THESIS
2022
1 online resource (xvii, 124 pages) : illustrations (chiefly color)
Abstract
Vulnerabilities become prevalent and cost great financial and time lost nowadays.
As one of the most effective ways to detect vulnerabilities, fuzzing has been widely
applied and optimized in both academia and industry. Nonetheless, fuzzing is still
deficient in detecting vulnerabilities hidden behind sophisticated program behaviors
such as complex path conditions and deep calling context. Moreover, along with
the growing size of the software programs, sieving through a specific vulnerability is
similar to finding a needle in the haystack. In this thesis, we demonstrate our novel
designs of fuzzing frameworks with the assistance of static analysis by interpreting the
program behavior for directing fuzzing toward the vulnerabilities more effectively. The
proposed approach has been succes...[
Read more ]
Vulnerabilities become prevalent and cost great financial and time lost nowadays.
As one of the most effective ways to detect vulnerabilities, fuzzing has been widely
applied and optimized in both academia and industry. Nonetheless, fuzzing is still
deficient in detecting vulnerabilities hidden behind sophisticated program behaviors
such as complex path conditions and deep calling context. Moreover, along with
the growing size of the software programs, sieving through a specific vulnerability is
similar to finding a needle in the haystack. In this thesis, we demonstrate our novel
designs of fuzzing frameworks with the assistance of static analysis by interpreting the
program behavior for directing fuzzing toward the vulnerabilities more effectively. The
proposed approach has been successfully deployed in the development environment
of Huawei, which is one of the biggest companies in the world, and awarded by
detecting thousands of potential threats.
Our first addressed problem is the non-incremental issue in existing fuzzing:
fuzzing, which consists of many epochs for analyzing path conditions and generating
new inputs, does not preserve and reuse the previous exploration states in the
successive epochs, and thus solve the path condition redundantly. To tackle this
problem, we present PANGOLIN, an incremental hybrid fuzzing system that preserves
the solved path condition as a precise polyhedra path abstraction. The preserved
abstractions can be reused in further input generation by constraining the random
mutation and accelerating the following constraint solving. Overall, the incremental
mechanism enables PANGOLIN to achieve up to 30% coverage improvement within
the same time budget compared with state of the art. Meanwhile, PANGOLIN also
detects 41 previous unseen bugs, with 17 assigned with CVE ids.
Our second addressed problem can be referred to as infeasible path explosion: To
detect specific vulnerabilities in the target programs, existing directed fuzzers are still inefficient since they either symbolically or concretely execute a large number of
paths that cannot reach the target code, and thus waste the computational resources.
To solve this problem, we design BEACON, a directed fuzzing framework that can
effectively direct fuzzer in the sea of paths in a provable manner. Assisted by a
lightweight static analysis, BEACON infers the sound preconditions for the values of
the program variables that directly make the path-to-target infeasible. The evaluation
results demonstrate that more than 80% of the infeasible path can be rejected and
thus improve the efficiency of reproducing vulnerabilities with 11.50x speedup on
average than state of the art. Moreover, BEACON detects 14 incomplete fixes of
previous vulnerabilities and eight new bugs while 10 of them are exploitable with
new CVE ids assigned.
Our third addressed problem can be referred to as high-cost/benefit trap: The
existence of multiple vulnerabilities in the target programs requires fuzzing to
schedule the seeds, i.e., selecting seeds for mutation from a pool of candidates, which
significantly impacts the speed of a greybox fuzzer to achieve a target coverage rate.
Despite improvement in seeds scheduling, existing works cannot escape from the high-cost
trap or the high-benefit trap. Due to the ignorance of the impacts from either the
cost or the benefits, they often trap fuzzers into mutating the seeds without increasing
any coverage. Therefore, we present BELIEFFUZZ, which transforms the fuzzing
procedure into a Monte Carlo Tree Search (MCTS) system. The system allows us to
compute both the benefits and the cost dynamically during the fuzzing procedure and,
different from existing MCTS-based approaches, can provide theoretical guarantees
for the coverage rate in the worst case. The experimental results demonstrated that
our approach achieves a significant efficiency improvement, with 212%-601% speedups
and 1.50x-2.77x fewer executions needed, over the state of the arts to achieve the
same coverage in 24 hours experiments. Meanwhile, the efficiency improvement
also benefits the effectiveness of coverage discovery with an increase from 118%-221%. Moreover, BELIEFFUZZ also detected 33 more previously-unseen bugs in the
real-world projects evaluated with 17 CVEs assigned.
Post a Comment