THESIS
2020
xiv, 112 pages : illustrations ; 30 cm
Abstract
Software bugs cost developers and software companies a great deal of time and
money. Although previous work has reported many success stories for using static
bug-finding tools, it is still difficult to find bugs hidden behind sophisticated pointer
operations, deep calling contexts, and complex path conditions with a low false-positive rate, while sieving through millions of lines of code in just a few hours. In
this thesis, we present our novel designs of sparse value-flow analysis to tackle a wide
range of software bugs caused by improper value flows. The proposed approach has
been commercialized and deployed in many of the global 500 companies. It also has
reported hundreds of real bugs for open-source software systems.
The first problem addressed in the thesis is referred to...[
Read more ]
Software bugs cost developers and software companies a great deal of time and
money. Although previous work has reported many success stories for using static
bug-finding tools, it is still difficult to find bugs hidden behind sophisticated pointer
operations, deep calling contexts, and complex path conditions with a low false-positive rate, while sieving through millions of lines of code in just a few hours. In
this thesis, we present our novel designs of sparse value-flow analysis to tackle a wide
range of software bugs caused by improper value flows. The proposed approach has
been commercialized and deployed in many of the global 500 companies. It also has
reported hundreds of real bugs for open-source software systems.
The first problem addressed in the thesis is referred to as the pointer trap: a
precise points-to analysis limits the scalability of sparse value-flow analysis and an
imprecise one seriously undermines its precision. To solve the problem, we present
Pinpoint, a holistic approach that decomposes the cost of high-precision points-to
analysis by precisely discovering local data dependence and delaying the expensive
inter-procedural analysis through memorization. Such memorization enables the on-demand slicing and, thus, improves the scalability with high precision. Experiments
show that Pinpoint can check millions of lines of code in less than five hours with a
false positive rate lower than 30%.
The second problem addressed in the thesis is known as the extensional scalability
problem, which happens when we simultaneously check many value-flow properties
with high precision. A major factor to this deficiency is that the core static analysis
engine is oblivious of the mutual synergy among the properties being checked, thus
inevitably losing many optimization opportunities. Our work is to leverage the
inter-property awareness and to capture redundancies and inconsistencies when many
properties are considered together. The evaluation results demonstrate that the
approach, Catapult, is more than 8× faster than Pinpoint but consumes only 1/7 of
the memory when checking twenty value-flow properties together.
The third problem addressed in the thesis is how to improve the parallelism of
static bug finding over the conventional parallel designs. Conventionally, bottom-up
program analysis has been easy to parallelize because functions without caller-callee relations can be analyzed independently. However, functions with caller-callee
relations have to be analyzed sequentially because the analysis of a function depends
on the analysis results of its callees. We present Cheetah, a framework of bottom-up
analysis, in which the analysis task of a function is partitioned into multiple sub-tasks. These sub-tasks are pipelined and run in parallel, even though the caller-callee
relations exist. The evaluation results of Cheetah demonstrate significant speedup
over a conventional parallel design.
Post a Comment