Making call graph construction more practical for program analysis in the real world

HKUST Electronic Theses

Making call graph construction more practical for program analysis in the real world

by Yuandao Cai

THESIS 2023

Ph.D. Computer Science and Engineering

1 online resource (xx, 210 pages) : illustrations (some color)

Abstract

The explosively increasing code size and high concurrency make modern software increasingly complex and error-prone. Furthermore, the prevalence and extensive deployment of modern software have exacerbated the hazards of any software vulnerabilities and errors, potentially affecting millions of users and resulting in significant financial costs. Unfortunately, the ever-growing complexity of modern codebases has made many conventional program analysis tasks impractical for safeguarding today’s software. Specifically, as one of the most fundamental tasks of program analysis, call graph construction for large-scale code has become either imprecise or unscalable, which impairs the effectiveness of many downstream program analysis applications. In addition, as one of the most popular tasks of program analysis, bug detection is faced with significant challenges in effectively detecting concurrency bugs. In particular, unlike detecting sequential bugs, detecting concurrent bugs requires additional characterization of exponentionally possible thread interactions. To address these challenges, in this thesis, we contribute to making fundamental call graph construction and popular concurrency bug detection more practical.

First, we advocate two new approaches for call graph construction with different precision-scalability trade-offs, catering to different scenarios throughout software development (e.g., the continuous integration and daily build process). Specifically, Kelp significantly outperforms existing type-based call graph construction, which enables the construction of precise call graphs for millions of lines of code in a few minutes. On the other hand, Coral largely outperforms existing pointer-analysis-based call graph construction, which facilitates the construction of precise call graphs for millions of lines of code in a few hours. By evaluating the call graphs through the lens of several various downstream program analysis clients (i.e., thread-sharing analysis, program slicing, value-flow bug detection, and directed grey-box fuzzing), our experimental results suggest that KELP and CORAL can dramatically promote their effectiveness for better vulnerability understanding, hunting, and reproduction.

Second, we advocate a series of static approaches for concurrency bug detection, including CANARY for detecting inter-thread value-flow bugs, PEAHEN for detecting deadlocks, and LOCKPICK for detecting lock misuses. It is important to note that the effectiveness of our concurrency bug detection can be also enhanced by our work on practical call graph construction. Our experimental results show that our concurrency bug detectors can beat many popular open-source and commercial static concurrency bug detectors, such as Clang Static Analyzer (CSA) and Facebook/Meta Infer, regarding efficiency and precision. Moreover, our concurrency bug detectors are practical and deployable, checking millions of lines of code in less than five hours, mostly with a false positive rate lower than 30%.

Excitingly, our research prototypes for call graph construction and concurrency bug detection have been successfully deployed in two global 500 companies. Moreover, our concurrency bug detectors have uncovered over two hundred bugs with sixteen CVE IDs assigned in numerous award-winning and popular open-source software, such as the Linux kernel, Firefox, PHP, PostgreSQL, and OpenSSL. We believe that the adoption of our techniques at complicated industrial settings and the capability to uncover genuine bugs in the real world are strong evidence of our approaches’ effectiveness and advancement.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Zhang, Charles Chuan Authors Cai, Yuandao Subjects Debugging in computer science Computer programs Testing Subroutines (Computer programs) Mathematical optimization Language English Call number Thesis CSE 2023 CaiY DOI 10.14711/thesis-991013231858203412

Full record

Making call graph construction more practical for program analysis in the real world

by Yuandao Cai

Post a Comment Cancel reply