THESIS
2023
1 online resource (vii, 66 pages) : illustrations (some color)
Abstract
Data-flow analysis is a general technique used to compute information of interest at different
points of a program and is considered to be a cornerstone of static analysis. In this
thesis, we consider interprocedural data-flow analysis as formalized by the standard IFDS
framework, which can express many widely-used static analyses such as reaching definitions,
live variables, and null-pointer. We focus on the well-studied on-demand setting in
which queries arrive one-by-one in a stream and each query should be answered as fast as
possible. While the classical IFDS algorithm provides a polynomial-time solution to this
problem, it is not scalable in practice. Specifically, it either requires a quadratic-time pre-processing
phase or takes linear time per query, both of which are untenable...[
Read more ]
Data-flow analysis is a general technique used to compute information of interest at different
points of a program and is considered to be a cornerstone of static analysis. In this
thesis, we consider interprocedural data-flow analysis as formalized by the standard IFDS
framework, which can express many widely-used static analyses such as reaching definitions,
live variables, and null-pointer. We focus on the well-studied on-demand setting in
which queries arrive one-by-one in a stream and each query should be answered as fast as
possible. While the classical IFDS algorithm provides a polynomial-time solution to this
problem, it is not scalable in practice. Specifically, it either requires a quadratic-time pre-processing
phase or takes linear time per query, both of which are untenable for modern
huge codebases with hundreds of thousands of lines. Previous works have already shown
that parameterizing the problem by the treewidth of the program’s control-flow graph is
promising and can lead to significant gains in efficiency. Unfortunately, these results were
only applicable to the limited special case of same-context queries.
In this work, we obtain significant speedups for the general case of on-demand IFDS
with queries that are not necessarily same-context. This is achieved by exploiting a new
graph sparsity parameter, namely the treedepth of the program’s call graph. Our approach
is the first to exploit the sparsity of control-flow graphs and call graphs at the same time
and parameterize by both treewidth and treedepth. We obtain an algorithm with a linear
preprocessing phase that can answer each query in constant time with respect to the input
size. Finally, we show experimental results demonstrating that our approach significantly
outperforms the classical IFDS and its on-demand variant.
Post a Comment