THESIS
2017
xvi, 137 pages : illustrations (some color) ; 30 cm
Abstract
Software crashes are severe manifestation of software bugs. Crashes are often
required to be fixed with a high priority. Due to the severity of crashing bugs,
companies (e.g., Microsoft and Apple) and open source communities (Mozilla and
Netbeans) have widely deployed crash reporting systems to automatically collect
program execution stacks when crashes occur. While crash reporting systems can
massively collect and group similar crash reports, they offer little support for debugging
and fixing crashes. As a result, crash diagnosis process still requires manual
efforts mostly, which are tedious and expensive.
Automating crash diagnosis involves the following major challenges. First, each
collected crash report contains only the last program execution stack (i.e., crash
stack) w...[
Read more ]
Software crashes are severe manifestation of software bugs. Crashes are often
required to be fixed with a high priority. Due to the severity of crashing bugs,
companies (e.g., Microsoft and Apple) and open source communities (Mozilla and
Netbeans) have widely deployed crash reporting systems to automatically collect
program execution stacks when crashes occur. While crash reporting systems can
massively collect and group similar crash reports, they offer little support for debugging
and fixing crashes. As a result, crash diagnosis process still requires manual
efforts mostly, which are tedious and expensive.
Automating crash diagnosis involves the following major challenges. First, each
collected crash report contains only the last program execution stack (i.e., crash
stack) when a crash occurs. The crash stack logs the crashing function and its calling
chain, which provides brief information of the failed execution and is not sufficient
for debugging. Second, crash reports can be numerous because a single bug can
generate many crash reports due to different inputs or configurations. Diagnosing
such a large volume of crash reports is non-trivial. Moreover, diagnosing crashes
requires to understand the root causes of crashing bugs. Via conducting surveys and
literature reviews, we explore two kinds of important crash diagnosis information:
crash-inducing changes and crash trace data. Crash-inducing changes, i.e., the
changes that initially introduce the crashing bug, are highly demanded by developers
in practice. However, due to lack of good understanding of the characterization of
crash-inducing changes, identifying crash-inducing changes from a larger number of
changes in the code repository is challenging. Beside crash-inducing changes, tracing
the crash executions via program instrumentation is another common practice to
narrow down and understand the root causes. However, automating crash tracing involves two major challenges. First, deployed software is required to run with
minimal overhead and cannot afford a heavyweight instrumentation approach to
collect program execution information. Furthermore, end users require that the
logged information should not reveal sensitive production data.
To address these challenges, in this thesis, we first propose a technique CrashLo-cator to locate the buggy functions via statically analyzing and mining from crash
stacks. Then, to locate the crash-inducing changes and facilitate understand the
root causes of crashing bugs, we propose a technique ChangeLocator via statically
analyzing and mining from crash reports. Furthermore, we propose an automatic
program tracing technique Casper, which collects program call traces information.
We select program call trace as the tracing data for crashing bugs, since it does not
expose user sensitive data and has been proved to be useful for crash reproduction and
bug diagnosis. Our proposed technique causes significantly lower runtime and space
overhead of call trace collection than the conventional instrumentation approach.
Post a Comment