THESIS
2021
1 online resource (x, 45 pages) : illustration (some color)
Abstract
Graph neural networks (GNNs) have achieved major success in solving challenging tasks
in malware analysis, social networks analysis, molecular networks, image classification,
text comprehension, and other pattern analysis tasks. Despite the prosperous development
of GNNs, recent research has demonstrated the feasibility of exploiting GNNs using
adversarial examples, in which a small distortion is added into the input data to dramatically
mislead prediction of the GNN models.
In this research, we present an attack that performs perturbations toward the control
flow structure of an executable to deceive GNNs-based software similarity analysis
tools. Unlike prior attacks which mostly change non-functional code components, our
approach proposes the design of several semantics-preserving man...[
Read more ]
Graph neural networks (GNNs) have achieved major success in solving challenging tasks
in malware analysis, social networks analysis, molecular networks, image classification,
text comprehension, and other pattern analysis tasks. Despite the prosperous development
of GNNs, recent research has demonstrated the feasibility of exploiting GNNs using
adversarial examples, in which a small distortion is added into the input data to dramatically
mislead prediction of the GNN models.
In this research, we present an attack that performs perturbations toward the control
flow structure of an executable to deceive GNNs-based software similarity analysis
tools. Unlike prior attacks which mostly change non-functional code components, our
approach proposes the design of several semantics-preserving manipulations directly toward
the control flow graph of a software executable, thus making it particularly effective
to deceive GNNs. To speed up the process, we design a framework that leverages
gradient-based or hill climbing-based optimizations to generate adversarial examples in
both white-box and black-box settings. We evaluated our attack against two de facto
GNN-based software similarity analysis tools, ASM2VEC and ncc, and achieve reasonably high success rates. Furthermore, our attack toward an industrial-strength similarity
analyzer, BinaryAI, shows that the proposed attack can fool remote APIs in challenging
black-box settings with a success rate of over 92.0%.
LGTM!