Graph Neural Networks (GNNs) have achieved great success in various graph tasks, such
as node classification and link prediction. Generally, they learn node representations by
recursively aggregating information from neighbors. However, recent studies have revealed
two significant issues associated with the deployment of GNNs in real-world applications.
First, GNNs are vulnerable to adversarial attacks on graph data, including
topology modifications and feature perturbations. The attackers can slightly manipulate
graph data to mislead GNNs into misclassifying node labels or increasing the predicted
link probability of a user toward a target item for promotion. Second, in many applications,
such as Twitter and Facebook, graphs are dynamic and evolving with continuous
graph events, such a...[
Read more ]
Graph Neural Networks (GNNs) have achieved great success in various graph tasks, such
as node classification and link prediction. Generally, they learn node representations by
recursively aggregating information from neighbors. However, recent studies have revealed
two significant issues associated with the deployment of GNNs in real-world applications.
First, GNNs are vulnerable to adversarial attacks on graph data, including
topology modifications and feature perturbations. The attackers can slightly manipulate
graph data to mislead GNNs into misclassifying node labels or increasing the predicted
link probability of a user toward a target item for promotion. Second, in many applications,
such as Twitter and Facebook, graphs are dynamic and evolving with continuous
graph events, such as node feature changes and graph structure updates. These events require
the node representations to be updated accordingly. Currently, due to the real-time
requirement, how to efficiently and reliably update node representations under continuous
graph events is still an open problem.
In this thesis, we first investigate the robustness of GNNs, focusing on node classification
and link prediction tasks. Specifically, in the node classification task, we propose a pure black-box attacker PEEGA, which is restricted to accessing node features and graph
topology for practicability. Furthermore, we reveal that effective attackers enable GNNs
to misclassify the labels of nodes by obscuring the neighbor distribution of nodes. As a
result, GNNs are unable to recognize nodes. Based on this observation, we propose a
GNN defender GNAT that incorporates three augmented graphs, i.e., a topology graph,
a feature graph, and an ego graph, to enhance the distinguishability of node contexts.
Besides, in the link prediction task, we propose an attacker DADA, which incorporates
difficulty-aware and diversity-aware objectives. These objectives enable easy users from
various communities, who have a higher tendency to the target item, to contribute more
weight when optimizing attackers. By incorporating these two objectives, the attack behaviors
of DADA can increase the link probability of diverse easy users towards the target
item, thereby enabling platforms to promote the target item to more users. Based on these
observations, we propose two solutions (e.g., attack behavior detection and adversarial
training) to help GNNs resist attacks for link prediction.
Second, we propose an efficient and reliable graph neural network framework, namely
EARLY, to update node representations for dynamic graphs. When graph events arrive,
we first identify the top-k influential nodes that are most affected by the new graph events.
Specifically, we theoretically analyze the impact of graph events on node representations.
Based on this, we formulate the top-k node selection problem and prove that this problem
is NP-hard. We then propose a greedy algorithm with a theoretical guarantee. Also, we
reveal that existing GNNs may sample similar and redundant neighbors to update node
representations, which cannot reflect the distribution of all neighbors. Consequently, node
representations aggregated on these redundant neighbors are not reliable, because they
may differ from the node representations aggregated on all neighbors. Thus, we propose
a diversity-aware layer-wise sampling approach to sample diverse neighbors for these
selected nodes in a layer-wise manner. We theoretically demonstrate that our algorithm
can learn more reliable node representations with lower expectation error.
We demonstrate the superior performance of the proposed approaches for the above
tasks against state-of-the-art techniques, through extensive experiments on real-world
datasets. In the end, we conclude this thesis with future research directions of improving
the robustness and efficiency of GNNs.
Post a Comment