Sanitizing Backdoored Graph Neural Networks: A Multidimensional Approach
Authors: Rong Zhao, Jilian Zhang, Yu Wang, Yinyan Zhang, Jian Weng
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that at the cost of slight loss in clean classification accuracy, MAD achieves considerably lower attack success rate as compared to state-of-the-art backdoor defense methods. We conduct extensive experiments on real datasets, and the results show that MAD significantly reduces attack success rate as compared to existing defense methods. |
| Researcher Affiliation | Academia | Rong Zhao , Jilian Zhang , Yu Wang , Yinyan Zhang and Jian Weng College of Cyber Security, Jinan University, Guangzhou China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the MAD framework in Section 4, detailing its three steps (preprocessing, anomaly detection, and trigger pruning) with prose and equations. However, it does not present these steps in a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Datasets. Four datasets are used, where Cora, Pubmed, and OGB-arxiv are academic citation networks, while Flickr is a large-scale social network. Statistics of these datasets are summarized in Table 1. |
| Dataset Splits | Yes | Implementation Details. Each of the experiment datasets is divided into a training set and a test set, where the former and the latter contain 80% and 20% of data samples, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It mentions using GNN models but no hardware specifications. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, TensorFlow, or other libraries/frameworks). |
| Experiment Setup | Yes | We inject backdoors to graph data strictly following the same way as in SOTA backdoor attack methods. Meanwhile, we define the attack budget as the number of nodes poisoned by triggers, which is set to 10, 40, 80, and 160 for Cora, Pub Med, Flickr, and OGB-arxiv, respectively. As discussed in Section 4, for the computed standard deviation of feature values, Euclidean distance between node embeddings, and entropy of class prediction probabilities, we denote their corresponding threshold as tσ, tdist, and t H, respectively. Specifically, we set tσ to be the cut-off value of nodes with top 3% highest deviations of node features for Cora and Pubmed, while tσ is set to be that of the nodes with top 1% highest deviations of node features for Flickr and OGB-arxiv, because the two datasets are larger in size. Similarly, we set tdist in the same way as tσ, i.e., the cut-off value of node pairs with top 3%, 3%, 1%, and 1% highest Euclidean distance between node embeddings for the four datasets, respectively. As for t H, however, it is defined as the cut-off value of nodes with the top 3%, 3%, 1%, and 1% smallest entropy for the four datasets, respectively. Nodes that exceed the three thresholds are regarded as potential trigger nodes or edges, and they must be removed from the graph. We employ three GNN models, i.e., GCN, GAT, and Graph SAGE when testing, and we report the average ASR and CA. |