Explainable Graph Neural Networks via Structural Externalities
Authors: Lijun Wu, Dong Hao, Zhiyi Fan
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental studies on both synthetic and real-world datasets show that Graph EXT outperforms existing baseline methods in terms of fidelity across diverse GNN architectures , significantly enhancing the explainability of GNN models. |
| Researcher Affiliation | Academia | Lijun Wu1 , Dong Hao1,2 and Zhiyi Fan1 1SCSE, University of Electronic Science and Technology of China 2AI-HSS, University of Electronic Science and Technology of China EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Compute the Network Based Value Function Algorithm 2 Shapley Value Sampling and Estimation |
| Open Source Code | Yes | The full version of this paper and source code are at this link. |
| Open Datasets | Yes | We evaluated the effectiveness of Graph EXT using six datasets from node classification and graph-level classification tasks. The statistical details of these datasets are shown in Table 1. These datasets include synthetic datasets, sentiment graph datasets, and biological datasets. BA-Shapes [Ying et al., 2019] is designed for node classification tasks, it is based on a Barab asi-Albert graph with added house-like patterns. ... BA-2Motifs [Luo et al., 2020] is used for graph classification tasks... Graph-SST2 and Graph-Twitter [Yuan et al., 2022] are used for graph classification tasks... BBBP and Clin Tox [Wu et al., 2018] are designed for graph classification tasks... |
| Dataset Splits | No | For each dataset, we conducted quantitative calculations of the Fidelity metrics using test samples on trained GCN and GIN models to demonstrate the effectiveness of our method. The paper mentions using "test samples" but does not provide specific details on the train/validation/test splits, such as percentages, absolute counts, or references to predefined splits for reproducibility. |
| Hardware Specification | No | No specific hardware details (such as GPU/CPU models, processor types, or memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | In our experiments, we trained on all datasets using threelayer GCN [Kipf and Welling, 2016] and GIN [Xu et al., 2018] models. ... The datasets and baseline implementations were based on the DIG Library [Liu et al., 2021]. The paper mentions GCN and GIN models, and the DIG Library, but does not provide specific version numbers for these software components. |
| Experiment Setup | No | In our experiments, we trained on all datasets using threelayer GCN [Kipf and Welling, 2016] and GIN [Xu et al., 2018] models. The model with the highest accuracy on the test set was selected as our final model, and all models were trained to achieve competitive accuracy. The paper mentions the use of three-layer GCN and GIN models and the model selection criterion, but it does not provide specific hyperparameters or detailed training configurations (e.g., learning rate, batch size, optimizer settings) needed for reproducibility. |