Explainable Graph Neural Networks via Structural Externalities

Authors: Lijun Wu, Dong Hao, Zhiyi Fan

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental studies on both synthetic and real-world datasets show that Graph EXT outperforms existing baseline methods in terms of fidelity across diverse GNN architectures , significantly enhancing the explainability of GNN models.
Researcher Affiliation Academia Lijun Wu1 , Dong Hao1,2 and Zhiyi Fan1 1SCSE, University of Electronic Science and Technology of China 2AI-HSS, University of Electronic Science and Technology of China EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1 Compute the Network Based Value Function Algorithm 2 Shapley Value Sampling and Estimation
Open Source Code Yes The full version of this paper and source code are at this link.
Open Datasets Yes We evaluated the effectiveness of Graph EXT using six datasets from node classification and graph-level classification tasks. The statistical details of these datasets are shown in Table 1. These datasets include synthetic datasets, sentiment graph datasets, and biological datasets. BA-Shapes [Ying et al., 2019] is designed for node classification tasks, it is based on a Barab asi-Albert graph with added house-like patterns. ... BA-2Motifs [Luo et al., 2020] is used for graph classification tasks... Graph-SST2 and Graph-Twitter [Yuan et al., 2022] are used for graph classification tasks... BBBP and Clin Tox [Wu et al., 2018] are designed for graph classification tasks...
Dataset Splits No For each dataset, we conducted quantitative calculations of the Fidelity metrics using test samples on trained GCN and GIN models to demonstrate the effectiveness of our method. The paper mentions using "test samples" but does not provide specific details on the train/validation/test splits, such as percentages, absolute counts, or references to predefined splits for reproducibility.
Hardware Specification No No specific hardware details (such as GPU/CPU models, processor types, or memory) used for running the experiments are provided in the paper.
Software Dependencies No In our experiments, we trained on all datasets using threelayer GCN [Kipf and Welling, 2016] and GIN [Xu et al., 2018] models. ... The datasets and baseline implementations were based on the DIG Library [Liu et al., 2021]. The paper mentions GCN and GIN models, and the DIG Library, but does not provide specific version numbers for these software components.
Experiment Setup No In our experiments, we trained on all datasets using threelayer GCN [Kipf and Welling, 2016] and GIN [Xu et al., 2018] models. The model with the highest accuracy on the test set was selected as our final model, and all models were trained to achieve competitive accuracy. The paper mentions the use of three-layer GCN and GIN models and the model selection criterion, but it does not provide specific hyperparameters or detailed training configurations (e.g., learning rate, batch size, optimizer settings) needed for reproducibility.