Redundancy Undermines the Trustworthiness of Self-Interpretable GNNs
Authors: Wenxin Tai, Ting Zhong, Goce Trajcevski, Fan Zhou
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our findings through extensive experiments across diverse datasets, model architectures, and self-interpretable GNN frameworks, providing a benchmark to guide future research on addressing redundancy and advancing GNN deployment in critical domains. |
| Researcher Affiliation | Academia | 1Department of Software Engineering, University of Electronic Science and Technology of China, China 2Department of Electrical and Computer Engineering, Iowa State University, United States. Correspondence to: Fan Zhou <EMAIL>. |
| Pseudocode | No | The paper describes methods and processes using mathematical formulas and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/ICDM-UESTC/Trustworthy Explanation. |
| Open Datasets | Yes | We select four datasets: a synthetic dataset, BA-2MOTIFS (Luo et al., 2020), and three real-world dataset 3MR (Rao et al., 2022), BENZENE (Morris et al., 2020), and MUTAGENICITY (Morris et al., 2020) all sourced from the graph learning community and have ground-truth explanation labels. All datasets are published and can be downloaded from the Internet (see Table 5). |
| Dataset Splits | No | The paper mentions using a 'validation set' for hyperparameter selection and running methods multiple times with 'random seeds', but it does not specify explicit percentages or counts for training, validation, and test splits required for reproducing the exact data partitioning. |
| Hardware Specification | Yes | All experiments were conducted using Py Torch, trained with the Adam optimizer (Kingma & Ba, 2015), and executed on one NVIDIA RTX 4090 GPU with Intel Core i7-13700KF CPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' as the framework used and the 'Adam optimizer', but it does not provide specific version numbers for these or other key software components. |
| Experiment Setup | Yes | GIN consists of 2 layers with a hidden size of 64, while GCN has 3 layers with the same hidden size. We employ a 3-layer Multi-Layer Perceptron (MLP) to predict edge weights, with hidden sizes set to 256, 64, 1. The learning rate is chosen from {0.01, 0.005, 0.001, 0.0005, 0.0001}. The coefficient for EA is selected from {0.01, 0.1, 1, 10, 100}. We start using SWA from the 10-th epoch. The hyperparameters (e.g., β, γ) are selected from {0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100}. The Gumbel Softmax technique is used with edge weight calculated by ϵ Uniform(0, 1), eij = σ((log ϵ log(1 ϵ) + wij)/τ). |