reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

From GNNs to Trees: Multi-Granular Interpretability for Graph Neural Networks

Authors: Jie Yang, Yuwen Wang, Kaixuan Chen, Tongya Zheng, Yihe Zhou, Zhenbang Xiao, Ji Cao, Mingli Song, Shunyu Liu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the graph classification benchmarks with both synthetic and real-world datasets demonstrate the superiority of TIF in interpretability, while also delivering a competitive prediction performance akin to the state-of-the-art counterparts.
Researcher Affiliation	Academia	1Zhejiang University, 2State Key Laboratory of Blockchain and Data Security, Zhejiang University, 3Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security, 4Big Graph Center, Hangzhou City University, 5Nanyang Technological University
Pseudocode	No	The paper describes the methodology using mathematical equations and textual descriptions (Sections 3.1, 3.2, 3.3, 3.4) but does not include a dedicated pseudocode block or algorithm figure.
Open Source Code	Yes	Our code will be made publicly available at https://github.com/dutyj2020/TIF.
Open Datasets	Yes	We first conduct experiments on five real-world datasets across different domains to evaluate the effectiveness of our framework. Additionally, we use three synthetic datasets to better demonstrate the interpretability of our framework. The specifics of the datasets are as follows: Real-world datasets: To explore the effectiveness of our framework across different domains, we use protein datasets including ENZYMES, PROTEINS (Feragen et al., 2013), and D&D (Dobson & Doig, 2003), molecular dataset MUTAG (Wu et al., 2018), and scientific collaboration dataset COLLAB (Yanardag & Vishwanathan, 2015).
Dataset Splits	Yes	The dataset is divided into 10 equal subsets for 10-fold cross-validation, with the time taken by each model being the average of the times required for each fold. Specifically, in each iteration, one fold is held out as the validation set, while the remaining 9 folds are used for training.
Hardware Specification	No	The paper mentions 'advanced computing resources provided by the Supercomputing Center of Hangzhou City University' in the acknowledgments, but does not specify any particular hardware details such as CPU or GPU models, or memory specifications used for the experiments.
Software Dependencies	No	The paper mentions the use of 'Adam' as an optimizer in Table 4, but does not provide specific version numbers for any programming languages, libraries, or other software dependencies.
Experiment Setup	Yes	The hyper-parameters used in our framework include batch size, optimizer, learning rate, and epoch. Additionally, several key hyper-parameters control the various loss terms in the model. Specifically, α1 controls the contribution of the edge prediction loss Llink, which ensures the preservation of graph connectivity during the hierarchical graph coarsening process. α2 governs the perturbation regularization loss Lperturb, balancing similarity regularization Lsimilarity and diversity regularization Ldiversity to ensure the embeddings remain diverse yet close to the original during the learnable graph perturbation module. α3 adjusts the entropy regularization loss Lentropy, which promotes diverse path selection in the adaptive routing module. The specific settings are provided in Table 4.