reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Authors: Michael Scholkemper, Xinyi Wu, Ali Jadbabaie, Michael Schaub

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results corroborate the effectiveness of our method, demonstrating improved performance across various GNN architectures and tasks.
Researcher Affiliation	Academia	Michael Scholkemper Department of Computer Science RWTH Aachen University EMAIL Xinyi Wu Institute for Data, Systems, and Society Massachusetts Institute of Technology EMAIL Ali Jadbabaie Institute for Data, Systems, and Society Massachusetts Institute of Technology EMAIL Michael T. Schaub Department of Computer Science RWTH Aachen University EMAIL
Pseudocode	No	The paper describes methods and equations, such as update rules (1), (4), and (5), and the definition of Graph Normv2. It also explains the Weisfeiler-Leman algorithm. However, it does not present any of these in a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code	Yes	The code is made available here. The results in Figure 1, can be reproduced by running the ablation_study.ipynb notebook in the supplementary material. Aggregating the statistics over multiple runs strengthens the reproducibility of the results.
Open Datasets	Yes	We investigate the effect of normalization in deep (linear) GNNs on the Cora dataset (Yang et al., 2016)... We perform graph classification tasks on the standard benchmark datasets MUTAG (Schlichtkrull et al., 2017), PROTEINS (Morris et al., 2020) and PTC-MR (Bai et al., 2019) as well as node classification tasks on Cora, Citeseer (Yang et al., 2016) and large-scale ogbn-arxiv (Hu et al., 2020).
Dataset Splits	Yes	We perform a within-fold 90%/10% train/validation split for model selection. We train the models for 200 epochs using the Adam W optimizer and search the hyperparameter space over the following parameter combinations... Following the general set-up of (Errica et al., 2019), we investigate the performance of GIN, GCN and GAT in a 5-fold crossvalidation setting.
Hardware Specification	Yes	We ran all of our experiments on a system with two NVIDIA L40 GPUs, two AMD EPYC 7H12 CPUs and 1TB RAM.
Software Dependencies	No	The paper mentions software like Py G (Fey & Lenssen, 2019) and OGB (Hu et al., 2020) and notes their licenses, but does not provide specific version numbers for these or other software components like Python or PyTorch, which would be necessary for precise replication.
Experiment Setup	Yes	We train the models for 200 epochs using the Adam W optimizer and search the hyperparameter space over the following parameter combinations: learning rate {10 4, 10 3, 10 2, 10 1}, feature size {32, 64}, weight decay {0, 10 2, 10 4}, number of layers {3, 5}.