reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improving the Effective Receptive Field of Message-Passing Neural Networks

Authors: Shahaf E. Finder, Ron Shapira Weber, Moshe Eliasof, Oren Freifeld, Eran Treister

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive evaluations on benchmarks such as the Long-Range Graph Benchmark (LRGB), we demonstrate substantial improvements over baseline MPNNs in capturing long-range dependencies while maintaining computational efficiency. ... Our Main Contributions are as follows: ... Through experiments on benchmarks such as the Long Range Graph Benchmark (LRGB), we demonstrate the superior performance of IM-MPNN in capturing long-range dependencies and mitigating over-squashing.
Researcher Affiliation	Academia	1Department of Computer Science, Ben-Gurion University, Israel 2Data Science Research Center, Ben-Gurion University, Israel 3Department of Applied Mathematics and Theoretical Physics, University of Cambridge, United Kingdom 4School of Brain Sciences and Cognition, Ben-Gurion University, Israel.
Pseudocode	No	The paper describes the IM-MPNN architecture and its components using natural language and mathematical equations (e.g., equations 14-21) but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code	Yes	Our code is available at https://github.com/BGU-CS-VIL/IM-MPNN
Open Datasets	Yes	Through experiments on benchmarks such as the Long Range Graph Benchmark (LRGB)... We test our method on Long Range Graph Benchmark (Dwivedi et al., 2022) ... The Pascalvoc-SP and COCO-SP datasets are based on Pascal VOC 2011 image dataset (Everingham et al., 2010) and MS COCO image dataset (Lin et al., 2014)... We evaluate IM-MPNN on the City-Networks benchmark (Liang et al., 2025)... We evaluate IM-MPNN on five heterophilic node classification benchmarks introduced by Platonov et al. (2023): Roman-Empire, Amazon-Ratings, Minesweeper, Tolokers, and Questions.
Dataset Splits	Yes	For all datasets, we used the official splits as by Dwivedi et al. (2022), and reported the average and standard-deviation performance across 3 seeds. ... We follow the training and evaluation protocols from Platonov et al. (2023), and in particular follow the same splits. ... We follow Liang et al. (2025) training procedure of 20k epochs, batch size of 20k, learning rate of 10 3, and weight decay of 10 5.
Hardware Specification	Yes	Table 7: Training and inference runtime per epoch using an Nvidia RTX A6000 GPU. ... Table 8: Runtimes on the Questions dataset using 8-layer network with 256 channels on Nvidia RTX A6000 GPU.
Software Dependencies	No	The paper mentions using the Adam W optimizer and PyTorch Geometric (Py G) but does not specify exact version numbers for any software dependencies, which are necessary for reproducible descriptions.
Experiment Setup	Yes	We follow Liang et al. (2025) training procedure of 20k epochs, batch size of 20k, learning rate of 10 3, and weight decay of 10 5. The model is evaluated for accuracy every 100 epochs, and the model with the best validation is saved for final testing. All the scenarios are repeated 5 times, and we report their means and standard deviations. ... For hyperparameters, we consider learning rates and weight decays in the range of 1e 5 to 1e 3 using the Adam W optimizer, and we consider 2, 3, 4, 8 scales within our IM-MPNN framework.