reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Homomorphism Counts as Structural Encodings for Graph Learning

Authors: Linus Bao, Emily Jin, Michael Bronstein, Ismail I Ceylan, Matthias Lanzinger

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we observe that Mo SE outperforms other well-known positional and structural encodings across a range of architectures, and it achieves state-of-the-art performance on a widely studied molecular property prediction dataset. ... 5 EXPERIMENTS
Researcher Affiliation	Collaboration	Linus Bao University of Oxford Emily Jin University of Oxford Michael Bronstein University of Oxford / AITHYRA Ismail Ilkan Ceylan University of Oxford Matthias Lanzinger TU Wien
Pseudocode	No	The paper describes methods through mathematical formulations and textual descriptions, but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	All code and instructions on how to reproduce our results are available at the following link: https://github.com/linusbao/Mo SE.
Open Datasets	Yes	We evaluate GPS+Mo SE on three different benchmarking datasets, including ZINC (Irwin et al., 2012; Dwivedi et al., 2023), PCQM4Mv2 (Hu et al., 2021), and CIFAR10 (Krizhevsky et al., 2009; Dwivedi et al., 2023). ... QM9 is a real-world molecular dataset that contains over 130,000 graphs (Wu et al., 2018; Brockschmidt, 2020). ... The Peptides-func and Peptides-struct datasets were introduced in the Long Range Graph Benchmark (Dwivedi et al., 2022b).
Dataset Splits	Yes	For ZINC, ... The dataset is split into 10,000 graphs for training, 1,000 graphs for validation, and 1,000 graphs for testing. ... PCQM4Mv2-subset ... results in 322,869 training graphs, 50,000 validation graphs, and 73,545 test graphs... The graph benchmarking dataset maintains the same splits as the original image dataset, which contains 45,000 training, 5,000 validation, and 10,000 test graphs. ... QM9 dataset ... They are split into 110,831 graphs for training, 10,000 for validation, and 10,000 for testing.
Hardware Specification	Yes	All experiments were conducted on a cluster with 12 NVIDIA A10 GPUs (24 GB) and 4 NVIDIA H100s. Each node had 64 cores of Intel(R) Xeon(R) Gold 6326 CPU at 2.90GHz and 500GB of RAM.
Software Dependencies	No	The paper does not explicitly mention specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	We provide additional experimental details and hyperparameters for our results in Section 5. ... See the appendix sections that follow for details on our hyperparameter configurations and hyperparameter searches. ... Table 14: Hyperparameter summary for Mo SE results on ZINC from Table 2. ... We perform a grid-search on MLP-E for our ZINC results presented in Table 2. We first fix the following hyperparameters: Batch Size: 32 Optimizer: Adam Weight Decay: 0.001 Base LR: 0.001 Max Epochs: 1200 LR Scheduler: Cosine LR Warmup Epochs: 10.