reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Heterophily-informed Message Passing

Authors: Haishan Wang, Arno Solin, Vikas K Garg

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments, conducted across various data sets and GNN architectures, demonstrate performance enhancements and reveal heterophily patterns across standard classification benchmarks. Furthermore, application to molecular generation showcases notable performance improvements on chemoinformatics benchmarks.
Researcher Affiliation	Collaboration	Haishan Wang EMAIL Aalto University Arno Solin EMAIL Aalto University Vikas Garg EMAIL Yai Yai Ltd and Aalto University
Pseudocode	No	The paper describes processes like message passing, training, and generation in text. While it uses mathematical equations and explains steps, it does not include a distinct, labeled pseudocode or algorithm block.
Open Source Code	Yes	A reference implementation of the methods is available at https://github.com/AaltoML/heterophily-imp.
Open Datasets	Yes	We evaluated on 5 homophilic data sets in citation networks (Yang et al., 2016) (Cora, Pub Med, Cite Seer) and co-purchase graphs (Shchur et al., 2018) (Computers, Photo). Furthermore, the 10 heterophilic data sets including hyperlink networks (Pei et al., 2019) (Cornell, Wisconsin, Texas), Wikipedia networks (Rozemberczki et al., 2021) (Chameleon, Squirrel), and heterophilous graph dataset (Platonov et al., 2023) (Roman-empire, Amazon-ratings, Minesweeper, Tolokers, Questions). We consider two common molecule data sets: qm9 and zinc-250k. The qm9 data set (Ramakrishnan et al., 2014) comprises 134k stable small organic molecules... The zinc-250k (Irwin et al., 2012) data contains 250k drug-like molecules...
Dataset Splits	Yes	The data split settings (training/validation/test 60%{20%{20%). Each configuration (data set and model) is tested for 10 random model initializations and data splits. For three heterophilic data sets (Cornell, Wisconsin and Texas), the data is split to train/validate/test with fixed 10 seeds from GEOM-GCN (Pei et al., 2019). In this experiment, all data sets are split with ratio train/test 80{20%.
Hardware Specification	Yes	All models in this experiment are trained on a Linux cluster equipped with NVIDIA V100 GPUs. The training time and memory requirement for single were (for all modes orig., hom., het., mix. and for all architectures). All models in this experiment are trained on a cluster equipped with NVIDIA A100 GPUs.
Software Dependencies	No	The models were implemented in Py Torch (Paszke et al., 2019) and Py Torch Geometric (Py G) (Fey & Lenssen, 2019). The text mentions software names like Py Torch and Py Torch Geometric and references papers, but does not provide specific version numbers for these software components used in the implementation.
Experiment Setup	Yes	Each one and its variants (Het MP, Hom MP) contain 2 layers and 128 dimensions for all hidden layers. All the models are trained with the Adam W optimizer (Loshchilov & Hutter, 2019), learning rate 0.001 and drop-out ratio 0.2. The Het Flows in Sec. 4.2 is built on GNNs with 4 layers and flows that were ka 27, kb 10 (for qm9) deep and ka 38, kb 10 (for zinc-250k).