reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Graph Invariance by Harnessing Spuriosity

Authors: Tianjun Yao, Yongqiang Chen, Kai Hu, Tongliang Liu, Kun Zhang, Zhiqiang Shen

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on synthetic datasets demonstrate that LIRS is able to learn more invariant features compared to state-of-the-art graph invariant learning methods that adopt the direct invariant learning paradigm. Furthermore, LIRS shows superior OOD performance on real-world datasets with various types of distribution shifts, highlighting its effectiveness in learning graph invariant features. Our contributions can be summarized as follows: [...] Extensive experiments demonstrate that LIRS outperforms second-best baseline methods by up to 25.50% across 17 competitive baselines on both synthetic and real-world datasets with various distribution shifts.
Researcher Affiliation	Academia	1Mohamed bin Zayed University of Artificial Intelligence 2Carnegie Mellon University 3The University of Sydney
Pseudocode	Yes	Algorithm 1 The LIRS framework
Open Source Code	Yes	1Code is available at https://github.com/tianyao-aka/LIRS-ICLR2025
Open Datasets	Yes	We adopt GOODMotif and GOODHIV datasets (Gui et al., 2022), OGBG-Molbace and OGBG-Molbbbp datasets (Hu et al., 2020; Wu et al., 2018) to comprehensively evaluate the OOD generalization performance of our proposed framework.
Dataset Splits	Yes	Table 7: Details about the datasets used in our experiments. DATASETS Split # TRAINING # VALIDATION # TESTING # CLASSES METRICS GOOD-HIV Scaffold 24682 4113 4108 2 ROC-AUC
Hardware Specification	Yes	We run all the experiments on Linux servers with RTX 4090 and CUDA 12.2.
Software Dependencies	No	All the experiments are ran with Py Torch (Paszke et al., 2019) and Py Torch Geometric (Fey & Lenssen, 2019). We adopt Py GCL (Zhu et al., 2021) package and modify the source code in Dual Branch Contrast to implement the biased infomax to generate spurious embeddings. To generate logits from the spurious embeddings, we use Mini Batch KMeans, linear SVC, and Calibrated Classifier CV in Scikit-Learn package (Pedregosa et al., 2011). No specific version numbers for these software libraries are provided, only the publication years of their respective papers.
Experiment Setup	Yes	Optimization and evaluation. By default, we use Adam optimizer (Kingma & Ba, 2014) with a learning rate of 1e 3 and a batch size of 64 for all experiments. we also employ an early stopping of 10 epochs according to the validation performance for all datasets. [...] Hyperparameter search for LIRS. The penalty weight for LInv in LIRS is searched over {1e 1, 1e 2, 1e 3}. The reweighting coefficient γ is searched over {0.1, 0.3, 0.5, 0.7, 0.9}. The cluster number C is searched over {3, 5, 10}. The training epoch E at which the spurious embedding is derived from the biased infomax is searched over {50, 60, 70, 80, 90} for real-world datasets, and for the synthetic datasets, The training epoch E is searched over {5, 6, 7, 8, 9}.