reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Disentangling Invariant Subgraph via Variance Contrastive Estimation under Distribution Shifts

Authors: Haoyang Li, Xin Wang, Xueling Zhu, Weigao Wen, Wenwu Zhu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct extensive experiments to verify that our VIVACE method can effectively handle distribution shifts even on the severely biased graph datasets by capturing the invariant subgraphs, including the experimental setup, quantitative comparisons, ablation studies, the impact of the hyper-parameters, etc. Table 1. Experimental results (%) of our method and baselines. The evaluation metric is accuracy for CMNIST, CFashion, and CKuzushiji, and ROC-AUC for MOLSIDER and MOLHIV. denotes the standard deviation. The best results are in bold for each row. Our VIVACE outperforms the baselines in all comparisons, indicating its superiority against graph distribution shifts.
Researcher Affiliation	Collaboration	1Department of Computer Science and Technology, BNRist, Tsinghua University, Beijing, China 2Department of Radiology, Xiangya Hospital, Central South University, Changsha, Hunan, China 3Alibaba Group. Correspondence to: Xin Wang <EMAIL>, Wenwu Zhu <EMAIL>.
Pseudocode	Yes	Algorithm 1 The training procedure of VIVACE. Input: The graph dataset Output: An optimized invariant subgraph generator Φ( ) and predictor f I( ) mapping each graph to its label 1: while not converge do 2: for sampled minibatch B of the graph dataset do 3: for each graph G and the corresponding label y in B do 4: Generate the edge mask matrix M by Eq. (4). 5: Generate the invariant subgraph GI and the variant subgraph GV by Eq. (3). 6: end for 7: Calculate the contrastive learning objective by Eq. (5). 8: Calculate the objective of variant subgraph predictor by Eq. (7). 9: Obtain the propensity score function for reweighting by Eq. (9). 10: Calculate the objective of invariant subgraph predictor by Eq. (8). 11: Obtain the overall objective by Eq. (11). 12: Update model parameters by backpropagation. 13: end for 14: end while
Open Source Code	No	The paper does not contain an explicit statement about releasing code for the described methodology or a direct link to a code repository.
Open Datasets	Yes	The datasets are publicly available as follows: CMNIST: http://yann.lecun.com/exdb/mnist/ CFashion: https://github.com/zalandoresearch/fashion-mnist CKuzushiji: https://github.com/rois-codh/kmnist MOLSIDER: https://ogb.stanford.edu/docs/graphprop/ MOLHIV: https://ogb.stanford.edu/docs/graphprop/
Dataset Splits	Yes	The default split separates structurally different molecules with different scaffolds into different subsets, i.e., training/validation/testing sets. We report the accuracy for CMNIST, CFashion, and CKuzushiji, ROC-AUC for MOLSIDER and MOLHIV.
Hardware Specification	Yes	All the experiments are conducted with: Operating System: Ubuntu 18.04.1 LTS CPU: Intel(R) Xeon(R) CPU E5-2699 v4@2.20GHz GPU: NVIDIA Ge Force GTX TITAN Xp with 12GB of Memory
Software Dependencies	Yes	Software: Python 3.6.5; Num Py 1.19.2; Py Torch 1.10.1; Py Torch Geometric 2.0.3 (Fey & Lenssen, 2019)
Experiment Setup	Yes	The hyper-parameter q in Eq. (7) is 0.7. We adopt the Adam optimizer (Kingma & Ba, 2014). Note that we adopt the default hyperparameter settings following (Fan et al., 2022) for a fair comparison.