Out-of-Distribution Generalization on Graphs via Progressive Inference

Authors: Yiming Xu, Bin Shi, Zhen Peng, Huixiang Liu, Bo Dong, Chen Chen

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our proposed GPro outperforms the state-of-the-art methods by 4.91% on average. For datasets with more severe distribution shifts, the performance improvement can be up to 6.86%.
Researcher Affiliation Academia 1School of Computer Science and Technology, Xi an Jiaotong University 2Shaanxi Provincial Key Laboratory of Big Data Knowledge Engineering, Xi an Jiaotong University 3School of Distance Education, Xi an Jiaotong University 4University of Virginia, Charlottesville, Virginia, USA
Pseudocode No The paper describes the methodology using text and mathematical equations, but it does not include a clearly labeled pseudocode or algorithm block in the main text. It mentions 'The details of our algorithm are summarized in the Appendix.' but the appendix is not provided for analysis.
Open Source Code Yes Code https://github.com/yimingxu24/GPro
Open Datasets Yes We use three benchmark graph classification datasets in causal learning (Fan et al. 2022), namely CMNIST-75sp, CFashion-75sp, and CKuzushiji-75s, to evaluate the performance of the models on out-of-distribution (OOD) problems.
Dataset Splits Yes The datasets are divided into the training set: validation set: testing set in the ratio of 10K:5K:10K.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions using the Adam optimizer and GCN, but does not provide specific version numbers for these or other software libraries/dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes We use the Adam optimizer (Kingma and Ba 2014), and the learning rate is 0.01. For Eq. (7) and Eq. (8), we use the GCN (Kipf and Welling 2017) with 2 layers and 146 hidden dimensions as the encoder. We train the GPro with 200 epochs and add Lcou loss function at the 100th epoch. The batch size is 256. The default value for the number of causal and non-causal substructure context inference blocks is 2, and ρ are 0.9 and 0.8, respectively. We set q of GCE loss as 0.7 to amplify the focus on the non-causal part, λ1 is 15, λ2 is 0.01 and λ3 is 1.