reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Graph Structure Refinement with Energy-based Contrastive Learning

Authors: Xianlin Zeng, Yufeng Wang, Yuqi Sun, Guodong Guo, Wenrui Ding, Baochang Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that ECL-GSR outperforms the state-of-the-art on eight benchmark datasets in node classiﬁcation. ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efﬁciency in downstream tasks. We conduct comprehensive experiments to sequentially evaluate the proposed framework s effectiveness, complexity, and robustness, addressing ﬁve research questions: RQ1: How effective is ECL-GSR on the node classiﬁcation task? RQ2: How efﬁcient is ECL-GSR in terms of training time and space? RQ3: How do ECL architecture and its hyperparameters impact the performance of node-level representation learning? RQ4: How robust is ECL-GSR in the face of structural attacks or noises? RQ5: What kind of reﬁned structure does ECL-GSR learn?
Researcher Affiliation	Collaboration	Xianlin Zeng1, 2, Yufeng Wang1, Yuqi Sun1, Guodong Guo3, Wenrui Ding1, Baochang Zhang1 1Beihang University, Beijing, P.R.China 2Postdoctoral Research Station at China Rong Tong Academy of Sciences Group Corporation Limited, Beijing, P.R.China 3Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo, P.R.China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	The pseudocode of ECL-GSR is illustrated in Algorithm 1.
Open Source Code	No	The paper does not explicitly provide a statement about releasing code, nor does it include a link to a code repository for the described methodology.
Open Datasets	Yes	Datasets For extensive comparison, we execute experiments on eight benchmark datasets: four citation networks (Cora, Citeseer (Sen et al. 2008), Pubmed (Namata et al. 2012), and OGB-Arxiv (Hu et al. 2020)), three webpage graphs (Cornell, Texas, and Wisconsin (Pei et al. 2020)), and one actor co-occurrence network (Actor (Tang et al. 2009)).
Dataset Splits	Yes	Evaluation on standard splits As stated in Table 1, three key observations can be made: i) ECL-GSR shows robust performance across all benchmark datasets, demonstrating its superior generalizability to diverse data. Notably, within the ambit of eight datasets, ECL-GSR achieves the state-ofthe-art with margins ranging from 0.15% to 1.61% over the second-highest approach. ... Evaluation on different train ratios In Table 2, we conduct experiments on Cora and Citeseer datasets with varying amounts of supervised information, speciﬁcally at training ratios of 1%, 3%, 5%, and 10%.
Hardware Specification	Yes	Our framework operates on an Ubuntu system with an NVIDIA Ge Force 3090 GPU, employing Py Torch 1.12.1, DGL 1.1.0, and Python 3.9.16.
Software Dependencies	Yes	Our framework operates on an Ubuntu system with an NVIDIA Ge Force 3090 GPU, employing Py Torch 1.12.1, DGL 1.1.0, and Python 3.9.16.
Experiment Setup	Yes	Subgraph sampling batch size N is ﬁxed at 64 for efﬁciency consideration. In ECL, the backbone fθ( ) is divided into ϕθ( ) for encoding, utilizing three GCN layers with the hidden and output dimension e F of 128, and φθ( ) for projection, comprising two fully-connected layers with an output dimension F of 128. The learned representation e Z is produced by ϕθ( ). Batch normalization is discarded when utilizing SGLD. The data augmentation operator T is a random Gaussian blur. For node classiﬁcation, classiﬁer Cθ( ) mirrors the architecture of ϕθ( ). Our model s ﬁnal hyperparameters are set as: α=0.1, β=0.01, µ=0.01, and τ=0.1. We adopt the Adam optimizer with an initial learning rate of 0.001, halving every 20 epochs. The epochs P for Cora, Citeseer, Cornell, Texas, and Wisconsin are 40, and those for Actor, Pubmed, and OGB-Arxiv are 80. The number of SGLD s iterations K only takes 3 steps.