reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Equivalence is All: A Unified View for Self-supervised Graph Learning

Authors: Yejiang Wang, Yuhai Zhao, Zhengkui Wang, Ling Li, Jiapu Wang, Fangting Li, Miaomiao Huang, Shirui Pan, Xingwei Wang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To demonstrate that GALE achieves superior performance over baselines... We demonstrate that GALE surpasses SOTA algorithms through experiments on benchmark datasets. Performance on Graph-level Tasks. We evaluate the proposed model on both node classiﬁcation and graph classiﬁcation tasks. For node classiﬁcation, we use 8 benchmark datasets... For graph classiﬁcation, we evaluate on 8 datasets from the TUDataset benchmark... We conduct ablation studies on the loss in Eq. (10) using ﬁve benchmark datasets...
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Northeastern University, China 2Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, China 3Info Comm Technology Cluster, Singapore Institute of Technology, SIT X NVIDIA AI Centre, Singapore 4Shanxi University, China 5Hefei University of Technology, China 6Grifﬁth University, Australia. Correspondence to: Yuhai Zhao <EMAIL>.
Pseudocode	No	The paper describes the methodology using prose and mathematical formulations. There are no explicitly labeled sections such as "Pseudocode" or "Algorithm", nor are there structured, code-like blocks detailing a procedure.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Datasets. We evaluate the proposed model on both node classiﬁcation and graph classiﬁcation tasks. For node classiﬁcation, we use 8 benchmark datasets: Cora, Citeseer, Pubmed (Kipf & Welling, 2017), Wiki-CS, Amazon Computers, Amazon-Photo, Coauthor-CS, and Coauthor Physics (Shchur et al., 2018). For graph classiﬁcation, we evaluate on 8 datasets from the TUDataset benchmark (Morris et al., 2020), including NCI1, PROTEINS, DD, MUTAG, COLLAB, RDT-B, RDT-M5K, and IMDB-B.
Dataset Splits	Yes	Protocol. We follow the standard evaluation protocol of previous state-of-the-art self-supervised learning methods. For node classiﬁcation, we report the mean accuracy on the test set after 50 runs of training. Pretrained node embeddings are used to train a linear neural network for classiﬁcation. The dataset is split into 10%/10%/80% for training, validation, and testing, respectively. For graph classiﬁcation, we evaluate the learned graph representations using a linear SVM classiﬁer. We report the mean 10-fold cross-validation accuracy across 5 runs. For each training fold, the linear SVM is tuned using cross-validation, and the best mean accuracy is reported. The dataset is split into 80%/10%/10% for training, validation, and testing, respectively.
Hardware Specification	Yes	OOM indicates Out-Of-Memory on a 24GB GPU.
Software Dependencies	No	We implement both GALE and its variant GALE-APR using Py Torch Geometric. The key difference is that GALE uses Nauty (Mc Kay & Piperno, 2014) for exact automorphisms, while GALE-APR employs Page Rank equivalence with α = 0.85.
Experiment Setup	Yes	Implementation Details. We implement both GALE and its variant GALE-APR using Py Torch Geometric... We adopt the Adam optimizer, tuning learning rates {0.0001, 0.001, 0.01}, batch sizes {16, 64, 128, 256, 512}.