Equivalence is All: A Unified View for Self-supervised Graph Learning

Authors: Yejiang Wang, Yuhai Zhao, Zhengkui Wang, Ling Li, Jiapu Wang, Fangting Li, Miaomiao Huang, Shirui Pan, Xingwei Wang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate that GALE achieves superior performance over baselines... We demonstrate that GALE surpasses SOTA algorithms through experiments on benchmark datasets. Performance on Graph-level Tasks. We evaluate the proposed model on both node classification and graph classification tasks. For node classification, we use 8 benchmark datasets... For graph classification, we evaluate on 8 datasets from the TUDataset benchmark... We conduct ablation studies on the loss in Eq. (10) using five benchmark datasets...
Researcher Affiliation Academia 1School of Computer Science and Engineering, Northeastern University, China 2Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, China 3Info Comm Technology Cluster, Singapore Institute of Technology, SIT X NVIDIA AI Centre, Singapore 4Shanxi University, China 5Hefei University of Technology, China 6Griffith University, Australia. Correspondence to: Yuhai Zhao <EMAIL>.
Pseudocode No The paper describes the methodology using prose and mathematical formulations. There are no explicitly labeled sections such as "Pseudocode" or "Algorithm", nor are there structured, code-like blocks detailing a procedure.
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Datasets. We evaluate the proposed model on both node classification and graph classification tasks. For node classification, we use 8 benchmark datasets: Cora, Citeseer, Pubmed (Kipf & Welling, 2017), Wiki-CS, Amazon Computers, Amazon-Photo, Coauthor-CS, and Coauthor Physics (Shchur et al., 2018). For graph classification, we evaluate on 8 datasets from the TUDataset benchmark (Morris et al., 2020), including NCI1, PROTEINS, DD, MUTAG, COLLAB, RDT-B, RDT-M5K, and IMDB-B.
Dataset Splits Yes Protocol. We follow the standard evaluation protocol of previous state-of-the-art self-supervised learning methods. For node classification, we report the mean accuracy on the test set after 50 runs of training. Pretrained node embeddings are used to train a linear neural network for classification. The dataset is split into 10%/10%/80% for training, validation, and testing, respectively. For graph classification, we evaluate the learned graph representations using a linear SVM classifier. We report the mean 10-fold cross-validation accuracy across 5 runs. For each training fold, the linear SVM is tuned using cross-validation, and the best mean accuracy is reported. The dataset is split into 80%/10%/10% for training, validation, and testing, respectively.
Hardware Specification Yes OOM indicates Out-Of-Memory on a 24GB GPU.
Software Dependencies No We implement both GALE and its variant GALE-APR using Py Torch Geometric. The key difference is that GALE uses Nauty (Mc Kay & Piperno, 2014) for exact automorphisms, while GALE-APR employs Page Rank equivalence with α = 0.85.
Experiment Setup Yes Implementation Details. We implement both GALE and its variant GALE-APR using Py Torch Geometric... We adopt the Adam optimizer, tuning learning rates {0.0001, 0.001, 0.01}, batch sizes {16, 64, 128, 256, 512}.