reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Understanding Class Bias Amplification in Graph Representation Learning

Authors: Shengzhong Zhang, Wenjie Yang, Yimin Zhang, Hongwei Zhang, Zengfeng Huang

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various datasets demonstrate the advantage of our method when dealing with class bias amplification.
Researcher Affiliation	Academia	Shengzhong Zhang EMAIL College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics Wenjie Yang EMAIL School of Data Science, Fudan University Yimin Zhang EMAIL School of Data Science, Fudan University Hongwei Zhang EMAIL School of Data Science, Fudan University Zengfeng Huang EMAIL School of Data Science, Fudan University Shanghai Innovation Institution
Pseudocode	Yes	A.4 Random Graph Coarsening Algorithm Algorithm 1 is a detailed description of our random graph coarsening algorithm. Algorithm 1 Random Graph Coarsening Input: G = (A, X), threshold δ, the coarsening ratio r, number of nodes n Output: G = (A , X )
Open Source Code	Yes	Our code is available at: https://github.com/szzhang17/Understanding-Class-Bias-Amplification -in-Graph-Representation-Learning.
Open Datasets	Yes	The results are evaluated on six real-world datasets (Kipf & Welling, 2017; Veličković et al., 2018; Zhu et al., 2021; Hu et al., 2020), Cora, Citeseer, Pubmed, Amazon Computer, Amazon Photo, and Ogbn-Arixv. ... For Ogbn-Arixv, we use fixed data splits as in previous studies Hu et al. (2020).
Dataset Splits	Yes	On small-scale datasets Cora, Citeseer, Pubmed, Photo and Computers, performance is evaluated on random splits. We select 20 labeled nodes per class for training, while the remaining nodes are used for testing. For Ogbn-Arixv, we use fixed data splits as in previous studies Hu et al. (2020).
Hardware Specification	Yes	Experiments are conducted on a server with NVIDIA 3090 GPU (24 GB memory), NVIDIA A6000 GPU (48 GB memory) and Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz.
Software Dependencies	No	All the algorithms and models are implemented in Python and Py Torch Geometric. No specific version numbers for Python or PyTorch Geometric are provided.
Experiment Setup	Yes	A.5 Experimental details For all unsupervised models, the learned representations are evaluated by training and testing a logistic regression classifier except for Ogbn-Arxiv. ... The detailed hyperparameter settings are listed in Table 8. Table 8: Summary of the hyper-parameters. Dataset Epoch Learning rate α β Cora 25 0.01 15000 500 Citeseer 200 0.0002 15000 500 Pubmed 25 0.02 20000 200 Amazon-Photo 20 0.001 100000 100000 Amazon-Computers 20 0.0002 20000 20000 Ogbn-Arxiv 10 0.0001 2000000 200000