reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

Authors: Jiawen Qin, Haonan Yuan, Qingyun Sun, Lyujin Xu, Jiaqi Yuan, Pengfeng Huang, Zhaonan Wang, Xingcheng Fu, Hao Peng, Jianxin Li, Philip Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To bridge this gap, we introduce IGL-Bench, a foundational comprehensive benchmark for imbalanced graph learning, embarking on 17 diverse graph datasets and 24 distinct IGL algorithms with uniform data processing and splitting strategies. Specifically, IGL-Bench systematically investigates state-of-the-art IGL algorithms in terms of effectiveness, robustness, and efficiency on node-level and graph-level tasks, with the scope of class-imbalance and topology-imbalance. Extensive experiments demonstrate the potential benefits of IGL algorithms on various imbalanced conditions, offering insights and opportunities in the IGL field.
Researcher Affiliation	Academia	1Beihang Univerisity 2Guangxi Normal University 3University of Illinois, Chicago EMAIL
Pseudocode	No	The paper describes methods and experiments in natural language but does not contain any explicitly labeled pseudocode or algorithm blocks. It refers to specific algorithms by name, but their procedures are explained descriptively rather than formally structured as pseudocode.
Open Source Code	Yes	Further, we have developed an open-sourced and unified package to facilitate reproducible evaluation and inspire further innovative research, available at: https://github.com/Ring BDStack/IGL-Bench.
Open Datasets	Yes	To comprehensively and effectively evaluate the performance of IGL algorithms, we have integrated 17 real-world datasets from various domains for both the node-level and graph-level tasks. We briefly introduce each category in the following sections. More details are provided in Appendix A.1. ... Cora [65] Cite Seer [65] Pub Med [65] ... ogbn-ar Xiv [14]
Dataset Splits	Yes	To achieve this, we conduct node and graph classifications, where the train/val/test split satisfies the consistent ratio of 1:1:8. We facilitate dataset imbalance with the imbalance ratio ρ follows definitions in Tabel 1, providing a fair comparison under the same imbalance degree.
Hardware Specification	Yes	GPU: NVIDIA Tesla A100 SMX4 with 40GB of Memory. CPU: Intel(R) Xeon(R) Platinum 8358 CPU@2.60GHz with 1TB DDR4 of Memory.
Software Dependencies	Yes	Software: CUDA 10.1, Python 3.8.12, Py Torch (Paszke et al., 2019) 1.9.1, Py Torch Geometric (Fey & Lenssen, 2019) 2.0.1.
Experiment Setup	Yes	The number of training epochs for optimizing all IGL algorithms is set to 1000. We adopt the early stopping strategy, i.e., stop training if the performance on the validation set does not improve for 50 epochs. All parameters are randomly initiated. We adopt Adam (Kingma & Ba, 2015) with an appropriate learning rate and weight decay for the best performance on the validation split. We randomly run all the experiments 10 times, and report the average results with standard deviations.