IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

Authors: Jiawen Qin, Haonan Yuan, Qingyun Sun, Lyujin Xu, Jiaqi Yuan, Pengfeng Huang, Zhaonan Wang, Xingcheng Fu, Hao Peng, Jianxin Li, Philip Yu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To bridge this gap, we introduce IGL-Bench, a foundational comprehensive benchmark for imbalanced graph learning, embarking on 17 diverse graph datasets and 24 distinct IGL algorithms with uniform data processing and splitting strategies. Specifically, IGL-Bench systematically investigates state-of-the-art IGL algorithms in terms of effectiveness, robustness, and efficiency on node-level and graph-level tasks, with the scope of class-imbalance and topology-imbalance. Extensive experiments demonstrate the potential benefits of IGL algorithms on various imbalanced conditions, offering insights and opportunities in the IGL field.
Researcher Affiliation Academia 1Beihang Univerisity 2Guangxi Normal University 3University of Illinois, Chicago EMAIL
Pseudocode No The paper describes methods and experiments in natural language but does not contain any explicitly labeled pseudocode or algorithm blocks. It refers to specific algorithms by name, but their procedures are explained descriptively rather than formally structured as pseudocode.
Open Source Code Yes Further, we have developed an open-sourced and unified package to facilitate reproducible evaluation and inspire further innovative research, available at: https://github.com/Ring BDStack/IGL-Bench.
Open Datasets Yes To comprehensively and effectively evaluate the performance of IGL algorithms, we have integrated 17 real-world datasets from various domains for both the node-level and graph-level tasks. We briefly introduce each category in the following sections. More details are provided in Appendix A.1. ... Cora [65] Cite Seer [65] Pub Med [65] ... ogbn-ar Xiv [14]
Dataset Splits Yes To achieve this, we conduct node and graph classifications, where the train/val/test split satisfies the consistent ratio of 1:1:8. We facilitate dataset imbalance with the imbalance ratio ρ follows definitions in Tabel 1, providing a fair comparison under the same imbalance degree.
Hardware Specification Yes GPU: NVIDIA Tesla A100 SMX4 with 40GB of Memory. CPU: Intel(R) Xeon(R) Platinum 8358 CPU@2.60GHz with 1TB DDR4 of Memory.
Software Dependencies Yes Software: CUDA 10.1, Python 3.8.12, Py Torch (Paszke et al., 2019) 1.9.1, Py Torch Geometric (Fey & Lenssen, 2019) 2.0.1.
Experiment Setup Yes The number of training epochs for optimizing all IGL algorithms is set to 1000. We adopt the early stopping strategy, i.e., stop training if the performance on the validation set does not improve for 50 epochs. All parameters are randomly initiated. We adopt Adam (Kingma & Ba, 2015) with an appropriate learning rate and weight decay for the best performance on the validation split. We randomly run all the experiments 10 times, and report the average results with standard deviations.