reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unifews: You Need Fewer Operations for Efficient Graph Neural Networks

Authors: Ningyi Liao, Zihao Yu, Ruixiao Zeng, Siqiang Luo

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that UNIFEWS achieves efficiency improvements with comparable or better accuracy, including 10 20 matrix operation reduction and up to 100 acceleration for graphs up to billion-edge scale. Our code is available at: https://github.com/gdmnl/Unifews. ... We commonly utilize graph normalization r = 0.5, model layer depth L = 2, and layer width fhidden = 512. For decoupled models, the number of propagation hops is 20. We employ full-batch and mini-batch training for iterative and decoupled methods, respectively. The total number of training epochs is 200, including pre-training or fine-tuning process in applicable methods. The batch size is 512 for small datasets and 16384 for large ones. We tune the edge and weight sparsity of evaluated models across their entire available ranges.
Researcher Affiliation	Academia	1College of Computing and Data Science, Nanyang Technological University, Singapore. Correspondence to: Siqiang Luo <EMAIL>.
Pseudocode	Yes	Algorithm 1: UNIFEWS on Decoupled Propagation ... Algorithm 2: UNIFEWS on Iterative GNN
Open Source Code	Yes	Extensive experiments demonstrate that UNIFEWS achieves efficiency improvements with comparable or better accuracy, including 10 20 matrix operation reduction and up to 100 acceleration for graphs up to billion-edge scale. Our code is available at: https://github.com/gdmnl/Unifews.
Open Datasets	Yes	In the main experiment, we adopt 6 representative datasets including 3 small-scale (Kipf & Welling, 2017) and 3 large-scale ones (Hu et al., 2020) considering the applicability of evaluated methods. ... Table 3: Statistics of graph datasets. f and Nc are the numbers of input attributes and label classes, respectively. Numbers in the Split column are percentages of nodes in training/validation/testing set w.r.t. labeled nodes, respectively ... cora (Kipf & Welling, 2017) ... citeseer (Kipf & Welling, 2017) ... pubmed (Kipf & Welling, 2017) ... physics (Shchur et al., 2019) ... arxiv (Hu et al., 2020) ... products (Hu et al., 2020) ... papers100m (Hu et al., 2020)
Dataset Splits	Yes	Table 3: Statistics of graph datasets. f and Nc are the numbers of input attributes and label classes, respectively. Numbers in the Split column are percentages of nodes in training/validation/testing set w.r.t. labeled nodes, respectively ... cora (Kipf & Welling, 2017) ... Split 0.50/0.25/0.25 ... arxiv (Hu et al., 2020) ... Split 0.54/0.18/0.29 ... products (Hu et al., 2020) ... Split 0.08/0.02/0.90 ... papers100m (Hu et al., 2020) ... Split 0.78/0.08/0.14
Hardware Specification	Yes	Evaluation are conducted on a server with 32 Intel Xeon CPUs (2.4GHz), an Nvidia A30 (24GB memory) GPU, and 512GB RAM.
Software Dependencies	No	The paper describes various GNN models (GCN, GAT, SGC, APPNP, etc.) and compression methods as baselines. However, it does not explicitly state specific version numbers for general software dependencies like Python, PyTorch, TensorFlow, or CUDA.
Experiment Setup	Yes	Hyperparameters. We commonly utilize graph normalization r = 0.5, model layer depth L = 2, and layer width fhidden = 512. For decoupled models, the number of propagation hops is 20. We employ full-batch and mini-batch training for iterative and decoupled methods, respectively. The total number of training epochs is 200, including pre-training or fine-tuning process in applicable methods. The batch size is 512 for small datasets and 16384 for large ones. We tune the edge and weight sparsity of evaluated models across their entire available ranges.