reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bonsai: Gradient-free Graph Condensation for Node Classification

Authors: Mridul Gupta, Samyak Jain, Vansh Ramani, HARIPRASAD KODAMANA, Sayan Ranu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we benchmark Bonsai and establish: Superior accuracy: Bonsai consistently outperforms existing baselines in terms of accuracy across various compression factors, datasets, and Gnn architectures. Enhanced computation and energy efficiency: On average, Bonsai is at least 7 times faster and 17 times more energy efficient than the state-of-the-art baselines. Increased robustness: Unlike existing methods that require tuning condensation-specific hyperparameters for each combination of Gnn architecture, dataset, and compression ratio, Bonsai achieves superior performance using a single set of parameters across all scenarios. Our implementation is available at https://github.com/idea-iitd/Bonsai.
Researcher Affiliation	Academia	1Yardi School of Artificial Intelligence 2Department of Computer Science 3Department of Chemical Engineering Indian Institute of Technology Delhi, New Delhi, 110016, India 4Indian Institute of Technology Delhi, Abu Dhabi, Zayed City, Abu Dhabi, UAE {mridul.gupta@scai,cs5200667@,cs5230804@,kodamana@,sayanranu@cse}.iitd.ac.in
Pseudocode	Yes	Algorithm 1 The greedy approach Require: Graph G, budget b, Rev k NN T L v Ensure: solution set A, \|A\| = b 1: A 2: while size(A) b (within budget) do 3: T L v arg max T L v T\A{Π(A {T L v }) Π(A)} 4: A A {T L v } 5: Return A
Open Source Code	Yes	Our implementation is available at https://github.com/idea-iitd/Bonsai.
Open Datasets	Yes	Datasets: Table 3 lists the benchmark datasets used. Dataset # Nodes # Edges # Classes # Features Cora (Kipf & Welling, 2017) 2,708 10,556 7 1,433 Citeseer (Kipf & Welling, 2017) 3,327 9,104 6 3,703 Pubmed (Kipf & Welling, 2017) 19,717 88,648 3 500 Flickr (Zeng et al., 2020) 89,250 899,756 7 500 Ogbn-arxiv (Hu et al., 2021) 169,343 2,315,598 40 128 Reddit (Hamilton et al., 2017) 232,965 23,213,838 41 602 MAG240M (Hu et al., 2021) 1,398,159 26,434,726 153 768
Dataset Splits	Yes	Across all datasets, except MAG240M, we maintain a train-validation-test split ratio of 60 : 20 : 20. In MAG240M, we use a ratio of 80:10:10.
Hardware Specification	Yes	All experiments were conducted on a high-performance computing system with the following specifications: CPU: 96 logical cores RAM: 512 GB GPU: NVIDIA A100-PCIE-40GB
Software Dependencies	Yes	Operating System: Linux (Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-124-generic x86_64)) ) Py Torch Version: 1.13.1+cu117 CUDA Version: 11.7 Py Torch Geometric Version: 2.3.1
Experiment Setup	Yes	The specifics of our experimental setup, including hardware and software environment, and hyperparameters are detailed in App. B. For the baseline algorithms, we use the code shared by their respective authors. We conduct each experiment 5 times and report the means and standard deviations. ... Number of layers in evaluation models: 2 (with Relu in between) for Gcn, Gat, and Gin. The Mlp used in Gin is a simple linear transform with a bias defined by the following equation WX+b where X is the input design matrix. Value of k in Rev-k-NN: 5 Hyper-parameters Baselines: We use the config files shared by the authors. We note that the benchmark datasets are common between our experiments and those used in the baselines.