reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aggregation Buffer: Revisiting DropEdge with a New Parameter Block

Authors: Dooho Lee, Myeong Kong, Sagad Hamid, Cheonwoo Lee, Jaemin Yoo

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of AGGB in improving the robustness and overall accuracy of GNNs across 12 node classification benchmarks. In addition, we show that AGGB works as a unifying solution to structural inconsistencies such as degree bias and structural disparity, both of which arise from structural variations in graph datasets. ... We evaluate the accuracy of node classification for 12 widely-used benchmark graphs... All reported performances, including the ablation studies, are averaged over ten independent runs with different random seeds and splits...
Researcher Affiliation	Academia	1School of Electrical Engineering, KAIST, Daejeon, Republic of Korea 2Computer Science Department, University of Münster, Münster, Germany. Correspondence to: Jaemin Yoo <EMAIL>.
Pseudocode	No	The paper includes theoretical analysis and mathematical formulations but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Code and datasets are available at https: //github.com/dooho00/agg-buffer.
Open Datasets	Yes	We evaluate the accuracy of node classification for 12 widely-used benchmark graphs, including Cora, Citeseer, Pubmed, Computers, Photo, CS, Physics, Ogbn-arxiv, Actor, Squirrel and Chameleon (Shchur et al., 2018; Hu et al., 2020a; Pei et al., 2020). For Squirrel and Chameleon, we use the filtered versions provided by (Platonov et al., 2023) via their public repository: https://github.com/yandex-research/ heterophilous-graphs. The remaining 10 datasets are sourced from the Deep Graph Library (DGL) (Wang et al., 2019).
Dataset Splits	Yes	We adopt the public dataset splits for Ogbn-arxiv (Hu et al., 2020a), Actor, Squirrel and Chameleon (Pei et al., 2020; Platonov et al., 2023). For the remaining eight datasets, we use an independently randomized 10%/10%/80% split for training, validation, and test, respectively. Our experiments are conducted using a two-layer GCN (Kipf & Welling, 2017), with hyperparameters selected via grid search based on validation accuracy across five runs, following prior works (Luo et al., 2024).
Hardware Specification	Yes	All experiments are conducted on an NVIDIA RTX A6000 GPU with 48 GB of memory.
Software Dependencies	No	The paper mentions using the Adam optimizer and the Deep Graph Library (DGL) but does not specify their version numbers or other key software dependencies with versions.
Experiment Setup	Yes	Our experiments are conducted using a two-layer GCN (Kipf & Welling, 2017), with hyperparameters selected via grid search based on validation accuracy across five runs... The search space included hidden dimensions [64, 256, 512], dropout ratios [0.2, 0.3, 0.5, 0.7], weight decay values [0, 5e-4, 5e-5], and learning rates [1e-2, 1e-3, 5e-3]. We use the Adam optimizer (Kingma & Ba, 2015) for training with early stopping based on validation accuracy, using a patience of 100 epochs across all datasets. ... AGGB requires tuning on three key hyperparameters: the dropout ratio, Drop Edge ratio, and the coefficient λ... λ values: [1, 0.5, 0.1], Drop Edge ratio: [0.2, 0.5, 0.7, 1.0], Dropout ratio: [0, 0.2, 0.5, 0.7]. For hyperparameter tuning, we follow the same process used for training the base GCN, conducting a search across five independent runs and selecting the configuration with the highest validation accuracy. To ensure reproducibility, we provide the detailed hyperparameters for training AGGB across datasets in Table 9.