reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exact Certification of (Graph) Neural Networks Against Label Poisoning

Authors: Mahalakshmi Sabanayagam, Lukas Gosch, Stephan Günnemann, Debarghya Ghoshdastidar

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Sec. 4.1 we thoroughly investigate our sample-wise and collective certificates. Sec. 4.2 discusses in detail the effect of architectural choices and graph structure. Datasets. We use the real-world graph datasets Cora-ML (Bojchevski & G unnemann, 2018) and Citeseer (Giles et al., 1998) for multi-class certification. We evaluate binary class certification using Polblogs (Adamic & Glance, 2005), and by extracting the subgraphs containing the top two largest classes from Cora-ML, Citeseer, Wiki-CS (Mernyei & Cangea, 2020), Cora (Mc Callum et al., 2000) and Chameleon (Rozemberczki et al., 2021) referring to these as Cora-MLb, Citeseerb, Wiki-CSb, Corab and Chameleonb, respectively.
Researcher Affiliation	Academia	1 School of Computation, Information and Technology, Technical University of Munich 2 Munich Data Science Institute 3 Munich Center for Machine Learning (MCML); Germany EMAIL
Pseudocode	No	The paper describes methods and theorems but does not contain explicitly labeled pseudocode or algorithm blocks. The derivations are mathematical and descriptions are in prose.
Open Source Code	Yes	The code is available at https://github.com/saper0/qpcert.
Open Datasets	Yes	We use the real-world graph datasets Cora-ML (Bojchevski & G unnemann, 2018) and Citeseer (Giles et al., 1998) for multi-class certification. We evaluate binary class certification using Polblogs (Adamic & Glance, 2005), and by extracting the subgraphs containing the top two largest classes from Cora-ML, Citeseer, Wiki-CS (Mernyei & Cangea, 2020), Cora (Mc Callum et al., 2000) and Chameleon (Rozemberczki et al., 2021) referring to these as Cora-MLb, Citeseerb, Wiki-CSb, Corab and Chameleonb, respectively.
Dataset Splits	Yes	We choose 10 nodes per class for training for all datasets, except for Citeseer, for which we choose 20. No separate validation set is needed as we perform 4-fold cross-validation (CV) for hyperparameter tuning. All results are averaged over 5 seeds (multiclass datasets: 3 seeds) and reported with their standard deviation. The test set for collective certificates consists of all unlabeled nodes on CSBM and CBA, and random samples of 50 unlabeled nodes for real-world graphs. The samplewise certificate is calculated on all unlabeled nodes.
Hardware Specification	No	We used Gurobi to solve the MILP problems and all our experiments are run on CPU on an internal cluster. The memory requirement to compute sample-wise and collective certificates depends on the length MILP solving process.
Software Dependencies	Yes	All results concern the infinite-width limit and are obtained by solving the MILPs in Thm. 1 and 2 using Gurobi 11.0.1 (Gurobi Optimization, LLC, 2023) and the GNN s NTK as derived in Gosch et al. (2024) and Sabanayagam et al. (2023).
Experiment Setup	Yes	All other hyperparameters are chosen based on 4-fold CV, given in App. G.2. We define the row and symmetric normalizations as Srow = b D 1 b A, Ssym = b D 1/2 b A b D 1/2 with b D and b A the degree and adjacency matrices of the given graph G with an added self-loop. For CSBM, we choose S to Srow for GCN, SGC, GCN Skip-α and GCN Skip-PC, Ssym for APPNP with its α = 0.1. GIN and Graph SAGE are with fixed S. In the case of L = 1, the regularization parameter C is 0.001 for all GNNs except APPNP where C = 0.5. For L = 2, C = 0.001 for all, except GCN with C = 0.25 and GCN Skip-α with C = 0.25. For L = 4, again C = 0.001 for all, except GCN with C = 0.25 and GCN Skip-α with C = 0.5.