reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Valid Conformal Prediction for Dynamic GNNs

Authors: Ed Davis, Ian Gallagher, Daniel Lawson, Patrick Rubin-Delanchy

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide real data examples demonstrating validity, showing improved accuracy over baselines, and sign-posting different failure modes which can occur when those assumptions are violated. ... We evaluate the performance of UGNNs using four examples, comprising simulated and real data, summarised in Table 1. We will then delve deeper into a particular dataset to show how variation in prediction sets can tell us something about the underlying network dynamics.
Researcher Affiliation	Academia	1University of Bristol, U.K. 2The University of Melbourne, Australia 3School of Mathematics and Maxwell Institute for Mathematical Sciences, University of Edinburgh, U.K. EMAIL EMAIL EMAIL
Pseudocode	Yes	Algorithm 1 Split conformal inference Algorithm 2 Full conformal inference Algorithm 3 Split conformal inference (abstract version)
Open Source Code	Yes	Python code for reproducing experiments is available at https://github.com/edwarddavis1/valid_conformal_for_dynamic_gnn.
Open Datasets	Yes	SBM (simulated dynamic stochastic block model, described in text) ... School. A dynamic social network between pupils at a primary school in Lyon, France (Stehl e et al., 2011). ... Flight. The Open Sky dataset tracks the number of flights (edges) between airports (nodes) over each month from the start of 2019 to the end of 2021 (Olive et al., 2022). ... Trade. An agricultural trade network between members of the United Nations tracked yearly between 1986 and 2016 (Mac Donald et al., 2015), which features in the Temporal Graph Benchmark (Huang et al., 2024b).
Dataset Splits	Yes	In the transductive regime, we randomly assign nodes to train, validation, calibration and test sets with ratios 20/10/35/35, regardless of their time point label. ... For the semi-inductive regime, we apply a similar approach, except that we reserve the last 35% of the observation period as the test set... The training, validation and calibration sets are then picked at random, regardless of time point label, from the remaining data with ratios 20/10/35.
Hardware Specification	Yes	The maximum time to train an individual model was around a minute on an AMD Ryzen 5 3600 CPU processor.
Software Dependencies	No	Useful resources for getting started with GNNs include the introductions (Hamilton, 2020; Sanchez-Lengeling et al., 2021) and the Pytorch Geometric library (Pyt). ... Our contribution should be viewed as a novel interface between CP and GNNs... using recognised and standard baselines for GNNs and CP: Graph Convolutional Networks (GCN) (Kipf and Welling, 2016) and Graph Attention Networks (GAT) (Veliˇckovi c et al., 2017) for GNNs and APS (Romano et al., 2020) for CP.
Experiment Setup	No	On each dataset, we apply GCN (Kipf and Welling, 2016) and GAT (Veliˇckovi c et al., 2017) to the block diagonal and unfolded matrix structures (referred to as UGCN and UGAT respectively) in both the transductive and the semi-inductive settings. ... For each experiment, we return the mean accuracy, coverage and prediction set size across all conformal runs to evaluate the predictive power of each GNN, as well as its conformal performance. To quantify error, we quote the standard deviation of each metric.