reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-model Online Conformal Prediction with Graph-Structured Feedback

Authors: Erfan Hajihashemi, Yanning Shen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on real and synthetic datasets validate that the proposed methods construct smaller prediction sets and outperform existing multi-model online conformal prediction approaches.
Researcher Affiliation	Academia	Erfan Hajihashemi EMAIL Department of Electrical Engineering and Computer Science University of California, Irvine Yanning Shen EMAIL Department of Electrical Engineering and Computer Science University of California, Irvine
Pseudocode	Yes	Algorithm 1 Generating Graph Gt Require: Number of selective nodes J, exploration coefficient ηe > 0, the maximum number of connected models to each selective node N, M pre-trained models. Initialize At = 0J M. for m = 1, ..., M do Set pm t = (1 ηe) wm t PM m=1 w m t + ηe end for for j = 1, ..., J do for n = 1, ..., N do Select one of the models according to PMF pt = (pm t )M m=1 Set At(j, m) = 1 { m is selected model from PMF} end for end for --- Algorithm 2 Graph-Structured feedback Multi-model Ensemble Online Conformal Prediction (GMOCP) Require: α [0, 1], M pre-trained models, and step size ϵ (0, 1) for t [T] do Receive new datum xt. Generate graph Gt using Algorithm 1 Obtain uj t = P m vj wm t , j [J] for j = 1, ..., J do Set p j t = uj t PJ i=1 ui t end for Select one of the selective nodes according to the PMF p t = (p j t)J j=1. Create a set St including connected models to the selected node. Obtain normalized weights by wm t = wm t P m St w m t , m St Select model ˆm according to the PMF ws t = (w m t ) m St. Obtain threshold ˆq ˆm α ˆ m t according to equation 1, and construct prediction set C ˆm α ˆ m t (Xt) via equation 2. Observe the true label. Calculates l m t and update w m t and α m t according to equation 8, equation 7, and equation 5 m St end for
Open Source Code	No	The paper does not provide any explicit links to source code repositories or statements about releasing code for the described methodology.
Open Datasets	Yes	Dataset: We utilize corrupted versions of CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009), known as CIFAR-10C and CIFAR-100C (Hendrycks & Dietterich, 2019). ... Here, we conduct experiments on a new dataset featuring a gradual distribution shift using Tiny Image Net C, a corrupted version of the Tiny Image Net dataset (Le & Yang, 2015) that contains 200 distinct classes.
Dataset Splits	Yes	Each dataset is divided into a training phase (50,000 samples) and a test set (6,000 samples). Additionally, a separate set of 2,000 samples is used for hyperparameter selection in the conformal prediction task.
Hardware Specification	Yes	All experiments were performed on a workstation with NVIDIA RTX A4000 GPU.
Software Dependencies	No	The paper mentions several deep learning models (Goog Le Net, Res Net, Dense Net, Mobile Net V2, Efficient Net-B0) and frameworks implicitly (e.g., Image Net for pretrained weights, training with epochs and batch sizes suggest PyTorch or TensorFlow), but it does not specify any software names with version numbers for their implementation.
Experiment Setup	Yes	For all experiments conducted on CIFAR-10C and CIFAR-100C in this section, the parameters ϵ, η, and β were selected through grid search, with values of 0.5, 0.05, and 0.05, respectively. Additionally, we set T = 6000, indicating that the algorithm receives sequential data in an online manner over 6000 time steps. ... In all 3 settings, the learning rate is set to 10-3, and the batch size is fixed at 64. ... The hyperparameters ξ and kreg are set to 0.02 and 5 for CIFAR-100C, and 0.1 and 1 for Cifar-10C, respectively.