Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning

Authors: Yan Scholten, Stephan Günnemann

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally validate our approach on image classification tasks, achieving strong reliability while maintaining utility and preserving coverage on clean data.
Researcher Affiliation Academia Yan Scholten, Stephan G unnemann Department of Computer Science & Munich Data Science Institute Technical University of Munich EMAIL
Pseudocode Yes Algorithm 1 Reliable conformal score function Input: Dtrain, kt, deterministic training algo. T 1: Split Dtrain into kt disjoint partitions P t i P t i = {(xj, yj) Dtrain : h(xj) i (mod kt)} 2: for i = 1 to kt do 3: Train classifier f (i) = T(P t i ) on partition P t i 4: Construct the voting function πy(x) = 1 kt Pkt i=1 1{f (i)(x) = y} 5: Smooth the voting function s(x, y) = eπy/(PK i=1 eπi) Output: Reliable conformal score function s Algorithm 2 Reliable conformal prediction sets Input: Dcalib, kc, s, α, xn+1 1: Split Dcalib into kc disjoint partitions P c i P c i = {(xj, yj) Dcalib : h(xj) i (mod kc)} 2: for i = 1 to kc do 3: Compute scores Si={s(xj, yj)}(xj,yj) P c i 4: Compute αni-quantile τi of scores Si 5: Construct prediction set for quantile τi Ci(xn+1) = {y : s(xn+1, y) τi} 6: Construct majority vote prediction set CM(xn+1)={y :Pkc i=1 1{y Ci(xn+1)}> ˆτ(α)} Output: Reliable conformal prediction set CM
Open Source Code Yes We also provide code along with detailed reproducibility instructions via the following project page: https://www.cs.cit.tum.de/daml/reliable-conformal-prediction/.
Open Datasets Yes We train Res Net18, Res Net50 and Res Net101 models (He et al., 2016) on SVHN (Netzer et al., 2011), CIFAR10 and CIFAR100 (Krizhevsky et al., 2009).
Dataset Splits Yes We randomly select 1,000 images of the test set for calibration and use the remaining 9,000 datapoints for testing.
Hardware Specification Yes We train Res Net18 models on a NVIDIA GTX 1080TI GPU, and the Res Net50 and Res Net101 models on a NVIDIA A100 40GB. We perform inference of all models on a NVIDIA GTX 1080TI GPU, and compute certificates on a Xeon E5-2630 v4 CPU.
Software Dependencies No The paper mentions "We use the torchvision library to load the datasets." and "We further deploy a cosine learning rate scheduler (Loshchilov & Hutter, 2017)" but does not specify version numbers for these software components or the underlying framework like PyTorch.
Experiment Setup Yes We train all models with stochastic gradient descent (learning rate 0.01, momentum 0.9, weight decay 5e-4) for 400 epochs using early stopping if the training accuracy does not improve for 100 epochs. We further deploy a cosine learning rate scheduler (Loshchilov & Hutter, 2017). We use a batch size of 128 during training and 300 at inference.