Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning
Authors: Yan Scholten, Stephan Günnemann
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally validate our approach on image classification tasks, achieving strong reliability while maintaining utility and preserving coverage on clean data. |
| Researcher Affiliation | Academia | Yan Scholten, Stephan G unnemann Department of Computer Science & Munich Data Science Institute Technical University of Munich EMAIL |
| Pseudocode | Yes | Algorithm 1 Reliable conformal score function Input: Dtrain, kt, deterministic training algo. T 1: Split Dtrain into kt disjoint partitions P t i P t i = {(xj, yj) Dtrain : h(xj) i (mod kt)} 2: for i = 1 to kt do 3: Train classifier f (i) = T(P t i ) on partition P t i 4: Construct the voting function πy(x) = 1 kt Pkt i=1 1{f (i)(x) = y} 5: Smooth the voting function s(x, y) = eπy/(PK i=1 eπi) Output: Reliable conformal score function s Algorithm 2 Reliable conformal prediction sets Input: Dcalib, kc, s, α, xn+1 1: Split Dcalib into kc disjoint partitions P c i P c i = {(xj, yj) Dcalib : h(xj) i (mod kc)} 2: for i = 1 to kc do 3: Compute scores Si={s(xj, yj)}(xj,yj) P c i 4: Compute αni-quantile τi of scores Si 5: Construct prediction set for quantile τi Ci(xn+1) = {y : s(xn+1, y) τi} 6: Construct majority vote prediction set CM(xn+1)={y :Pkc i=1 1{y Ci(xn+1)}> ˆτ(α)} Output: Reliable conformal prediction set CM |
| Open Source Code | Yes | We also provide code along with detailed reproducibility instructions via the following project page: https://www.cs.cit.tum.de/daml/reliable-conformal-prediction/. |
| Open Datasets | Yes | We train Res Net18, Res Net50 and Res Net101 models (He et al., 2016) on SVHN (Netzer et al., 2011), CIFAR10 and CIFAR100 (Krizhevsky et al., 2009). |
| Dataset Splits | Yes | We randomly select 1,000 images of the test set for calibration and use the remaining 9,000 datapoints for testing. |
| Hardware Specification | Yes | We train Res Net18 models on a NVIDIA GTX 1080TI GPU, and the Res Net50 and Res Net101 models on a NVIDIA A100 40GB. We perform inference of all models on a NVIDIA GTX 1080TI GPU, and compute certificates on a Xeon E5-2630 v4 CPU. |
| Software Dependencies | No | The paper mentions "We use the torchvision library to load the datasets." and "We further deploy a cosine learning rate scheduler (Loshchilov & Hutter, 2017)" but does not specify version numbers for these software components or the underlying framework like PyTorch. |
| Experiment Setup | Yes | We train all models with stochastic gradient descent (learning rate 0.01, momentum 0.9, weight decay 5e-4) for 400 epochs using early stopping if the training accuracy does not improve for 100 epochs. We further deploy a cosine learning rate scheduler (Loshchilov & Hutter, 2017). We use a batch size of 128 during training and 300 at inference. |