Efficient Robust Conformal Prediction via Lipschitz-Bounded Networks
Authors: Thomas Massena, Léo Andéol, Thibaut Boissin, Franck Mamalet, Corentin Friedrich, Mathieu Serrurier, Sébastien Gerchinovitz
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the whole approach across the CIFAR-10, CIFAR-100, Tiny Image Net and Image Net datasets. Our experiments showcase negligible computational overhead compared to vanilla CP, with best-in-class performances for both robust CP and vanilla CP s auditing. |
| Researcher Affiliation | Collaboration | 1IRIT 2SNCF 3Institut de Math ematiques de Toulouse 4IRT Saint Exupery. Correspondence to: Thomas Massena <EMAIL>. Acknowledgements The authors would like to thank Agustin Martin Picard for his insights, along with Arthur Chiron and Luca Mossina for their careful proofreading. This work was carried out within the DEEL project,5 which is part of IRT Saint Exupery and the ANITI AI cluster. The authors acknowledge the financial support from DEEL s Industrial and Academic Members and the France 2030 program Grant agreements n ANR-10-AIRT-01 and n ANR23-IACL-0002. |
| Pseudocode | Yes | Figure 6: Our function for computing the maximum quantile shift and therefore certifying the robustness of prediction sets under calibration time feature poisoning attacks. |
| Open Source Code | No | Finally, our code will be made available on the following github repository. |
| Open Datasets | Yes | We validate the whole approach across the CIFAR-10, CIFAR-100, Tiny Image Net and Image Net datasets. |
| Dataset Splits | Yes | Our methodology follows that of the benchmark of VRCP and we adopt the same calibration, holdout and test set sizes as Jeary et al. (2024) on all these datasets. Also, we give the mean values of the robust CP set sizes and the conformal coverage of these robust sets across 25 different random samplings of Dcal and Dtest (as well as Dholdout for PTT) that were unseen during training. ... For both methods we use 40% of the data points for calibration and the rest for testing. ... We first perform vanilla split CP on Dcal consisting of ncal = 3000 samples. Next, we compute empirical approximations γm and γm as in (12) and (13) on an evaluation dataset Deval with neval = 5000 samples... We take ncal = 15000 and neval = 35000 for Image Net. |
| Hardware Specification | Yes | All experiments were conducted on a system equipped with two NVIDIA GeForce RTX 4090 GPUs, each providing 24 GB of GDDR6X memory. |
| Software Dependencies | No | In our experimental setup, we use a standard neural network with two convolutional layers followed by max pooling operations and a linear layer. We study how a drop-in replacement of the vanilla Py Torch layers by our chosen Lipschitz constrained layers affects the overall training time of our network. ... We train our Lipschitz neural networks with the Adam W optimizer with a learning rate 1e-3. Also, we use the Soft HKRMulticlass Loss from the deel-torchlip library... |
| Experiment Setup | Yes | We train our Lipschitz neural networks with the Adam W optimizer with a learning rate 1e-3. Also, we use the Soft HKRMulticlass Loss from the deel-torchlip library with the following standard values: Dataset Margin Temperature Epochs Alpha CIFAR-10 0.6 5.0 130 0.975 CIFAR-100 0.6 5.0 220 0.975 Tiny Image Net 0.3 5.0 80 0.975 |