Optimal transport-based conformal prediction

Authors: Gauthier Thurin, Kimia Nadjahi, Claire Boyer

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we evaluate our method on practical regression and classification problems, illustrating its advantages in terms of (conditional) coverage and efficiency. [...] Numerical experiments. In what follows, we study a practical regression problem and compare several CP methods described above: OT-CP for forming prediction regions as in (8), a CP approach producing ellipses (ELL, Johnstone & Cox, 2021), and a simple method creating hyperrectangle (REC, Neeven & Smirnov, 2018), with the miscoverage level adjusted by the Bonferroni correction. We simulate univariate inputs X Unif([0, 2]) with responses Y R2, and we assume that we are given a pre-trained predictor ˆf(x) = (2x2, (x + 1)2), x R. [...] We also compare the methods in terms of empirical coverage on test data (Figure 2(c)) and efficiency (volume of prediction regions, Figure 2(d)).
Researcher Affiliation Academia 1CNRS, Ecole Normale Sup erieure, Paris, France 2Laboratoire de Math ematiques d Orsay (LMO), Universit e Paris Saclay, France, and Institut universitaire de France. Correspondence to: Gauthier Thurin <EMAIL>.
Pseudocode No The paper describes the methodology in prose and bullet points, but does not include any explicitly labeled pseudocode, algorithm blocks, or similarly structured step-by-step procedures.
Open Source Code Yes The code used to produce the results in this paper can be accessed at this Git Hub repository.
Open Datasets Yes Next, we evaluate OT-CP+ on real datasets sourced from Mulan (Tsoumakas et al., 2011), with dataset statistics summarized in Table 1. [...] In Figure 8 and Figure 9, we present the results for a random forest on MNIST and Fashion-MNIST.
Dataset Splits Yes We split each dataset into training, calibration, and testing subsets (50% 25% 25% ratio) and train a random forest model as the regressor. [...] We used 25 000 data splitted in train/calibration/test with ratio 10%/45%/45%, since this is sufficient for the classifier to reach 90% accuracy and to ensure reasonable size for the test data. [...] Results in Figures 15 and 16 are averaged over 10 runs, each with 10 000 randomly chosen observations split in train/calibration/test with ratio 50%, 40%, 10%.
Hardware Specification No The paper does not provide specific hardware details such as CPU, GPU models, or memory used for running the experiments.
Software Dependencies No In all of our experiments, optimal transport problems are solved using the network simplex method implemented in the Python Optimal Transport library (Flamary et al., 2021). [...] random forest classifier implemented with the Python library scikit-learn.
Experiment Setup Yes Quantile regions for α = 0.9 are constructed using n = 1000 calibration instances. [...] Both methods use a k NN step that selects 10% of the calibration set as neighbors for each test point Xtest. [...] We start by simulating data according to a Gaussian mixture model, represented in Figure 7(a) and we consider a pretrained classifier based on Quadratic Discriminant Analysis. [...] a random forest classifier implemented with the Python library scikit-learn.