Leave-One-Out Stable Conformal Prediction

Authors: Kiljae Lee, Yuan Zhang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method is theoretically justified and demonstrates superior numerical performance on synthetic and real-world data. We applied our method to a screening problem, where its effective exploitation of training data led to improved test power compared to state-of-the-art method based on split conformal.
Researcher Affiliation Academia Kiljae Lee, Yuan Zhang The Ohio State University EMAIL, EMAIL
Pseudocode Yes Algorithm 1: (LOO-Stab CP) Leave-One-Out Stable Conformal Prediction Set Algorithm 2: (LOO-cf BH) Conformal Selection by Prediction with Leave-One-Out p-values
Open Source Code Yes The code for reproducing numerical results is available at: https://github.com/Kiljae L/LOO-Stab CP.
Open Datasets Yes We considered two models for µ( ; β): linear µ(x; β) = Pd j=1 βjxj and nonlinear µ(x; β) = Pd j=1 βjexj/10. The Boston Housing data (Harrison Jr & Rubinfeld, 1978) contain 506 different areas in Boston... The Diabetes data (Efron et al., 2004) measured 442 individuals at their baseline time points... We used the recruitment data set Ganatara (2020) that was also analyzed in Jin & Cand es (2023).
Dataset Splits Yes We set n = m = 100, α = 0.1 and generated synthetic data... Split CP: 70% training + 30% calibration... For each data set, we randomly held out m data points (as the test data) for performance evaluation and released the rest to all methods for training/calibration. We tested two settings: m = 1 and m = 100... each time leaving out 20% data points as the test data. In cf BH, the data was split into 70% for training and 30% for calibration.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, only the algorithms, datasets, and hyperparameters.
Software Dependencies No The paper mentions algorithmic details and methods like "robust linear regression, equipped with Huber loss", "gradient descent", and "SGD", but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes In this simulation, we set n = m = 100, α = 0.1 and generated synthetic data using Xi i.i.d. N(0, 1dΣ) with d = 100, where Σi,j = ρ|i j| (i.e., AR(1)). In particular, we chose ρ = 0.5 in this experiment. For the response variable we set Yi = µ(Xi; β) + ϵi, where ϵi i.i.d. N(0, 1). We considered two models for µ( ; β): linear µ(x; β) = Pd j=1 βjxj and nonlinear µ(x; β) = Pd j=1 βjexj/10. In both models, set βj (1 j/d)5 for j [d], and normalize: β 2 2 = d. To fit the model, we used robust linear regression, equipped with Huber loss: ℓ(y, fθ(x)) = 12(y fθ(x))2, if |y fθ(x)| ϵ, ϵ|y fθ(x)| 12ϵ2, if |y fθ(x)| > ϵ, where fθ(x) = x T θ and we set ϵ = 1 throughout. We used absolute residual as non-conformity scores. In RLM, we set Ω(θ) = θ 2 and solved it using gradient descent (Diamond & Boyd, 2016). Throughout, we ran SGD for R = 15 epochs for all methods, except R = 5 for the very slow Full CP. For both RLM and SGD, we set the learning rate to be η = 0.001. To further evaluate the performance of LOO-Stab CP with non-convex learning methods, we conducted experiments with a neural network of a single hidden layer of 20 nodes and a sigmoid activation function. We set η = 0.001 and R = 30.