reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Risk-controlling Prediction with Distributionally Robust Optimization

Authors: Franck Iutzeler, Adrien Mazoyer

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 4, we illustrate its encouraging performance in practice. In this section, we illustrate the results of the paper and the applicability of WDRO-based Risk Controlling Prediction Sets. Our goal is thus to show how the WDRO-based bounds behave compared to classical approaches for RCPS and how they can encompass distribution shifts and simultaneous training and conformal prediction, rather than performing a complete performance evaluation. The code used for these experiments is available at https://github.com/iutzeler/rcps-wdro. We use 1000 samples generated from the make-regression function of scikit-learn: 500 are used to train a linear prediction model (using scikit-learn s Linear Regression estimator), 250 are used for calibration (which is n in the notation of the paper), 250 are used for evaluation. In Fig. 2, we display the values of the considered upper-bounds as a function of the prediction size λ for one realization of the experiment. In Fig. 3, we repeat our experiment 100 times and report boxplots of the coverages (i.e., one minus the risk) evaluated on the test data as well as prediction interval sizes.
Researcher Affiliation	Academia	Franck Iutzeler EMAIL Institut de Mathématiques de Toulouse Université de Toulouse, CNRS, UPS, 31062, Toulouse, France Adrien Mazoyer EMAIL Institut de Mathématiques de Toulouse Université de Toulouse, CNRS, UPS, 31062, Toulouse, France
Pseudocode	No	The paper describes methods and theoretical results but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code used for these experiments is available at https://github.com/iutzeler/rcps-wdro.
Open Datasets	Yes	We use 1000 samples generated from the make-regression function of scikit-learn:6 n_features=5, n_informative=3, noise=20 leading to a problem in dimension d = 5 with a fair amount of noise.
Dataset Splits	Yes	We use 1000 samples generated from the make-regression function of scikit-learn:6 500 are used to train a linear prediction model (using scikit-learn s Linear Regression estimator), 250 are used for calibration (which is n in the notation of the paper), 250 are used for evaluation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions 'scikit-learn' and 'skwdro' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	In all the section, unless otherwise specified, we take α = 0.1 and δ = 0.05. For the WDRO-based approaches, we take ρ = c/ n with c = 2 for SKWDRO and c = 10 2 for Simple WDRO. We use the options n_features=5, n_informative=3, noise=20 leading to a problem in dimension d = 5 with a fair amount of noise. For a fixed λ, we perform a degree-4 polynomial regression of Y from X and compare two approaches for training and conformal inference using the SKWDRO bound.