Wasserstein-Regularized Conformal Prediction under General Distribution Shift
Authors: Rui Xu, Chao Chen, Yue Sun, Parvathinathan Venkitasubramaniam, Sihong Xie
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on six datasets prove that WR-CP can reduce coverage gaps to 3.2% across different confidence levels and outputs prediction sets 37% smaller than the worst-case approach on average. ... Experiments were conducted on six datasets: (a) the airfoil self-noise dataset (Brooks & Marcolini, 2014); (b) Seattle-loop (Cui et al., 2019), Pe MSD4, Pe MSD8 (Guo et al., 2019) for traffic speed prediction; (c) Japan-Prefectures, and U.S.-States (Deng et al., 2020) for epidemic spread forecasting. |
| Researcher Affiliation | Academia | Rui Xu, Sihong Xie The Hong Kong University of Science and Technology (Guangzhou) EMAIL, EMAIL Chao Chen Harbin Institute of Technology EMAIL Yue Sun, Parvathinathan Venkitasubramaniam Lehigh University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Wasserstein-regularized Conformal Prediction (WR-CP) |
| Open Source Code | Yes | The code of our work is released on https://github.com/rxu0112/WR-CP. |
| Open Datasets | Yes | Experiments were conducted on six datasets: (a) the airfoil self-noise dataset (Brooks & Marcolini, 2014); (b) Seattle-loop (Cui et al., 2019), Pe MSD4, Pe MSD8 (Guo et al., 2019) for traffic speed prediction; (c) Japan-Prefectures, and U.S.-States (Deng et al., 2020) for epidemic spread forecasting. ... The airfoil self-noise dataset from the UCI Machine Learning Repository (Brooks & Marcolini, 2014). DOI: https://doi.org/10.24432/C5VW2C. |
| Dataset Splits | No | We conducted 10 sampling trials for each dataset. Within each trails, we sampled S(i) XY from each subset i, for i = 1, ..., k. After this step, we allocated the remaining elements within each subset for calibration and testing purposes. The parts intended for calibration across all subsets were then unified to form SP XY . Lastly, to create diverse testing scenarios, we generated multiple test sets by randomly mixing the parts designated for testing from each subset with replacement. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. It only mentions using an MLP model. |
| Software Dependencies | No | To find the optimized bandwidth value of ˆPX and ˆD(i) X for i = 1, ..., k on each dataset, we applied the grid search method with a bandwidth pool using scikit-learn package (Pedregosa et al., 2011). |
| Experiment Setup | Yes | A multi-layer perceptron (MLP) with an architecture of (input dimension, 64, 64, 1) was utilized in all experimental setups to maintain comparison fairness. ... The β values for the WR-CP method are 9, 11, 9, 10, 13, and 13, respectively. ... The β values for the WR-CP method are 4.5, 9, 9, 6, 8, and 20, respectively. ... The selected β values for the results of Figure 5 are shown in Table 2. |