Conformal Prediction as Bayesian Quadrature
Authors: Jake C. Snell, Thomas L. Griffiths
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on both synthetic data and calibration data collected from MS-COCO (Lin et al., 2014). For each data setting, we randomly generate M = 10,000 data splits. Each method is used to select λ with the goal of controlling the risk such that R(θ, λ) α for unknown θ. We compare algorithms on the basis of both the relative frequency of incurring risk greater than α and the prediction set size of the chosen λ. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Princeton University 2Department of Psychology, Princeton University. Correspondence to: Jake C. Snell <EMAIL>. |
| Pseudocode | No | The paper describes methods in prose and does not contain structured pseudocode or algorithm blocks in the provided text. |
| Open Source Code | Yes | Code for our experiments is publicly available on Github.4 |
| Open Datasets | Yes | We also compare methods on controlling the false negative rate of multilabel classification on the MS-COCO dataset (Lin et al., 2014). |
| Dataset Splits | Yes | For each data setting, we randomly generate M = 10,000 data splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For each data setting, we randomly generate M = 10,000 data splits. Each method is used to select λ with the goal of controlling the risk such that R(θ, λ) α for unknown θ. We set n = 10, K = 4, and α = 0.4. Monte Carlo simulation of Dirichlet random variates with 1000 samples. |