Calibrated Uncertainty Quantification for Operator Learning via Conformal Prediction
Authors: Ziqi Ma, David Pitt, Kamyar Azizzadenesheli, Anima Anandkumar
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on a 2D Darcy flow and a 3D car surface pressure prediction task validate our theoretical results, demonstrating calibrated coverage and efficient uncertainty bands outperforming baseline methods. |
| Researcher Affiliation | Collaboration | Ziqi Ma EMAIL California Institute of Technology David Pitt EMAIL California Institute of Technology Kamyar Azizzadenesheli EMAIL NVIDIA Anima Anandkumar EMAIL California Institute of Technology |
| Pseudocode | Yes | Algorithm 1: Risk-Controlling Quantile Neural Operator |
| Open Source Code | Yes | Code is available at https://github.com/neuraloperator/neuraloperator/tree/main (UQNO module). |
| Open Datasets | Yes | Empirical results on a 2D Darcy flow and a 3D car surface pressure prediction task validate our theoretical results [...] This is a data-rich scenario with 5000 total training data and 421 421 resolution, for which we obtain the ground truth from prior work Li et al. (2021). [...] The car surface is represented as a 3D mesh of 3586 points [...] car shapes from Umetani & Bickel (2018) modified from the Shape-Net dataset Chang et al. (2015) car category. |
| Dataset Splits | No | For UQNO, we split the training set in half for training the base and the quantile model. [...] This is a data-rich scenario with 5000 total training data [...] This is a data-scarce setting with only 500 total training samples - While training data is mentioned, overall train/test/validation splits are not explicitly provided. |
| Hardware Specification | Yes | Training time is approximate GPU hours on a single RTX4090. |
| Software Dependencies | No | The paper discusses neural operator architectures and methods but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | We fix the same Fourier Neural Operator architecture for all methods. [...] MCDropout Gal & Ghahramani (2016): which predicts uncertainty by aggregating results from multiple (we use 10) models trained with random dropout [...] Deep Ensemble Lakshminarayanan et al. (2017): which predicts uncertainty by aggregating results from an ensemble (we use 10) of models [...] For both tasks, we show a high-domain-threshold scenario (α = 0.02 for Darcy and α = 0.04 for car, α values larger for car due to its lower resolution due to the correction term t in Equation 7) and a low-domain-threshold scenario (α = 0.1 for Darcy and α = 0.12 for car). In the Darcy problem, where we have sufficient data, most methods satisfy our calibration target of 98%... |