Calibrated Uncertainty Quantification for Operator Learning via Conformal Prediction

Authors: Ziqi Ma, David Pitt, Kamyar Azizzadenesheli, Anima Anandkumar

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results on a 2D Darcy flow and a 3D car surface pressure prediction task validate our theoretical results, demonstrating calibrated coverage and efficient uncertainty bands outperforming baseline methods.
Researcher Affiliation Collaboration Ziqi Ma EMAIL California Institute of Technology David Pitt EMAIL California Institute of Technology Kamyar Azizzadenesheli EMAIL NVIDIA Anima Anandkumar EMAIL California Institute of Technology
Pseudocode Yes Algorithm 1: Risk-Controlling Quantile Neural Operator
Open Source Code Yes Code is available at https://github.com/neuraloperator/neuraloperator/tree/main (UQNO module).
Open Datasets Yes Empirical results on a 2D Darcy flow and a 3D car surface pressure prediction task validate our theoretical results [...] This is a data-rich scenario with 5000 total training data and 421 421 resolution, for which we obtain the ground truth from prior work Li et al. (2021). [...] The car surface is represented as a 3D mesh of 3586 points [...] car shapes from Umetani & Bickel (2018) modified from the Shape-Net dataset Chang et al. (2015) car category.
Dataset Splits No For UQNO, we split the training set in half for training the base and the quantile model. [...] This is a data-rich scenario with 5000 total training data [...] This is a data-scarce setting with only 500 total training samples - While training data is mentioned, overall train/test/validation splits are not explicitly provided.
Hardware Specification Yes Training time is approximate GPU hours on a single RTX4090.
Software Dependencies No The paper discusses neural operator architectures and methods but does not provide specific software names with version numbers for reproducibility.
Experiment Setup Yes We fix the same Fourier Neural Operator architecture for all methods. [...] MCDropout Gal & Ghahramani (2016): which predicts uncertainty by aggregating results from multiple (we use 10) models trained with random dropout [...] Deep Ensemble Lakshminarayanan et al. (2017): which predicts uncertainty by aggregating results from an ensemble (we use 10) of models [...] For both tasks, we show a high-domain-threshold scenario (α = 0.02 for Darcy and α = 0.04 for car, α values larger for car due to its lower resolution due to the correction term t in Equation 7) and a low-domain-threshold scenario (α = 0.1 for Darcy and α = 0.12 for car). In the Darcy problem, where we have sufficient data, most methods satisfy our calibration target of 98%...