Linearization Turns Neural Operators into Function-Valued Gaussian Processes

Authors: Emilia Magnani, Marvin Pförtner, Tobias Weber, Philipp Hennig

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate linearized predictive uncertainty (LUNO) against sample-based approaches (Sample), which require additional approximations to impose a Gaussian Process structure over the output space. ... We evaluate the predictive uncertainty using standard metrics: the expected root mean squared error (RMSE) of the mean predictions, the expected marginal χ2 statistics, and the expected marginal negative log-likelihood (NLL) over 250 test input-output pairs. ... We demonstrate the capabilities of the framework in a case study on Fourier neural operators.
Researcher Affiliation Academia 1Tübingen AI Center, University of Tübingen, Tübingen, Germany. Correspondence to: Emilia Magnani <EMAIL>.
Pseudocode No The paper describes the methodology and algorithms using mathematical formulations and textual descriptions, but it does not include any explicit 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No Code. We provide an efficient implementation of the LUNO framework in JAX (Bradbury et al., 2018) at / Methods Of Machine Learning / luno. The code for our experiments can be found at / 2bys / luno-experiments.
Open Datasets Yes To evaluate the performance of the uncertainty quantification methods discussed, we utilize the code in the APEBench for generating data from Burgers , Hyper Diffusion and Kuramoto-Sivashinsky equation (conservative) (cf. (Koehler et al., 2024) for more details). Table 3 summarizes the characteristics of the datasets we use, the number of trajectories for training, and testing, as well as the spatial and temporal resolutions.
Dataset Splits Yes Table 3: Summary of PDE datasets generated using APEBench. PDE Name Dimensions Training Traj. Valid. Traj. Test Traj. Spatial Res. Temp. Res. Burgers 1D 25 250 250 256 59 Hyper Diffusion 1D 25 250 250 256 59 Kuramoto-Sivashinsky (cons.) 1D 25 250 250 256 59
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models.
Software Dependencies No All training implementations rely on jax (Bradbury et al., 2018), Flax NNX and optax.
Experiment Setup Yes For all experiments, we consider the original Fourier neural operator architecture (Li et al., 2021) with the hyperparameter suggestions following (Koehler et al., 2024), i.e. 12 modes (per spatial dimension) and 18 hidden dimensions constant throughout the network, with a total of 4 Fourier blocks. ... Networks for the low data experiment are trained for 100 epochs, all remaining networks are trained for 1000 epochs where one epoch corresponds to iterating through a single input-output pair per trajectory in the training set. During training the mean squared error loss was minimized using Adam W (Loshchilov & Hutter, 2019) combined with a cosine decay learning rate scheduler with warmup.