Linearization Turns Neural Operators into Function-Valued Gaussian Processes
Authors: Emilia Magnani, Marvin Pförtner, Tobias Weber, Philipp Hennig
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate linearized predictive uncertainty (LUNO) against sample-based approaches (Sample), which require additional approximations to impose a Gaussian Process structure over the output space. ... We evaluate the predictive uncertainty using standard metrics: the expected root mean squared error (RMSE) of the mean predictions, the expected marginal χ2 statistics, and the expected marginal negative log-likelihood (NLL) over 250 test input-output pairs. ... We demonstrate the capabilities of the framework in a case study on Fourier neural operators. |
| Researcher Affiliation | Academia | 1Tübingen AI Center, University of Tübingen, Tübingen, Germany. Correspondence to: Emilia Magnani <EMAIL>. |
| Pseudocode | No | The paper describes the methodology and algorithms using mathematical formulations and textual descriptions, but it does not include any explicit 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | Code. We provide an efficient implementation of the LUNO framework in JAX (Bradbury et al., 2018) at / Methods Of Machine Learning / luno. The code for our experiments can be found at / 2bys / luno-experiments. |
| Open Datasets | Yes | To evaluate the performance of the uncertainty quantification methods discussed, we utilize the code in the APEBench for generating data from Burgers , Hyper Diffusion and Kuramoto-Sivashinsky equation (conservative) (cf. (Koehler et al., 2024) for more details). Table 3 summarizes the characteristics of the datasets we use, the number of trajectories for training, and testing, as well as the spatial and temporal resolutions. |
| Dataset Splits | Yes | Table 3: Summary of PDE datasets generated using APEBench. PDE Name Dimensions Training Traj. Valid. Traj. Test Traj. Spatial Res. Temp. Res. Burgers 1D 25 250 250 256 59 Hyper Diffusion 1D 25 250 250 256 59 Kuramoto-Sivashinsky (cons.) 1D 25 250 250 256 59 |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | All training implementations rely on jax (Bradbury et al., 2018), Flax NNX and optax. |
| Experiment Setup | Yes | For all experiments, we consider the original Fourier neural operator architecture (Li et al., 2021) with the hyperparameter suggestions following (Koehler et al., 2024), i.e. 12 modes (per spatial dimension) and 18 hidden dimensions constant throughout the network, with a total of 4 Fourier blocks. ... Networks for the low data experiment are trained for 100 epochs, all remaining networks are trained for 1000 epochs where one epoch corresponds to iterating through a single input-output pair per trajectory in the training set. During training the mean squared error loss was minimized using Adam W (Loshchilov & Hutter, 2019) combined with a cosine decay learning rate scheduler with warmup. |