Residual Deep Gaussian Processes on Manifolds
Authors: Kacper Wyrwal, Andreas Krause, Viacheslav (Slava) Borovitskiy
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We examine residual deep GPs through synthetic and real-world experiments, demonstrating our models superior performance over shallow geometry-aware GPs on tasks where complex data inherently lies on a manifold. 4 EXPERIMENTS We begin this section by examining how various GVF and variational family choices impact the regression performance of residual deep GPs in synthetic experiments, as discussed in Section 4.1. Throughout, we compare our models to a baseline with Euclidean hidden layers. Next, in the robotics-inspired experiments of Section 4.2, we demonstrate that residual deep GPs can significantly enhance Bayesian optimisation on a manifold when the optimised function is irregular. Following this, in Section 4.3, we show state-of-the-art predictive and uncertainty calibration performance of residual deep GPs in wind velocity modelling on the globe, achieving interpretable patterns even at low altitudes where data is more complex and irregular. Finally, in Section 4.4, we explore potential avenues for using residual deep GPs to accelerate inference for inherently Euclidean data. |
| Researcher Affiliation | Academia | Kacper Wyrwal ETH Z urich University of Edinburgh Andreas Krause ETH Z urich Viacheslav Borovitskiy ETH Z urich |
| Pseudocode | No | The paper describes methods and procedures in paragraph text, but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, figures, or sections. |
| Open Source Code | Yes | Correspondence to EMAIL and EMAIL. Code available at https://github.com/KacperWyrwal/residual-deep-gps. |
| Open Datasets | Yes | We consider the task of interpolating the monthly average wind velocity from the ERA5 dataset (Hersbach et al., 2023), from a set of locations on the Aeolus satellite track (Reitebuch, 2012) |
| Dataset Splits | Yes | We take N {100, 200, 400, 800, 1600} training inputs x on a Fibonacci lattice on S2 and put y = f (x) + ε, ε N(0, 10 4I). Then, we regress f from x and y. On this problem, we compare different modifications of the residual deep GPs amongst themselves and to a baseline, in terms of the negative log predictive density (NLPD) and the mean squared error (MSE) metrics on the test set of 5000 points, also on a Fibonacci lattice. |
| Hardware Specification | Yes | We use a single Intel i7-13700H CPU. |
| Software Dependencies | No | The paper mentions software like PYMANOPT (Townsend et al., 2016), GEOOPT (Kochurov et al., 2020), and the Adam optimiser (Kingma and Ba, 2015), but it does not provide specific version numbers for any of these components. |
| Experiment Setup | Yes | Training and evaluation We optimise all models using the Adam optimiser (Kingma and Ba, 2015) for 1000 iterations with learning rate set to 0.01. In all experiments, to approximate the ELBO in deep models during training, we use 3 samples from the posterior. In evaluation, we use 10 samples from the posterior to approximate the MSE and NLPD. For kernels of output layers we initialise the variance to σ2 = 1.0, while for kernels in hidden layers of an L-layer deep GPs we set σ2 = 10 4L 1 at the start of training. In Section 4.1, Section 4.4, and Section 4.3 we initialise the smoothness parameter to ν = 3 2, while in Section 4.2 we set it to ν = 5 2 to replicate the setup in Jaquier et al. (2022). |