Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals

Authors: David Bolin, Vaibhav Mehandiratta, Alexandre B. Simas

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Besides theoretical justifications, we demonstrate accuracy empirically through carefully designed simulation studies, which show that the method outperforms state-of-the-art alternatives in accuracy for fixed computational cost in tasks like Gaussian process regression.
Researcher Affiliation Academia David Bolin EMAIL CEMSE Division, statistics program King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia Vaibhav Mehandiratta EMAIL CEMSE Division, statistics program King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia Alexandre B. Simas EMAIL CEMSE Division, statistics program King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Pseudocode No The paper describes methods and propositions but does not present any structured pseudocode or algorithm blocks.
Open Source Code Yes The proposed method is implemented in the R package r SPDE (Bolin and Simas, 2023) available on CRAN, and all code for the comparisons, as well as a Shiny application with further results can be found in https://github.com/vpnsctl/Markov Approx Matern/.
Open Datasets No We first consider the case of a dataset with 5000 observations, y = [y1, y2, . . . , yn], with the observation locations being evenly spaced over the interval I = [0, 50]. Each observation is generated as yi | u( ) N(u(ti), σ2 e), where ti are the observation locations, and u is a centered Gaussian process with the Matérn covariance function given by (1). This indicates that the data was synthetically generated for the study, not obtained from an existing public dataset.
Dataset Splits Yes We first consider the case of a dataset with 5000 observations, y = [y1, y2, . . . , yn], with the observation locations being evenly spaced over the interval I = [0, 50]. ... The goal is to compute the posterior mean µu|y with elements (µu|y)j = E(u(pj)|y), and the posterior standard deviation σu|y with elements (σu|y)j = pVar(u(pj)|y), where pj are the prediction locations. To make the comparison simple, we initially choose pj = tj evenly spaced in the interval. ... In Scenario 1, we consider a forecasting setting where we construct a regular mesh of 1501 points in the interval [0, 15]. The first 1001 points, evenly spaced in [0, 10], serve as observation locations. For prediction, we use all observation locations plus the next n {0, 1, . . . , 10, 20, 30, 40, 50, 75, 100, 125, . . . , 500} consecutive points from the mesh as prediction locations. In Scenario 2, we instead use 250 observation locations randomly sampled uniformly in the interval [0, 10], with a minimum spacing of 10 3 between locations ensured through resampling if needed, to ensure stability for nn GP. For an increasing sequence of values n between 0 and 3000, we consider n evenly spaced prediction locations in the interval.
Hardware Specification Yes The results were obtained using a Mac Book Pro Laptop with an M3 Max processor and 128Gb of memory, without using any parallel computations to make the comparison as fair as possible.
Software Dependencies Yes The proposed method is implemented in the R package r SPDE (Bolin and Simas, 2023) available on CRAN, and all code for the comparisons, as well as a Shiny application with further results can be found in https://github.com/vpnsctl/Markov Approx Matern/. The paper also mentions 'R-INLA (Lindgren and Rue, 2015)' and 'inlabru (Bachl et al., 2019)'.
Experiment Setup Yes We set σ = 1 (as it is merely a scaling parameter) and explore two different noise levels: σe = 0.1 and 0.1. Additionally, we vary the smoothness parameter ν over the interval (0, 2.5). For each value of ν, we choose κ = 2ν, ensuring that the practical correlation range, ρ = 8ν/κ, remains fixed at 2. ... The total prediction times were averaged over 100 samples to obtain the calibrations.