reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals

Authors: David Bolin, Vaibhav Mehandiratta, Alexandre B. Simas

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Besides theoretical justiﬁcations, we demonstrate accuracy empirically through carefully designed simulation studies, which show that the method outperforms state-of-the-art alternatives in accuracy for ﬁxed computational cost in tasks like Gaussian process regression.
Researcher Affiliation	Academia	David Bolin EMAIL CEMSE Division, statistics program King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia Vaibhav Mehandiratta EMAIL CEMSE Division, statistics program King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia Alexandre B. Simas EMAIL CEMSE Division, statistics program King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Pseudocode	No	The paper describes methods and propositions but does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The proposed method is implemented in the R package r SPDE (Bolin and Simas, 2023) available on CRAN, and all code for the comparisons, as well as a Shiny application with further results can be found in https://github.com/vpnsctl/Markov Approx Matern/.
Open Datasets	No	We ﬁrst consider the case of a dataset with 5000 observations, y = [y1, y2, . . . , yn], with the observation locations being evenly spaced over the interval I = [0, 50]. Each observation is generated as yi \| u( ) N(u(ti), σ2 e), where ti are the observation locations, and u is a centered Gaussian process with the Matérn covariance function given by (1). This indicates that the data was synthetically generated for the study, not obtained from an existing public dataset.
Dataset Splits	Yes	We ﬁrst consider the case of a dataset with 5000 observations, y = [y1, y2, . . . , yn], with the observation locations being evenly spaced over the interval I = [0, 50]. ... The goal is to compute the posterior mean µu\|y with elements (µu\|y)j = E(u(pj)\|y), and the posterior standard deviation σu\|y with elements (σu\|y)j = pVar(u(pj)\|y), where pj are the prediction locations. To make the comparison simple, we initially choose pj = tj evenly spaced in the interval. ... In Scenario 1, we consider a forecasting setting where we construct a regular mesh of 1501 points in the interval [0, 15]. The ﬁrst 1001 points, evenly spaced in [0, 10], serve as observation locations. For prediction, we use all observation locations plus the next n {0, 1, . . . , 10, 20, 30, 40, 50, 75, 100, 125, . . . , 500} consecutive points from the mesh as prediction locations. In Scenario 2, we instead use 250 observation locations randomly sampled uniformly in the interval [0, 10], with a minimum spacing of 10 3 between locations ensured through resampling if needed, to ensure stability for nn GP. For an increasing sequence of values n between 0 and 3000, we consider n evenly spaced prediction locations in the interval.
Hardware Specification	Yes	The results were obtained using a Mac Book Pro Laptop with an M3 Max processor and 128Gb of memory, without using any parallel computations to make the comparison as fair as possible.
Software Dependencies	Yes	The proposed method is implemented in the R package r SPDE (Bolin and Simas, 2023) available on CRAN, and all code for the comparisons, as well as a Shiny application with further results can be found in https://github.com/vpnsctl/Markov Approx Matern/. The paper also mentions 'R-INLA (Lindgren and Rue, 2015)' and 'inlabru (Bachl et al., 2019)'.
Experiment Setup	Yes	We set σ = 1 (as it is merely a scaling parameter) and explore two diﬀerent noise levels: σe = 0.1 and 0.1. Additionally, we vary the smoothness parameter ν over the interval (0, 2.5). For each value of ν, we choose κ = 2ν, ensuring that the practical correlation range, ρ = 8ν/κ, remains ﬁxed at 2. ... The total prediction times were averaged over 100 samples to obtain the calibrations.