reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Convergence of Sparse Variational Inference in Gaussian Processes Regression

Authors: David R. Burt, Carl Edward Rasmussen, Mark van der Wilk

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we investigate upper and lower bounds on how M needs to grow with N to ensure high quality approximations. We show that we can make the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with M N. Speciﬁcally, for the popular squared exponential kernel and D-dimensional Gaussian distributed covariates, M = O((log N)D) suﬃce and a method with an overall computational cost of O N(log N)2D(log log N)2 can be used to perform inference. ... We provide recommendations on how to select inducing variables in practice, and demonstrate empirical improvements.
Researcher Affiliation	Collaboration	David R. Burt EMAIL Carl Edward Rasmussen EMAIL Department of Engineering, University of Cambridge, UK Mark van der Wilk EMAIL Department of Computing, Imperial College London, UK Prowler.io, Cambridge, UK
Pseudocode	Yes	Algorithm 1 MCMC algorithm for approximately sampling from an M-DPP (Anari et al., 2016) Input: Training inputs X = {xi}N i=1, number of points to choose, M, kernel k, T number of steps of MCMC to run. Returns: An (approximate) sample from a M-DPP with kernel matrix Kﬀformed by evaluating k at X. Initialize M columns by greedily selecting columns to maximize the determinant of the resulting submatrix. Call this set of indices of these columns Z0. for τ T do Sample i uniformly from Zτ and j uniformly from X \ Zτ. Deﬁne Z = Zτ \ {i} {j}, Compute pi j := 1 2 min{1, det(KZ )/det(KZτ )} With probability pi j, Zτ+1 = Z otherwise, Zτ+1 = Zτ end for Return: ZT
Open Source Code	Yes	We provide a GPﬂowbased (Matthews et al., 2017) implementation of the initialization methods and experiments that builds on other open source software (Coelho, 2017; Virtanen et al., 2020), available at https://github.com/markvdw/Robust GP.
Open Datasets	Yes	We consider 3 data sets from the UCI repository that are commonly used in benchmarking regression algorithms, Naval (Ntrain = 10740, Ntest = 1194, D = 14) , Elevators (Ntrain = 14939, Ntest = 1660, D = 18) and Energy (Ntrain = 691, Ntest = 77, D = 8).
Dataset Splits	Yes	We consider 3 data sets from the UCI repository that are commonly used in benchmarking regression algorithms, Naval (Ntrain = 10740, Ntest = 1194, D = 14) , Elevators (Ntrain = 14939, Ntest = 1660, D = 18) and Energy (Ntrain = 691, Ntest = 77, D = 8).
Hardware Specification	No	The paper does not explicitly describe the specific hardware used to run its experiments.
Software Dependencies	No	We provide a GPﬂowbased (Matthews et al., 2017) implementation of the initialization methods and experiments that builds on other open source software (Coelho, 2017; Virtanen et al., 2020)... For K-means, we run the Scipy implementation of K-means++ with M centres... We ran 104 steps of L-BFGS... The default Scipy settings.
Experiment Setup	Yes	The hyperparameters are set to the optimal values for an exact GP model, or for Naval a sparse GP with 1000 inducing points... For all experiments, we use a squared exponential kernel with automatic relevance determination (ARD), i.e. a separate lengthscale per input dimension... We ran 104 steps of L-BFGS, at which point any improvement was negligible compared to adding more inducing variables.