Generic Inference in Latent Gaussian Process Models

Authors: Edwin V. Bonilla, Karl Krauth, Amir Dezfouli

JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach quantitatively and qualitatively with experiments on small datasets, medium-scale datasets and large datasets, showing its competitiveness under different likelihood models and sparsity levels. On the large-scale experiments involving prediction of airline delays and classification of handwritten digits, we show that our method is on par with the state-of-the-art hard-coded approaches for scalable gp regression and classification.
Researcher Affiliation Academia Edwin V. Bonilla EMAIL Machine Learning Research Group CSIRO s Data61 Sydney NSW 2015, Australia Karl Krauth EMAIL Department of Electrical Engineering and Computer Science University of California Berkeley, CA 94720-1776, USA Amir Dezfouli EMAIL Machine Learning Research Group CSIRO s Data61 Sydney NSW 2015, Australia
Pseudocode No The paper describes the inference method through detailed mathematical derivations and explanations, but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes We have implemented our savigp method in Python and all the code is publicly available at https://github.com/Karl-Krauth/Sparse-GP.
Open Datasets Yes The datasets are summarized in Table 9.3, and are the same as those used by Nguyen and Bonilla (2014a). For example, "boston" and "abalone" refer to (Bache and Lichman, 2013) which points to "UCI machine learning repository, 2013. URL http://archive.ics.uci.edu/ml." The "mnist" dataset and "mnist8m (Loosli et al., 2007)" and "sarcos dataset (Vijayakumar and Schaal, 2000)" are also cited.
Dataset Splits Yes The datasets are summarized in Table 9.3... Ntrain, Ntest... For the airline delay prediction, "we selected the first 700, 000 data points starting at a given offset as the training set and the next 100, 000 data points as the test set. We generated five training/test sets by setting the initial offset to 0 and increasing it by 200, 000 each time."
Hardware Specification Yes Most of our experiments were either run on g2.2 aws instances, or on a desktop machine with an Intel core i5-4460 cpu, 8GB of ram, and a gtx760 gpu.
Software Dependencies No We have implemented our savigp method in Python... an implementation of savigp that uses Theano (Al-Rfou et al., 2016), a library that allows users to define symbolic mathematical expressions that get compiled to highly optimized gpu cuda code. This text mentions software but lacks specific version numbers for Theano or CUDA.
Experiment Setup Yes For optimization in the batch settings, each set of parameters was optimized using l-bfgs, with the maximum number of global iterations limited to 200. In the case of stochastic optimization, we used the adadelta method (Zeiler, 2012) with parameters ϵ = 10 6 and a decay rate of 0.95. We train savigp on the mnist8m dataset by optimizing only variational parameters stochastically, with a batch size of 1000 and 2000 inducing points. Prior mean depths of 200m, 500m, 1600m and 2200m and prior mean velocities of 1950m/s, 2300m/s, 2750m/s and 3650m/s. The corresponding standard deviations for the depths were set to 15% of the layer mean, and for the velocities they were set to 10% of the layer mean. A squared exponential covariance function with unit length-scale was used.