Preconditioning Kernel Matrices

Authors: Kurt Cutajar, Michael Osborne, John Cunningham, Maurizio Filippone

ICML 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate datasets over a range of problem size and dimensionality. Because PCG is exact in the limit of iterations (unlike approximate techniques), we demonstrate a tradeoff between accuracy and computational effort that improves beyond state-of-the-art approximation and factorization approaches. In this section, we provide an empirical exploration of these preconditioners in a practical setting. We begin by considering three datasets for regression from the UCI repository (Asuncion & Newman, 2007), namely the Concrete dataset (n = 1030, d = 8), the Power Plant dataset (n = 9568, d = 4), and the Protein dataset (n = 45730, d = 9).
Researcher Affiliation Academia Kurt Cutajar EMAIL EURECOM, Department of Data Science Michael A. Osborne EMAIL University of Oxford, Department of Engineering Science John P. Cunningham EMAIL Columbia University, Department of Statistics Maurizio Filippone EMAIL EURECOM, Department of Data Science
Pseudocode Yes Algorithm 1 The Preconditioned CG Algorithm, adapted from (Golub & Van Loan, 1996) Require: data X, vector v, convergence threshold ϵ, initial vector x0, maximum no. of iterations T
Open Source Code Yes Code to replicate all results in this paper is available at http://github.com/mauriziofilippone/preconditioned_GPs
Open Datasets Yes We begin by considering three datasets for regression from the UCI repository (Asuncion & Newman, 2007), namely the Concrete dataset (n = 1030, d = 8), the Power Plant dataset (n = 9568, d = 4), and the Protein dataset (n = 45730, d = 9). GP classification: Spam dataset (n = 4601, d = 57) and EEG dataset (n = 14979, d = 14).
Dataset Splits Yes All methods are initialized from the same set of kernel parameters, and the curves are averaged over 5 folds (3 for the Protein and EEG datasets).
Hardware Specification Yes For the sake of integrity, we ran each method in the comparison individually on a workstation with Intel Xeon E5-2630 CPU having 16 cores and 128GB RAM.
Software Dependencies No The paper states that "The CG, PCG and CHOL approaches have been implemented in R;" but does not specify a version for R or any specific libraries/packages with version numbers that are critical for reproducibility. It mentions GPstuff as a comparison target but also without version.
Experiment Setup Yes The convergence threshold is set to ϵ2 = n 10 10 so as to roughly accept an average error of 10 5 on each element of the solution. We focus on an isotropic RBF variant of the kernel in eq. 1, fixing the marginal variance σ2 to one. We vary the lengthscale parameter l and the noise variance λ in log10 scale. We set the stepsize to one. All methods are initialized from the same set of kernel parameters.