Preconditioning Kernel Matrices
Authors: Kurt Cutajar, Michael Osborne, John Cunningham, Maurizio Filippone
ICML 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate datasets over a range of problem size and dimensionality. Because PCG is exact in the limit of iterations (unlike approximate techniques), we demonstrate a tradeoff between accuracy and computational effort that improves beyond state-of-the-art approximation and factorization approaches. In this section, we provide an empirical exploration of these preconditioners in a practical setting. We begin by considering three datasets for regression from the UCI repository (Asuncion & Newman, 2007), namely the Concrete dataset (n = 1030, d = 8), the Power Plant dataset (n = 9568, d = 4), and the Protein dataset (n = 45730, d = 9). |
| Researcher Affiliation | Academia | Kurt Cutajar EMAIL EURECOM, Department of Data Science Michael A. Osborne EMAIL University of Oxford, Department of Engineering Science John P. Cunningham EMAIL Columbia University, Department of Statistics Maurizio Filippone EMAIL EURECOM, Department of Data Science |
| Pseudocode | Yes | Algorithm 1 The Preconditioned CG Algorithm, adapted from (Golub & Van Loan, 1996) Require: data X, vector v, convergence threshold ϵ, initial vector x0, maximum no. of iterations T |
| Open Source Code | Yes | Code to replicate all results in this paper is available at http://github.com/mauriziofilippone/preconditioned_GPs |
| Open Datasets | Yes | We begin by considering three datasets for regression from the UCI repository (Asuncion & Newman, 2007), namely the Concrete dataset (n = 1030, d = 8), the Power Plant dataset (n = 9568, d = 4), and the Protein dataset (n = 45730, d = 9). GP classification: Spam dataset (n = 4601, d = 57) and EEG dataset (n = 14979, d = 14). |
| Dataset Splits | Yes | All methods are initialized from the same set of kernel parameters, and the curves are averaged over 5 folds (3 for the Protein and EEG datasets). |
| Hardware Specification | Yes | For the sake of integrity, we ran each method in the comparison individually on a workstation with Intel Xeon E5-2630 CPU having 16 cores and 128GB RAM. |
| Software Dependencies | No | The paper states that "The CG, PCG and CHOL approaches have been implemented in R;" but does not specify a version for R or any specific libraries/packages with version numbers that are critical for reproducibility. It mentions GPstuff as a comparison target but also without version. |
| Experiment Setup | Yes | The convergence threshold is set to ϵ2 = n 10 10 so as to roughly accept an average error of 10 5 on each element of the solution. We focus on an isotropic RBF variant of the kernel in eq. 1, fixing the marginal variance σ2 to one. We vary the lengthscale parameter l and the noise variance λ in log10 scale. We set the stepsize to one. All methods are initialized from the same set of kernel parameters. |