High-Dimensional Gaussian Process Regression with Soft Kernel Interpolation

Authors: Chris L Camaño, Daniel Huang

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of Soft KI across various examples and show that it is competitive with other approximated GP methods when the data dimensionality is modest (around 10). We evaluate Soft KI on a variety of datasets from the UCI repository (Kelly et al., 2017) and demonstrate that it achieves test root mean square error (RMSE) comparable to other inducing point methods for datasets with moderate dimensionality (approximately d = 10) (Section 4.1).
Researcher Affiliation Collaboration Chris Camaño EMAIL Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA Daniel Huang EMAIL Base26, CA, USA
Pseudocode Yes Algorithm 1 Soft KI Regression. The procedure kmeans performs k-means clustering, batch splits the dataset into batches, and softmax_interpolation produces a softmax interpolation matrix (see Section 3.1).
Open Source Code No No explicit statement about releasing code or a link to a repository for Soft KI's implementation is provided in the paper. The paper mentions using "a default implementation of SGPR and SVGP from GPy Torch" but not their own code.
Open Datasets Yes We evaluate the efficacy of a Soft KI against other scalable GP methods on data sets of varying size n and data dimensionality d from the UCI repository (Kelly et al., 2017), a common GP benchmark (Section 4.1). Next, we test Soft KIs on high-dimensional molecule data sets from the domain of computational chemistry (Section 4.2). Finally, we explore the numerical stability of Soft KI (Section 4.3).
Dataset Splits Yes For this experiment, we split the data set into 0.9 for training and 0.1 for testing. We standardize the data to have mean 0 and standard deviation 1 using the training data set.
Hardware Specification Yes We run all experiments on a single Nvidia RTX 3090 GPU which has 24Gb of VRAM. Our machine uses an Intel i9-10900X CPU at 3.70GHz with 10 cores.
Software Dependencies No The paper mentions libraries like "GPy Torch" and "Ke Ops" but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes We use a Matérn 3/2 kernel and a learnable output scale for each dimension (ARD). We choose m = 512 inducing points for Soft KI. We perform 50 epochs of training using the Adam optimizer (Kingma & Ba, 2014) for all methods with a learning rate of η = 0.01. The learning rate for SGPR is η = 0.1 since we are not performing batching. For Soft KI and SVGP, we use a minibatch size of 1024.