reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-Dimensional Gaussian Process Regression with Soft Kernel Interpolation

Authors: Chris L Camaño, Daniel Huang

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of Soft KI across various examples and show that it is competitive with other approximated GP methods when the data dimensionality is modest (around 10). We evaluate Soft KI on a variety of datasets from the UCI repository (Kelly et al., 2017) and demonstrate that it achieves test root mean square error (RMSE) comparable to other inducing point methods for datasets with moderate dimensionality (approximately d = 10) (Section 4.1).
Researcher Affiliation	Collaboration	Chris Camaño EMAIL Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA Daniel Huang EMAIL Base26, CA, USA
Pseudocode	Yes	Algorithm 1 Soft KI Regression. The procedure kmeans performs k-means clustering, batch splits the dataset into batches, and softmax_interpolation produces a softmax interpolation matrix (see Section 3.1).
Open Source Code	No	No explicit statement about releasing code or a link to a repository for Soft KI's implementation is provided in the paper. The paper mentions using "a default implementation of SGPR and SVGP from GPy Torch" but not their own code.
Open Datasets	Yes	We evaluate the efficacy of a Soft KI against other scalable GP methods on data sets of varying size n and data dimensionality d from the UCI repository (Kelly et al., 2017), a common GP benchmark (Section 4.1). Next, we test Soft KIs on high-dimensional molecule data sets from the domain of computational chemistry (Section 4.2). Finally, we explore the numerical stability of Soft KI (Section 4.3).
Dataset Splits	Yes	For this experiment, we split the data set into 0.9 for training and 0.1 for testing. We standardize the data to have mean 0 and standard deviation 1 using the training data set.
Hardware Specification	Yes	We run all experiments on a single Nvidia RTX 3090 GPU which has 24Gb of VRAM. Our machine uses an Intel i9-10900X CPU at 3.70GHz with 10 cores.
Software Dependencies	No	The paper mentions libraries like "GPy Torch" and "Ke Ops" but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	We use a Matérn 3/2 kernel and a learnable output scale for each dimension (ARD). We choose m = 512 inducing points for Soft KI. We perform 50 epochs of training using the Adam optimizer (Kingma & Ba, 2014) for all methods with a learning rate of η = 0.01. The learning rate for SGPR is η = 0.1 since we are not performing batching. For Soft KI and SVGP, we use a minibatch size of 1024.