reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Pathwise Conditioning of Gaussian Processes

Authors: James T. Wilson, Viacheslav Borovitskiy, Alexander Terenin, Peter Mostowsky, Marc Peter Deisenroth

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We, then, ground these results by exploring the practical implications of pathwise conditioning in various applied settings, such as global optimization and reinforcement learning. The paper includes multiple sections (e.g., "7. Applications") with empirical evaluations, performance metrics, and comparative analyses, such as Figure 6, Figure 8, and Figure 9, which display median performances, simulations, and success rates.
Researcher Affiliation	Academia	James T. Wilson EMAIL Imperial College London, Viacheslav Borovitskiy EMAIL St. Petersburg State University and St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences, Alexander Terenin EMAIL Imperial College London, Peter Mostowsky EMAIL St. Petersburg State University, Marc Peter Deisenroth EMAIL Centre for Artiﬁcial Intelligence, University College London. All listed affiliations are academic institutions.
Pseudocode	No	The paper provides detailed mathematical derivations and descriptions of methods but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	Yes	We provide a general framework for pathwise conditioning of Gaussian processes based on GPﬂow (Matthews et al., 2017).15 Code is available online at https://github.com/j-wilson/GPﬂow Sampling. Additionally, 'PILCO implementation available separately at https://github.com/j-wilson/GPﬂow PILCO.'
Open Datasets	Yes	As an illustrative example, we trained a deep GP to act as an autoencoder for the MNIST dataset (Le Cun and Cortes, 2010).
Dataset Splits	No	The paper mentions using the MNIST dataset and refers to 'randomly chosen test images' but does not provide specific details on the training, validation, and test splits (e.g., percentages, sample counts, or methodology for generating these splits). For other experiments, data was generated rather than split from a predefined dataset.
Hardware Specification	No	The paper mentions 'Running on a single GPU' but does not specify the model or any other detailed hardware specifications such as CPU, RAM, or specific GPU model numbers used for running experiments.
Software Dependencies	No	The paper mentions using GPflow and TensorFlow, as well as optimizers like L-BFGS and ADAM, but it does not specify any version numbers for these software components, which is necessary for reproducibility.
Experiment Setup	Yes	The paper provides extensive details on experimental setups across various applications. For example, in Section 7.1, it states: 'black-box functions drawn from a known Matérn-5/2 prior with an isotropic length scale l = p d/100 and Gaussian observations y N f(x), 10 3 . We set κ = d'. In Section 7.3: 'where we have chosen τ = 0.25 ms, α = 0.75, β = 0.75, γ = 20, and Σε = 10 4I'. In Section 7.4: 'At each round, θ was updated 5000 times using ADAM (Kingma and Ba, 2015) with gradient norms clipped to one and an initial learning rate 0.01 that decreased by a factor of ten after every third of training'. In Section 7.5: 'Model evaluations were performed by using the sparse update (30) together with functions drawn from approximate priors constructed using ℓ= 256 random Fourier features. We associate each input image with a single draw of the model. Running on a single GPU, the model outlined above was jointly trained in just over 40 minutes using 104 steps of gradient descent with a batch size of 128'.