Pathwise Conditioning of Gaussian Processes
Authors: James T. Wilson, Viacheslav Borovitskiy, Alexander Terenin, Peter Mostowsky, Marc Peter Deisenroth
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We, then, ground these results by exploring the practical implications of pathwise conditioning in various applied settings, such as global optimization and reinforcement learning. The paper includes multiple sections (e.g., "7. Applications") with empirical evaluations, performance metrics, and comparative analyses, such as Figure 6, Figure 8, and Figure 9, which display median performances, simulations, and success rates. |
| Researcher Affiliation | Academia | James T. Wilson EMAIL Imperial College London, Viacheslav Borovitskiy EMAIL St. Petersburg State University and St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences, Alexander Terenin EMAIL Imperial College London, Peter Mostowsky EMAIL St. Petersburg State University, Marc Peter Deisenroth EMAIL Centre for Artificial Intelligence, University College London. All listed affiliations are academic institutions. |
| Pseudocode | No | The paper provides detailed mathematical derivations and descriptions of methods but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | Yes | We provide a general framework for pathwise conditioning of Gaussian processes based on GPflow (Matthews et al., 2017).15 Code is available online at https://github.com/j-wilson/GPflow Sampling. Additionally, 'PILCO implementation available separately at https://github.com/j-wilson/GPflow PILCO.' |
| Open Datasets | Yes | As an illustrative example, we trained a deep GP to act as an autoencoder for the MNIST dataset (Le Cun and Cortes, 2010). |
| Dataset Splits | No | The paper mentions using the MNIST dataset and refers to 'randomly chosen test images' but does not provide specific details on the training, validation, and test splits (e.g., percentages, sample counts, or methodology for generating these splits). For other experiments, data was generated rather than split from a predefined dataset. |
| Hardware Specification | No | The paper mentions 'Running on a single GPU' but does not specify the model or any other detailed hardware specifications such as CPU, RAM, or specific GPU model numbers used for running experiments. |
| Software Dependencies | No | The paper mentions using GPflow and TensorFlow, as well as optimizers like L-BFGS and ADAM, but it does not specify any version numbers for these software components, which is necessary for reproducibility. |
| Experiment Setup | Yes | The paper provides extensive details on experimental setups across various applications. For example, in Section 7.1, it states: 'black-box functions drawn from a known Matérn-5/2 prior with an isotropic length scale l = p d/100 and Gaussian observations y N f(x), 10 3 . We set κ = d'. In Section 7.3: 'where we have chosen τ = 0.25 ms, α = 0.75, β = 0.75, γ = 20, and Σε = 10 4I'. In Section 7.4: 'At each round, θ was updated 5000 times using ADAM (Kingma and Ba, 2015) with gradient norms clipped to one and an initial learning rate 0.01 that decreased by a factor of ten after every third of training'. In Section 7.5: 'Model evaluations were performed by using the sparse update (30) together with functions drawn from approximate priors constructed using ℓ= 256 random Fourier features. We associate each input image with a single draw of the model. Running on a single GPU, the model outlined above was jointly trained in just over 40 minutes using 104 steps of gradient descent with a batch size of 128'. |