reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

How Deep Are Deep Gaussian Processes?

Authors: Matthew M. Dunlop, Mark A. Girolami, Andrew M. Stuart, Aretha L. Teckentrup

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also describe numerical experiments which illustrate the theory, and which demonstrate some of the limitations of the framework in the inference context, suggesting the need for further algorithmic innovation and theoretical understanding.
Researcher Affiliation	Academia	Matthew M. Dunlop EMAIL Computing and Mathematical Sciences Caltech Pasadena, CA 91125, USA Mark A. Girolami EMAIL Department of Mathematics Imperial College London London, SW7 2AZ, UK and The Alan Turing Institute 96 Euston Road London, NW1 2DB, UK Andrew M. Stuart EMAIL Computing and Mathematical Sciences Caltech Pasadena, CA 91125, USA Aretha L. Teckentrup EMAIL School of Mathematics University of Edinburgh Edinburgh, EH9 3FD, UK and The Alan Turing Institute 96 Euston Road London, NW1 2DB, UK
Pseudocode	Yes	Algorithm 1 Non-Centred Algorithm 1. Fix β0, . . . , βN 1 (0, 1] and deﬁne B = diag(βj). Choose initial state ξ(0) X, and set u(0) = T(ξ(0)) X. Set k = 0. 2. Propose ˆξ(k) = (I B2) 1 2 ξ(k) + Bζ(k) j , ζ(k) N(0, I). 3. Set ξ(k+1) = ˆξ(k) with probability αk = min n 1, exp Φ T(ξ(k)); y Φ T(ˆξ(k)); y o ; otherwise set ξ(k+1) = ξ(k). 4. Set k 7 k + 1 and go to 1.
Open Source Code	No	The paper does not provide an explicit statement about open-source code availability or a link to a code repository.
Open Datasets	No	We consider ﬁrst the case D = (0, 1), where the forward map is given by a number of point evaluations: Gj(u) = u(xj) for some sequence {xj}J j=1 D. We compare the quality of reconstruction versus both the number of point evaluations and the number of levels in the deep Gaussian prior. We use the same parameters for the family of covariance operators as in subsection 4.2. The base layer u0 is taken to be Gaussian with covariance of the form (15), with Γ(u) 202. The true unknown ﬁeld u is given by the indicator function u = 1(0.3,0.7), shown in Figure 6. It is generated on a mesh of 400 points, and three data sets are created wherein it is observed on uniform grids of J = 25, 50 and 100 points, and corrupted by white noise with standard deviation γ = 0.02. Sampling is performed on a mesh of 200 points to avoid an inverse crime (Kaipio and Somersalo, 2006).
Dataset Splits	No	The paper describes generating custom data on specified grids and points, but does not provide training/test/validation splits. It refers to "data sets created" for observation points, but not for typical ML model training/evaluation.
Hardware Specification	No	The paper does not explicitly describe the hardware used for its experiments. It mentions using MATLAB for sampling.
Software Dependencies	No	In Figure 1, we show four independent realizations of the ﬁrst seven layers u0, . . . , u6, where u0 is taken as a sample of the stationary Gaussian process with correlation kernel ρS. The domain D is here chosen as the interval (0, 1), and the sampling points are given by the uniform grid xi = i 1 256 , for i = 1, . . . , 257. Each column in Figure 1 corresponds to one realization, and each row corresponds to a given layer un, the ﬁrst row showing u0. We can clearly see the non-stationary behaviour in the samples when progressing through the levels. We note that the ergodicity of the chain is also reﬂected in the samples, with the distribution of the samples un looking similar for larger values of n. Figure 2 shows the same information as Figure 1, in the case where the domain D is (0, 1)2 and the sampling points are the tensor product of the one-dimensional points x1 i = i 1 64 , for i = 1, . . . , 65. To generate the samples, we use the command mvnrnd in MATLAB, and when plotting the samples, we use linear interpolation.
Experiment Setup	Yes	For numerical experiments, we take F(u) = min{F + aebu2, F+} for some F+, F , a, b > 0. In particular, in one spatial dimension we take F+ = 1502, F = 200, a = 100 and b = 2. In two dimensions, we take F+ = 1502, F = 50, a = 25 and b = 0.3. We take α = 4 in both cases, and choose σ such that E u(x)2 1. Sampling is performed on a mesh of 200 points to avoid an inverse crime (Kaipio and Somersalo, 2006). 106 samples are generated per chain, with the ﬁrst 2 105 discarded as burn-in when calculating means. The jump parameters βj are adaptively tuned to keep acceptance rates close to 30%. It is generated on a uniform square mesh of 214 points, and two data sets are created wherein it is observed on uniform square grid of J = 210, 28 points, and corrupted by white noise with standard deviation γ = 0.02. Sampling is performed on a mesh of 212 points to again avoid an inverse crime. 4 105 samples are generated per chain, with the ﬁrst 2 105 discarded as burn-in when calculating means. Again the jump parameters βj are adaptively tuned to keep acceptance rates close to 30%.