reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Approximate Bayesian inference from noisy likelihoods with Gaussian process emulated MCMC

Authors: Marko Järvenpää, Jukka Corander

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we investigate the eﬀect of the tolerance parameter ε and the developed sequential experimental design strategies on the quality of the resulting posterior approximation. We compare our GP-MH and MH-BLFI implementations and also consider a BLFI implementation with the theoretically well-motivated and best-performing integrated median interquantile range (IMIQR) strategy by J arvenp a a et al. (2021). We consider three scenarios: 1) synthetically constructed log-densities corrupted with additive Gaussian noise (Section 5.1), 2) SL inference for simulator-based models (Section 5.2), 3) likelihood-free generalised Bayesian inference (Section 5.3).
Researcher Affiliation	Academia	Marko J arvenp a a EMAIL Department of Biostatistics, University of Oslo, Norway Jukka Corander EMAIL Department of Biostatistics, University of Oslo, Norway Department of Mathematics and Statistics, University of Helsinki, Finland Wellcome Sanger Institute, United Kingdom
Pseudocode	Yes	Algorithm 1 Metropolis-Hastings (MH) sampler for π(θ \| xo) ... Algorithm 2 General form of B(O)LFI implementation ... Algorithm 3 Approximate GP-emulated MH (GP-MH) ... Algorithm 4 GP-MH reformulated in BLFI framework (MH-BLFI)
Open Source Code	Yes	Two particular implementations of the proposed framework were investigated numerically, codes for reproducing the experiments and visualisations are available at https: //github.com/mjarvenpaa/GP-MH.
Open Datasets	Yes	We use the same real data as Numminen et al. (2013) which describes colonisations with the bacterium Streptococcus pneumoniae and consists of varying numbers of sampled attendees at 29 day care centres at a single time point. Further details of the model and data can be found in Numminen et al. (2013).
Dataset Splits	No	The initial samples (e.g. the ﬁrst half) are often discarded as burn-in . The remaining samples, here denoted as θ(0), . . . , θ(n), are approximately distributed as π(θ \| xo) and can be used to estimate (1) as h ˆ hn+1 1 n + 1 i=0 h(θ(i)).
Hardware Specification	Yes	One log-SL evaluation takes approximately 7s (on a PC laptop with Intel Core i5 8265U and using the C-code by Price et al., 2018)
Software Dependencies	Yes	The experiments were performed using MATLAB 2022a. Some GP functionality was taken from GPstuﬀ4.7 (Vanhatalo et al., 2013).
Experiment Setup	Yes	The number of initial evaluations is tinit = 10. Algorithm 3 is run for i MH = 105 iterations. ... The initial location and initial proposal covariance are θ(0) = (3.4, 0.9, 3.0, 8.0, 0.3) and Σ0 = diag(0.05, 0.1, 0.25, 0.5, 0.05)2, respectively. ... Interestingly, a fairly large tolerance (ε 0.3) is already suﬃcient to produce reasonable approximations.