Approximate Bayesian inference from noisy likelihoods with Gaussian process emulated MCMC

Authors: Marko Järvenpää, Jukka Corander

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we investigate the effect of the tolerance parameter ε and the developed sequential experimental design strategies on the quality of the resulting posterior approximation. We compare our GP-MH and MH-BLFI implementations and also consider a BLFI implementation with the theoretically well-motivated and best-performing integrated median interquantile range (IMIQR) strategy by J arvenp a a et al. (2021). We consider three scenarios: 1) synthetically constructed log-densities corrupted with additive Gaussian noise (Section 5.1), 2) SL inference for simulator-based models (Section 5.2), 3) likelihood-free generalised Bayesian inference (Section 5.3).
Researcher Affiliation Academia Marko J arvenp a a EMAIL Department of Biostatistics, University of Oslo, Norway Jukka Corander EMAIL Department of Biostatistics, University of Oslo, Norway Department of Mathematics and Statistics, University of Helsinki, Finland Wellcome Sanger Institute, United Kingdom
Pseudocode Yes Algorithm 1 Metropolis-Hastings (MH) sampler for π(θ | xo) ... Algorithm 2 General form of B(O)LFI implementation ... Algorithm 3 Approximate GP-emulated MH (GP-MH) ... Algorithm 4 GP-MH reformulated in BLFI framework (MH-BLFI)
Open Source Code Yes Two particular implementations of the proposed framework were investigated numerically, codes for reproducing the experiments and visualisations are available at https: //github.com/mjarvenpaa/GP-MH.
Open Datasets Yes We use the same real data as Numminen et al. (2013) which describes colonisations with the bacterium Streptococcus pneumoniae and consists of varying numbers of sampled attendees at 29 day care centres at a single time point. Further details of the model and data can be found in Numminen et al. (2013).
Dataset Splits No The initial samples (e.g. the first half) are often discarded as burn-in . The remaining samples, here denoted as θ(0), . . . , θ(n), are approximately distributed as π(θ | xo) and can be used to estimate (1) as h ˆ hn+1 1 n + 1 i=0 h(θ(i)).
Hardware Specification Yes One log-SL evaluation takes approximately 7s (on a PC laptop with Intel Core i5 8265U and using the C-code by Price et al., 2018)
Software Dependencies Yes The experiments were performed using MATLAB 2022a. Some GP functionality was taken from GPstuff4.7 (Vanhatalo et al., 2013).
Experiment Setup Yes The number of initial evaluations is tinit = 10. Algorithm 3 is run for i MH = 105 iterations. ... The initial location and initial proposal covariance are θ(0) = (3.4, 0.9, 3.0, 8.0, 0.3) and Σ0 = diag(0.05, 0.1, 0.25, 0.5, 0.05)2, respectively. ... Interestingly, a fairly large tolerance (ε 0.3) is already sufficient to produce reasonable approximations.