Approximate Bayesian inference from noisy likelihoods with Gaussian process emulated MCMC
Authors: Marko Järvenpää, Jukka Corander
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we investigate the effect of the tolerance parameter ε and the developed sequential experimental design strategies on the quality of the resulting posterior approximation. We compare our GP-MH and MH-BLFI implementations and also consider a BLFI implementation with the theoretically well-motivated and best-performing integrated median interquantile range (IMIQR) strategy by J arvenp a a et al. (2021). We consider three scenarios: 1) synthetically constructed log-densities corrupted with additive Gaussian noise (Section 5.1), 2) SL inference for simulator-based models (Section 5.2), 3) likelihood-free generalised Bayesian inference (Section 5.3). |
| Researcher Affiliation | Academia | Marko J arvenp a a EMAIL Department of Biostatistics, University of Oslo, Norway Jukka Corander EMAIL Department of Biostatistics, University of Oslo, Norway Department of Mathematics and Statistics, University of Helsinki, Finland Wellcome Sanger Institute, United Kingdom |
| Pseudocode | Yes | Algorithm 1 Metropolis-Hastings (MH) sampler for π(θ | xo) ... Algorithm 2 General form of B(O)LFI implementation ... Algorithm 3 Approximate GP-emulated MH (GP-MH) ... Algorithm 4 GP-MH reformulated in BLFI framework (MH-BLFI) |
| Open Source Code | Yes | Two particular implementations of the proposed framework were investigated numerically, codes for reproducing the experiments and visualisations are available at https: //github.com/mjarvenpaa/GP-MH. |
| Open Datasets | Yes | We use the same real data as Numminen et al. (2013) which describes colonisations with the bacterium Streptococcus pneumoniae and consists of varying numbers of sampled attendees at 29 day care centres at a single time point. Further details of the model and data can be found in Numminen et al. (2013). |
| Dataset Splits | No | The initial samples (e.g. the first half) are often discarded as burn-in . The remaining samples, here denoted as θ(0), . . . , θ(n), are approximately distributed as π(θ | xo) and can be used to estimate (1) as h ˆ hn+1 1 n + 1 i=0 h(θ(i)). |
| Hardware Specification | Yes | One log-SL evaluation takes approximately 7s (on a PC laptop with Intel Core i5 8265U and using the C-code by Price et al., 2018) |
| Software Dependencies | Yes | The experiments were performed using MATLAB 2022a. Some GP functionality was taken from GPstuff4.7 (Vanhatalo et al., 2013). |
| Experiment Setup | Yes | The number of initial evaluations is tinit = 10. Algorithm 3 is run for i MH = 105 iterations. ... The initial location and initial proposal covariance are θ(0) = (3.4, 0.9, 3.0, 8.0, 0.3) and Σ0 = diag(0.05, 0.1, 0.25, 0.5, 0.05)2, respectively. ... Interestingly, a fairly large tolerance (ε 0.3) is already sufficient to produce reasonable approximations. |