Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models
Authors: Michael U. Gutmann, Jukka Corander
JMLR 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Applications of the developed methodology are given in Section 7, and Section 8 concludes the paper. We here apply the developed methodology to infer the complete Ricker model and the complete model of bacterial infections in day care centers. As in the previous section, using Bayesian optimization in likelihood-free inference (BOLFI) reduces the amount of required simulations by several orders of magnitude. |
| Researcher Affiliation | Academia | Michael U. Gutmann EMAIL Helsinki Institute for Information Technology HIIT Department of Mathematics and Statistics, University of Helsinki Department of Information and Computer Science, Aalto University Jukka Corander EMAIL Helsinki Institute for Information Technology HIIT Department of Mathematics and Statistics, University of Helsinki |
| Pseudocode | No | The paper describes algorithms and methods but does not include any structured pseudocode or algorithm blocks for its proposed methodology. |
| Open Source Code | No | The paper does not contain an explicit statement offering the source code for the described methodology, nor does it provide a direct link to a code repository. It mentions code of Wood (2010) but not its own. |
| Open Datasets | Yes | Example 2 (Ricker model)... Wood (2010) used this example to illustrate his synthetic likelihood approach to inference. Example 3 (Bacterial infections in day care centers). The model was developed by Numminen et al. (2013) to infer the transmission dynamics of bacterial infections in day care centers. |
| Dataset Splits | No | The paper discusses 'training data in the form of tuples (θ(i), (i) θ )' used for regression in its framework, but it does not specify any traditional training/validation/test splits for a fixed dataset in a manner required for reproducibility of data partitioning. The 'training data' refers to simulated observations for the Bayesian optimization process itself. |
| Hardware Specification | No | The paper mentions "The PMC computations took about 4.5 days on a cluster with 200 cores" and "took only about 1.5 hours on a desktop computer", but does not specify any particular CPU or GPU models, or detailed system specifications. |
| Software Dependencies | No | The paper references |
| Experiment Setup | Yes | For BOLFI, we modeled the random log synthetic likelihood ˆℓN s as a log-Gaussian process with a quadratic prior mean function (using N = 500 as Wood, 2010). Bayesian optimization was performed with the stochastic acquisition rule and 20 initial data points. For BOLFI, we modeled the discrepancy θ as a Gaussian process with a quadratic prior mean function and used the stochastic acquisition rule. The algorithm was initialized with 20 data points. |