reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models

Authors: Michael U. Gutmann, Jukka Corander

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Applications of the developed methodology are given in Section 7, and Section 8 concludes the paper. We here apply the developed methodology to infer the complete Ricker model and the complete model of bacterial infections in day care centers. As in the previous section, using Bayesian optimization in likelihood-free inference (BOLFI) reduces the amount of required simulations by several orders of magnitude.
Researcher Affiliation	Academia	Michael U. Gutmann EMAIL Helsinki Institute for Information Technology HIIT Department of Mathematics and Statistics, University of Helsinki Department of Information and Computer Science, Aalto University Jukka Corander EMAIL Helsinki Institute for Information Technology HIIT Department of Mathematics and Statistics, University of Helsinki
Pseudocode	No	The paper describes algorithms and methods but does not include any structured pseudocode or algorithm blocks for its proposed methodology.
Open Source Code	No	The paper does not contain an explicit statement offering the source code for the described methodology, nor does it provide a direct link to a code repository. It mentions code of Wood (2010) but not its own.
Open Datasets	Yes	Example 2 (Ricker model)... Wood (2010) used this example to illustrate his synthetic likelihood approach to inference. Example 3 (Bacterial infections in day care centers). The model was developed by Numminen et al. (2013) to infer the transmission dynamics of bacterial infections in day care centers.
Dataset Splits	No	The paper discusses 'training data in the form of tuples (θ(i), (i) θ )' used for regression in its framework, but it does not specify any traditional training/validation/test splits for a fixed dataset in a manner required for reproducibility of data partitioning. The 'training data' refers to simulated observations for the Bayesian optimization process itself.
Hardware Specification	No	The paper mentions "The PMC computations took about 4.5 days on a cluster with 200 cores" and "took only about 1.5 hours on a desktop computer", but does not specify any particular CPU or GPU models, or detailed system specifications.
Software Dependencies	No	The paper references
Experiment Setup	Yes	For BOLFI, we modeled the random log synthetic likelihood ˆℓN s as a log-Gaussian process with a quadratic prior mean function (using N = 500 as Wood, 2010). Bayesian optimization was performed with the stochastic acquisition rule and 20 initial data points. For BOLFI, we modeled the discrepancy θ as a Gaussian process with a quadratic prior mean function and used the stochastic acquisition rule. The algorithm was initialized with 20 data points.