reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Target–Aware Bayesian Inference: How to Beat Optimal Conventional Estimators

Authors: Tom Rainforth, Adam Golinski, Frank Wood, Sheheryar Zaidi

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that it can also breach this limit in practice. We utilize our TABI framework by combining it with adaptive importance sampling approaches and show both theoretically and empirically that the resulting estimators are capable of converging faster than the standard O(1/N) Monte Carlo rate, potentially producing rates as fast as O(1/N 2). 3.3. An Empirical Demonstration 4.5. Experiments 5.3. Experiments
Researcher Affiliation	Academia	Tom Rainforth EMAIL Department of Statistics University of Oxford 29 St Giles , Oxford, OX1 3LB, United Kingdom Adam Goli nski EMAIL Department of Statistics and Department of Engineering Science University of Oxford 29 St Giles , Oxford, OX1 3LB, United Kingdom Frank Wood EMAIL Department of Computer Science University of British Columbia 2366 Main Mall 201, Vancouver, BC V6T 1Z4, Canada Sheheryar Zaidi EMAIL Department of Statistics University of Oxford 29 St Giles , Oxford, OX1 3LB, United Kingdom
Pseudocode	No	The paper describes methods and algorithms verbally and in mathematical notation (e.g., Section 6.1 Target Aware Nested Sampling outlines steps for NS). However, it does not contain clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured, code-like formatting.
Open Source Code	Yes	Code for these experiments and others is available at https://github.com/twgr/tabi. An implementation for AMCI and our associated experiments is available at http://github.com/talesa/amci.
Open Datasets	No	The paper uses various synthetic models (e.g., Gaussian model, banana problem, tail integral calculation) and a simulated cancer treatment scenario. These are models and simulations, not external publicly available datasets with specific access information (links, DOIs, or citations).
Dataset Splits	No	The paper discusses synthetic models and simulations for its experiments, rather than predefined datasets with explicit splits. While it mentions 'training and validation data sets' in the context of AMCI training (Appendix D), it does not provide specific split percentages or sample counts for experimental evaluation, nor does it reference standard benchmark splits with citations.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory amounts used for running its experiments. It focuses on the methodological aspects and experimental results without specifying the underlying computational infrastructure.
Software Dependencies	No	The paper mentions software like 'normalizing flows (Rezende and Mohamed, 2015)', 'conditional masked autoregressive ﬂows (CMAF) (Papamakarios et al., 2017)', 'implementation from http://github.com/ikostrikov/pytorch-flows' and the 'Adam optimizer (Kingma and Ba, 2015)'. However, it does not provide specific version numbers for any of these software components or libraries, which is necessary for reproducible ancillary software details.
Experiment Setup	Yes	For the Gaussian example: 'We draw R = 200 samples from each qt(x) between each proposal update... We further take N = M for TAAIS. ...We take Σmin = 0.42 when targeting γ2(x) and Σmin = 0.22 when targeting γ1(x).' For the banana example: 'We use S = 40 such chains and draw R = 200 samples from each qt(x). ...choose the following covariance setups for the diﬀerent problem conﬁgurations: [fa,γ2] Σ = 36I and ΣMCMC = 2.25I; [fa,γ+ 1 and γ 1 ] Σ = 2.25I and ΣMCMC = 2.25I; [fa,γ1] Σ = 9I and ΣMCMC = 2.25I; [fb, all γ] Σ = 16I and ΣMCMC = I.' For the tail integral calculation: 'For the one dimensional case, our ﬂow comprised of 10 radial ﬂow layers... This network was comprised of 3 fully connected layers with 1000 hidden units in each layer and relu activation functions. For the more challenging ﬁve dimensional case, we instead used conditional masked autoregressive ﬂows (CMAF) ... with 4 ﬂow layers of 1024 hidden units each. ...The Adam optimizer (Kingma and Ba, 2015) was adopted for both, with learning rates of 10-2 and 10-4 respectively.' For the cancer treatment: 'We then train a single layer perceptron with 500 hidden units to predict the parameters of these distributions as a function of (c 0, c 5). ...Training was performed using the Adam optimizer with a learning rate of 10-4.' For TANS: 'Step 2(c) above is conducted by running 20 steps of a Metropolis-Hastings chain... The proposal for this sampler is based on an isotropic Gaussian with ﬁxed variance. ...The variance of the proposal was set to I, 0.09 I, and 0.01 I for D = 10, 25, and 50 respectively.' For TAAn IS: 'We use n = 200 intermediate distributions and each τi is a 5-step Metropolis-Hastings chain... with covariances of 0.1225I, 0.04 I, and 0.01 I for D = 10, 25, and 50 respectively.'