Target–Aware Bayesian Inference: How to Beat Optimal Conventional Estimators
Authors: Tom Rainforth, Adam Golinski, Frank Wood, Sheheryar Zaidi
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show empirically that it can also breach this limit in practice. We utilize our TABI framework by combining it with adaptive importance sampling approaches and show both theoretically and empirically that the resulting estimators are capable of converging faster than the standard O(1/N) Monte Carlo rate, potentially producing rates as fast as O(1/N 2). 3.3. An Empirical Demonstration 4.5. Experiments 5.3. Experiments |
| Researcher Affiliation | Academia | Tom Rainforth EMAIL Department of Statistics University of Oxford 29 St Giles , Oxford, OX1 3LB, United Kingdom Adam Goli nski EMAIL Department of Statistics and Department of Engineering Science University of Oxford 29 St Giles , Oxford, OX1 3LB, United Kingdom Frank Wood EMAIL Department of Computer Science University of British Columbia 2366 Main Mall 201, Vancouver, BC V6T 1Z4, Canada Sheheryar Zaidi EMAIL Department of Statistics University of Oxford 29 St Giles , Oxford, OX1 3LB, United Kingdom |
| Pseudocode | No | The paper describes methods and algorithms verbally and in mathematical notation (e.g., Section 6.1 Target Aware Nested Sampling outlines steps for NS). However, it does not contain clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured, code-like formatting. |
| Open Source Code | Yes | Code for these experiments and others is available at https://github.com/twgr/tabi. An implementation for AMCI and our associated experiments is available at http://github.com/talesa/amci. |
| Open Datasets | No | The paper uses various synthetic models (e.g., Gaussian model, banana problem, tail integral calculation) and a simulated cancer treatment scenario. These are models and simulations, not external publicly available datasets with specific access information (links, DOIs, or citations). |
| Dataset Splits | No | The paper discusses synthetic models and simulations for its experiments, rather than predefined datasets with explicit splits. While it mentions 'training and validation data sets' in the context of AMCI training (Appendix D), it does not provide specific split percentages or sample counts for experimental evaluation, nor does it reference standard benchmark splits with citations. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory amounts used for running its experiments. It focuses on the methodological aspects and experimental results without specifying the underlying computational infrastructure. |
| Software Dependencies | No | The paper mentions software like 'normalizing flows (Rezende and Mohamed, 2015)', 'conditional masked autoregressive flows (CMAF) (Papamakarios et al., 2017)', 'implementation from http://github.com/ikostrikov/pytorch-flows' and the 'Adam optimizer (Kingma and Ba, 2015)'. However, it does not provide specific version numbers for any of these software components or libraries, which is necessary for reproducible ancillary software details. |
| Experiment Setup | Yes | For the Gaussian example: 'We draw R = 200 samples from each qt(x) between each proposal update... We further take N = M for TAAIS. ...We take Σmin = 0.42 when targeting γ2(x) and Σmin = 0.22 when targeting γ1(x).' For the banana example: 'We use S = 40 such chains and draw R = 200 samples from each qt(x). ...choose the following covariance setups for the different problem configurations: [fa,γ2] Σ = 36I and ΣMCMC = 2.25I; [fa,γ+ 1 and γ 1 ] Σ = 2.25I and ΣMCMC = 2.25I; [fa,γ1] Σ = 9I and ΣMCMC = 2.25I; [fb, all γ] Σ = 16I and ΣMCMC = I.' For the tail integral calculation: 'For the one dimensional case, our flow comprised of 10 radial flow layers... This network was comprised of 3 fully connected layers with 1000 hidden units in each layer and relu activation functions. For the more challenging five dimensional case, we instead used conditional masked autoregressive flows (CMAF) ... with 4 flow layers of 1024 hidden units each. ...The Adam optimizer (Kingma and Ba, 2015) was adopted for both, with learning rates of 10-2 and 10-4 respectively.' For the cancer treatment: 'We then train a single layer perceptron with 500 hidden units to predict the parameters of these distributions as a function of (c 0, c 5). ...Training was performed using the Adam optimizer with a learning rate of 10-4.' For TANS: 'Step 2(c) above is conducted by running 20 steps of a Metropolis-Hastings chain... The proposal for this sampler is based on an isotropic Gaussian with fixed variance. ...The variance of the proposal was set to I, 0.09 I, and 0.01 I for D = 10, 25, and 50 respectively.' For TAAn IS: 'We use n = 200 intermediate distributions and each τi is a 5-step Metropolis-Hastings chain... with covariances of 0.1225I, 0.04 I, and 0.01 I for D = 10, 25, and 50 respectively.' |