Quantifying the Effectiveness of Linear Preconditioning in Markov Chain Monte Carlo
Authors: Max Hird, Samuel Livingstone
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude with a numerical study comparing preconditioners in different models, and we show how proper preconditioning can greatly reduce compute time in Hamiltonian Monte Carlo. Keywords: Markov chain Monte Carlo, Preconditioning, Bayesian inference, Bayesian Computation, Condition Number |
| Researcher Affiliation | Academia | Max Hird EMAIL Department of Statistical Science University College London 1 19 Torrington Place, London WC1E 7HB Samuel Livingstone EMAIL Department of Statistical Science University College London 1 19 Torrington Place, London WC1E 7HB |
| Pseudocode | Yes | Algorithm 1: Metropolized Markov chain input : Chain length N, Initial distribution µ, Initial state X0 µ, Proposal parameters θ output: Π-invariant Markov chain {Xi}N i=1 |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code, nor does it provide a link to a code repository for the methodology described. It mentions using third-party software like Stan and TensorFlow Probability, but this is not an indication of the authors' own code release. |
| Open Datasets | No | The paper generates synthetic data for its experiments. For instance, in Section 4.2: "Every element of X is an independent standard normal random variable, and Y is generated by sampling β0 from the prior and setting Y = Xβ0 + ϵ with ϵ N(0, Id), meaning σ = 1." In Section 4.3: "We generate the design matrix X Rn d with X = G + M where Gij N(0, 1) independently and Mij = µ for all i [n], j [d]." It does not use or provide access to any publicly available datasets. |
| Dataset Splits | No | The paper does not explicitly describe training/test/validation dataset splits. The experiments involve simulating Markov chains and evaluating effective sample size over a certain number of iterations, rather than typical dataset splitting for supervised learning tasks. The data itself is generated synthetically per experiment. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for running its experiments. It only mentions general computational concepts but no concrete hardware. |
| Software Dependencies | No | The paper mentions software components like "Stan", "Tensor Flow Probability library", and the "coda package" but does not provide specific version numbers for any of them. For example: "the effective Size function from the coda package (Plummer et al., 2006)". |
| Experiment Setup | Yes | In Section 4.2, the paper specifies various parameters for the MALA chains: "We set d {2, 5, 10, 20, 100} and n = {1, 5, 20} d for each value of d. At each combination of n and d we run 15 chains for each preconditioner. Each chain is composed by initialising at β = (XT X) 1XT Y and taking 104 samples to equilibrate. We initialise the step size at d 1/6 and adapt it along the course of the chain seeking an optimal acceptance rate of 0.574 according to the results of Roberts and Rosenthal (2001). We then continue the chain with preconditioning and a fixed step size of d 1/6 for a further 104 samples, over which we measure the ESS of each dimension." Similar details are provided for other experiments, such as acceptance rates and initialization strategies. |