Quantifying the Effectiveness of Linear Preconditioning in Markov Chain Monte Carlo

Authors: Max Hird, Samuel Livingstone

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conclude with a numerical study comparing preconditioners in different models, and we show how proper preconditioning can greatly reduce compute time in Hamiltonian Monte Carlo. Keywords: Markov chain Monte Carlo, Preconditioning, Bayesian inference, Bayesian Computation, Condition Number
Researcher Affiliation Academia Max Hird EMAIL Department of Statistical Science University College London 1 19 Torrington Place, London WC1E 7HB Samuel Livingstone EMAIL Department of Statistical Science University College London 1 19 Torrington Place, London WC1E 7HB
Pseudocode Yes Algorithm 1: Metropolized Markov chain input : Chain length N, Initial distribution µ, Initial state X0 µ, Proposal parameters θ output: Π-invariant Markov chain {Xi}N i=1
Open Source Code No The paper does not provide an explicit statement about releasing its own source code, nor does it provide a link to a code repository for the methodology described. It mentions using third-party software like Stan and TensorFlow Probability, but this is not an indication of the authors' own code release.
Open Datasets No The paper generates synthetic data for its experiments. For instance, in Section 4.2: "Every element of X is an independent standard normal random variable, and Y is generated by sampling β0 from the prior and setting Y = Xβ0 + ϵ with ϵ N(0, Id), meaning σ = 1." In Section 4.3: "We generate the design matrix X Rn d with X = G + M where Gij N(0, 1) independently and Mij = µ for all i [n], j [d]." It does not use or provide access to any publicly available datasets.
Dataset Splits No The paper does not explicitly describe training/test/validation dataset splits. The experiments involve simulating Markov chains and evaluating effective sample size over a certain number of iterations, rather than typical dataset splitting for supervised learning tasks. The data itself is generated synthetically per experiment.
Hardware Specification No The paper does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for running its experiments. It only mentions general computational concepts but no concrete hardware.
Software Dependencies No The paper mentions software components like "Stan", "Tensor Flow Probability library", and the "coda package" but does not provide specific version numbers for any of them. For example: "the effective Size function from the coda package (Plummer et al., 2006)".
Experiment Setup Yes In Section 4.2, the paper specifies various parameters for the MALA chains: "We set d {2, 5, 10, 20, 100} and n = {1, 5, 20} d for each value of d. At each combination of n and d we run 15 chains for each preconditioner. Each chain is composed by initialising at β = (XT X) 1XT Y and taking 104 samples to equilibrate. We initialise the step size at d 1/6 and adapt it along the course of the chain seeking an optimal acceptance rate of 0.574 according to the results of Roberts and Rosenthal (2001). We then continue the chain with preconditioning and a fixed step size of d 1/6 for a further 104 samples, over which we measure the ESS of each dimension." Similar details are provided for other experiments, such as acceptance rates and initialization strategies.