Graph-accelerated Markov Chain Monte Carlo using Approximate Samples
Authors: Leo L. Duan, Anirban Bhattacharya
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate improved mixing performances for challenging problems, such as those involving multiple modes, non-convex density contour, or large-dimension latent variables. Section 4. Simulations Section 5. Application: Estimating Latent Gaussian Model for Power Outage Data |
| Researcher Affiliation | Academia | Leo L Duan EMAIL Department of Statistics University of Florida, USA Anirban Bhattacharya EMAIL Department of Statistics Texas A&M University, USA |
| Pseudocode | Yes | In each iteration, with probability (1 w), the sampler will update θ using K(θt, ), the baseline algorithm; with probability w, the sampler will use Q(θt, ) to take a graph jump consisting of the following steps: 1. (Project to a node) Find the projection of θt to one βj, j = N(θt) := arg minl βl θt . 2. (Walk on the graph) Draw a new node i uniformly from the ball B(j; r). 3. (Relaxation from βi) Draw a proposal θ from a relaxation distribution F(θ | βi, θt). 4. (Metropolis Hastings adjustment) Accept θ as θt+1 with probability α(θt, θ ) = min 1, Π(θ | y)|B(i; r)| 1F(θt | βj, θ ) Π(θt | y)|B(j; r)| 1F(θ | βi, θt) 1 N(θ ) = i ; (2) otherwise keep θt+1 as the same as θt. |
| Open Source Code | Yes | The software is hosted and maintained on github repository under the following link: https://github.com/leoduan/graph_acc_mcmc |
| Open Datasets | No | The paper describes a dataset: "We use the power outage count for a zip code area in the south of Florida collected during a 90-day time period in the 2009 hurricane season." However, it does not provide any concrete access information (link, DOI, repository, or citation) for this dataset to be publicly available. |
| Dataset Splits | No | The paper mentions simulated data sizes ("n {100, 500, 1000, 2000}") and describes the power outage count data, but does not provide specific train/test/validation dataset splits, methodology, or references to predefined splits for any of the data used in the experiments. |
| Hardware Specification | Yes | it takes about 2 seconds to find the minimum spanning tree for m = 1600 on a laptop with Apple Silicon processor. |
| Software Dependencies | No | The paper mentions software like "numpyro package (Phan et al., 2019)" and "JAX package (Bradbury et al., 2018)" and references "Abril-Pla et al. (2023) for its sampler implementation" (referring to PyMC), but it does not specify version numbers for these software dependencies within the text. |
| Experiment Setup | Yes | In each iteration, with probability (1 w), the sampler will update θ using K(θt, ), the baseline algorithm; with probability w, the sampler will use Q(θt, ) to take a graph jump For acceleration, we use a variational distribution β... and use the graph-accelerated algorithm in (1) with w = 0.3 and Gaussian for F. We run both the baseline and accelerated algorithms for 10,000 iterations Using random-walk Metropolis as the baseline algorithm, we tweak s to around 0.5 so that the Metropolis acceptance rate is around 0.234. We run the algorithm for 3000 iterations and use the last 2000 as a Markov chain sample. We run the accelerated algorithm for 2000 iterations with w = 0.5 and r = 3, and report the empirical acceptance rate as the number of accepted graph jump steps divided by 1000, as equal to 2000 * 0.5. In prior specification, we use h Inverse-Gamma(2, 1) for the bandwidth, r N(0, )(0, 1) half Gaussian for the inverse dispersion parameter, τ Inverse-Gamma(2, 1) for the scale. We run the baseline algorithm for 20,000 iterations, and treat the first 5,000 as burn-in. we construct the graph and run the accelerated algorithm for 20,000 iterations (with the first 5,000 as burn-in). |