reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stacking Variational Bayesian Monte Carlo

Authors: Francesco Silvestrin, Chengkun LI, Luigi Acerbi

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we introduce Stacking Variational Bayesian Monte Carlo (S-VBMC), a method that overcomes this limitation by constructing a robust, global posterior approximation from multiple independent VBMC runs. Our approach merges these local approximations through a principled and inexpensive post-processing step that leverages VBMC s mixture posterior representation and per-component evidence estimates. Crucially, S-VBMC requires no additional likelihood evaluations and is naturally parallelisable, fitting seamlessly into existing inference workflows. We demonstrate its effectiveness on two synthetic problems designed to challenge VBMC s exploration and two real-world applications from computational neuroscience, showing substantial improvements in posterior approximation quality across all cases.
Researcher Affiliation	Academia	Francesco Silvestrin EMAIL Department of Computer Science University of Helsinki Chengkun Li EMAIL Department of Computer Science University of Helsinki Luigi Acerbi EMAIL Department of Computer Science University of Helsinki
Pseudocode	Yes	Algorithm 1 S-VBMC Input: outputs of M VBMC runs {ϕm,ˆIm, ELBO(ϕm)}M m=1 1. Stack the posteriors to obtain a mixture of PM m=1 Km Gaussians 2. Reparametrise the weights with unconstrained logits a 3. Initialise the logits as in Eq. 16 (a) Compute the normalised weights w as in Eq. 15 (b) Estimate ELBOstacked( w) as in Eq. 11 with the entropy approximated as in Eq. 12 (c) Compute d ELBOstacked(a) da (d) Update a with a step of Adam (Kingma & Ba, 2014) until ELBOstacked( w) converges return ϕ, ELBOstacked
Open Source Code	Yes	Our code is available as a Python package at https://github.com/acerbilab/svbmc.
Open Datasets	Yes	Our second real-world problem involved fitting a visuo-vestibular causal inference model to empirical data from a representative participant (S1 from Acerbi et al., 2018). In each trial of the modelled experiment, participants seated in a moving chair reported whether they perceived their movement direction (svest) as congruent with an experimentally-manipulated looming visual field (svis). The model assumes participants receive noisy sensory measurements, with vestibular information zvest N(svest, σ2 vest) and visual information zvis N(svis, σ2 vis(c)), where σ2 vest and σ2 vis represent sensory noise variances.
Dataset Splits	No	The paper uses synthetic GMM and ring targets, and empirical data for neuronal and multisensory models, but does not provide explicit training/test/validation splits for these datasets. It describes the data source for real-world problems but not how it was partitioned.
Hardware Specification	Yes	All the experiments presented in this work were run on an AMD EPYC 7452 Processor with 16GB of RAM.
Software Dependencies	No	The paper mentions software like the 'pyvbmc Python package', 'Adam optimiser', and 'NEURON simulation environment', but does not provide specific version numbers for these tools.
Experiment Setup	Yes	We performed all VBMC runs using the default settings (which always resulted in posteriors with 50 components, Km = 50 m {1, ..., M}) and random uniform initialisation within plausible parameter bounds (Acerbi, 2018). For all benchmark problems, the entropy is approximated as in Eq. 12 with S = 20 during optimisation, and a final estimation (reported in all tables and figures) is performed with S = 100 after convergence. The ELBO is optimised using Adam (Kingma & Ba, 2014) with learning rate set to 0.1. For a fair comparison with S-VBMC, we set the target evaluation budget for a BBVI run to 2000(D +2) and 3000(D + 2) evaluations for noiseless and noisy problems, respectively, matching the maximum evaluations used by 40 VBMC runs in total.