reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SoftCVI: Contrastive variational inference with self-generated soft labels

Authors: Daniel Ward, Mark Beaumont, Matteo Fasiolo

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically investigate the performance on a variety of Bayesian inference tasks, using both simple (e.g. normal) and expressive (normalizing flow) variational distributions. We find that Soft CVI can be used to form objectives which are stable to train and mass-covering, frequently outperforming inference with other variational approaches.
Researcher Affiliation	Academia	Daniel Ward1, Mark Beaumont2, Matteo Fasiolo1 1School of Mathematics, Bristol University, UK 2School of Biological Sciences, Bristol University, UK
Pseudocode	Yes	An algorithm outlining the overall approach of Soft CVI is shown in algorithm 1. Algorithm 1: Soft CVI
Open Source Code	Yes	We provide a pair of Python packages, pyrox and softcvi validation, which provide the implementation, and the code for reproducing the results of this paper, respectively.
Open Datasets	Yes	For tasks where the reference posterior is created through sampling methods we relied on reference posteriors provided by Posterior DB (Magnusson et al., 2024) or SBI-Benchmark (Lueckmann et al., 2021) and for each run sampled from the available observations if multiple are present.
Dataset Splits	Yes	We use 300 training data points, 150 validation data points and 1000 testing data points for computing the metrics.
Hardware Specification	Yes	Run time / s (measured on a CPU with 8GB RAM, including compilation time).
Software Dependencies	No	Our implementation and experiments made wide use of the python packages JAX (Bradbury et al., 2018), equinox (Kidger & Garcia, 2021), Num Pyro (Phan et al., 2019), Flow JAX (Ward, 2024) and optax (Deep Mind et al., 2020). While Flow JAX's reference mentions 'version 14.0.0', explicit version numbers are not provided for JAX, equinox, Num Pyro, and optax directly in the text describing their usage.
Experiment Setup	Yes	For all methods, K = 8 samples from qϕ(θ) were used during computation of the objectives, and 50,000 optimization steps were performed with the Adam optimizer (Kingma & Ba, 2014). In all cases, the objectives were trained for 100,000 steps with a batch size of 1, and K = 8.