SoftCVI: Contrastive variational inference with self-generated soft labels
Authors: Daniel Ward, Mark Beaumont, Matteo Fasiolo
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically investigate the performance on a variety of Bayesian inference tasks, using both simple (e.g. normal) and expressive (normalizing flow) variational distributions. We find that Soft CVI can be used to form objectives which are stable to train and mass-covering, frequently outperforming inference with other variational approaches. |
| Researcher Affiliation | Academia | Daniel Ward1, Mark Beaumont2, Matteo Fasiolo1 1School of Mathematics, Bristol University, UK 2School of Biological Sciences, Bristol University, UK |
| Pseudocode | Yes | An algorithm outlining the overall approach of Soft CVI is shown in algorithm 1. Algorithm 1: Soft CVI |
| Open Source Code | Yes | We provide a pair of Python packages, pyrox and softcvi validation, which provide the implementation, and the code for reproducing the results of this paper, respectively. |
| Open Datasets | Yes | For tasks where the reference posterior is created through sampling methods we relied on reference posteriors provided by Posterior DB (Magnusson et al., 2024) or SBI-Benchmark (Lueckmann et al., 2021) and for each run sampled from the available observations if multiple are present. |
| Dataset Splits | Yes | We use 300 training data points, 150 validation data points and 1000 testing data points for computing the metrics. |
| Hardware Specification | Yes | Run time / s (measured on a CPU with 8GB RAM, including compilation time). |
| Software Dependencies | No | Our implementation and experiments made wide use of the python packages JAX (Bradbury et al., 2018), equinox (Kidger & Garcia, 2021), Num Pyro (Phan et al., 2019), Flow JAX (Ward, 2024) and optax (Deep Mind et al., 2020). While Flow JAX's reference mentions 'version 14.0.0', explicit version numbers are not provided for JAX, equinox, Num Pyro, and optax directly in the text describing their usage. |
| Experiment Setup | Yes | For all methods, K = 8 samples from qϕ(θ) were used during computation of the objectives, and 50,000 optimization steps were performed with the Adam optimizer (Kingma & Ba, 2014). In all cases, the objectives were trained for 100,000 steps with a batch size of 1, and K = 8. |