Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold

Authors: Hoang Phuc Hau Luu, Hanlin Yu, Bernardo Williams, Marcelo Hartmann, Arto Klami

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate that the proposed estimator gains order-of-magnitude improvements over previous Bures Wasserstein methods. We demonstrate the method in a collection of controlled problems, comparing it against the recent methods for VI in the BW manifold, namely BWGD (Lambert et al., 2022) and SGVI (Diao et al., 2023).
Researcher Affiliation Academia Department of Computer Science University of Helsinki, Helsinki, Finland
Pseudocode Yes Algorithm 1 Stochastic variance-reduced Gaussian Variational Inference (SVRGVI)
Open Source Code Yes Our code is available at https://github.com/MCS-hub/vr25
Open Datasets No Even though our experiments focused on synthetic targets, they expanded on the previous experimentation of VI optimized in the BW space. We confirm the finding of Diao et al. (2023) that BWGD and SGVI are effectively identical except for the instability of the former, but now show how their performance degrades in higher dimensions while our algorithm remains effective.
Dataset Splits No The paper uses synthetic targets (Gaussian, Student's t, Bayesian logistic regression) where the underlying distributions are approximated by variational inference, not traditional datasets requiring training/test/validation splits for model evaluation.
Hardware Specification No The authors wish to acknowledge CSC IT Center for Science, Finland, for computational resources. This statement is too general and does not provide specific hardware details like GPU/CPU models or memory specifications.
Software Dependencies No We use BFGS optimizer (Nocedal & Wright, 2006) as implemented in Sci Py (Virtanen et al., 2020)... We use Adam (Kingma & Ba, 2015) optimizer... While these software components are mentioned with citations, explicit version numbers for SciPy or Adam are not provided, which is required for reproducibility.
Experiment Setup Yes We set the step size to 1 for all algorithms (for an experiment on varying step size, see Appendix B.7), fix the covariate coefficient c = 0.9 (for an experiment on the effect of c, see Appendix B.6)... Table 1: Optimization details for EVI: Gaussian 10 (Learning Rate 0.01, Iterations 5,000), Gaussian 50 (Learning Rate 0.01, Iterations 5,000), Gaussian 200 (Learning Rate 0.001, Iterations 10,000), Student-t 200 (Learning Rate 0.001, Iterations 8,000), Logistic Regression 200 (Learning Rate 0.01, Iterations 3,000).