Variational Inference for Uncertainty Quantification: an Analysis of Trade-offs

Authors: Charles C. Margossian, Loucas Pillaud-Vivien, Lawrence K. Saul

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we empirically evaluate the validity of this ordering when the target distribution p is not Gaussian. [...] In section 6, we empirically investigate the validity of this ordering, on a variety of targets and data sets, when q is Gaussian but p is not.
Researcher Affiliation Academia Charles C. Margossian EMAIL Department of Statistics University of British Columbia Vancouver, BC, Canada Loucas Pillaud-Vivien EMAIL CERMICS Laboratory Ecole des Ponts Paris Tech Champs-sur-Marne, France Lawrence K. Saul EMAIL Center for Computational Mathematics Flatiron Institute New York, NY, USA
Pseudocode No The paper describes algorithms and derivations but does not present any structured pseudocode or algorithm blocks with explicit labels like "Algorithm" or "Pseudocode".
Open Source Code Yes Code to reproduce the experiments can be found at https://github.com/charlesm93/VI-ordering.
Open Datasets Yes We next experiment with target distributions from the inference gym, a curated library of diverse models for the study of inference algorithms (Sountsov et al., 2020). [...] German Credit. (n=25) A logistic regression applied to a credit data set (Dua and Graff, 2017)
Dataset Splits No The paper discusses various target distributions and models, but does not specify any training, validation, or test splits for the datasets used.
Hardware Specification No In addition, we parallelize many calculations on GPU, keeping run times short. This statement is too general and does not provide specific hardware models (e.g., NVIDIA A100, RTX 3090).
Software Dependencies No Each method is implemented in Python and uses the JAX library to calculate derivatives (Bradbury et al., 2018). Specific version numbers for Python or JAX are not provided.
Experiment Setup Yes Optimization is performed with the Adam optimizer (Kingma and Ba, 2015). As a baseline, we run the optimizer for 500 iterations, and at each iteration, we use B = 10, 000 draws from q (or p when using oracle samples) to estimate the relevant objective function and its gradient. This large number of draws is overkill for many problems, but it helps stabilize the optimization of α-divergences. In addition, we parallelize many calculations on GPU, keeping run times short. For certain targets, we find it necessary to fine-tune the optimizer in order to avoid numerical instability. We do so by adjusting the learning rate of Adam, the number B of Monte Carlo draws, and the number of iterations. [...] We find a constant learning rate λt =1 to work well for the experiments in Section 6.