reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Inference for Uncertainty Quantification: an Analysis of Trade-offs

Authors: Charles C. Margossian, Loucas Pillaud-Vivien, Lawrence K. Saul

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we empirically evaluate the validity of this ordering when the target distribution p is not Gaussian. [...] In section 6, we empirically investigate the validity of this ordering, on a variety of targets and data sets, when q is Gaussian but p is not.
Researcher Affiliation	Academia	Charles C. Margossian EMAIL Department of Statistics University of British Columbia Vancouver, BC, Canada Loucas Pillaud-Vivien EMAIL CERMICS Laboratory Ecole des Ponts Paris Tech Champs-sur-Marne, France Lawrence K. Saul EMAIL Center for Computational Mathematics Flatiron Institute New York, NY, USA
Pseudocode	No	The paper describes algorithms and derivations but does not present any structured pseudocode or algorithm blocks with explicit labels like "Algorithm" or "Pseudocode".
Open Source Code	Yes	Code to reproduce the experiments can be found at https://github.com/charlesm93/VI-ordering.
Open Datasets	Yes	We next experiment with target distributions from the inference gym, a curated library of diverse models for the study of inference algorithms (Sountsov et al., 2020). [...] German Credit. (n=25) A logistic regression applied to a credit data set (Dua and Graﬀ, 2017)
Dataset Splits	No	The paper discusses various target distributions and models, but does not specify any training, validation, or test splits for the datasets used.
Hardware Specification	No	In addition, we parallelize many calculations on GPU, keeping run times short. This statement is too general and does not provide specific hardware models (e.g., NVIDIA A100, RTX 3090).
Software Dependencies	No	Each method is implemented in Python and uses the JAX library to calculate derivatives (Bradbury et al., 2018). Specific version numbers for Python or JAX are not provided.
Experiment Setup	Yes	Optimization is performed with the Adam optimizer (Kingma and Ba, 2015). As a baseline, we run the optimizer for 500 iterations, and at each iteration, we use B = 10, 000 draws from q (or p when using oracle samples) to estimate the relevant objective function and its gradient. This large number of draws is overkill for many problems, but it helps stabilize the optimization of α-divergences. In addition, we parallelize many calculations on GPU, keeping run times short. For certain targets, we ﬁnd it necessary to ﬁne-tune the optimizer in order to avoid numerical instability. We do so by adjusting the learning rate of Adam, the number B of Monte Carlo draws, and the number of iterations. [...] We ﬁnd a constant learning rate λt =1 to work well for the experiments in Section 6.