Variational Learning of Fractional Posteriors

Authors: Kian Ming A. Chai, Edwin V. Bonilla

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate two cases where gradients can be obtained analytically and a simulation study on mixture models showing that our fractional posteriors can be used to achieve better calibration compared to posteriors from the conventional variational bound. When applied to variational autoencoders (VAEs), our approach attains higher evidence bounds and enables learning of high-performing approximate Bayes posteriors jointly with fractional posteriors. We show that VAEs trained with fractional posteriors produce decoders that are better aligned for generation from the prior. (...) We perform a simulation study on mixture models, demonstrating that fractional posteriors achieve better-calibrated uncertainties compared to conventional VI. (...) Finally, we consider variational autoencoders (VAEs) (Kingma & Welling, 2014), showing that fractional posteriors not only give higher evidence bounds but also enhance generative performance by aligning decoders with the prior a known issue in standard VAEs (Dai & Wipf, 2019).
Researcher Affiliation Academia 1DSO National Laboratories, Singapore 2CSIRO s Data61, Australia.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes methodologies and derivations in narrative text and mathematical equations, but without formal algorithm listings.
Open Source Code Yes Our source codes are available at https://github.com/csiro-funml/Variational-learning-of-Fractional-Posteriors/.
Open Datasets Yes We follow the experimental setup and the neural network models of Ruthotto & Haber (2021) for the gray-scale MNIST dataset (Lecun et al., 1998). (...) We illustrate this with the Fashion-MNIST dataset (Xiao et al., 2017) (...) The MNIST and Fashion-MNIST data sets are available via Torchvision.
Dataset Splits Yes We follow the experimental setup and the neural network models of Ruthotto & Haber (2021) for the gray-scale MNIST dataset (Lecun et al., 1998). (...) We illustrate this with the Fashion-MNIST dataset (Xiao et al., 2017) (...) To quantify, we generate 10,000 images from each trained decoder and measure their Fréchet inception distances (FIDs, Heusel et al. 2017; Seitzer 2020) to the test set...
Hardware Specification Yes Our source codes are available at https://github.com/csiro-funml/Variational-learning-of-Fractional-Posteriors/. They are executable on a single NVIDIA T4 GPU, which are available free (with limitations) on Google Collab, Kaggle and Amazon Sage Maker Studio Lab at the point of writing.
Software Dependencies No Other than the standard Python and PyTorch (https://pytorch.org/), including Torchvision, packages, we take reference from and make use of the following source codes: VAE https://github.com/Emory MLIP/Deep Generative Modeling Intro SIVI https://github.com/mingzhang-yin/SIVI FID https://github.com/mseitzer/pytorch-fid.
Experiment Setup Yes For training the explicit posteriors with Lγ, we use 100 samples per datum. For the semi-implicit posteriors with Lh γ, we draw 10 samples from the implicit distribution per datum, then 10 latent variables from the explicit distribution per implicit sample. For Lbh γ involving semi-implicit fractional and Bayes posteriors, we additionally use 5 of the 10 implicit samples to estimate the cross entropy term. Batch size of 64 is used for training with the Adam optimizer, where the learning rate and weight decay are set to 10-3 and 10-5. We train for 500 epochs instead of the 50 epochs that they have used, so that we obtain results closer to convergence to reduce doubts on the comparisons.