Denoising Diffusion Variational Inference: Diffusion Models as Expressive Variational Posteriors
Authors: Wasu Top Piriyakulkij, Yingheng Wang, Volodymyr Kuleshov
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate DDVI on synthetic benchmarks and on a real problem in biological data analysis inferring human ancestry from genetic data. Our method outperforms strong baselines on the Thousand Genomes dataset (Siva 2008) and learns a low-dimensional latent space that preserves biologically meaningful structure (Haghverdi, Buettner, and Theis 2015). We compare DDVI with Auto-Encoding Variational Bayes (AEVB) (Kingma and Welling 2013), AEVB with inverse autoregressive flow posteriors (AEVB-IAF) (Kingma et al. 2016), Adversarial Auto-Encoding Bayes (AAEB) (Makhzani et al. 2015), and Path Integral Sampler (PIS) (Zhang and Chen 2021) on MNIST (Lecun et al. 1998) and CIFAR-10 (Krizhevsky and Hinton 2009) in unsupervised and semi-supervised learning settings, and also on the Thousand Genomes dataset (Siva 2008). From Table 1 and Table 7 in Appendix, we see our method DDVI achieve best ELBO in all but one scenario, in which it still performs competitively. |
| Researcher Affiliation | Academia | Wasu Top Piriyakulkij*1, Yingheng Wang*1, Volodymyr Kuleshov1,2 1Department of Computer Science, Cornell University 2The Jacobs Technion-Cornell Institute, Cornell Tech EMAIL |
| Pseudocode | No | The paper describes the optimization steps in Section 3.4 'Optimization: Extending Wake-Sleep' by listing steps 1 and 2, but these are presented as prose within the text rather than as a formally structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository. |
| Open Datasets | Yes | We compare DDVI with Auto-Encoding Variational Bayes (AEVB) (Kingma and Welling 2013), AEVB with inverse autoregressive flow posteriors (AEVB-IAF) (Kingma et al. 2016), Adversarial Auto-Encoding Bayes (AAEB) (Makhzani et al. 2015), and Path Integral Sampler (PIS) (Zhang and Chen 2021) on MNIST (Lecun et al. 1998) and CIFAR-10 (Krizhevsky and Hinton 2009) in unsupervised and semi-supervised learning settings, and also on the Thousand Genomes dataset (Siva 2008). |
| Dataset Splits | No | The paper mentions specific numbers of labels observed for semi-supervised learning (e.g., '1,000 for MNIST and 10,000 for CIFAR-10'), but it does not provide explicit train/test/validation split percentages or sample counts for the full datasets in the main text. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models used for running experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The priors, model architecture, and training details can also be founded in Appendix H. |