Doubly Robust Conditional VAE via Decoder Calibration: An Implicit KL Annealing Approach

Authors: Chuanhui Liu, Xiao Wang

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on synthetic and real-world datasets demonstrate the superior performance of our method across various conditional density estimation tasks, highlighting its significance for accurate and reliable probabilistic modeling. The implementation is publicly available at https://github.com/chuanhuiliu/calibrated_cvae.
Researcher Affiliation Academia Chuanhui Liu EMAIL Department of Statistics, Purdue University Xiao Wang EMAIL Department of Statistics, Purdue University
Pseudocode Yes A general pseudo-code can be found in Algorithm 1.
Open Source Code Yes The implementation is publicly available at https://github.com/chuanhuiliu/calibrated_cvae.
Open Datasets Yes Experimental results on synthetic and real-world datasets demonstrate the superior performance of our method across various conditional density estimation tasks, highlighting its significance for accurate and reliable probabilistic modeling. The implementation is publicly available at https://github.com/chuanhuiliu/calibrated_cvae. ... we also validate our method on the MNIST (Deng, 2012) and Celeb A (Liu et al., 2018) datasets for conditional image generation and reconstruction in Appendix I & J. ...Finally, we compare the performance of calibrated σ-CVAE with Bayesian conditional normalizing flows (Trippe & Turner, 2018) on 6 UCI datasets, as listed in Table 4.
Dataset Splits Yes The random train-test split is 75% to 25%.
Hardware Specification Yes We implemented the proposed method using Py Torch 1.8.2 +cu111 with Python 3.7 on an Ubuntu internal cluster with multiple Nvidia GPUs including A10,A30, A100, A100-40GB, A100-80GB, and V100.
Software Dependencies Yes We implemented the proposed method using Py Torch 1.8.2 +cu111 with Python 3.7 on an Ubuntu internal cluster with multiple Nvidia GPUs including A10,A30, A100, A100-40GB, A100-80GB, and V100.
Experiment Setup Yes In this section, we provide evidence and more practical insights into the calibration of σ-CVAE through extensive numerical experiments given a finite training sample. We use the Adam (Kingma, 2014) stochastic gradient descent algorithm for training neural networks. The general learning rate is 0.005, and the convergence threshold is 0.001 in the average loss change. ...In Section 5.1, we used fully connected 4-layer neural networks with a hyperbolic tangent activation function for the encoding and decoding networks. The latent dimension is set to 2, and the width of the hidden layers is [16, 8, 4, 2] and [2, 4, 16, 4], respectively. σ initialized at 1. The batch size is equal to the sample size of the training data.