Identifiable Deep Generative Models via Sparse Decoding

Authors: Gemma Elyse Moran, Dhanya Sridhar, Yixin Wang, David Blei

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically study the sparse VAE with both simulated and real data. We find that it recovers meaningful latent factors and has smaller heldout reconstruction error than related methods.
Researcher Affiliation Academia Gemma E. Moran EMAIL Columbia University Dhanya Sridhar Mila Quebec AI Institute and Université de Montréal Yixin Wang University of Michigan David M. Blei Columbia University
Pseudocode Yes Algorithm 1: The sparse VAE
Open Source Code Yes The sparse VAE implementation may be found at https://github.com/gemoran/sparse-vae-code.
Open Datasets Yes Peer Read (Kang et al., 2018). Dataset of word counts for paper abstracts (N 10, 000, G = 500). Movie Lens (Harper and Konstan, 2015). Dataset of binary user-movie ratings (N = 100, 000, G = 300). Zeisel (Zeisel et al., 2015). Dataset of RNA molecule counts in mouse cortex cells (N = 3005, G = 558).
Dataset Splits Yes All results are averaged over five splits of the data, with standard deviation in parentheses. We assess this question using the semi-synthetic Peer Read dataset, where the train and test data were generated by factors with different correlations.
Hardware Specification Yes GPU: NVIDIA TITAN Xp graphics card (24GB). CPU: Intel E4-2620 v4 processor (64GB).
Software Dependencies No For stochastic optimization, we use automatic differentiation in Py Torch, with optimization using Adam (Kingma and Ba, 2015) with default settings (beta1=0.9, beta2=0.999) For LDA, we used Python s sklearn package with default settings.
Experiment Setup Yes Table 6: Settings for each experiment. Synthetic data ... # hidden layers 3 # layer dimension 50 Latent space dimension 5 Learning rate 0.01 Epochs 200 Batch size 100 Loss function Gaussian Sparse VAE λ1 = 1, λ0 = 10 β-VAE [2, 4, 6, 8, 16] VSC α = 0.01 OI-VAE λ = 1, p = 5 Runtime per split CPU, 2 mins