Identifiable Deep Generative Models via Sparse Decoding
Authors: Gemma Elyse Moran, Dhanya Sridhar, Yixin Wang, David Blei
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically study the sparse VAE with both simulated and real data. We find that it recovers meaningful latent factors and has smaller heldout reconstruction error than related methods. |
| Researcher Affiliation | Academia | Gemma E. Moran EMAIL Columbia University Dhanya Sridhar Mila Quebec AI Institute and Université de Montréal Yixin Wang University of Michigan David M. Blei Columbia University |
| Pseudocode | Yes | Algorithm 1: The sparse VAE |
| Open Source Code | Yes | The sparse VAE implementation may be found at https://github.com/gemoran/sparse-vae-code. |
| Open Datasets | Yes | Peer Read (Kang et al., 2018). Dataset of word counts for paper abstracts (N 10, 000, G = 500). Movie Lens (Harper and Konstan, 2015). Dataset of binary user-movie ratings (N = 100, 000, G = 300). Zeisel (Zeisel et al., 2015). Dataset of RNA molecule counts in mouse cortex cells (N = 3005, G = 558). |
| Dataset Splits | Yes | All results are averaged over five splits of the data, with standard deviation in parentheses. We assess this question using the semi-synthetic Peer Read dataset, where the train and test data were generated by factors with different correlations. |
| Hardware Specification | Yes | GPU: NVIDIA TITAN Xp graphics card (24GB). CPU: Intel E4-2620 v4 processor (64GB). |
| Software Dependencies | No | For stochastic optimization, we use automatic differentiation in Py Torch, with optimization using Adam (Kingma and Ba, 2015) with default settings (beta1=0.9, beta2=0.999) For LDA, we used Python s sklearn package with default settings. |
| Experiment Setup | Yes | Table 6: Settings for each experiment. Synthetic data ... # hidden layers 3 # layer dimension 50 Latent space dimension 5 Learning rate 0.01 Epochs 200 Batch size 100 Loss function Gaussian Sparse VAE λ1 = 1, λ0 = 10 β-VAE [2, 4, 6, 8, 16] VSC α = 0.01 OI-VAE λ = 1, p = 5 Runtime per split CPU, 2 mins |