Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity
Authors: Gonçalo Correia, Vlad Niculae, Wilker Aziz, André Martins
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report successful results in three tasks covering a range of latent variable modeling applications: a semisupervised deep generative model, a latent communication game, and a generative model with a bit-vector latent representation. and 5 Experimental Analysis We next demonstrate the applicability of our proposed strategies by tackling three tasks: a deep generative model with semisupervision ( 5.1), an emergent communication two-player game over a discrete channel ( 5.2), and a variational autoencoder with latent binary factors ( 5.3). |
| Researcher Affiliation | Collaboration | Gonçalo M. Correiaä EMAIL Vlad Niculaeæ EMAIL Wilker Azizå EMAIL André F. T. Martinsä Èã EMAIL äInstituto de Telecomunicações, Lisbon, Portugal ÈLUMLIS (Lisbon ELLIS Unit), Instituto Superior Técnico, Lisbon, Portugal ãUnbabel, Lisbon, Portugal åILLC, University of Amsterdam, The Netherlands æIv I, University of Amsterdam, The Netherlands |
| Pseudocode | No | The paper describes algorithms and procedures in text (e.g., "The active set algorithm for Sparse MAP"), and Appendix B details "The Active Set Algorithm for Sparse MAP", but it does not present structured pseudocode or a formally labeled algorithm block. |
| Open Source Code | Yes | Code is publicly available at https://github.com/deep-spin/sparse-marginalization-lvm |
| Open Datasets | Yes | Data and architecture. We evaluate this model on the MNIST dataset [31], using 10% of labeled data, treating the remaining data as unlabeled. and We use Fashion-MNIST [42], consisting of 256-level grayscale images x {0, 1, . . . , 255}28 28. |
| Dataset Splits | No | The paper mentions using 10% labeled data for the semisupervised VAE task, and discusses training epochs, but does not provide specific train/validation/test dataset splits (percentages or counts) for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper mentions using PyTorch [62] for implementation, but it does not specify version numbers for PyTorch or any other software dependencies needed to reproduce the experiments. |
| Experiment Setup | Yes | We describe any further architecture and hyperparameter details in App. E. and within the experimental sections, details like For top-k sparsemax, we choose k = 10. and we used b = D 2 are mentioned. Appendix E details include: Each model was trained for 200 epochs. and All methods are trained for 500 epochs. |