Revisiting Topic-Guided Language Models
Authors: Carolina Zheng, Keyon Vafa, David Blei
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we detail the reproducibility study and results. We also investigate the quality of learned topics and probe the LSTM-LM s hidden representations to find the amount of retained topic information. |
| Researcher Affiliation | Academia | Carolina Zheng EMAIL Department of Computer Science Columbia University Keyon Vafa EMAIL Department of Computer Science Columbia University David M. Blei EMAIL Department of Statistics Department of Computer Science Columbia University |
| Pseudocode | No | The paper describes models and their components using mathematical equations and textual descriptions, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make public all code used for this study.1 1https://github.com/carolinazheng/revisiting-tglms |
| Open Datasets | Yes | We use four publicly available natural language datasets: APNEWS,2 IMDB (Maas et al., 2011), BNC (Consortium, 2007), and Wiki Text-2 (Merity et al., 2017). We follow the training, validation, and test splits from Lau et al. (2017) and Merity et al. (2017). |
| Dataset Splits | Yes | We follow the training, validation, and test splits from Lau et al. (2017) and Merity et al. (2017). Table 4 shows the dataset statistics. The data is preprocessed as follows. For Wiki Text-2, we use the standard vocabulary, tokenization, and splits from Merity et al. (2017). |
| Hardware Specification | Yes | The models in our codebase train to convergence within three days on a single Tesla V100 GPU. r GBN-RNN, trained using its public codebase, trains to convergence within one week on the same GPU. The experiments can be replicated on an AWS Tesla V100 GPU with 16GB GPU memory. |
| Software Dependencies | Yes | LSTM-LM, Topic RNN, VRTM, and TDLM are implemented in our codebase in Pytorch 1.12. We use the original implementation of r GBN-RNN, which uses Tensorflow 1.9. |
| Experiment Setup | Yes | For all LSTM-LM baselines, we use a hidden size of 600, word embeddings of size 300 initialized with Google News word2vec embeddings (Mikolov et al., 2013), and dropout of 0.4 between the LSTM input and output layers (and between the hidden layers for the 3-layer models). We train the RNN components using truncated backpropagation through time with a sequence length of 30. Following Lau et al. (2017), Rezaee & Ferraro (2020), and Guo et al. (2020), we use the Adam optimizer with a learning rate of 0.001 on APNEWS, IMDB, and BNC. For Wiki Text-2, we follow Merity et al. (2017) and use stochastic gradient descent; the initial learning rate is 20 and is divided by 4 when validation perplexity is worse than the previous iteration. The models are trained until validation perplexity does not improve for 5 epochs and we use the best validation checkpoint. We train all models on single GPUs with a language model batch size of 64. We train LDA via Gibbs sampling using Mallet (Mc Callum, 2002). The hyperparameters are: ̑̑ (topic density) = 50, Β (word density) = 0.01, number of iterations = 1000. |