reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Document Neural Autoregressive Distribution Estimation

Authors: Stanislas Lauly, Yin Zheng, Alexandre Allauzen, Hugo Larochelle

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To compare the topic models, two quantitative measures are used. The ﬁrst one evaluates the generative ability of different models, by computing the perplexity on held-out texts. The second one compares the quality of document representations for information retrieval. Two different datasets are used for the experiments in this Section, a small one and a relatively big one, 20 Newsgroups and RCV1-V2 (Reuters Corpus Volume I) respectively.
Researcher Affiliation	Collaboration	Stanislas Lauly EMAIL D epartement d informatique Universit e de Sherbrooke Sherbrooke, Qu ebec, Canada Yin Zheng EMAIL Tencent AI Lab Shenzhen, Guangdong, China Alexandre Allauzen EMAIL LIMSI-CNRS Universit e Paris Sud Orsay, France Hugo Larochelle EMAIL D epartement d informatique Universit e de Sherbrooke Sherbrooke, Qu ebec, Canada
Pseudocode	Yes	Algorithm 1 Computing of the cost for training and the hidden layer for representing an entire document (Doc NADE model) Input: Multinomial observation v a c NLL 0 For i = 1 to D Do hi g(a) p(vi) 1 For m = 1 to \|π(vi)\| Do p(π(vi)m = 1\|v<i) sigm bl(vi)m + Vl(vi)m,: hi(v<i) p(vi) p(vi) (p(π(vi)m = 1\|v<i)π(vi)m + (1 p(π(vi)m = 1\|v<i)1 π(vi)m)) End for NLL NLL log p(vi) a a + W:,vi End for hfinal g(a) Return NLL, hfinal
Open Source Code	No	The paper does not contain an explicit statement regarding the public availability of source code nor does it provide a link to a code repository.
Open Datasets	Yes	Two different datasets are used for the experiments in this Section, a small one and a relatively big one, 20 Newsgroups and RCV1-V2 (Reuters Corpus Volume I) respectively. The 20 Newsgroups corpus has 18,786 documents (postings) partitioned into 20 different classes (newsgroups). RCV1-V2 is a much bigger dataset composed of 804,414 documents (newswire stories) manually categorized into 103 classes (topics).
Dataset Splits	Yes	The setup consists of respectively 11,284 and 402,207 training examples for 20 Newsgroups and RCV1-V2. We randomly extract 1,000 and 10,000 documents from the training sets of 20 Newsgroups and RCV1-V2, respectively, to build a validation set. The average perplexity per word is used for comparison. This perplexity is estimated using the 50 ﬁrst documents of a separate test set, as follows:
Hardware Specification	No	The paper mentions 'efﬁcient implementation on the GPU' but does not specify any particular GPU model or other hardware components used for the experiments.
Software Dependencies	No	The paper mentions using the 'Adam optimizer (Kingma and Ba, 2014)' but does not specify any other software dependencies with version numbers.
Experiment Setup	Yes	For these experiments, all the hidden representations of all the models are composed of 50 dimensional features. Early stopping on the validation set is used for avoiding overﬁtting and for model selection. Our best Deep Doc NADE models were trained with the Adam optimizer (Kingma and Ba, 2014) and with tanh activation function. The hyper-parameters of Adam were selected on the validation set.