reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Principled Selection of Hyperparameters in the Latent Dirichlet Allocation Model

Authors: Clint P. George, Hani Doss

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both synthetic and real data show good performance of our methodology.
Researcher Affiliation	Academia	Clint P. George EMAIL Informatics Institute University of Florida Gainesville, FL 32611, USA Hani Doss EMAIL Department of Statistics University of Florida Gainesville, FL 32611, USA
Pseudocode	Yes	Algorithm 1: Serial tempering. See the discussion in Sec. 2.3 and Appendices A.1 and 4.
Open Source Code	Yes	Software for implementation of our algorithms as well as datasets we use are available as an R package at https://github.com/clintpgeorge/ldamcmc
Open Datasets	Yes	The 20Newsgroups dataset is commonly used in the machine learning literature for experiments on applications of text classiﬁcation and clustering algorithms. It contains approximately 20,000 articles that are partitioned relatively evenly across 20 different newsgroups or categories. We created the second set of corpora from web articles downloaded from the English Wikipedia, with the help of the Media Wiki API3. 2. http://qwone.com/~jason/20Newsgroups 3. http://www.mediawiki.org/wiki/API:Query
Dataset Splits	No	The paper describes methods for evaluating model fit using held-out documents (leave-one-out cross-validation implicitly), such as "For d = 1, . . . , D, let w( d) denote the corpus consisting of all the documents except for document d. To evaluate a given model (in our case the LDA model indexed by a given h), in essence we see how well the model based on w( d) predicts document d, the held-out document." However, it does not explicitly provide details about training/test/validation dataset splits for general model training beyond this evaluation strategy.
Hardware Specification	Yes	Our experiments were conducted through the R programming language, using Rcpp, on a 3.70GHz quad core Intel Xeon Processor E5-1630V3.
Software Dependencies	No	The paper mentions "R programming language, using Rcpp" and the "R package mcmcse (Flegal et al., 2016)" but does not provide specific version numbers for R, Rcpp, or the mcmcse package.
Experiment Setup	Yes	For each hyperparameter value hj (j = 1, . . . , 220), we took Φj to be the Markov transition function of the Augmented Collapsed Gibbs Sampler alluded to earlier and described in detail in Section 4... We took the sequence h1, . . . , h J to consist of an 11 20 grid of 220 evenly-spaced values over the region (η, α) [.6, 1.1] [.15, 1.1]... We took the Markov transition function K(j, ) on L = {1, . . . , 220} to be the uniform distribution on Nj where Nj is the subset of L consisting of the indices of the hl s that are neighbors of the point hj... Each plot was constructed from a serial tempering chain, using the methodology described in Section 2.3. We took the number of cycles of the Gibbs sampler to be 10,000... For the serial tempering estimate, each run consisted of 2 tuning iterations with a chain length of 51,000 cycles, and a ﬁnal iteration with a chain length of 101,000 cycles; and in each case, the ﬁrst 1000 cycles were deleted as burn-in.