reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Poisson Random Fields for Dynamic Feature Models

Authors: Valerio Perrone, Paul A. Jenkins, Dario Spanò, Yee Whye Teh

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our construction to develop a nonparametric focused topic model for collections of time-stamped text documents and test it on the full corpus of NIPS papers published from 1987 to 2015. Section 6 combines the model with a linear-Gaussian likelihood and evaluates it on a range of synthetic data sets. Finally, Section 7 illustrates the application of the WF-IBP to topic modeling and presents results obtained on both synthetic data and on the real-world data set consisting of the full text of papers from the NIPS conferences between the years 1987 and 2015.
Researcher Affiliation	Academia	Valerio Perrone* EMAIL Paul A. Jenkins* EMAIL Dario Spano* EMAIL *Department of Statistics Department of Computer Science University of Warwick Coventry, CV4 7AL, UK Yee Whye Teh EMAIL Department of Statistics University of Oxford Oxford, OX1 3LB, UK
Pseudocode	Yes	Algorithm 1: Particle Gibbs Algorithm 2: PG: features seen for the ﬁrst time at time t1. Algorithm 3: Thinning: unseen features alive at time t0 Algorithm 4: Thinning: unseen features born between time t0 and t1
Open Source Code	No	The paper does not provide an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets	Yes	We apply our construction to develop a nonparametric focused topic model for collections of time-stamped text documents and test it on the full corpus of NIPS papers published from 1987 to 2015. The data set is available at https://archive.ics.uci.edu/ml/datasets/NIPS+Conference+Papers+ 1987-2015.
Dataset Splits	Yes	The results in Figure 15 were obtained by holding out diﬀerent percentages of words (50%, 60%, 70% and 80%) from all the papers published in 1999 and by training the model over the papers published in the time range 1987-1999. The held-out words were then used to compute the testset perplexity after 5 repeated runs with random initializations.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or versions (e.g., library names with version numbers) used for the experiments.
Experiment Setup	Yes	1000 iterations of the overall algorithm were performed, choosing a burn-in period of 100 iterations and setting the time-units and drift parameters of the W-F diﬀusion equal to the true ones in the PG update. We set the hyperparameters α and β equal to 1 and the time step to 0.12 diﬀusion time-units per year so as to reﬂect realistic evolutions of topic popularity. The Markov chain was run for 2000 iterations with a burn-in period of 200 iterations, setting η = 0.001 and placing a Gamma(5,1) hyper-prior on γ.