reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Population Posterior and Bayesian Modeling on Streams

Authors: James McInerney, Rajesh Ranganath, David Blei

NeurIPS 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study our method with several large-scale data sets. With held-out likelihood as a measure of model ﬁtness, we found our method to give better models of the data than approaches based on full Bayesian inference [14] or Bayesian updating [8].
Researcher Affiliation	Academia	James Mc Inerney Columbia University EMAIL Rajesh Ranganath Princeton University EMAIL David Blei Columbia University EMAIL
Pseudocode	Yes	Algorithm 1 Population Variational Bayes
Open Source Code	No	The paper does not include an unambiguous statement that the authors are releasing the code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets	Yes	With LDA we analyze three large-scale streamed corpora: 1.7M articles from the New York Times spanning 10 years, 130K Science articles written over 100 years, and 7.4M tweets collected from Twitter on Feb 2nd, 2014. The Ivory Coast location data contains 18M discrete cell tower locations for 500K users recorded over 6 months [6]. The Microsoft Geolife dataset contains 35K latitude-longitude GPS locations for 182 users over 5 years.
Dataset Splits	Yes	We measure model ﬁtness by evaluating the average predictive log likelihood on held-out data. This involves splitting held-out observations (that were not involved in the posterior approximation of β) into two equal halves, inferring the local component distribution based on the ﬁrst half, and testing with the second half [14, 26].
Hardware Specification	No	The paper does not specify any particular hardware components such as GPU or CPU models, memory details, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the experiments.
Experiment Setup	Yes	The other hyperparameters to our experiments are reported in Appendix C.