reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Nonparametric Comorbidity Analysis of Psychiatric Disorders

Authors: Francisco J. R. Ruiz, Isabel Valera, Carlos Blanco, Fernando Perez-Cruz

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5, we validate our model on synthetic data and apply it over the real data extracted from the NESARC database. Finally, we use the model to analyze comorbidity among the psychiatric disorders diagnosed by experts from the NESARC database.
Researcher Affiliation	Academia	Francisco J. R. Ruiz EMAIL Isabel Valera EMAIL Department of Signal Processing and Communications University Carlos III in Madrid Avda. de la Universidad, 30 28911 Legan es (Madrid, Spain) Carlos Blanco EMAIL Department of Psychiatry, New York State Psychiatric Institute Columbia University 1051 Riverside Drive, Unit #69 New York, NY 10032 (United States of America) Fernando Perez-Cruz EMAIL Department of Signal Processing and Communications University Carlos III in Madrid Avda. de la Universidad, 30 28911 Legan es (Madrid, Spain)
Pseudocode	No	The paper describes the generative model and inference algorithms (Gibbs sampler and variational inference) in detail using mathematical notation and prose, but does not provide specific structured pseudocode or algorithm blocks with numbered steps.
Open Source Code	No	The paper describes the methods and algorithms used but does not include any explicit statement or link indicating that the source code for the methodology is openly available or will be released.
Open Datasets	Yes	We use the large amount of information collected in the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database and propose to model these data using a nonparametric latent model based on the Indian Buﬀet Process (IBP). Public use data are currently available for this wave of data collection.2 2. See http://aspe.hhs.gov/hsp/06/catalog-ai-an-na/nesarc.htm
Dataset Splits	Yes	We run the Gibbs sampler over 3, 500 randomly chosen subjects out of the 43,093 participants in the survey, having initialized the sampler with an active feature, i.e., K+ = 1, having set znk = 1 randomly with probability 0.5, and ﬁxing α = 1 and σ2 B = 1. After convergence, we run an additional Gibbs sampler with 10 iterations for each of the remaining subjects in the database, restricted to their latent features (that is, we ﬁx the latent features learned for the 3, 500 subjects to sample the feature vector of each subject).
Hardware Specification	No	The paper describes the computational complexity and performance of the algorithms but does not specify any particular hardware (e.g., GPU, CPU models, or cloud computing resources) used for running the experiments.
Software Dependencies	No	The text mentions "for avid Matlab users" in the context of notation simplification, suggesting Matlab might have been used, but no specific software or library versions are provided to ensure reproducibility.
Experiment Setup	Yes	The Gibbs sampler has been initialized with K+ = 2, setting each znk = 1 with probability 1/2, and setting the hyperparameters to α = 0.5 and σ2 B = 1. We run the Gibbs sampler over 3, 500 randomly chosen subjects out of the 43,093 participants in the survey, having initialized the sampler with an active feature, i.e., K+ = 1, having set znk = 1 randomly with probability 0.5, and ﬁxing α = 1 and σ2 B = 1. we apply the variational inference algorithm truncated to K = 25 features, as detailed in Section 4, to find the latent structure of the data.