Bayesian Nonparametric Comorbidity Analysis of Psychiatric Disorders

Authors: Francisco J. R. Ruiz, Isabel Valera, Carlos Blanco, Fernando Perez-Cruz

JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we validate our model on synthetic data and apply it over the real data extracted from the NESARC database. Finally, we use the model to analyze comorbidity among the psychiatric disorders diagnosed by experts from the NESARC database.
Researcher Affiliation Academia Francisco J. R. Ruiz EMAIL Isabel Valera EMAIL Department of Signal Processing and Communications University Carlos III in Madrid Avda. de la Universidad, 30 28911 Legan es (Madrid, Spain) Carlos Blanco EMAIL Department of Psychiatry, New York State Psychiatric Institute Columbia University 1051 Riverside Drive, Unit #69 New York, NY 10032 (United States of America) Fernando Perez-Cruz EMAIL Department of Signal Processing and Communications University Carlos III in Madrid Avda. de la Universidad, 30 28911 Legan es (Madrid, Spain)
Pseudocode No The paper describes the generative model and inference algorithms (Gibbs sampler and variational inference) in detail using mathematical notation and prose, but does not provide specific structured pseudocode or algorithm blocks with numbered steps.
Open Source Code No The paper describes the methods and algorithms used but does not include any explicit statement or link indicating that the source code for the methodology is openly available or will be released.
Open Datasets Yes We use the large amount of information collected in the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database and propose to model these data using a nonparametric latent model based on the Indian Buffet Process (IBP). Public use data are currently available for this wave of data collection.2 2. See http://aspe.hhs.gov/hsp/06/catalog-ai-an-na/nesarc.htm
Dataset Splits Yes We run the Gibbs sampler over 3, 500 randomly chosen subjects out of the 43,093 participants in the survey, having initialized the sampler with an active feature, i.e., K+ = 1, having set znk = 1 randomly with probability 0.5, and fixing α = 1 and σ2 B = 1. After convergence, we run an additional Gibbs sampler with 10 iterations for each of the remaining subjects in the database, restricted to their latent features (that is, we fix the latent features learned for the 3, 500 subjects to sample the feature vector of each subject).
Hardware Specification No The paper describes the computational complexity and performance of the algorithms but does not specify any particular hardware (e.g., GPU, CPU models, or cloud computing resources) used for running the experiments.
Software Dependencies No The text mentions "for avid Matlab users" in the context of notation simplification, suggesting Matlab might have been used, but no specific software or library versions are provided to ensure reproducibility.
Experiment Setup Yes The Gibbs sampler has been initialized with K+ = 2, setting each znk = 1 with probability 1/2, and setting the hyperparameters to α = 0.5 and σ2 B = 1. We run the Gibbs sampler over 3, 500 randomly chosen subjects out of the 43,093 participants in the survey, having initialized the sampler with an active feature, i.e., K+ = 1, having set znk = 1 randomly with probability 0.5, and fixing α = 1 and σ2 B = 1. we apply the variational inference algorithm truncated to K = 25 features, as detailed in Section 4, to find the latent structure of the data.