reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Integrative Analysis using Coupled Latent Variable Models for Individualizing Prognoses

Authors: Peter Schulam, Suchi Saria

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our approach to the problem of predicting lung disease trajectories in scleroderma, a complex autoimmune disease. We show that our model improves over state-of-the-art baselines in predictive accuracy and we provide a qualitative analysis of our model s output. Finally, the variability of disease presentation in scleroderma makes clinical trial recruitment challenging. We show that a prognostic tool that integrates multiple types of routinely collected longitudinal data can be used to identify individuals at greatest risk of rapid progression and to target trial recruitment.
Researcher Affiliation	Academia	Department of Computer Science Johns Hopkins University Baltimore, MD 21218, USA
Pseudocode	Yes	Figure 2: Two-stage procedure for ﬁtting the Coupled Latent Trajectory Model (C-LTM).
Open Source Code	No	The paper does not contain an explicit statement about the availability of source code or a link to a code repository.
Open Datasets	No	We train and validate our model using data from the Johns Hopkins Scleroderma Center patient registry, one of largest collections of clinical scleroderma data in the world.
Dataset Splits	Yes	We divide our data into 10 folds and use log-likelihood on the ﬁrst fold for tuning hyperparameters. For PFVC, we select G = 9 subtypes using BIC. For the kernel hyperparameters Θ1 = {Σb, α, ℓ, σ2} we set Σb R to be 16.0, which corresponds to the variance of individual-speciﬁc intercepts. We set α = 6, ℓ= 2, and σ2 = 1 using a grid search over values chosen using domain knowledge. Qualitatively, these make sense; we expect transient deviations to last around 2 years and to change PFVC by around 6 units. Finally, we penalize the expected log-likelihood with respect to β1:G as in Eq. 4 and set the weight ρ = 0.01, which was chosen based on the clinical interpretability of the learned subtype trajectories. The remaining 9 folds were used for our cross-validation experiments.
Hardware Specification	No	On a standard laptop, we are able to train the model on 772 patients (5,458 PFVC measurements) in 10-20 minutes.
Software Dependencies	No	We optimize the objective using the Orthant-Wise Limited-memory Quasi-Newton (OWL-QN) algorithm (Andrew and Gao, 2007).
Experiment Setup	Yes	For the population model, we use constant functions (i.e. the basis expansion Φp(t) contains an intercept term whose coeﬃcient is determined by baseline covariates). For the subpopulation B-splines, we set boundary knots at 0 and 25 years (the maximum observation time in our data set is 23 years), use two interior knots that divide the time period from 0-25 years into three equally spaced chunks, and use quadratics as the piecewise components. For the individual-speciﬁc long-term basis Φℓ, we use the same basis as the population model (constant functions). We divide our data into 10 folds and use log-likelihood on the ﬁrst fold for tuning hyperparameters. For PFVC, we select G = 9 subtypes using BIC. For the kernel hyperparameters Θ1 = {Σb, α, ℓ, σ2} we set Σb R to be 16.0, which corresponds to the variance of individual-speciﬁc intercepts. We set α = 6, ℓ= 2, and σ2 = 1 using a grid search over values chosen using domain knowledge. Qualitatively, these make sense; we expect transient deviations to last around 2 years and to change PFVC by around 6 units. Finally, we penalize the expected log-likelihood with respect to β1:G as in Eq. 4 and set the weight ρ = 0.01, which was chosen based on the clinical interpretability of the learned subtype trajectories.