Patient Risk Stratification with Time-Varying Parameters: A Multitask Learning Approach
Authors: Jenna Wiens, John Guttag, Eric Horvitz
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Applied to a held out set of approximately 25,000 patient admissions, we achieve an area under the receiver operating characteristic curve of 0.81 (95% CI 0.78-0.84). The model has been integrated into the health record system at a large hospital in the US, and can be used to produce daily risk estimates for each inpatient. |
| Researcher Affiliation | Collaboration | Jenna Wiens EMAIL Computer Science & Engineering University of Michigan Ann Arbor, MI John Guttag EMAIL Department of EECS Massachusetts Institute of Technology Cambridge, MA Eric Horvitz EMAIL Microsoft Research Redmond, WA |
| Pseudocode | No | The paper describes the problem setup and the learning algorithms using mathematical formulations (Equation 1 and 2), but does not contain a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code, nor does it provide a link to a code repository. It does mention using LIBLINEAR (Fan et al., 2008), but this is a third-party tool. |
| Open Datasets | No | We considered all adult inpatient admissions to a large private hospital in the US over a two year period. We leverage the contents of EHRs from over 50,000 patient admissions from a single hospital. |
| Dataset Splits | Yes | We split the data into a training set and a holdout set based on time, training on data from the first year, and validating our model on data from the second year. The training data consisted of patient admissions from 2011-04-12 to 2012-04-11, totaling 190,675 visit days pertaining to 24,607 unique visits. Within the training data, 258 admissions had a positive test for C. difficile resulting in 2,608 training days with a positive label. [...] The validation, which consisted of patient admissions from 2012-04-12 to 2013-04-12 and was composed of 24,399 admissions of which 242 had a positive test result for C. difficile. [...] To select the hyperparameter C in (1), we performed repeated five-fold cross validation on the training data, choosing a setting that maximized the AUROC. |
| Hardware Specification | No | The paper does not contain any specific details about the hardware used for running its experiments, such as GPU/CPU models or memory. |
| Software Dependencies | No | The model parameters, i.e., θ, were solved for using LIBLINEAR (Fan et al., 2008). (This cites a tool, but does not provide a specific version number for this or any other software dependency for replication.) |
| Experiment Setup | Yes | We selected the number of tasks T, and the corresponding temporal intervals τj for j = 1, ..., T based on the number of training examples available for each interval. For our data, this resulted in six distinct tasks, corresponding to six distinct time periods: D1, D2, D3, D4, D5, D6. [...] To select the hyperparameter C in (1), we performed repeated five-fold cross validation on the training data, choosing a setting that maximized the AUROC. |