reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Weibull Racing Survival Analysis with Competing Events, Left Truncation, and Time-Varying Covariates

Authors: Quan Zhang, Yanxun Xu, Mei-Cheng Wang, Mingyuan Zhou

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 4, we use synthetic data to showcase WDR’s parsimonious nonlinear modeling capacity and outstanding performance. In Section 5, we analyze real data of lymphoma and Alzheimer’s disease to understand how hazards of competing events are inﬂuenced by covariates and show the potential of WDR in discovering and diagnosing new diseases. In all the experiments, we set K = 10 in WDR to allow up to 10 latent sub-events for each competing event, run 20,000 MCMC iterations, and collect the last 2,000 for posterior estimations. We use the Brier score (Gerds et al., 2008; Steyerberg et al., 2010) to quantify the prediction accuracy.
Researcher Affiliation	Academia	Quan Zhang EMAIL Department of Accounting and Information Systems Michigan State University East Lansing, MI 48824, U.S.A. Yanxun Xu EMAIL Department of Applied Mathematics and Statistics Johns Hopkins University Baltimore, MD 21218, U.S.A. Mei-Cheng Wang EMAIL Department of Biostatistics Johns Hopkins University Baltimore, MD 21205, U.S.A. Mingyuan Zhou EMAIL Department of Information, Risk, and Operations Management Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712, U.S.A.
Pseudocode	Yes	Algorithm 1 Simulation of the survival data with time-varying covariates. Input: Number of subjects n, number of competing events J, right censoring time Tr.c., covariate distribution Px, maximum number of covariate updates L Z+, potential covariate update times τ (0), τ (1), . . . , τ (L) with τ (L) < Tr.c., Weibull parameters a and {λj(x)}J j=1 Output: {ti, yi, x(0) i , . . . , x(Li) i , τ (0) i , . . . , τ (Li) i }n i=1 1: for i = 1, . . . , n do ... 16: end for
Open Source Code	No	The paper does not provide concrete access to its own source code. It mentions using third-party packages like 'pycox' but does not offer a link or explicit statement for the WDR implementation.
Open Datasets	Yes	We apply WDR to the diﬀuse large B-cell lymphoma (DLBCL) data3 (Rosenwald et al., 2002) where the covariates are time-invariant. The data is publicly accessible at https://llmpp.nih.gov/DLBCL/. Last access in July 2022.
Dataset Splits	Yes	For each data set, we simulate 2000 subjects and take 20 random partitions into a training set of 1800 and a testing set of 200. We evaluate the Brier scores at the ﬁve time points as in Table 1 and report in Table 3 the average score over the partitions and the time evaluated. For each data set, we simulate 1000 subjects and take 20 random partitions into a training set of 900 and a testing set of 100. We report in Tables 4 and 5 the Brier scores (mean standard error) for events 1 and 2, respectively, by the six models. In addition, we randomly split the subjects into 80% of training data and 20% of testing data to assess the performance of WDR and defer the model comparison to Table 14 in the Appendix.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	Yes	We use the mcmc function in the R package diversitree (Fitz John, 2012). We use R for the MCMC algorithm of WDR and the package riskRegression (Gerds and Scheike, 2015) for FG, the package CoxBoost (Binder, 2013) for the gradient boosting method in KFG, and the package randomForestSRC (Ishwaran and Kogalur, 2018) for RF. For Deep Hit and PCH, we use the Python package pycox4. https://github.com/havakv/pycox. Last access in February 2023. We use the R package LiblineaR (Helleputte, 2015) for L2-MLR where a bias term is included and the regularization parameter is selected by a ﬁve-fold cross-validation on the training set from (2-10, 2-9, ..., 215). For SVMs, we use the LIBSVM (Chang and Lin, 2011) provided by the R package e1071 (Meyer et al., 2015).
Experiment Setup	Yes	In all the experiments, we set K = 10 in WDR to allow up to 10 latent sub-events for each competing event, run 20,000 MCMC iterations, and collect the last 2,000 for posterior estimations. For the kernel Fine-Gray (KFG) model, we use the radial basis function kernel and select the kernel width from 2^-5, 2^-4, ..., 2^5 by maximizing the partial likelihood of validation data, which are randomly sampled from and accounts for 20% of the training data. For the random survival forests (RF), we set the number of trees equal to 1000 and the number of splits equal to 2 if xi R3 and equal to 4 if xi R10, which are roughly equal to the square root of the covariate dimensions. For the Deep Hit and the piecewise constant hazards (PCH) models, we discretize the continuous time into 20 intervals of an equal length, in each of which the survival or hazard function is constant and use a feedforward neural network with the ReLU activation functions and two hidden layers, each of which has 20 nodes. Early stopping is implemented by incorporating a validation set.