Time-to-Event Prediction with Neural Networks and Cox Regression

Authors: Håvard Kvamme, Ørnulf Borgan, Ida Scheel

JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through simulation studies, the proposed loss function is verified to be a good approximation for the Cox partial log-likelihood. The proposed methodology is compared to existing methodologies on real-world data sets and is found to be highly competitive, typically yielding the best performance in terms of Brier score and binomial log-likelihood. In Section 5, we conduct a simulation study, verifying that the methods we propose behave as expected. In Section 6 we evaluate our methods on five real-world data sets and compare their performances with existing methodology.
Researcher Affiliation Academia H avard Kvamme EMAIL Ørnulf Borgan EMAIL Ida Scheel EMAIL Department of Mathematics University of Oslo P.O. Box 1053 Blindern 0316 Oslo, Norway
Pseudocode No The paper describes the methodologies using mathematical formulas and prose, but it does not contain clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes A python package for the proposed methods is available at https://github.com/havakv/pycox. Implementations of methods and the data sets are available at https://github.com/havakv/pycox.
Open Datasets Yes Implementations of methods and the data sets are available at https://github.com/havakv/pycox. For FLCHAIN, we remove individuals with missing values. Further, we remove the chapter covariate, which gives the cause of death. Table 1 provides a summary of the data sets. For a more detailed description, we refer to the original sources (Therneau, 2015; Katzman et al., 2018). The WSDM KKBox’s churn prediction challenge was proposed... The competition was hosted by Kaggle in 2017, with the goal of predicting customer churn on a data set donated by KKBox... (https://www.kaggle.com/c/kkbox-churn-prediction-challenge)
Dataset Splits Yes As the four data sets are somewhat small, we scored our fitted models using 5-fold crossvalidation, where the hyperparameter search was performed individually for each fold. We split the data into a training, a testing, and a validation set, and some information about these subsets are listed in Table 5.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies No The paper mentions software like the PyTorch framework (Paszke et al., 2017), the Lifelines python package (Davidson-Pilon et al., 2018), and survival packages of R (Therneau, 2015). However, it does not provide specific version numbers for these software components, which is required for reproducibility.
Experiment Setup Yes The networks are standard multi-layer perceptrons with the same number of nodes in every layer, Re LU activations, and batch normalization between layers. We used dropout, normalized decoupled weight decay (Loshchilov and Hutter, 2019), and early stopping for regularization. SGD was performed by Adam WR (Loshchilov and Hutter, 2019) with an initial cycle length of one epoch, and we double the cycle length after each cycle. Learning rates were found using the methods proposed by Smith (2017). All networks were trained with batch size of 1028, and the best performing architectures can be found in Table 6. For the proposed Cox-MLP (CC) and Cox-Time, we used a fixed penalty λ = 0.001 in (10). Table A.1 and A.2 provide detailed hyperparameter search spaces and chosen values for KKBox.