Doubly Robust Conformalized Survival Analysis with Right-Censored Data

Authors: Matteo Sesia, Vladimir Svetnik

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical studies on simulated and real data demonstrate that our method leads to relatively informative predictive inferences and is especially robust in challenging settings where the survival model may be inaccurate. ... 4. Numerical Experiments ... 5. Application to Real Data
Researcher Affiliation Collaboration 1Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA, USA. 2Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA, USA. 3Merck&Co., Inc., Rahway, NJ, USA. Correspondence to: Matteo Sesia <EMAIL>.
Pseudocode Yes Algorithm 1 Imputation of Latent Censoring Times ... Algorithm 2 DR-COSARC with Fixed Cutoffs ... Algorithm 3 DR-COSARC with Adaptive Cutoffs
Open Source Code Yes Software Availability A software implementation of the methods described in this paper is available online at https://github.com/msesia/conformal_survival.
Open Datasets Yes We apply our method to seven publicly available datasets: VALCT, PBC, GBSG, METABRIC, COLON, HEART, and RETINOPATHY. These datasets cover a range of study designs and sizes; Table A3 in Appendix A4 provides details on the number of observations, covariates, and data sources. ... The datasets were obtained from various publicly available sources. VALCT, PBC, COLON, HEART, and RETINOPATHY are included in the survival R package. GBSG was sourced from Git Hub: https://github.com/jaredleekatzman/Deep Surv/. METABRIC was accessed via https://www.cbioportal.org/study/summary?id=brca_metabric.
Dataset Splits Yes We generate independent training, calibration, and test datasets, each with 1000 samples. ... The datasets are split into 60% for training, 20% for calibration, and 20% for testing, and each experiment is repeated 100 times using independent random splits.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It discusses computational cost but without specifying the underlying hardware.
Software Dependencies No The paper lists several R packages used for modeling (grf, survival, randomForestSRC), but it does not specify their version numbers or the version of R itself, which is required for reproducible software dependencies.
Experiment Setup Yes We compute 90% survival LPBs for the test set. Performance is evaluated by the average proportion of test points where the true survival time exceeds the LPB (targeting 90%) and the average LPB value... All experiments are repeated 100 times, and results are averaged. ... we always set equal to the median of the observed censoring times.