reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fair Federated Survival Analysis

Authors: Md Mahmudur Rahman, Sanjay Purushotham

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on distributed non-IID datasets demonstrate Fair FSA s superiority in fairness and accuracy over state-of-the-art FSA methods, establishing it as a robust FSA approach capable of handling censoring while providing equitable and accurate survival predictions for all subjects.
Researcher Affiliation	Academia	University of Maryland, Baltimore County, Baltimore, Maryland, USA EMAIL, EMAIL
Pseudocode	No	The paper describes procedures and processes (e.g., iterative gradient descent approach, Federated Training steps) but does not present them in a structured pseudocode or algorithm block format. Figure 1 shows a high-level framework diagram, not pseudocode.
Open Source Code	Yes	Local and global fairness definitions are in the supplementary materials at this Github link: https://github.com/umbcsanjaylab/Fair FSA
Open Datasets	Yes	We evaluated four publicly available real-world survival datasets namely FLChain (Dispenzieri et al. 2012), SUPPORT (Knaus et al. 1995), SEER (TENG 2019), and MSK-MET (Nguyen et al. 2022) for centralized and federated fair survival analysis. Detailed descriptions of these datasets are provided in the supplementary materials.
Dataset Splits	Yes	At each client, local datasets are split into 80% training and 20% test sets. The local test datasets are evaluated in both centralized and federated settings, ensuring that the combined test data remains the same across all experiments. We perform 5-fold cross-validation and compute evaluation metrics on the same combined test set, reporting the mean and standard deviation (shown in the supplementary materials) of each metric.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using the "Adam optimizer (Kingma and Ba 2014)" and the "Survival EVAL package (Qi, Sun, and Greiner 2023)" but does not specify version numbers for these or other software libraries or tools.
Experiment Setup	Yes	We use the Adam optimizer (Kingma and Ba 2014) with early-stopping based on the best validation C-index. In the federated settings, we consider 5 clients, with a total of 10 communication rounds and 20 local epochs per round. We also choose the learning rate from [0.01, 0.001] and set the batch size to 128. We compute the pseudo values for the unique time points in the training data. Detailed hyperparameter settings are provided in the supplementary materials.