reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

EHRDiff : Exploring Realistic EHR Synthesis with Diffusion Models

Authors: Hongyi Yuan, Songchi Zhou, Sheng Yu

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this study, we investigate the potential of diffusion models for EHR data synthesis and introduce a novel method, EHRDiff. Through extensive experiments, EHRDiff establishes new state-of-the-art quality for synthetic EHR data, protecting private information in the meanwhile.
Researcher Affiliation	Academia	Hongyi Yuan EMAIL Center for Statistical Science Tsinghua University Songchi Zhou EMAIL Center for Statistical Science Tsinghua University Sheng Yu EMAIL Center for Statistical Science Tsinghua University
Pseudocode	Yes	Algorithm 1 Heun’s 2nd Method for Sampling Input: Time Step ti and noise level σti
Open Source Code	Yes	Codes are released in https://github.com/sczzz3/EHRDiff.git.
Open Datasets	Yes	In this work, we use a publicly available EHR database, MIMIC-III, to evaluate EHRDiff . Deidentified and comprehensive clinical EHR data is integrated into MIMIC-III (Johnson et al., 2016). Cin C2012 Data (Silva et al., 2012) is a dataset proposed to predict the mortality of ICU patients in the Cin C2012. PTB-ECG Data (Bousseljot et al., 1995) is a collection of ECG signals for heart disease diagnosis.
Dataset Splits	Yes	The final extracted number of EHRs is 46,520 and we randomly select 41,868 for model training while the rest are held out for evaluation. We use sets A and B in Cin C2023 Data as training and held-out testing sets respectively. The PTB-ECG Data is split with a ratio of 8:2 for training and held-out testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. It only mentions general settings like 'training on synthetic data' without further hardware specifications.
Software Dependencies	No	The paper mentions using 'Light GBM (Ke et al., 2017) as classifiers' and 'MLP with ReLU (Nair & Hinton, 2010) activations' but does not provide specific version numbers for these or other key software components (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup	Yes	In our experiments, for the diffusion noise schedule, we set σmin and σmax to be 0.02 and 80. ρ is set to 7 and the time step is discretized to N = 32. Pmean is set to 1.2 and Pstd is set to 1.2 for noise distribution in the training process. For Fθ in Equation 8, it is parameterized by an MLP with Re LU (Nair & Hinton, 2010) activations and the hidden states are set to [1024, 384, 384, 384, 1024]. For the baseline methods, we follow the settings reported in their papers. The reported standard errors marked with are calculated under 5 different runs.