A Flexible Generative Model for Heterogeneous Tabular EHR with Missing Modality
Authors: Huan He, William hao, Yuanzhe Xi, Yong Chen, Bradley Malin, Joyce Ho
ICLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that our model consistently outperforms existing state-of-the-art synthetic EHR generation methods both in fidelity by up to 3.10% and utility by up to 7.16%. Additionally, we show that our method can be successfully used in privacy-sensitive settings, where the original patient-level data cannot be shared. |
| Researcher Affiliation | Academia | Huan He Department of Biostatistics University of Pennsylvania EMAIL William Hao Department of Computer Science Emory University EMAIL Yuanzhe Xi Department of Mathematics Emory University EMAIL Yong Chen Department of Biostatistics University of Pennsylvania EMAIL Bradley Malin Department of Biomedical Informatics Vanderbilt University EMAIL Joyce C Ho Department of Biostatistics Emory University EMAIL |
| Pseudocode | Yes | A.4 ALGORITHM OF FLEXGEN-EHR Algorithm 1: Training of FLEXGEN-EHR |
| Open Source Code | No | The paper states that codes for *baseline models* are available online (with links provided), but does not provide an explicit statement or link for the source code of FLEXGEN-EHR itself. |
| Open Datasets | Yes | We use two real-world de-identified EHR datasets, MIMIC-III (Johnson et al., 2016) and e ICU (Pollard et al., 2018). |
| Dataset Splits | No | The paper does not provide specific percentages or methodology for train/validation/test splits, nor does it explicitly mention a validation set. It mentions using 'test datasets' but not the splitting strategy. |
| Hardware Specification | Yes | For training the models, we used Adam (Kingma & Ba, 2015) with the learning rate set to 0.001, and a mini-batch of 128 on a machine equipped with one Nvidia Ge Force RTX 3090 and CUDA 11.2. |
| Software Dependencies | Yes | We implemented FLEXGEN-EHR with Py Torch. For training the models, we used Adam (Kingma & Ba, 2015) with the learning rate set to 0.001, and a mini-batch of 128 on a machine equipped with one Nvidia Ge Force RTX 3090 and CUDA 11.2. |
| Experiment Setup | Yes | For training the models, we used Adam (Kingma & Ba, 2015) with the learning rate set to 0.001, and a mini-batch of 128... Hyperparamters of FLEXGEN-EHR are selected after grid search. We use a timestep of 50 and a noise scheduling β from 1 10 4 to 1 10 2. |