Going Beyond Static: Understanding Shifts with Time-Series Attribution
Authors: Jiashuo Liu, Nabeel Seedat, Peng Cui, Mihaela van der Schaar
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies in real-world healthcare applications highlight how the TSSA framework enhances the understanding of time-series shifts, facilitating reliable model deployment and driving targeted improvements from both algorithmic and data-centric perspectives. |
| Researcher Affiliation | Academia | Jiashuo Liu , Nabeel Seedat , Peng Cui & Mihaela van der Schaar Tsinghua University, University of Cambridge EMAIL, EMAIL EMAIL, EMAIL |
| Pseudocode | No | The paper describes methodologies and processes but does not include any explicitly labeled pseudocode or algorithm blocks. For example, it describes the TSSA framework's parts and the doubly robust estimator's formulation, but not in a structured pseudocode format. |
| Open Source Code | No | The paper does not provide any specific links to a code repository, an explicit statement of code release, or mention of code in supplementary materials. |
| Open Datasets | Yes | Through our experiments, we use the Medical Information Mart for Intensive Care (MIMIC) (Johnson et al., 2016) dataset. |
| Dataset Splits | Yes | We follow the standard design outlined by Jarrett et al., randomly splitting the patients in the MIMIC-III dataset into a training set (18,490 patients, P) and a test set (4,610 patients, Q), ensuring no patient overlap between the two sets. For the validation set, we use the same patients as in the training set but select different time segments for their time-series features, denoted as Pval. ... Specifically, for the training set P, we utilize the last 24-hour time segments for all time-series features, while for the test set Q, we select the first 24-hour time segments for all features. This setup allows us to assess whether the model can effectively withstand these temporal shifts and accurately identify patients at high risk of mortality in the early stage. We train a Transformer model fθ( ) on P, which comprises 12,574 patients, and validate it on an additional 5,547 patients. To control for other shifts, we use the same set of patients for both the validation and test sets Q; the only difference lies in the time segments used: the last 24 hours for validation and the first 24 hours for testing. ... We consider a realistic scenario in which a model trained on historical data (12,574 patients, first 24-hour time series, denoted as P) must be deployed for new patients and future time segments (an additional 5,547 patients, second 24-hour time series, denoted as Q). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running its experiments. It only mentions using a Transformer model and an attribution model without hardware context. |
| Software Dependencies | No | The paper mentions using a 'Transformer model' and 'XGBoost' for comparison, but does not provide specific version numbers for any software libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | As for the original model (under evaluation), we use Transformer model (n head:4, n layer:3, hidden dim:32), learning rate is 1e 3, the total epoch number is 200, batch size is 256, and the early stop is used during training (according to last 10 epoch). As for the attribution model: The model architecture is shown in Figure 2, where we use two-layer MLP with hidden size selected from {16,32,64,128} for each part according to the validation results, learning rate 1e 3, and batch size 64. |