Timer-XL: Long-Context Transformers for Unified Time Series Forecasting

Authors: Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct evaluations of Timer-XL in three aspects, including (1) supervised training as a task-specific forecaster, (2) large-scale pre-training as a zero-shot forecaster, and (3) assessing the effectiveness of Time Attention and model efficiency. Given that the long-context forecasting paradigm receives less attention in the community, which can be concealed due to the performance saturation on previous benchmarks (Makridakis et al., 2020; Wu et al., 2022), we established new long-context forecasting benchmarks. Detailed experimental configurations are provided in Appendix B.
Researcher Affiliation Academia Yong Liu , Guo Qin , Xiangdong Huang, Jianmin Wang, Mingsheng Long B School of Software, BNRist, Tsinghua University, Beijing 100084, China EMAIL EMAIL
Pseudocode No The paper describes the methodology using mathematical equations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at this repository: https://github.com/thuml/Timer-XL.
Open Datasets Yes We conduct experiments on well-acknowledged benchmarks to evaluate performance of the proposed Timer-XL, which includes (1) ETT (Zhou et al., 2021) [...] (9) GTWSF (Wu et al., 2023) is a dataset collected from the National Centers for Environmental Information (NCEI). [...] (10) UTSD (Liu et al., 2024c) is a multi-domain time series dataset, which includes seven domains with a hierarchy of four volumes. We adopt the largest volume that encompasses 1 billion time points for pre-training.
Dataset Splits Yes We follow the same data processing and train-validation-test split protocol used in Times Net (Wu et al., 2022), where the train, validation, and test datasets are divided according to chronological order to prevent data leakage. Detailed dataset descriptions and prediction settings are provided in Table 9.
Hardware Specification Yes All the experiments are implemented by Py Torch (Paszke et al., 2019) on NVIDIA A100 Tensor Core GPUs.
Software Dependencies No All the experiments are implemented by Py Torch (Paszke et al., 2019) on NVIDIA A100 Tensor Core GPUs. We employ the Adam optimizer (Kingma & Ba, 2014) and MSE loss for model optimization. A specific version number for PyTorch is not provided in the text.
Experiment Setup Yes Detailed experimental configurations are provided in Table 11. We employ the Adam optimizer (Kingma & Ba, 2014) and MSE loss for model optimization.