Sundial: A Family of Highly Capable Time Series Foundation Models

Authors: Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Sundial on best-recognized zero-shot forecasting benchmarks (Section 5.1) and investigate the scaling behavior of Sundial (Section 5.2). We compare Time Flow with other training objectives (Section 5.3). We delve into test-time calibration of generative forecasters (Section 5.4). We conduct model adaptation of Sundial, i.e., instruction tuning (Section 5.5) and provide in-depth ablation studies to evaluate our modular enhancement (Section 5.6).
Researcher Affiliation Academia 1School of Software, BNRist, Tsinghua University. Yong Liu <EMAIL>. Guo Qin <EMAIL>. Correspondence to: Mingsheng Long <EMAIL>.
Pseudocode Yes Algorithm 1 Time Flow Loss: Sampling Require: condition hi RD, path steps K. 1: Sample initial noise byi N(0, I). 2: t = 1/K 3: for k in {0, 1 . . . , K 1} do 4: for byi byi + FM-Net byi, k t, hi t 5: end for 6: Return: byi
Open Source Code Yes Code is available at: https://github.com/thuml/Sundial.
Open Datasets Yes We collected and curated Time Bench, which comprises over a trillion time points from various sources, as shown in Figure 3. Several datasets originate from research teams (Woo et al., 2024; Ansari et al., 2024; Liu et al., 2024a;b). ... The statistical details of Time Bench are summarized in Table 4. In addition to open-source datasets from research teams on time series foundation models (Woo et al., 2024; Ansari et al., 2024; Liu et al., 2024b;a), we collected substantial real-world time series from various domains such as finance, Io T, meteorology, and healthcare (Goldberger et al., 2000).
Dataset Splits No Metrics (MSE/MAE) are calculated from all predicted windows in the test split of each dataset following Liu et al. (2024a). To prevent data leakage, we exclude all datasets evaluated in Section 5.1 to make sure that Sundial conducts zero-shot forecasting.
Hardware Specification Yes All experiments are implemented using Py Torch (Paszke et al., 2019) and executed with 32 NVIDIA A100 GPUs.
Software Dependencies No All experiments are implemented using Py Torch (Paszke et al., 2019) and executed with 32 NVIDIA A100 GPUs.
Experiment Setup Yes On the FEV leaderboard (Ansari et al., 2024), which consists of short-term forecasting datasets, we train Sundial models by Time Flow Loss with the prediction length of F = 16. For the point forecasting (Wu et al., 2022) and GIFT-Eval (Aksu et al., 2024), which consist of forecasting datasets with a prediction length ranging from 6 to 900, we train Sundial models by Time Flow Loss with the prediction length of F = 720. ... The sampling step is fixed as K = 50. Configurations of Sundial in different sizes are provided in Table 5.