Battling the Non-stationarity in Time Series Forecasting via Test-time Adaptation

Authors: HyunGi Kim, Siwon Kim, Jisoo Mok, Sungroh Yoon

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on diverse benchmark datasets and cutting-edge architectures demonstrate the efficacy and generality of TAFAS, especially in long-term forecasting scenarios that suffer from significant distribution shifts.
Researcher Affiliation Academia Hyun Gi Kim1, Siwon Kim1, Jisoo Mok1, Sungroh Yoon1, 2, 3 1Department of Electrical and Computer Engineering, Seoul National University 2Interdisciplinary Program in Artificial Intelligence, Seoul National University 3AIIS, ASRI, and INMC, Seoul National University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology in detail using text and mathematical equations, and refers to an appendix for a summary of the overall pipeline, but it does not contain a clearly labeled pseudocode or algorithm block in the main text.
Open Source Code Yes Code https://github.com/kimanki/TAFAS
Open Datasets Yes We demonstrate the effectiveness of TAFAS using the seven widely used multivariate TSF benchmark datasets: ETTh1, ETTm1, ETTh2, ETTm2, Exchange, Illness, and Weather (Wu et al. 2021).
Dataset Splits Yes We split datasets in chronological order with the ratio of (0.6, 0.2, 0.2) for ETTh1, ETTm1, ETTh2, and ETTm2 and (0.7, 0.1, 0.2) for Exchange, Illness, and Weather to construct train, validation, and test sets.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions various TSF architectures used (e.g., i Transformer, DLinear, Fre TS) and normalization modules (Rev IN, Dish-TS, SAN), but it does not specify any software libraries or packages with their version numbers.
Experiment Setup Yes We use the look-back window length L = 36 for Illness and L = 96 for the other datasets. For forecasting window length H, we evaluate on 4 different lengths, H {24, 36, 48, 60} for Illness and H {96, 192, 336, 720} for the other datasets. We repeat each pre-training run over three different seeds and select the pre-trained source forecaster with the lowest average validation MSE. More details on training processes are provided in Appendix.