In-context Time Series Predictor

Authors: Jiecheng Lu, Yan Sun, Shihao Yang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct comprehensive experiments under full-data, few-shot, and zero-shot settings using widely-used TSF datasets (details in A.2), including ETTs (Zhou et al., 2021), Traffic, Electricity (ECL), and Weather. We use K = 3 TF layers with d = 128 and 8 heads. We set LI = 1440, Lb = 512, and LP {96, 192, 336, 720}, performing 4 experiments for each dataset.
Researcher Affiliation Academia Jiecheng Lu, Yan Sun, Shihao Yang Georgia Institute of Technology EMAIL, EMAIL
Pseudocode No The paper describes the model architecture and processes using mathematical formulations (e.g., Equation 1 for Transformer layers, Equations 6-10 for Token Retrieval) but does not contain a dedicated section or figure presenting pseudocode or an algorithm block.
Open Source Code Yes Code implementation is available at: https://anonymous.4open.science/r/ICTSP-C995
Open Datasets Yes Our main TSF experiments are conducted based on commonly used time series forecasting datasets, detailed as follows: ETT Datasets2 (Zhou et al., 2021): This dataset includes...2https://github.com/zhouhaoyi/ETDataset. Electricity Dataset3: This dataset covers...3https://archive.ics.uci.edu/ml/datasets/Electricity Load Diagrams20112014. Traffic Dataset4: Sourced from...4http://pems.dot.ca.gov/. Weather Dataset5: This dataset captures...5https://www.bgc-jena.mpg.de/wetter/.
Dataset Splits Yes In the full-data experiment setting, we split each dataset with 70% training set, 10% validation, set and 20% test set.
Hardware Specification Yes Our models are trained on single Nvidia RTX 4090 GPU with a batch size equals to 32 for most of the datasets.
Software Dependencies No The ICTSP model is trained using the Adam optimizer and MSE loss in Pytorch, with a learning rate of 0.0005 each dataset. The paper mentions using Pytorch but does not provide specific version numbers for Pytorch or any other software dependencies.
Experiment Setup Yes We use K = 3 TF layers with d = 128 and 8 heads. We set LI = 1440, Lb = 512, and LP {96, 192, 336, 720}, performing 4 experiments for each dataset. We use sampling step m = 8 and the token retrieval method with q = 10%, r = 30 in main experiments. The ICTSP model is trained using the Adam optimizer and MSE loss in Pytorch, with a learning rate of 0.0005 each dataset. We test the model every 200 training steps with a early-stopping patience being 30 * 200 steps. The first 1000 steps are for learning rate warm-up, followed by a linear decay of learning rate. We set the random seed as 2024.