reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting

Authors: Peiyuan Liu, Beiliang Wu, Yifan Hu, Naiqi Li, Tao Dai, Jigang Bao, Shu-Tao Xia

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that Time Bridge consistently achieves state-of-the-art performance in both short-term and long-term forecasting. Additionally, Time Bridge demonstrates exceptional performance in financial forecasting on the CSI 500 and S&P 500 indices, further validating its robustness and effectiveness. Additionally, Time Bridge demonstrates exceptional performance in financial forecasting on the CSI 500 and S&P 500 indices, further validating its robustness and effectiveness.
Researcher Affiliation	Academia	1Tsinghua Shenzhen International Graduate School 2Shenzhen University. Correspondence to: Tao Dai <EMAIL>, Naiqi Li <EMAIL>.
Pseudocode	No	The paper describes the Time Bridge framework with components like Patch Embedding, Integrated Attention, Patch Downsampling, and Cointegrated Attention. However, it does not present these components or the overall method in a structured pseudocode or algorithm block. The methodology is explained through descriptive text and diagrams.
Open Source Code	Yes	Code is available at https://github.com/ Hank0626/Time Bridge.
Open Datasets	Yes	We conduct long-term forecasting experiments on several widely-used real-world datasets, including the Electricity Transformer Temperature (ETT) dataset with its four subsets (ETTh1, ETTh2, ETTm1, ETTm2) (Wu et al., 2021; Miao et al., 2024a), as well as Weather, Electricity, Traffic, and Solar (Liu et al., 2025a;b). These datasets exhibit strong non-stationary characteristics, detailed in Appendix D. Following previous works (Zhou et al., 2021; Wu et al., 2021), we use Mean Square Error (MSE) and Mean Absolute Error (MAE) as evaluation metrics. We set the input length I to 720 for our method. For other baselines, we adopt the setting that searches for the optimal input length I and other hyperparameters. Details of the metric and the searching process can be found in Appendix C.1 and Appendix F.1.
Dataset Splits	Yes	Dataset Size denotes the total number of time points in (Train, Validation, Test) split respectively. Prediction Length denotes the future time points to be predicted. Frequency denotes the sampling interval of time points.
Hardware Specification	Yes	All experiments are implemented in Py Torch (Paszke et al., 2019) and conducted on two NVIDIA RTX 3090 24GB GPUs.
Software Dependencies	No	All experiments are implemented in Py Torch (Paszke et al., 2019) and conducted on two NVIDIA RTX 3090 24GB GPUs. We use the Adam optimizer (Kingma, 2014) with a learning rate selected from {1e-3, 1e-4, 5e-4}. While PyTorch is mentioned, a specific version number for the library itself is not provided. Adam is an algorithm, not a software dependency with a version number.
Experiment Setup	Yes	We use the Adam optimizer (Kingma, 2014) with a learning rate selected from {1e-3, 1e-4, 5e-4}. The number of patches N is set accordingly to different datasets. We adopt a hybrid MAE loss that operates in both the time and frequency domains for stable training (Wang et al., 2024a). For additional details on hyperparameter settings and loss function, please refer to the Appendix E. Table 8 details hyperparameter settings for different datasets, including lr, d_model, d_ff, and alpha.