reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Authors: Mouxiang Chen, Lefei Shen, Zhuo Li, Xiaoyun Joy Wang, Jianling Sun, Chenghao Liu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments reveal intrinsic similarities between images and real-world time series, suggesting that visual models may offer a free lunch for TSF and highlight the potential for future cross-modality research. Our code is publicly available at https://github.com/ Keytoyze/Vision TS. ... Comprehensive evaluations of VISIONTS on large-scale benchmarks across multiple domains demonstrate its significant forecasting performance, surpassing few-shot textbased TSF foundation models and achieving comparable or superior results to zero-shot TS-based models.
Researcher Affiliation	Collaboration	1Zhejiang University 2State Street Technology (Zhejiang) Ltd 3Salesforce Research Asia. Correspondence to: Chenghao Liu <EMAIL>, Zhuo Li <EMAIL>.
Pseudocode	No	The paper describes the methodology in Section 3 and provides a visual representation in Figure 3, but it does not contain any explicitly structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is publicly available at https://github.com/Keytoyze/VisionTS.
Open Datasets	Yes	We evaluate our proposed VISIONTS on large-scale benchmarks, including 8 long-term TSF (Zhou et al., 2021), 29 Monash (Godahewa et al., 2021), and 23 GIFT-Eval (Aksu et al., 2024) datasets, spanning diverse domains, frequencies, and multivariates. ... a visual masked autoencoder, pre-trained on the Image Net dataset
Dataset Splits	Yes	To prevent data leakage, we selected six widely-used datasets from the long-term TSF benchmark that are not included in MOIRAI s pre-training set for evaluation. Since most baselines cannot perform zero-shot forecasting, we report their few-shot results by fine-tuning on the 10% of the individual target datasets. ... We conduct hyperparameter tuning on validation sets to determine the optimal context length L, detailed in Appendix B.1.
Hardware Specification	Yes	All experiments are conducted using Time-Series-Library (https://github.com/thuml/ Time-Series-Library) and Gluon TS library (Alexandrov et al., 2020) on an NVIDIA A800 GPU.
Software Dependencies	No	The paper mentions 'Time-Series-Library' and 'Gluon TS library' but does not specify their version numbers.
Experiment Setup	Yes	We conduct hyperparameter tuning on validation sets to determine the optimal context length L. ... We set the hyperparameters to r = c = 0.4. ... We use an Adam optimizer with a learning rate 0.0001 and a batch size 256 to fine-tune MAE. All experiments are repeated three times. The training epoch is one for all the datasets except Illness, for which we train MAE for 100 epochs with an early stop due to the limited training dataset scale.