Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Winner-takes-all for Multivariate Probabilistic Time Series Forecasting

Authors: Adrien Cortes, Remi Rehm, Victor Letzelter

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide insights into our approach using synthetic data and evaluate it on real-world time series, demonstrating its promising performance at a light computational cost. In this section, we empirically validate our method, with experiments on real-world time series. The goal is to compare Time MCL with state-of-the-art probabilistic time series forecasters, emphasizing its balance between quantization, predictive performance, and computational efficiency.
Researcher Affiliation Collaboration 1LTCI, T el ecom Paris, Institut Polytechnique de Paris, France 2Valeo.ai, Paris, France.
Pseudocode No The paper describes the training scheme in Section 4.1 using numbered steps, but it is a descriptive explanation rather than a structured pseudocode block or algorithm.
Open Source Code Yes Code available at https://github.com/Victorletzelter/time MCL.
Open Datasets Yes Our approach is evaluated on six well-established benchmark datasets taken from Gluonts library (Alexandrov et al., 2020), preprocessed exactly as in Salinas et al. (2019); Rasul et al. (2021a).
Dataset Splits Yes Note for each dataset, we used the official train/test split. We dedicate 10 times the number of prediction steps for validation, at the end of the training data.
Hardware Specification Yes Inference time was computed on a single NVIDIA Ge Force RTX 2080 Ti, while making sure it is the only process that runs on the node.
Software Dependencies No The paper mentions software like PyTorch, Gluonts, fvcore, and statsmodels but does not provide specific version numbers for these dependencies.
Experiment Setup Yes Training is conducted using the Adam optimizer with a learning rate of 10 3, and following an LR on plateau scheduler (See Pytorch documentation), a weight decay of 10 8 and 200 training epochs. Additionally, a separate validation split is used with a temporal size equal to 10 times the prediction length. All models are trained using gradient norm clipping with a threshold of 10.