reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Winner-takes-all for Multivariate Probabilistic Time Series Forecasting

Authors: Adrien Cortes, Remi Rehm, Victor Letzelter

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide insights into our approach using synthetic data and evaluate it on real-world time series, demonstrating its promising performance at a light computational cost. In this section, we empirically validate our method, with experiments on real-world time series. The goal is to compare Time MCL with state-of-the-art probabilistic time series forecasters, emphasizing its balance between quantization, predictive performance, and computational efficiency.
Researcher Affiliation	Collaboration	1LTCI, T el ecom Paris, Institut Polytechnique de Paris, France 2Valeo.ai, Paris, France.
Pseudocode	No	The paper describes the training scheme in Section 4.1 using numbered steps, but it is a descriptive explanation rather than a structured pseudocode block or algorithm.
Open Source Code	Yes	Code available at https://github.com/Victorletzelter/time MCL.
Open Datasets	Yes	Our approach is evaluated on six well-established benchmark datasets taken from Gluonts library (Alexandrov et al., 2020), preprocessed exactly as in Salinas et al. (2019); Rasul et al. (2021a).
Dataset Splits	Yes	Note for each dataset, we used the official train/test split. We dedicate 10 times the number of prediction steps for validation, at the end of the training data.
Hardware Specification	Yes	Inference time was computed on a single NVIDIA Ge Force RTX 2080 Ti, while making sure it is the only process that runs on the node.
Software Dependencies	No	The paper mentions software like PyTorch, Gluonts, fvcore, and statsmodels but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	Training is conducted using the Adam optimizer with a learning rate of 10 3, and following an LR on plateau scheduler (See Pytorch documentation), a weight decay of 10 8 and 200 training epochs. Additionally, a separate validation split is used with a temporal size equal to 10 times the prediction length. All models are trained using gradient norm clipping with a threshold of 10.