Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Winner-takes-all for Multivariate Probabilistic Time Series Forecasting
Authors: Adrien Cortes, Remi Rehm, Victor Letzelter
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide insights into our approach using synthetic data and evaluate it on real-world time series, demonstrating its promising performance at a light computational cost. In this section, we empirically validate our method, with experiments on real-world time series. The goal is to compare Time MCL with state-of-the-art probabilistic time series forecasters, emphasizing its balance between quantization, predictive performance, and computational efficiency. |
| Researcher Affiliation | Collaboration | 1LTCI, T el ecom Paris, Institut Polytechnique de Paris, France 2Valeo.ai, Paris, France. |
| Pseudocode | No | The paper describes the training scheme in Section 4.1 using numbered steps, but it is a descriptive explanation rather than a structured pseudocode block or algorithm. |
| Open Source Code | Yes | Code available at https://github.com/Victorletzelter/time MCL. |
| Open Datasets | Yes | Our approach is evaluated on six well-established benchmark datasets taken from Gluonts library (Alexandrov et al., 2020), preprocessed exactly as in Salinas et al. (2019); Rasul et al. (2021a). |
| Dataset Splits | Yes | Note for each dataset, we used the official train/test split. We dedicate 10 times the number of prediction steps for validation, at the end of the training data. |
| Hardware Specification | Yes | Inference time was computed on a single NVIDIA Ge Force RTX 2080 Ti, while making sure it is the only process that runs on the node. |
| Software Dependencies | No | The paper mentions software like PyTorch, Gluonts, fvcore, and statsmodels but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Training is conducted using the Adam optimizer with a learning rate of 10 3, and following an LR on plateau scheduler (See Pytorch documentation), a weight decay of 10 8 and 200 training epochs. Additionally, a separate validation split is used with a temporal size equal to 10 times the prediction length. All models are trained using gradient norm clipping with a threshold of 10. |