Temporal horizons in forecasting: a performance-learnability trade-off
Authors: Pau Vilimelis Aceituno, Jack William Miller, Noah Marti, Youssef Farag, Victor Boussange
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theory through numerical experiments and discuss practical implications for selecting training horizons. Our results provide a principled foundation for hyperparameter optimization in autoregressive forecasting models. |
| Researcher Affiliation | Academia | 1Institute of Neuroinformatics, ETH Zürich and University of Zürich, Winterthurerstrasse 190, Zürich 8057, Switzerland 2School of Computing, National Australian University, 108 North Rd, Acton ACT 2601, Australia 3Unit of Land Change Science, Swiss Federal Research Institute for Forest, Snow and Landscape Zürcherstrasse 111, Birmensdorf 8903, Switzerland |
| Pseudocode | Yes | Algorithm 1 Iterative scheduling of k and η |
| Open Source Code | No | The paper does not explicitly state that the authors are releasing source code for the methodology described in this paper. While it mentions the use of 'Chaotic Inference Julia library' by Noah Marti, this is a third-party tool used by the authors, not their own implementation code for the paper's methods. |
| Open Datasets | Yes | The Clim SIM dataset (see appendix D for details), which is a complex deterministic simulation of weather patterns. The National Oceanographic Atmospheric Administration Sea Surface Temperature dataset (NOAA SST, Huang et al. (2021)), which comes from real world measurements and thus contains noise. The AMZN (Amazon) stocks from 1997 to 2017 (Kouroupetroglou, 2019), which contains a short dataset that is noisy and non-stationary. |
| Dataset Splits | Yes | We split the training and validation datasets using given years. 2000 to 2009 was used for training and 2011 to 2017 was used for validation. ... The training and validation sets were taken from non-overlaping year-long periods. |
| Hardware Specification | Yes | We use less than 34000 core hours, or equivalent time of 1888 hours on a single V100 GPU. |
| Software Dependencies | No | The paper mentions using 'Julia using Chaotic Inference Marti (2024)' and 'Python using diffrax (Kidger, 2021)' and refers to 'Differential Equations.jl a performant and feature-rich ecosystem for solving differential equations in Julia (Rackauckas & Nie, 2017)'. However, specific version numbers for these software packages or programming languages are not provided, preventing full reproducibility. |
| Experiment Setup | Yes | We trained residual MLPs with different training temporal horizons for the four dynamical systems presented in appendix C until they appeared to reach convergence or a large total wall time cutoff. ... As a general rule, we tried to maintain some consistency between the hyperparameter choices for various datasets and architectures (for example we typically used a batch size of 512). ... The scheme, detailed in algorithm 1, seeks to automatically choose T (or equivalently k, the number of forecasting steps) and η to overcome a common difficulty in gradient descent: the presence of plateaus. ... Initialize: k 1, η η0, θprev θ0, γ 1.5 10 4, look True, s = kmax/Tmaxsucceeded False |