reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Temporal horizons in forecasting: a performance-learnability trade-off

Authors: Pau Vilimelis Aceituno, Jack William Miller, Noah Marti, Youssef Farag, Victor Boussange

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theory through numerical experiments and discuss practical implications for selecting training horizons. Our results provide a principled foundation for hyperparameter optimization in autoregressive forecasting models.
Researcher Affiliation	Academia	1Institute of Neuroinformatics, ETH Zürich and University of Zürich, Winterthurerstrasse 190, Zürich 8057, Switzerland 2School of Computing, National Australian University, 108 North Rd, Acton ACT 2601, Australia 3Unit of Land Change Science, Swiss Federal Research Institute for Forest, Snow and Landscape Zürcherstrasse 111, Birmensdorf 8903, Switzerland
Pseudocode	Yes	Algorithm 1 Iterative scheduling of k and η
Open Source Code	No	The paper does not explicitly state that the authors are releasing source code for the methodology described in this paper. While it mentions the use of 'Chaotic Inference Julia library' by Noah Marti, this is a third-party tool used by the authors, not their own implementation code for the paper's methods.
Open Datasets	Yes	The Clim SIM dataset (see appendix D for details), which is a complex deterministic simulation of weather patterns. The National Oceanographic Atmospheric Administration Sea Surface Temperature dataset (NOAA SST, Huang et al. (2021)), which comes from real world measurements and thus contains noise. The AMZN (Amazon) stocks from 1997 to 2017 (Kouroupetroglou, 2019), which contains a short dataset that is noisy and non-stationary.
Dataset Splits	Yes	We split the training and validation datasets using given years. 2000 to 2009 was used for training and 2011 to 2017 was used for validation. ... The training and validation sets were taken from non-overlaping year-long periods.
Hardware Specification	Yes	We use less than 34000 core hours, or equivalent time of 1888 hours on a single V100 GPU.
Software Dependencies	No	The paper mentions using 'Julia using Chaotic Inference Marti (2024)' and 'Python using diffrax (Kidger, 2021)' and refers to 'Differential Equations.jl a performant and feature-rich ecosystem for solving differential equations in Julia (Rackauckas & Nie, 2017)'. However, specific version numbers for these software packages or programming languages are not provided, preventing full reproducibility.
Experiment Setup	Yes	We trained residual MLPs with different training temporal horizons for the four dynamical systems presented in appendix C until they appeared to reach convergence or a large total wall time cutoff. ... As a general rule, we tried to maintain some consistency between the hyperparameter choices for various datasets and architectures (for example we typically used a batch size of 512). ... The scheme, detailed in algorithm 1, seeks to automatically choose T (or equivalently k, the number of forecasting steps) and η to overcome a common difficulty in gradient descent: the presence of plateaus. ... Initialize: k 1, η η0, θprev θ0, γ 1.5 10 4, look True, s = kmax/Tmaxsucceeded False