reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LETS Forecast: Learning Embedology for Time Series Forecasting

Authors: Abrar Majeedi, Viswanatha Reddy Gajjala, Satya Sai Srinath Namburi Gnvv, Nada Magdi Elkordi, Yin Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate our method, we conduct comprehensive experiments on synthetic data of nonlinear dynamical systems as well as real-world time series across domains. Our results show that Deep EDM is robust to input noise, and outperforms state-of-the-art methods in forecasting accuracy.
Researcher Affiliation	Academia	1Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison 2Department of Computer Sciences, University of Wisconsin-Madison.
Pseudocode	No	The paper describes the model architecture and its components (base predictor, encoder, kernel regression, decoder) using mathematical formulations and descriptive text, but it does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at: https:// abrarmajeedi.github.io/deep_edm.
Open Datasets	Yes	For multivariate forecasting, we evaluate on 10 real-world datasets: ETTh1, ETTh2, ETTm1, ETTm2 (Zhou et al., 2021), National Illness (ILI) (Lai et al., 2018), Solar-Energy (Lai et al., 2018) (see appendix), Electricity (see appendix), Traffic (Pe MS) (Wu et al., 2021), Weather (Wetterstation) (Wu et al., 2021), and Exchange (Lai et al., 2018). For univariate forecasting, we leverage the well-established M4 dataset (Makridakis et al., 2020) (see appendix), which contains 6 subsets of periodically collected univariate marketing data.
Dataset Splits	Yes	Our experimental protocol adheres to the preprocessing methods and data split ratios established by prominent prior works such as Times Net (Wu et al., 2023) and Koopa (Liu et al., 2024b). For the ETT datasets, which contain 7 sequences, we train on sequences 0 2 using only timesteps from the standard training split, and test on sequences 4 6 using the standard test split. Similarly, for the Exchange dataset (8 sequences), we train on the first 4 sequences and test on the last 4. For the Weather dataset (21 sequences), we train on sequences 0 9 and test on sequences 10 19.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using the "Time-Series-Library benchmarking repository" and the "Adam W optimizer" but does not specify version numbers for these or any other software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	In our implementation, the base predictor f( ) is instantiated as an MLP with 1 to 3 layers, each followed by a non-linear activation and dropout. The number of Deep EDM blocks is also varied between 1 and 3 based on dataset size... Deep EDM is trained for 250 epochs using the Adam W (Loshchilov & Hutter, 2017) optimizer with a learning rate of 0.0005 and a batch size of 32. Following standard practices in time-series forecasting, an early stopping mechanism based on validation set performance metrics is implemented to mitigate overfitting.