On the Regularization of Learnable Embeddings for Time Series Forecasting

Authors: Luca Butera, Giovanni De Felice, Andrea Cini, Cesare Alippi

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Specifically, we perform the first extensive empirical study on the subject and show how such regularizations consistently improve performance in widely adopted architectures." and "5 Experiments We evaluate the effectiveness of different regularization strategies for local embeddings under three different scenarios: time series forecasting benchmarks (Sec. 5.1), transfer learning (Sec. 5.3), and a sensitivity analysis through embedding perturbations (Sec. 5.4).
Researcher Affiliation Academia Luca Butera EMAIL Università della Svizzera Italiana, IDSIA", "Giovanni De Felice EMAIL University of Liverpool", "Andrea Cini EMAIL Università della Svizzera Italiana, IDSIA", "Cesare Alippi EMAIL Università della Svizzera Italiana, IDSIA Politecnico di Milano
Pseudocode No The paper describes the methods and procedures in narrative text and mathematical equations, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Reproducibility Python code to reproduce the experiments is available online1. 1https://github.com/Luca Butera/TS-embedding-regularization
Open Datasets Yes We consider six real-world datasets of time series collections, spanning four different application domains: METR-LA and PEMS-BAY (Li et al., 2018) as two established benchmarks for traffic forecasting, AQI (Zheng et al., 2015) from the air quality domain, CER-E (CER, 2016) from the energy consumption domain, CLM-D (De Felice et al., 2024) and Eng RAD (Marisca et al., 2024) as two multivariate climatic datasets. Details on the datasets, data splits and forecasting settings can be found in Appendix A." and Appendix A provides specific links and citations for all datasets, e.g., "AQI Air quality data from Zheng et al. (2015). The dataset collects measurements of the PM2.5 pollutant from air quality stations in 43 Chinese cities and is available at https://www.microsoft.com/en-us/research/publication/forecasting-fine-grained-air-quality-based-on-big-data/
Dataset Splits Yes All datasets were split 70%/10%/20% into train, validation and test along the temporal axis.
Hardware Specification Yes Computing resources Experiments were run on A100 and A5000 NVIDIA GPUs. The vast majority of the experiments conducted in our work can be easily run on moderate GPU hardware, with at least 8 GBs of VRAM.
Software Dependencies No We used the Python (Van Rossum & Drake, 2009) programming language, leveraging Torch Spatiotemporal (Cini & Marisca, 2022), Pytorch (Paszke et al., 2019) and Pytorch Lightning (Falcon & The Py Torch Lightning team, 2019) to implement all the experiments. The paper mentions software tools but does not provide specific version numbers for libraries like PyTorch, PyTorch Lightning, or Torch Spatiotemporal, which are crucial for reproducibility.
Experiment Setup Yes For the experiments in Tab. 1, optimal hyperparameters, i.e., learning rate lr and hidden size dh, for each model and dataset, were found via a grid-search over lr [0.00025, 0.00075, 0.0015, 0.003] and dh [32, 64, 128, 256]... This resulted in: weight of L2 λl2 = 0.0001, weight of L1 λl1 = 0.00001, weight of variational regularization λvar = 0.00005 and weight of clustering regularization λclst = 0.0005. Dropout s probability of dropping a connection was set to p = 0.5... we set the resetting period of forgetting to k = 20 epochs with 30 epochs of warm-up... The training lasted up to 150 epochs with 50 epochs of early stopping patience, while fine-tuning lasted up to 1000 epochs with 100 epochs of patience.