DynaConF: Dynamic Forecasting of Non-Stationary Time Series

Authors: Siqi Liu, Andreas Lehrmann

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our approach with 2 univariate and 9 multivariate time series models on synthetic (Section 4.1) and real-world (Section 4.2 and 4.3) datasets; see Table 1 for an overview, including references to the relevant literature and implementations. For evaluation we use a rolling-window approach with a window size of 10 steps. The final evaluation metrics are the aggregated results from all 100 test windows. We report the mean squared error (MSE) and continuous ranked probability score (CRPS) (Matheson & Winkler, 1976), a commonly used score to measure how close the predicted distribution is to the true distribution (see Appendix E for details).
Researcher Affiliation Industry Siqi Liu EMAIL Borealis AI Andreas Lehrmann EMAIL Borealis AI
Pseudocode No The paper describes the methodology using mathematical equations and descriptive text, and provides an architecture diagram (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The core of our model, called Dyna Con F1, is a clean decoupling of the time-variant (non-stationary) and the time-invariant (stationary) part of the distribution. 1https://github.com/Borealis AI/dcf
Open Datasets Yes We evaluate the proposed method on 6 widely-used datasets7 with published results (Lai et al., 2018; Salinas et al., 2019): (Exchange) daily exchange rates of 8 different countries from 1990 to 2016; (Solar)8 hourly solar power production in 137 PV plants in 2006; (Electricity)9 hourly electricity consumption of 370 customers from 2012 to 2014; (Traffic)10 hourly occupancy data at 963 sensor locations in the San Francisco Bay area; (Taxi) rides taken in 30-minute intervals at 1214 locations in New York City in January 2015/2016; (Wikipedia) daily page views of 2000 Wikipedia articles. ... 8http://www.nrel.gov/grid/solar-power-data.html 9https://archive.ics.uci.edu/ml/datasets/Electricity_Load_Diagrams2011_2014 10http://pems.dot.ca.gov. We further evaluate our method against state-of-the-art baselines on two more publicly available datasets: (Walmart)11 weekly sales of 45 Walmart stores from February 2010 to October 2012; (Temperature)12 monthly average temperatures of 1000 cities from January 1980 to September 2020. ... 11https://www.kaggle.com/datasets/yasserh/walmart-dataset 12https://www.kaggle.com/datasets/hansukyang/temperature-history-of-1000-cities-1980-to-2020
Dataset Splits Yes For our experiments on synthetic data we simulate four conditionally non-stationary stochastic processes for T = 2500 time steps, where we use the first 1000 steps as training data, the next 500 steps as validation data, and the remaining 1000 steps as test data. ... We use the last 10% of the training time period as the validation set and choose the initial learning rate, number of training epochs, and model sizes using the performance on the validation set for Stati Con F. ... For Walmart, the forecast window size is 4 weeks, and the test set consists of the last 20 weeks. For Temperature, the forecast window size is 3 months, and the test set consists of the last 24 months.
Hardware Specification Yes OOM: The model ran out of memory on our 16GB GPU using the minimum batch size and lookback window size.
Software Dependencies No The paper mentions software components like "Adam" (optimizer), "Gluon TS", "PyTorch TS", and "TSlib" (libraries/frameworks), but does not provide specific version numbers for any of these components as used in the authors' own implementation or experiments.
Experiment Setup Yes For our models, we use Adam (Kingma & Ba, 2014) as the optimizer with the default initial learning rate of 0.001 unless it is chosen using the validation set. The dimension of the latent vector zt,i (see Section 3.2) is set to E = 4 across all the experiments. ... We perform 50 updates per epoch. We use 32 hidden units for our 2-layer MLP encoder. ... For our method, we first train Stati Con F and then reuse its learned encoder in Dyna Con F, so the optimization of Dyna Con F is focused on the dynamic model. For Dyna Con F, we use the same validation set to choose the number of training epochs and use 0.01 as the initial learning rate. ... Our models use a two-layer LSTM with 128 hidden units as the encoder, except for the 8-dimensional Exchange data, where the hidden size is 8.