Flipped Classroom: Effective Teaching for Time Series Forecasting
Authors: Philipp Teutsch, Patrick Mäder
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For our experiments, we utilize six datasets generated from prominent chaotic systems. We found that the newly proposed increasing training scale curricula with a probabilistic iteration scale curriculum consistently outperforms previous training strategies yielding an NRMSE improvement of up to 81% over FR or TF training. |
| Researcher Affiliation | Academia | Philipp Teutsch EMAIL Technische Universität Ilmenau Patrick Mäder EMAIL Technische Universität Ilmenau Friedrich-Schiller-Universität Jena |
| Pseudocode | No | The paper describes the training strategies and curricula using mathematical equations (Eqs. 1-10) and narrative text, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1A reproduction package for the experiments is available on Github: https://github.com/phit3/flipped_classroom. |
| Open Datasets | Yes | The datasets used in this paper are published on Dataverse: https://doi.org/10.7910/DVN/YEIZDT. We use six different time series datasets that we built by approximating six commonly studied chaotic systems (cp. Tab. II), i.e., Mackey-Glass (Mackey & Glass, 1977), Rössler (Rössler, 1976), Thomas cyclically symmetric attractor (Thomas, 1999), Hyper Rössler (Rossler, 1979), Lorenz (Lorenz, 1963) and Lorenz 96 (Lorenz, 1996). |
| Dataset Splits | Yes | We split each dataset into 80% training samples and 10% validation and testing samples respectively. |
| Hardware Specification | No | The paper describes hyper-parameters and training configurations (optimizer, batch size, learning rate, input length, hidden state size) in Section 5.5 'Training Procedure', but does not provide any specific hardware details such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and Reduce Learning Rate on Plateau (RLROP) but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the implementation. |
| Experiment Setup | Yes | We performed a full grid search for the hyper-parameters learning rate, batch size, learning rate reduction factor, loss plateau, input length n and hidden state size to determine suitable configurations for the experiments. Based on this optimization, we used the Adam (Kingma et al., 2015) optimizer with a batch size of 128 and apply Reduce Learning Rate on Plateau (RLROP) with an initial learning rate of 1e-3 and a reduction factor of 0.6, i.e., 40% learning rate reduction, given a loss plateau of 10 epochs for all datasets except Lorenz 96, where we use a reduction factor of 0.9 and a 20 epoch plateau respectively. Furthermore, we found an input length of n = 150 steps and a hidden state size of 256 to be most suitable. We use early stopping with a patience of 100 epochs and a minimum improvement threshold of 1% to ensure the convergence of the model while preventing from overfitting. |