Interpretable Sequence Learning for Covid-19 Forecasting
Authors: Sercan Arik, Chun-Liang Li, Jinsung Yoon, Rajarishi Sinha, Arkady Epshteyn, Long Le, Vikas Menon, Shashank Singh, Leyou Zhang, Martin Nikoltchev, Yash Sonthalia, Hootan Nakhost, Elli Kanal, Tomas Pfister
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our model provides more accurate forecasts compared to the alternatives, and that it provides qualitatively meaningful explanatory insights. ... 6 Experiments We conduct all experiments on US COVID-19 data. |
| Researcher Affiliation | Industry | Sercan O. Arık, Chun-Liang Li, Jinsung Yoon, Rajarishi Sinha, Arkady Epshteyn, Long T. Le, Vikas Menon, Shashank Singh, Leyou Zhang, Martin Nikoltchev, Yash Sonthalia, Hootan Nakhost, Elli Kanal, Tomas Pfister Google Cloud AI EMAIL |
| Pseudocode | Yes | Algorithm 1 Pseudo-code for training the proposed model |
| Open Source Code | No | The paper does not provide an explicit statement about open-sourcing the code for their method or a link to a code repository. |
| Open Datasets | Yes | We conduct all experiments on US COVID-19 data. The primary ground truth data for the progression of the disease, for Q and D, are from [39] as used by several others, e.g. [28]. They obtain the raw data from the state and county health departments. ... Ground truth data for the H, C and V (see Fig. 1) are obtained from [40]. |
| Dataset Splits | Yes | We split the observed data into training and validation with the last τ timesteps to mimic the testing scenario. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like TensorFlow and XGBoost, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We choose the compartment weights λD = λQ = 0.1, λH = 0.01 and λR(d) = λC = λV = 0.001. We employ [41] for hyperparameter tuning (including all the loss coefficients, learning rate, and initial conditions) with the objective of optimizing for the best validation loss, with 400 trials and we use F = 300 fine-tuning iterations. ... In the first stage of training, we use teacher forcing with ν [0, 1], which is a hyperparameter. For fine-tuning (please see below), we use ν = 1 to unroll the last τ steps to mimic the real forecasting scenario. |