S4M: S4 for multivariate time series forecasting with Missing values
Authors: Jing Peng, Meiqi Yang, Qiong Zhang, Xiaoxiao Li
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive empirical evaluations on diverse real-world datasets, we demonstrate that S4M consistently achieves state-of-the-art performance. These results underscore the efficacy of our integrated approach in handling missing data, showcasing its robustness and superiority over traditional imputation-based methods. Our findings highlight the potential of S4M to advance reliable time series forecasting in practical applications, offering a promising direction for future research and deployment. |
| Researcher Affiliation | Academia | Jing Peng1 Meiqi Yang2 Qiong Zhang1 Xiaoxiao Li3,4 1Renmin University of China 2Princeton University 3The University of British Columbia 4Vector Institute EMAIL, EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Bank Reading, Algorithm 2 Bank Writing, Algorithm 3 Testing Pipeline, Algorithm 4 Training Pipeline |
| Open Source Code | Yes | Code is available at https://github.com/WINTERWEEL/S4M.git. |
| Open Datasets | Yes | We select four commonly used time series datasets for forecasting: Electricity (Wu et al., 2021), ETTh1 (Zhou et al., 2021), Traffic (Wu et al., 2021), and Weather (Wu et al., 2021). For more general evaluation, we also include the real world dataset, USHCN climate dataset (Menne et al., 2015), with 271728 time steps and 10 variables in total. |
| Dataset Splits | Yes | After obtaining the dataset with missing values, we split it chronologically into training, validation, and test sets, with a ratio of 0.7/0.1/0.2. |
| Hardware Specification | No | To measure the training and inference time, we conducted performance experiments using the electricity dataset, with a batch size of 16 and a hidden size of 512. The maximum memory usage, along with the training and inference times, were recorded for a single epoch. (This text does not contain specific hardware details like GPU/CPU models, only computational metrics). |
| Software Dependencies | No | The learning rates are set to 0.01 for the Electricity and Traffic datasets, 0.005 for the ETTh1 dataset, and 0.001 for the Weather dataset. The dimensions of the hidden layers are set to 512 for the Electricity and Traffic datasets, and 256 for the ETTh1 and Weather datasets. The number of basic blocks or layers is selected from {2, 4, 8}. The batch size set for all experiments are 16. We use the Adam optimizer and implement an early stopping strategy across all experiments. (This text mentions the Adam optimizer but does not provide specific version numbers for software libraries or frameworks used, like Python, PyTorch, TensorFlow, etc.) |
| Experiment Setup | Yes | The learning rates are set to 0.01 for the Electricity and Traffic datasets, 0.005 for the ETTh1 dataset, and 0.001 for the Weather dataset. The dimensions of the hidden layers are set to 512 for the Electricity and Traffic datasets, and 256 for the ETTh1 and Weather datasets. The number of basic blocks or layers is selected from {2, 4, 8}. The batch size set for all experiments are 16. We use the Adam optimizer and implement an early stopping strategy across all experiments. |