Decomposed Spatio-Temporal Mamba for Long-Term Traffic Prediction
Authors: Sicheng He, Junzhong Ji, Minglong Lei
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results across five real-world datasets demonstrate that DST-Mamba can capture both local fluctuations and global trends within traffic patterns, achieving stateof-the-art performance with favorable efficiency. |
| Researcher Affiliation | Academia | Sicheng He1, Junzhong Ji1,2, Minglong Lei1,2* 1College of Computer Science, Beijing University of Technology, Beijing, China 2Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China sicheng EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using text and mathematical equations (e.g., Eq. 1-8) but does not include any explicit pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | Our code is available at https://github.com/Anle-He/DST-Mamba. |
| Open Datasets | Yes | To evaluate the performance of DST-Mamba, we carry out experiments on five real-world traffic datasets (Wang et al. 2025), including Traffic and the PEMS datasets. Table 1 gives the detailed statistics of these datasets. |
| Dataset Splits | No | We adopt the same data processing and dataset split setting in S-Mamba, which strictly follows the chronological order to avoid the data leakage issue. The input series length to 96 for all datasets and evaluated the models with different prediction horizons. |
| Hardware Specification | Yes | The experiments are conducted on a single NVIDIA Ge Force RTX 4090 with 24 GB memory. |
| Software Dependencies | No | The paper mentions using MSE as the loss function and ADAM as the optimizer, but does not specify version numbers for any software libraries or programming languages. |
| Experiment Setup | Yes | We use MSE as the loss function and ADAM as the optimizer with an initial learning rate of 10 3. The batch size is set to 32, and the training process is early stopped within 10 epochs. The number of encoder layers (bi-directional Mamba block) varies from {1, 2, 3, 4}. The weight of the trend predictions is selected from the range 0.5 to 1. The down-sampling window size to generate different scales of trend series with the range of {2, 3, 4}; the number of Mamba blocks with a range from 1 to 4, and the dimension of the adaptive spatial embedding from {16, 32, 64, 128}. |