Long Short-Term Imputer: Handling Consecutive Missing Values in Time Series

Authors: Jiacheng You, Xinyang Chen, Yu Sun, Weili Guan, Liqiang Nie

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that our approach, on average, reduces the error by 57.4% compared to state-of-the-art deep models across five datasets. ... In this section, we conduct experiments on five real-world datasets and compare them with the current mainstream imputation and forecasting methods. The experimental results indicate that our approach significantly outperforms existing methods in cases of long-interval consecutive missing data.
Researcher Affiliation Academia Jiacheng You EMAIL School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen); Xinyang Chen EMAIL School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen); Yu Sun EMAIL College of Computer Science, DISSec, Nankai University; Weili Guan EMAIL School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen); Liqiang Nie EMAIL School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
Pseudocode No The paper describes the model architecture and procedures using textual descriptions and mathematical equations, but does not include any explicit 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a direct link to a code repository.
Open Datasets Yes Datasets We use five real-world datasets in experiments. (1) Electricity (UCI) collects hourly electricity consumption data of 321 customers from 2012 to 2014. (2) Traffic (Pe MS) contains hourly road occupancy rates measured by 862 sensors on San Francisco Bay area freeways from January 2015 to December 2016. (3) METR-LA (Metro) records four months of statistics on traffic speed on 207 sensors on the highways of Los Angeles County. (4) Guangzhou (Chen et al., 2018) records traffic speeds per ten minutes on 214 anonymous roads in Guangzhou from August 1, 2016 to September 30, 2016. (5) PEMS04 (Chen et al., 2001) is a subset of PEMS, collected by 307 detectors over a continuous period of 59 days, starting from January 1, 2018.
Dataset Splits Yes To ensure consistency and comparability, we adopted the data processing methodology from Times Net. Each dataset was partitioned into training, validation, and test sets in a 7:1:2 ratio.
Hardware Specification Yes The experiments in this study were conducted on a high-performance computing system with the following specifications: eight NVIDIA Ge Force RTX 4090 GPUs, each equipped with 24GB of VRAM, enabling efficient execution of deep learning and high-performance computational tasks. The system is powered by a 128-core AMD EPYC 7513 processor, which provides substantial parallel processing power. Additionally, the server is configured with 503GB of RAM, facilitating the handling of large datasets and memory-intensive operations.
Software Dependencies No The paper mentions using Times Net as a backbone and other models as baselines, but does not provide specific version numbers for software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Implementation details Our LSTI is a general training framework. Hence the backbone of FPNet, BPNet and SMNet can be any advanced deep network specially designed for time series. In our experiments, we use Times Net as the backbone in LSTI, employing the same hyperparameter settings as in the original paper, namely S is 96 and L is 96. In other baseline models, hyperparameters are set according to the configurations specified in their original papers.