Filling the Missings: Spatiotemporal Data Imputation by Conditional Diffusion

Authors: Wenying He, Jieling Huang, Junhua Gu, Ji Zhang, Yude Bai

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The extensive experiments reveal that Co FILL s noise prediction network successfully transforms random noise into meaningful values that align with the true data distribution. The results also show that Co FILL outperforms state-of-the-art methods in imputation accuracy.
Researcher Affiliation Academia 1School of Artificial Intelligence, Hebei University of Technology, Tianjin, China 2School of Software, Tiangong University, Tianjin, China 3University of Southern Queensland, Queensland, Australia
Pseudocode Yes Algorithm 1 Training of Co FILL Algorithm 2 Imputation of Co FILL
Open Source Code Yes The source code is publicly available at https://github.com/joyHJL/CoFILL.
Open Datasets Yes We use three spatiotemporal datasets for imputation algorithms. The AQI-36 dataset [Zheng et al., 2014] serves as a good benchmark due to its inherent missing data patterns and complex spatiotemporal dependencies. The traffic datasets, METR-LA and PEMSBAY [Li et al., 2017], present different scales to test our model s capabilities.
Dataset Splits Yes The AQI-36 dataset requires careful temporal consideration due to its seasonal nature. We distribute the data across seasons by selecting March, June, September, and December for testing, which captures seasonal variations throughout the year. The validation set draws from the final 10% of data in February, May, August, and November to maintain seasonal representation. The remaining months form the training set. To evaluate model performance under different missing data scenarios, we design two types of data corruption schemes. For AQI-36, we implement a simulated failure (SF) pattern that replicates real-world sensor malfunction distributions. For PEMS-BAY and METR-LA traffic datasets, we create controlled missing data scenarios through mask matrices. These scenarios include random point missing (Point), where we mask 25% of observations uniformly at random, and structured block missing (Block), which combines 5% random masking with continuous missing segments. These segments span 1 to 4 hours per sensor and occur with 0.15% probability, simulating extended sensor outages.
Hardware Specification Yes Our implementation runs on an NVIDIA RTX 4090 GPU with 24GB VRAM.
Software Dependencies No The paper mentions "Adam" for optimization but does not provide specific software library versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We optimize the model using Adam with cosine annealing learning rate decay. The learning rate starts at 10 3 and decays to 10 5. Table 1 details the complete hyperparameter configuration for each dataset.