LSCD: Lomb–Scargle Conditioned Diffusion for Time series Imputation
Authors: Elizabeth Fons, Alejandro Sztrajman, Yousef El-Laham, Luciana Ferrer, Svitlana Vyetrenko, Manuela Veloso
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic and real-world benchmarks demonstrate that our method recovers missing data more accurately than purely time-domain baselines, while simultaneously producing consistent frequency estimates. Crucially, our method can be easily integrated into learning frameworks, enabling broader adoption of spectral guidance in machine learning approaches involving incomplete or irregular data. |
| Researcher Affiliation | Collaboration | 1J.P. Morgan AI Research 2University of Cambridge 3University of Buenos Aires 4CONICET. Correspondence to: Elizabeth Fons <EMAIL>. |
| Pseudocode | Yes | The following listing provides a PyTorch implementation of the Lomb Scargle periodogram, designed to efficiently compute spectral estimates for irregularly sampled time series. This implementation is fully differentiable, allowing seamless integration into learning-based models for gradient-based optimization. 1 import torch ... 53 return P Listing 1: Batch Lomb-Scargle Periodogram with Masking |
| Open Source Code | Yes | To facilitate broader adoption, we provide a differentiable implementation which can be seamlessly integrated into learning pipelines. ... E.1. Lomb-Scargle implementation The following listing provides a PyTorch implementation of the Lomb Scargle periodogram, designed to efficiently compute spectral estimates for irregularly sampled time series. This implementation is fully differentiable, allowing seamless integration into learning-based models for gradient-based optimization. |
| Open Datasets | Yes | The first dataset, PhysioNet (Silva et al., 2012), comprises 4,000 health measurements from ICU patients, covering 35 features. ... The second dataset consists of PM2.5 air quality measurements collected from 36 stations in Beijing over a 12-month period (Yi et al., 2016). |
| Dataset Splits | Yes | For evaluation, we hold out 10%, 50%, and 90% of the observed values as ground truth and assess imputation quality on these missing entries. ... In practice, during training a conditional mask mco {0, 1}K L is introduced to artificially split the observed values into xco 0 = mco X and xta 0 = (M mco) X, in order to train the conditional denoising function. |
| Hardware Specification | Yes | All computations in this analysis were performed using a g5.2xlarge AWS instance (AMD EPYC 7R32 CPU, with an Nvidia A10G 24 GB GPU). |
| Software Dependencies | No | The paper mentions 'PyTorch' implicitly through code examples and 'Py Grinder' for dataset generation, but no specific version numbers are provided for these software components. |
| Experiment Setup | Yes | We train for 400 epochs and select the best checkpoint via a validation set. The final z S from our spectral encoder is concatenated with the other conditioning signals (e.g. partial observations) at every denoising step to guide the model. Diffusion Steps: Tmax = 50. Batch Size: 16. Learning Rate: 1e-3. |