Single Image Rolling Shutter Removal with Diffusion Models
Authors: Zhanglei Yang, Haipeng Li, Mingbo Hong, Chen-Lin Zhang, Jiajun Li, Shuaicheng Liu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that RS-Diffusion surpasses previous single-frame RS methods, demonstrates the potential of diffusion-based approaches, and provides a valuable dataset for further research. ... Quantitative Comparison ... Ablation Studies |
| Researcher Affiliation | Collaboration | 1University of Electronic Science and Technology of China 2Megvii Technology 3 Moonshot AI 4 Noumena AI |
| Pseudocode | No | The paper describes the methodology and framework with textual descriptions and diagrams (e.g., Figure 3), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/lhaippp/RS-Diffusion ... We publicly share our code and dataset with the community at https://github.com/lhaippp/RS-Diffusion. |
| Open Datasets | Yes | Datasets https://huggingface.co/Lhaippp/RS-Diffusion ... In addition, we present the RS-Real dataset, comprised of captured RS frames alongside their corresponding Global Shutter (GS) ground-truth pairs. ... We publicly share our code and dataset with the community at https://github.com/lhaippp/RS-Diffusion. |
| Dataset Splits | Yes | The dataset contains 40,000 train and 1,000 test samples. ... Our dataset, designed for realism in content, RS-motion, and label accuracy, comprises 40,000 training and 1,000 test pairs across diverse scenes. |
| Hardware Specification | Yes | while it could run inference in real-time speed, i.e., up to 28.1 ms per frame on one NVIDIA 2080Ti. |
| Software Dependencies | No | The paper mentions building the framework upon CFG (Ho and Salimans 2022) and DDIM (Song, Meng, and Ermon 2020) but does not specify software dependencies with version numbers (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | We contrast our image-to-motion pipeline with traditional image-to-image methods that use diffusion models to convert RS images to GS images at a fixed resolution of 256 x 256, later upscaled to 600 x 800 for visual metric comparison with GT GS images... The loss function comprises the MSELoss, calculated between ˆx0 and x0, and the photometric loss, computed between the GT GS image and I GS... Consequently, the overall loss can be computed as a dynamically weighted sum. In other words, it continually adjusts ℓpl to be equal to ℓmse, formulated as: ℓoverall = ℓmse + |ℓmse| / |ℓpl| ℓpl. |