Single Image Rolling Shutter Removal with Diffusion Models

Authors: Zhanglei Yang, Haipeng Li, Mingbo Hong, Chen-Lin Zhang, Jiajun Li, Shuaicheng Liu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that RS-Diffusion surpasses previous single-frame RS methods, demonstrates the potential of diffusion-based approaches, and provides a valuable dataset for further research. ... Quantitative Comparison ... Ablation Studies
Researcher Affiliation Collaboration 1University of Electronic Science and Technology of China 2Megvii Technology 3 Moonshot AI 4 Noumena AI
Pseudocode No The paper describes the methodology and framework with textual descriptions and diagrams (e.g., Figure 3), but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/lhaippp/RS-Diffusion ... We publicly share our code and dataset with the community at https://github.com/lhaippp/RS-Diffusion.
Open Datasets Yes Datasets https://huggingface.co/Lhaippp/RS-Diffusion ... In addition, we present the RS-Real dataset, comprised of captured RS frames alongside their corresponding Global Shutter (GS) ground-truth pairs. ... We publicly share our code and dataset with the community at https://github.com/lhaippp/RS-Diffusion.
Dataset Splits Yes The dataset contains 40,000 train and 1,000 test samples. ... Our dataset, designed for realism in content, RS-motion, and label accuracy, comprises 40,000 training and 1,000 test pairs across diverse scenes.
Hardware Specification Yes while it could run inference in real-time speed, i.e., up to 28.1 ms per frame on one NVIDIA 2080Ti.
Software Dependencies No The paper mentions building the framework upon CFG (Ho and Salimans 2022) and DDIM (Song, Meng, and Ermon 2020) but does not specify software dependencies with version numbers (e.g., Python, PyTorch versions).
Experiment Setup Yes We contrast our image-to-motion pipeline with traditional image-to-image methods that use diffusion models to convert RS images to GS images at a fixed resolution of 256 x 256, later upscaled to 600 x 800 for visual metric comparison with GT GS images... The loss function comprises the MSELoss, calculated between ˆx0 and x0, and the photometric loss, computed between the GT GS image and I GS... Consequently, the overall loss can be computed as a dynamically weighted sum. In other words, it continually adjusts ℓpl to be equal to ℓmse, formulated as: ℓoverall = ℓmse + |ℓmse| / |ℓpl| ℓpl.