Infinite-Resolution Integral Noise Warping for Diffusion Models
Authors: Yitong Deng, Winnie Lin, Lingxiao Li, Dmitriy Smirnov, Ryan Burgert, Ning Yu, Vincent Dedun, Mohammad Taghavi
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We prove and experimentally validate our theoretical claims, and demonstrate our method s effectiveness in real-world applications. We verify our theoretical claims by showing that both variants of our method preserve Gaussian white noise distribution, and that Chang et al. (2024) (HIWYN) converges to our grid-based variant as N increases. We analyze the behaviors of our grid and particle-based variants under diffeomorphic and non-diffeomorphic deformations. We then apply our method to video generation tasks and benchmark against existing methods (Ge et al., 2023; Chen et al., 2023; Chang et al., 2024). Finally, we extend our method to warping volumetric noise and demonstrate a use case in 3D graphics. |
| Researcher Affiliation | Collaboration | Yitong Deng1,2, Winnie Lin1, Lingxiao Li1, Dmitriy Smirnov1, Ryan Burgert3,4, Ning Yu3, Vincent Dedun1, Mohammad H. Taghavi1 1Netflix, 2Stanford University, 3Netflix Eyeline Studios, 4Stony Brook University |
| Pseudocode | Yes | Algorithm 1 Infinite-Resolution Integral Noise Warp Algorithm 2 Grid-based Partition Algorithm 3 Particle-based Partition |
| Open Source Code | No | The paper mentions a reimplementation of a third-party method but does not explicitly state that the source code for their own described methodology is publicly available or provide a link to it. |
| Open Datasets | Yes | We integrate our method with I2SB (Liu et al., 2023) and adapt its pre-trained image 4 super-resolution model (bicubic) to perform video super-resolution. In Figure 7, we stress test both variants under non-diffeomorphic maps obtained using optical flow (Teed & Deng, 2020) on a real-world video (Brox & Malik, 2011). We extend our particle-based algorithm to 3D by replacing the bilinear kernel with the trilinear kernel in Algorithm 3 and apply it to Gaussian Cube (Zhang et al., 2024), which denoises a dense 3D noise grid to reconstruct 3D Gaussians. |
| Dataset Splits | No | The paper uses various datasets for evaluation but does not provide specific training, validation, or test splits for any of them. It applies its method to existing models or real-world videos without defining splits for its own experimental setup. |
| Hardware Specification | Yes | warping 1024 1024 noise images in 0.045s (grid variant) and 0.0086s (particle variant) using a laptop with an Nvidia RTX 3070 Ti GPU. The computation is done on a laptop with Intel i7-12700H and Nvidia RTX 3070 Ti. |
| Software Dependencies | No | The paper mentions using a reimplementation in Taichi (Hu et al., 2019) but does not provide a specific version number for Taichi or any other software dependencies. |
| Experiment Setup | Yes | We set SDEdit s parameter t0 to 0.4, and apply Perturbed-Attention Guidance (Ahn et al., 2024) with a strength of 3.0. Further results that additionally integrate cross-frame attention (Ceylan et al., 2023) (anchored every 3 frames) are shown in Figure 10 and B.2. |