Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate RF inversion on stroke-to-image generation and image editing tasks, with additional qualitative results on cartoonization, object insertion, image generation, and content-style composition. Our method significantly improves photorealism in stroke-to-image generation, surpassing a state-of-the-art (So TA) method (Mokady et al., 2023) by 89%, while maintaining faithfulness to the input stroke. In addition, we show that RF inversion outperforms DM inversion (Meng et al., 2022) in faithfulness by 4.7% and in realism by 13.8% on LSUN-bedroom dataset (Wang et al., 2017). Figure 1 shows a graphical illustration of our method RF-Inversion.
Researcher Affiliation Collaboration 1 Google 2 UT Austin
Pseudocode Yes Algorithm 1: Controlled Forward ODE (8) Input: Discretization steps N, reference image y0, prompt embedding network Φ, Flux model u( , , ; ϕ), Flux noise scheduler σ : [0, 1] R Tunable parameter: Controller guidance γ Output: Structured noise Y1
Open Source Code Yes See our project page https://rf-inversion.github.io/ for code and demo. ... Refer to our project page: https://rf-inversion.github.io/ for source code and demo.
Open Datasets Yes We evaluate RF inversion on stroke-to-image generation and image editing tasks, with additional qualitative results on cartoonization, object insertion, image generation, and content-style composition. ... We show that RF inversion outperforms DM inversion across three benchmarks: LSUN-church, LSUN-bedroom (Wang et al., 2017), and SFHQ (Beniaguev, 2022) on two tasks: Stroke2Image generation and image editing.
Dataset Splits Yes On the test split of LSUN bedroom dataset, our approach is 4.7% more faithful and 13.79% more realistic than the best optimization free method SDEdit-SD1.5. ... We conduct a user study on the test splits of both LSUN Bedroom and LSUN Church dataset using Amazon Mechanical Turk
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) were explicitly mentioned for running the experiments.
Software Dependencies No The paper mentions software like "NTI codebase", "Diffusers library", and "Flux" without specifying exact version numbers for these or other underlying libraries/frameworks (e.g., PyTorch, Python).
Experiment Setup Yes In Table 4, we provide the hyper-parameters for the empirical results reported in 5. We use a fix γ = 0.5 in our controlled forward ODE (8) and a time-varying guidance parameter ηt in our controlled reverse ODE (15), as motivated in Remark 3.3 and Remark 3.6. Thus, our algorithm introduces one additional hyper-parameter ηt into the Flux pipeline. For each experiment, we use a fixed time-varying schedule of ηt described by starting time (s), stopping time τ, and strength (η). We use the default config for Flux model: 3.5 for classifier-free guidance and 28 for the total number of inference steps.