EasyInv: Toward Fast and Better DDIM Inversion
Authors: Ziyue Zhang, Mingbao Lin, Shuicheng Yan, Rongrong Ji
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Table 1 and Table 2, we compare our Easy Inv over other inversion methods, using SD V1.4 on one NVIDIA GTX 3090 GPU. For Fixed-Point Iteration (Pan et al., 2023), we re-implemented it using settings from the paper. We set the data type of all methods to float16 by default to improve efficiency. The inversion and denoising steps are T = 50, except for Fixed-Point Iteration, which recommends T = 20. For our Easy Inv, we set 0.05 T < t < 0.25 T and η = 0.5. In Table 3 we compare the performance of different down-string tasks when using different inversion methods, the dataset and codes we used in this experiment are from PNPinversion (Ju et al., 2024). We use three major qualitative metrics: LPIPS index (Zhang et al., 2018), SSIM (Wang et al., 2004), and PSNR with the inference time. We sample 2,298 images from the COCO 2017 test and validation sets (Lin et al., 2014). |
| Researcher Affiliation | Collaboration | 1Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China 2Skywork AI, Singapore 3National University of Singapore. Correspondence to: Rongrong Ji <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 illustrates the approximate operation of our method in conjunction with an inversion framework. Algorithm 1 Add Easy Inv to an existing inversion method Require: A inversion algorithm Inv(), total inversion steps T, latent z, chosen steps t, empirical parameter η 1: for t in T do 2: zt+1 = Inv(zt, t) 3: if t in t then 4: zt+1 = η zt+1 + (1 η) zt 5: end if 6: end for Output: inverted latent z T |
| Open Source Code | Yes | See code at https:// github.com/potato-kitty/Easy Inv. |
| Open Datasets | Yes | We sample 2,298 images from the COCO 2017 test and validation sets (Lin et al., 2014). Most of the results in this table are from PNPInversion (Ju et al., 2024), where a dataset for 9 different image editing tasks is introduced |
| Dataset Splits | Yes | We sample 2,298 images from the COCO 2017 test and validation sets (Lin et al., 2014). Most of the results in this table are from PNPInversion (Ju et al., 2024), where a dataset for 9 different image editing tasks is introduced, along with the corresponding code for both editing and evaluation. We used these codes and dataset for this experiment. |
| Hardware Specification | Yes | In Table 1 and Table 2, we compare our Easy Inv over other inversion methods, using SD V1.4 on one NVIDIA GTX 3090 GPU. Most of the results in this table are from PNPInversion (Ju et al., 2024), where a dataset for 9 different image editing tasks is introduced, along with the corresponding code for both editing and evaluation. We used these codes and dataset for this experiment. The only modification we made was to incorporate our method into Direct Inv, the inversion method proposed in their work (Ju et al., 2024), and its results are indicate as ours+Direct Inv in Table 3. As we pointed out, our method is able to combined with most existing inversion algorithm. The results in Table 3 show the our advantage. By adding our method, the performance of Direct Inv improves in 5 out of 7 metrics across all editing tasks, with minimal changes in the remaining 2 metrics. averaged results on A800 and RTX3090 since different environment leads to slightly different performance. |
| Software Dependencies | No | The paper mentions 'SD V1.4', 'SD-XL', 'SD-V1-4', and 'Stable Diffusion v1.5' as base models, but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | The inversion and denoising steps are T = 50, except for Fixed-Point Iteration, which recommends T = 20. For our Easy Inv, we set 0.05 T < t < 0.25 T and η = 0.5. In Table 3 we compare the performance of different down-string tasks when using different inversion methods, the dataset and codes we used in this experiment are from PNPinversion (Ju et al., 2024). ... We set the data type of all methods to float16 by default to improve efficiency. ... For our Easy Inv, we set 0.05 T < t < 0.25 T and η = 0.5. |