OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Authors: Cong Wei, Zheyang Xiong, Weiming Ren, Xeron Du, Ge Zhang, Wenhu Chen
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Both automatic evaluation and human evaluations demonstrate that OMNI-EDIT can significantly outperform all the existing models. Our code, dataset and model will be available at https://tiger-ai-lab.github.io/OmniEdit/ ... We perform comprehensive automatic and human evaluation to show the significant boost of OMNI-EDIT over the existing baseline models... In Section 5.2, we study the advantages of importance sampling for synthetic data. In Section 5.2, we perform an analysis to study the design of OMNI-EDIT. |
| Researcher Affiliation | Collaboration | 1University of Waterloo, 2University of Wisconsin-Madison, 3Vector Institute, 4M-A-P EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | We formally provide the algorithm in 1. |
| Open Source Code | Yes | Our code, dataset and model will be available at https://tiger-ai-lab.github.io/OmniEdit/ |
| Open Datasets | Yes | We constructed the training dataset D by sampling high-resolution images with a minimum resolution of 1 megapixel from the LAION-5B (Schuhmann et al., 2022) and Open Image V6 (Kuznetsova et al., 2020) databases. Our code, dataset and model will be available at https://tiger-ai-lab.github.io/OmniEdit/ |
| Dataset Splits | No | The paper describes creating a training dataset of 1.2M entries and a test benchmark (OMNI-EDIT-BENCH) with 434 edits. While it mentions the size of the curated training set and the test set, it does not provide specific percentages or counts for traditional training, validation, and test splits for the main model training, nor does it refer to predefined splits in a way that would allow direct reproduction of the data partitioning process for internal validation. |
| Hardware Specification | Yes | We train OMNI-EDIT on the 1.2M OMNI-EDIT training dataset for 2 epochs on a single node with 8 H100 GPUs. |
| Software Dependencies | No | The paper mentions building upon "Stable diffusion 3 Medium" but does not specify version numbers for any software libraries, frameworks (like PyTorch or TensorFlow), programming languages, or other ancillary software components used for implementation. |
| Experiment Setup | Yes | The OMNI-EDIT model is built upon Stable diffusion 3 Medium(Esser et al., 2024) with Edit Net architecture. The stable diffusion 3 has 24 Di T layers. Each layer has a corresponding Edit Net layer. We train OMNI-EDIT on the 1.2M OMNI-EDIT training dataset for 2 epochs on a single node with 8 H100 GPUs. |