InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
Authors: Chenyang Zhu, Kai Li, Yue Ma, Longxiang Tang, Chengyu Fang, Chubin Chen, Qifeng Chen, Xiu Li
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations demonstrate the superiority and versatility of INSTANTSWAP. 1 INTRODUCTION... To provide a comprehensive evaluation for CCS, we introduce Con Swap Bench, the first benchmark for customized concept swapping. Extensive qualitative and quantitative evaluations demonstrate the effectiveness and superiority of our INSTANTSWAP. We also conduct comprehensive ablation studies to verify the effectiveness of each component of our approach. 4 EXPERIMENTS 4.3 QUALITATIVE COMPARISON 4.4 QUANTITATIVE COMPARISON 4.5 ABLATION STUDY |
| Researcher Affiliation | Collaboration | Chenyang Zhu1, , Kai Li2, , , Yue Ma3, , Longxiang Tang1, Chengyu Fang1, Chubin Chen1, Qifeng Chen3, Xiu Li1, 1 Tsinghua University 2 Meta 3 HKUST |
| Pseudocode | No | The paper describes the methodology using textual explanations and mathematical equations (e.g., Equations 1-19) and block diagrams (Figure 3, Figure 4), but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | The code is available in https://github.com/chenyangzhu1/InstantSwap. |
| Open Datasets | No | To meet the needs of comprehensive evaluation, we introduce Con Swap Bench, the first benchmark dataset specifically designed for customized concept swapping. Con Swap Bench consists of two sub-benchmarks: Concept Bench and Swap Bench. Concept Bench comprises 62 images covering 10 different target concepts used for customization, while Swap Bench includes 160 real images containing one or more objects to be swapped, serving as source images. For each image in Swap Bench, we use Grounding SAM (Ren et al., 2024) to acquire the bbox of the foreground concepts as the ground truth for evaluation purposes... More details can be found in Appendix C. Appendix C describes the components of Con Swap Bench and its sources (Unsplash, PIE-Bench, Visual Genome) but does not provide a direct link or repository for downloading the compiled Con Swap Bench dataset itself. |
| Dataset Splits | No | The paper mentions how images for evaluation are generated ("During the evaluation phase, each concept from Concept Bench is applied to each image in Swap Bench for concept swapping, generating 1,600 images for evaluation.") and that Dream Booth customization involves a 'set of images (typically fewer than 5) Xt = {xi}M i=1'. However, it does not specify explicit training, validation, or test dataset splits with percentages or sample counts for any model training or evaluation process. |
| Hardware Specification | Yes | We conduct the experiments with Stable Diffusion (Rombach et al., 2022) v2.1-base on a single RTX3090. |
| Software Dependencies | No | The paper mentions using Stable Diffusion v2.1-base, Dream Booth, and SGD, but it does not specify version numbers for general software dependencies like Python, PyTorch, or CUDA libraries used in the implementation. |
| Experiment Setup | Yes | We set the SSGU factor λ to 5, α to 2, β to 0.5 and the guidance scale to 7.5. The bbox is obtained through the first three steps. Subsequently, we use SGD (Robbins & Monro, 1951) with a learning rate of 0.1 to optimize for 550 steps of iterations. |