InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences

Authors: Chenyang Zhu, Kai Li, Yue Ma, Longxiang Tang, Chengyu Fang, Chubin Chen, Qifeng Chen, Xiu Li

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluations demonstrate the superiority and versatility of INSTANTSWAP. 1 INTRODUCTION... To provide a comprehensive evaluation for CCS, we introduce Con Swap Bench, the first benchmark for customized concept swapping. Extensive qualitative and quantitative evaluations demonstrate the effectiveness and superiority of our INSTANTSWAP. We also conduct comprehensive ablation studies to verify the effectiveness of each component of our approach. 4 EXPERIMENTS 4.3 QUALITATIVE COMPARISON 4.4 QUANTITATIVE COMPARISON 4.5 ABLATION STUDY
Researcher Affiliation Collaboration Chenyang Zhu1, , Kai Li2, , , Yue Ma3, , Longxiang Tang1, Chengyu Fang1, Chubin Chen1, Qifeng Chen3, Xiu Li1, 1 Tsinghua University 2 Meta 3 HKUST
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations (e.g., Equations 1-19) and block diagrams (Figure 3, Figure 4), but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The code is available in https://github.com/chenyangzhu1/InstantSwap.
Open Datasets No To meet the needs of comprehensive evaluation, we introduce Con Swap Bench, the first benchmark dataset specifically designed for customized concept swapping. Con Swap Bench consists of two sub-benchmarks: Concept Bench and Swap Bench. Concept Bench comprises 62 images covering 10 different target concepts used for customization, while Swap Bench includes 160 real images containing one or more objects to be swapped, serving as source images. For each image in Swap Bench, we use Grounding SAM (Ren et al., 2024) to acquire the bbox of the foreground concepts as the ground truth for evaluation purposes... More details can be found in Appendix C. Appendix C describes the components of Con Swap Bench and its sources (Unsplash, PIE-Bench, Visual Genome) but does not provide a direct link or repository for downloading the compiled Con Swap Bench dataset itself.
Dataset Splits No The paper mentions how images for evaluation are generated ("During the evaluation phase, each concept from Concept Bench is applied to each image in Swap Bench for concept swapping, generating 1,600 images for evaluation.") and that Dream Booth customization involves a 'set of images (typically fewer than 5) Xt = {xi}M i=1'. However, it does not specify explicit training, validation, or test dataset splits with percentages or sample counts for any model training or evaluation process.
Hardware Specification Yes We conduct the experiments with Stable Diffusion (Rombach et al., 2022) v2.1-base on a single RTX3090.
Software Dependencies No The paper mentions using Stable Diffusion v2.1-base, Dream Booth, and SGD, but it does not specify version numbers for general software dependencies like Python, PyTorch, or CUDA libraries used in the implementation.
Experiment Setup Yes We set the SSGU factor λ to 5, α to 2, β to 0.5 and the guidance scale to 7.5. The bbox is obtained through the first three steps. Subsequently, we use SGD (Robbins & Monro, 1951) with a learning rate of 0.1 to optimize for 550 steps of iterations.