reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Authors: Yoad Tewel, Rinon Gal, Dvir Samuel, Yuval Atzmon, Lior Wolf, Gal Chechik

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Without task-specific fine-tuning, Add-it achieves state-of-the-art results on both real and generated image insertion benchmarks, including our newly constructed Additing Affordance Benchmark for evaluating object placement plausibility, outperforming supervised methods. Human evaluations show that Add-it is preferred in over 80% of cases, and it also demonstrates improvements in various automated metrics. ... 4 EXPERIMENTS
Researcher Affiliation	Collaboration	Yoad Tewel NVIDIA, Tel-Aviv University Rinon Gal NVIDIA, Tel-Aviv University Dvir Samuel Bar-Ilan University Yuval Atzmon NVIDIA Lior Wolf Tel-Aviv University Gal Chechik NVIDIA
Pseudocode	No	The paper describes the methodology in prose and mathematical equations within Section 3 ('Our Method') and its subsections, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	REPRODUCIBILITY STATEMENT: We will open-source all the code upon publication of the paper.
Open Datasets	Yes	We introduce the Additing Affordance Benchmark, where we manually annotate suitable areas for object insertion in images and propose a new protocol specifically designed to evaluate the plausibility of object placement. ... We also evaluate our method on an existing benchmark (Sheynin et al., 2023) with real images, as well as our newly proposed Additing Benchmark for generated images. ... We provide the proposed Additing Benchmark and Additing Affordance Benchmark in the supplementary material of our submission.
Dataset Splits	No	The paper mentions using a "subset of Emu Edit s (Sheynin et al., 2023) validation set" and constructing new benchmarks (Additing Benchmark, Additing Affordance Benchmark) with 100 or 200 sets/images, but it does not specify explicit train/test/validation splits (percentages or sample counts) for the experiments conducted by the authors.
Hardware Specification	No	The paper does not explicitly state the hardware specifications (GPU/CPU models, memory, etc.) used for running the experiments. It mentions using 'FLUX.1-dev model' but not the underlying hardware.
Software Dependencies	No	The paper mentions using 'diffusers implementation of the FLUX.1-dev model', 'SAM-2 (Ravi et al., 2024)', and 'Grounding-DINO (Liu et al., 2023)'. However, it does not provide specific version numbers for general software libraries, frameworks (like PyTorch or TensorFlow), or programming languages (like Python) that would be needed for replication.
Experiment Setup	Yes	When evaluating Add-it, we use tstruct = 933 for generated images and tstruct = 867 for real images and tblend = 500. For the scaling factor γ, we use the root-finding solver described in section 3.2 on a set of validation images and set γ to 1.05, as it is close to the average result and performs well in practice. We generate the images with 30 denoising steps, building upon the diffusers implementation of the FLUX.1-dev model. We apply the extended attention mechanism until step t = 670 in the multi-stream blocks, and step t = 340 for the single-stream blocks.