PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery

Authors: Shristi Das Biswas, Matthew Shreve, Xuelu Li, Prateek Singhal, Kaushik Roy

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that PIXELS delivers high-quality edits efficiently, leading to a notable improvement in quantitative metrics as well as human evaluation.
Researcher Affiliation Collaboration 1Purdue University 2Amazon Fashion EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Progressive Image Editing
Open Source Code Yes Code https://github.com/amazon-science/PIXELS
Open Datasets Yes For fair quantitative comparison, we sample 3000 pairs of random images from Imagenet s validation set (Russakovsky et al. 2015) as inputs to all methods, with a randomly chosen edit map from our database.
Dataset Splits Yes For fair quantitative comparison, we sample 3000 pairs of random images from Imagenet s validation set (Russakovsky et al. 2015) as inputs to all methods, with a randomly chosen edit map from our database.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions using 'off-the-shelf SDXL' for denoising.
Software Dependencies No The paper mentions using "off-the-shelf SDXL (Podell et al. 2023) for denoising" but does not provide specific version numbers for SDXL or any other software dependencies like programming languages or libraries.
Experiment Setup Yes Implementation Details. For experiments, we use off-the-shelf SDXL (Podell et al. 2023) for denoising. However, the algorithm can be generalized across any DM (See the Appendix). We do not assume anything about source of the edit maps and find that they can be easily generated by operations like growing and blurring, or simple histogram transformations on binary masks created using tools like Language Segment-Anything, user interaction, automatic depth maps (Miangoleh et al. 2021) etc. Unless stated otherwise, we keep text prompts empty (p = ) for all experiments.