PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery
Authors: Shristi Das Biswas, Matthew Shreve, Xuelu Li, Prateek Singhal, Kaushik Roy
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that PIXELS delivers high-quality edits efficiently, leading to a notable improvement in quantitative metrics as well as human evaluation. |
| Researcher Affiliation | Collaboration | 1Purdue University 2Amazon Fashion EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Progressive Image Editing |
| Open Source Code | Yes | Code https://github.com/amazon-science/PIXELS |
| Open Datasets | Yes | For fair quantitative comparison, we sample 3000 pairs of random images from Imagenet s validation set (Russakovsky et al. 2015) as inputs to all methods, with a randomly chosen edit map from our database. |
| Dataset Splits | Yes | For fair quantitative comparison, we sample 3000 pairs of random images from Imagenet s validation set (Russakovsky et al. 2015) as inputs to all methods, with a randomly chosen edit map from our database. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions using 'off-the-shelf SDXL' for denoising. |
| Software Dependencies | No | The paper mentions using "off-the-shelf SDXL (Podell et al. 2023) for denoising" but does not provide specific version numbers for SDXL or any other software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | Implementation Details. For experiments, we use off-the-shelf SDXL (Podell et al. 2023) for denoising. However, the algorithm can be generalized across any DM (See the Appendix). We do not assume anything about source of the edit maps and find that they can be easily generated by operations like growing and blurring, or simple histogram transformations on binary masks created using tools like Language Segment-Anything, user interaction, automatic depth maps (Miangoleh et al. 2021) etc. Unless stated otherwise, we keep text prompts empty (p = ) for all experiments. |