Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment

Authors: Yizhi Song, Liu He, Zhifei Zhang, Soo Ye Kim, HE Zhang, Wei Xiong, Zhe Lin, Brian Price, Scott Cohen, Jianming Zhang, Daniel Aliaga

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments and comparisons demonstrate that our pipeline greatly pushes the boundary of fine details in the image synthesis models. Project page: https://song630.github.io/Refine-by-Align-Project-Page/ 1 INTRODUCTION ... Quantitative and qualitative comparisons (i.e., using several wellestablished metrics and a user study; Sec. 4.2, Sec. 4.3) show that in terms of detail and appearance preservation our model outperforms all six baseline models
Researcher Affiliation Collaboration Yizhi Song1 Liu He1 Zhifei Zhang2 Soo Ye Kim2 He Zhang2 Wei Xiong2 Zhe Lin2 Brian Price2 Scott Cohen2 Jianming Zhang2 Daniel Aliaga1 1 Purdue University 2 Adobe Research
Pseudocode Yes Algorithm 1 Optimal Cross-Attention Alignment (refer to Fig. 4 for visualization)
Open Source Code No Project page: https://song630.github.io/Refine-by-Align-Project-Page/
Open Datasets Yes To provide insight on the appearance of generative artifacts and an effective evaluation of our artifacts refinement approach, we present Gen Artifact Bench, the first benchmark for referenceguided artifacts refinement (refer to the Appendix for examples), featuring: We use Pixabay (Song et al., 2023) with panoptic segmentation labels as the training dataset. Our training data consists of Pixabay and MVObj, a manually annotated dataset where an object appears in multiple images with different contexts and views.
Dataset Splits No The paper mentions using 'Pixabay' as a training dataset and 'Gen Artifact Bench' for evaluation, but it does not specify any training, validation, or test split percentages, sample counts, or methodology for data partitioning.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions several models and frameworks like DINOv2, Any Door, IMPRINT, Stable Diffusion, and BLIP2, but it does not specify version numbers for any software libraries, programming languages, or other dependencies.
Experiment Setup No The paper states: "Parameters. t = 0 and l = 9 are used in all the comparisons below." and discusses an optimal combination of these parameters derived from a grid search. However, it does not provide common training hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings for the neural network training.