Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment
Authors: Yizhi Song, Liu He, Zhifei Zhang, Soo Ye Kim, HE Zhang, Wei Xiong, Zhe Lin, Brian Price, Scott Cohen, Jianming Zhang, Daniel Aliaga
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments and comparisons demonstrate that our pipeline greatly pushes the boundary of fine details in the image synthesis models. Project page: https://song630.github.io/Refine-by-Align-Project-Page/ 1 INTRODUCTION ... Quantitative and qualitative comparisons (i.e., using several wellestablished metrics and a user study; Sec. 4.2, Sec. 4.3) show that in terms of detail and appearance preservation our model outperforms all six baseline models |
| Researcher Affiliation | Collaboration | Yizhi Song1 Liu He1 Zhifei Zhang2 Soo Ye Kim2 He Zhang2 Wei Xiong2 Zhe Lin2 Brian Price2 Scott Cohen2 Jianming Zhang2 Daniel Aliaga1 1 Purdue University 2 Adobe Research |
| Pseudocode | Yes | Algorithm 1 Optimal Cross-Attention Alignment (refer to Fig. 4 for visualization) |
| Open Source Code | No | Project page: https://song630.github.io/Refine-by-Align-Project-Page/ |
| Open Datasets | Yes | To provide insight on the appearance of generative artifacts and an effective evaluation of our artifacts refinement approach, we present Gen Artifact Bench, the first benchmark for referenceguided artifacts refinement (refer to the Appendix for examples), featuring: We use Pixabay (Song et al., 2023) with panoptic segmentation labels as the training dataset. Our training data consists of Pixabay and MVObj, a manually annotated dataset where an object appears in multiple images with different contexts and views. |
| Dataset Splits | No | The paper mentions using 'Pixabay' as a training dataset and 'Gen Artifact Bench' for evaluation, but it does not specify any training, validation, or test split percentages, sample counts, or methodology for data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions several models and frameworks like DINOv2, Any Door, IMPRINT, Stable Diffusion, and BLIP2, but it does not specify version numbers for any software libraries, programming languages, or other dependencies. |
| Experiment Setup | No | The paper states: "Parameters. t = 0 and l = 9 are used in all the comparisons below." and discusses an optimal combination of these parameters derived from a grid search. However, it does not provide common training hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings for the neural network training. |