Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References

Authors: Teng-Fang Hsiao, Bo-Kai Ruan, Hong-Han Shuai

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the efficacy of our method in all benchmarks.
Researcher Affiliation Academia National Yang Ming Chiao Tung University, Taiwan EMAIL
Pseudocode No The overall algorithm and visualization can be found in Appendix.
Open Source Code Yes Code https://github.com/Blue Dyee/TF-GPH
Open Datasets Yes We generalize the computational metrics and benchmarks from various image editing methods including Painterly image harmonization , Prompt-based Image Composition , and Style Transfer... and aims to mitigate the shortcomings of existing benchmarks such as Wiki Art combined with COCO (Tan et al. 2019; Lin et al. 2014) and the TF-ICON Benchmark (Lu, Liu, and Kong 2023).
Dataset Splits No Details and experiment of these datasets can be found in the Appendix.
Hardware Specification No No specific hardware details such as GPU/CPU models or processor types are mentioned in the paper.
Software Dependencies No We employ the Stable Diffusion model (Rombach et al. 2022) as the pretrained backbone and utilize DPM Solver++ as the scheduler with a total of 25 steps for both inversion and reconstruction.
Experiment Setup Yes Experiments Setup. We employ the Stable Diffusion model (Rombach et al. 2022) as the pretrained backbone and utilize DPM Solver++ as the scheduler with a total of 25 steps for both inversion and reconstruction. Specifically, we first resize the input images If, Ib, and Ic to 512 512, and encode them into corresponding zf 0, zb 0, and zc 0. Afterward, we take these latents with prompt embedding ρexceptional as the input during both inversion and reconstruction stage... Without loss of generality, we place a higher tendency on style reference zb by setting β to 1.1, and a minor preference on content preservation related to zf by setting α to 0.9