Training-free Fourier Phase Diffusion for Style Transfer

Authors: Siyuan Zhang, Wei Ma, Libin Liu, Zheng Li, Hongbin Zha

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results demonstrate that our method outperforms state-of-the-art models in both content preservation and stylization. Section 5 is titled "Experiments" and includes "Implementation Details", "Qualitative Comparison", "User Study", "Quantitative Comparison", and "Ablation Study".
Researcher Affiliation Academia 1College of Computer Science, Beijing University of Technology 2Key Laboratory of Machine Perception (MOE), School of IST, Peking University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the method using figures (Figure 3: Overall pipeline of the proposed model, Figure 4: Our phase fusion module) and text, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code Yes Code is available at https://github.com/zhang2002forwin/Fourier Phase-Diffusion-for-Style-Transfer.
Open Datasets No The paper refers to content images and style text descriptions used in experiments (e.g., 'Participants were presented with 25 groups of results, each paired with the corresponding content image and style text description.'), but it does not provide specific access information (links, DOIs, citations) for these datasets. It mentions using Stable Diffusion XL 1.0 as a foundational model, but not a dataset for evaluation.
Dataset Splits No Our method is training-free and does not require fine-tuning. The paper does not describe any specific dataset splits for evaluation, as it focuses on training-free style transfer using pre-trained models.
Hardware Specification Yes All experiments are conducted on a single RTX 3090 GPU.
Software Dependencies No The paper mentions using 'Stable Diffusion XL 1.0 as the foundational text-to-image model' and 'the DDIM sampler', but does not provide specific version numbers for these or any other software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes The phase fusion module is applied during the early sampling stage, specifically at the 2nd, 5th, and 8th timesteps, with the parameters set to α = 0.5 and β = 0.7. We utilize Stable Diffusion XL 1.0 as the foundational text-to-image model and employ the DDIM sampler with 30 sampling steps for each stylized image generation.