Training-free Fourier Phase Diffusion for Style Transfer
Authors: Siyuan Zhang, Wei Ma, Libin Liu, Zheng Li, Hongbin Zha
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our method outperforms state-of-the-art models in both content preservation and stylization. Section 5 is titled "Experiments" and includes "Implementation Details", "Qualitative Comparison", "User Study", "Quantitative Comparison", and "Ablation Study". |
| Researcher Affiliation | Academia | 1College of Computer Science, Beijing University of Technology 2Key Laboratory of Machine Perception (MOE), School of IST, Peking University EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the method using figures (Figure 3: Overall pipeline of the proposed model, Figure 4: Our phase fusion module) and text, but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | Yes | Code is available at https://github.com/zhang2002forwin/Fourier Phase-Diffusion-for-Style-Transfer. |
| Open Datasets | No | The paper refers to content images and style text descriptions used in experiments (e.g., 'Participants were presented with 25 groups of results, each paired with the corresponding content image and style text description.'), but it does not provide specific access information (links, DOIs, citations) for these datasets. It mentions using Stable Diffusion XL 1.0 as a foundational model, but not a dataset for evaluation. |
| Dataset Splits | No | Our method is training-free and does not require fine-tuning. The paper does not describe any specific dataset splits for evaluation, as it focuses on training-free style transfer using pre-trained models. |
| Hardware Specification | Yes | All experiments are conducted on a single RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using 'Stable Diffusion XL 1.0 as the foundational text-to-image model' and 'the DDIM sampler', but does not provide specific version numbers for these or any other software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The phase fusion module is applied during the early sampling stage, specifically at the 2nd, 5th, and 8th timesteps, with the parameters set to α = 0.5 and β = 0.7. We utilize Stable Diffusion XL 1.0 as the foundational text-to-image model and employ the DDIM sampler with 30 sampling steps for each stylized image generation. |