Diffusion-RainbowPA: Improvements Integrated Preference Alignment for Diffusion-based Text-to-Image Generation

Authors: Haoyuan Sun, Bin Liang, Bo Xia, Jiaqi Wu, Yifei Zhao, Kai Qin, Yongzhe Chang, Xueqian Wang

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With comprehensive alignment performance evaluation and comparison, it is demonstrated that Diffusion-Rainbow PA outperforms current state-of-the-art methods. We also conduct ablation studies on the introduced components that reveal incorporation of each has positively enhanced alignment performance.
Researcher Affiliation Academia Haoyuan Sun EMAIL Tsinghua University Bin Liang EMAIL University of Technology Sydney Bo Xia EMAIL Tsinghua University Jiaqi Wu EMAIL Tsinghua University Yifei Zhao EMAIL Tsinghua University Kai Qin EMAIL Tsinghua University Yongzhe Chang EMAIL Tsinghua University Xueqian Wang EMAIL Tsinghua University
Pseudocode No The paper describes methods using mathematical formulations and prose but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing its own source code or a link to a code repository for the methodology described. It mentions using an "open-source step-aware preference model" but not for their own work.
Open Datasets Yes In selecting the training dataset, to ensure a fair comparison, we adopt the same dataset utilized by SPO (Liang et al., 2024), which consists of 4K randomly chosen prompts from the Pick-a-Pic V1 dataset. ... on four zero-shot datasets: {Gen Eval (Ghosh et al., 2024), T2I-Comp Bench++ (Huang et al., 2025), Gen AI-Bench (Li et al., 2024a), and DPG-Bench (Hu et al., 2024)}.
Dataset Splits No The paper mentions using a training dataset of 4K randomly chosen prompts from the Pick-a-Pic V1 dataset and evaluating on four zero-shot datasets. However, it does not specify explicit training/validation/test splits (e.g., percentages or exact counts) for these datasets or how they were used to reproduce the data partitioning.
Hardware Specification Yes In this study, the experiments are conducted on a machine equipped with 4 NVIDIA A100-PCIE-40GB GPUs. ... on consumer-grade graphics cards, specifically utilizing a machine equipped with 4 NVIDIA Ge Force RTX 3090 GPUs (each with 24GB of memory)
Software Dependencies No The paper does not provide specific version numbers for any software libraries, frameworks, or solvers used in the experiments.
Experiment Setup Yes Hyperparameters. We simultaneously set all terms in Equation (12) to share β = 10, corresponding to the SPO condition. Based on the tuning results reported in (Sun et al., 2025d), the positive enhancement intensity λ in Equation (11) is set as 100 and the threshold as log 0.9; for the MSPA term, the margin strengthening intensity η is empirically set to 0.5.