Fair Text-to-Image Diffusion via Fair Mapping

Authors: Jia Li, Lijie Hu, Jingfeng Zhang, Tianhang Zheng, Hua Zhang, Di Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With comprehensive experiments on face image generation, we show that our method significantly improves image generation fairness with almost the same image quality compared to conventional diffusion models when prompted with descriptions related to humans. Comprehensive Experimental Evaluations: Finally, we conduct comprehensive experiments of our methods to ensure our generated images fairness and quality.
Researcher Affiliation Academia 1Provable Responsible AI and Data Analytics (PRADA) Lab 2King Abdullah University of Science and Technology 3SDAIA-KAUST 4Institute of Information Engineering, Chinese Academy of Sciences 5University of Auckland 6The State Key Laboratory of Blockchain and Data Security, Zhejiang University 7Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security EMAIL
Pseudocode No The paper describes methods using text and mathematical equations, such as equations (1) through (7), and a diagram (Figure 3) illustrating the training and inference stages. It does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about providing source code, nor does it include a link to a code repository in the main text or references.
Open Datasets Yes We select a total of 150 occupations and 20 emotions for the fair human face image generation following [Ning, Li, and Jianlin Su 2023]. For sensitive groups, we choose gender groups (male and female), racial groups (Black, Asian, White, Indian) and Age (young, middle age, old) provided by [Kärkkäinen and Joo 2021].
Dataset Splits No The paper states: 'Standardized prompts are used for occupations and emotions. More details are in Appendix.' However, it does not provide specific details on dataset splits (e.g., percentages or counts for training, validation, and testing) within the main body of the paper.
Hardware Specification Yes On an Nvidia V100, our debiasing method trains on 150 occupations in just 50 minutes, demonstrating impressive efficiency.
Software Dependencies Yes In our experiments, we use stable Diffusion v1.5 [Rombach, Blattmann, and Dominik Lorenz 2022] as the base model and implement 50 DDIM denoising steps for generation.
Experiment Setup Yes In our experiments, we use stable Diffusion v1.5 [Rombach, Blattmann, and Dominik Lorenz 2022] as the base model and implement 50 DDIM denoising steps for generation. Standardized prompts are used for occupations and emotions.