LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces

Authors: Rashid Mushkani, Shravan Nayak, Hugo Berard, Allison Cohen, Shin Koseki, Hadrien Bertrand

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce the Local Intersectional Visual Spaces (LIVS) dataset, a benchmark for multicriteria alignment, developed through a two-year participatory process with 30 community organizations to support the pluralistic alignment of textto-image (T2I) models in inclusive urban planning. The dataset encodes 37,710 pairwise comparisons across 13,462 images, structured along six criteria Accessibility, Safety, Comfort, Invitingness, Inclusivity, and Diversity derived from 634 community-defined concepts. Using Direct Preference Optimization (DPO), we fine-tune Stable Diffusion XL to reflect multi-criteria spatial preferences and evaluate the LIVS dataset and the fine-tuned model through four case studies
Researcher Affiliation Academia 1Universit e de Montr eal 2Mila Quebec AI Institute. Correspondence to: Rashid Mushkani <EMAIL>.
Pseudocode Yes Algorithm 1: Selecting the 4 Most Diverse Images Using CLIP Similarity Scores
Open Source Code No The paper mentions that "An open-source annotation platform was developed to facilitate this process" but does not provide a specific link to the source code for this platform or for the main methodology (DPO fine-tuning, model evaluation).
Open Datasets Yes The LIVS dataset including citizen-provided self-identification markers (with consent) is available for research purposes at mid-space.one. This release aims to establish a benchmark for pluralistic alignment in text-to-image generation and supports both criterion-specific and user-specific customization.
Dataset Splits Yes We collected 35,510 multi-criteria preference annotations, each covering one to three criteria, to fine-tune a Stable Diffusion XL model using Direct Preference Optimization (DPO) (Wallace et al., 2023; Rafailov et al., 2024). We then tested the fine-tuned model with 2,200 additional annotations, comparing it to the baseline model.
Hardware Specification Yes Hardware: Single NVIDIA A100 80GB GPU
Software Dependencies No The paper mentions using "Stable Diffusion XL" and "Direct Preference Optimization (DPO)" and "GPT-4o", but does not provide specific version numbers for these or other software libraries or dependencies used for implementation.
Experiment Setup Yes We fine-tuned Stable Diffusion XL using Direct Preference Optimization (DPO; Rafailov et al. 2024), closely following the original hyperparameters: Batch Size: 64 Learning Rate: 1 10 8 with 20% linear warmup Beta (β): 5,000 Training Steps: 500 for smaller subsets; 1,500 when combining the entire preference dataset