Position: Rethinking LLM Bias Probing Using Lessons from the Social Sciences
Authors: Kirsten Morehouse, Siddharth Swaroop, Weiwei Pan
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Overall, the paper has four key contributions. 1. We review key psychological methods for studying human bias and examine how these approaches have been adapted for detecting bias in LLMs. In doing so, we show how theories from psychology can improve LLM social bias probing. 2. We examine existing taxonomies for LLM bias probes and highlight their limitations. 3. We introduce Eco Levels, a novel framework with two components: (a) ecological validity (i.e., the degree a probe aligns with the target task; see Fig. 2) and (b) the level at which bias is probed. We demonstrate how Eco Levels enables systematic bias probe selection and generates testable predictions about bias generalization. 4. We apply our framework to the domain of gender-occupation bias to demonstrate its practical utility in (a) determining appropriate probes, (b) reconciling conflicting findings, and (c) clarifying bias boundary conditions. We conclude by summarizing the five lessons that underpin our work and outlining our hopes for this research area. |
| Researcher Affiliation | Academia | 1Department of Psychology, Harvard University, Cambridge, MA, USA 2Department of Computer Science, Harvard University, Cambridge, MA, USA. Correspondence to: Kirsten Morehouse <EMAIL>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. It primarily presents a theoretical framework and discussions. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide links to a code repository. |
| Open Datasets | Yes | Word Embedding Association Test (WEAT) (Caliskan et al., 2017); Wino Bias (Zhao et al., 2018); Bias In Bios (De-Arteaga et al., 2019); BOLD (Dhamala et al., 2021); BBQ (Parrish et al., 2022); Crow S-Pairs (Nangia et al., 2020). |
| Dataset Splits | No | The paper is theoretical and proposes a framework; it does not conduct its own experiments or specify dataset splits. |
| Hardware Specification | No | The paper is theoretical and proposes a framework; it does not conduct its own experiments and therefore does not specify hardware details. |
| Software Dependencies | No | The paper is theoretical and proposes a framework; it does not conduct its own experiments and therefore does not specify software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and proposes a framework; it does not conduct its own experiments or provide details about an experimental setup. |