Investigating the Effects of Fairness Interventions Using Pointwise Representational Similarity
Authors: Camila Kolling, Till Speicher, Vedant Nanda, Mariya Toneva, Krishna P. Gummadi
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we introduce Pointwise Normalized Kernel Alignment (PNKA), a pointwise representational similarity measure that addresses these limitations by measuring how debiasing measures affect the intermediate representations of individuals. On tabular data, the use of PNKA reveals previously unknown insights: while group fairness predominantly influences a small subset of the population, maintaining high representational similarity for the majority, individual fairness constraints uniformly impact representations across the entire population, altering nearly every data point. We show that by evaluating representations using PNKA, we can reliably predict the behavior of ML models trained on these representations. Moreover, applying PNKA to language embeddings shows that existing debiasing methods may not perform as intended, failing to remove biases from stereotypical words and sentences. |
| Researcher Affiliation | Academia | Camila Kolling EMAIL MPI-SWS Till Speicher EMAIL MPI-SWS Vedant Nanda EMAIL MPI-SWS Mariya Toneva EMAIL MPI-SWS Krishna P. Gummadi EMAIL MPI-SWS |
| Pseudocode | No | The paper describes mathematical formulas and concepts (e.g., PNKA formula in Section 2.2) and outlines methodologies in paragraph form. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present any structured, code-like steps for its procedures. |
| Open Source Code | No | The paper mentions 'More information can be obtained in https://github.com/Trusted-AI/AIF360.' in footnote 4, but this refers to a third-party toolkit (AIF360) used for dataset preprocessing, not the authors' own source code for the PNKA methodology or their experiments. There is no explicit statement from the authors about releasing their own code or a direct link to a repository containing their implementation. |
| Open Datasets | Yes | We use the COMPAS (Larson et al., 2016) dataset, debiased for race, and the Adult (Becker & Kohavi, 1996) dataset, debiased for gender. ... We report the analysis on CIFAR-10 and CIFAR-100 Krizhevsky et al. (2009). ... We use PNKA to measure similarity between the original Glo Ve (baseline) and the debiased versions of Glo Ve embeddings, i.e., (GN-)Glo Ve Zhao et al. (2018), Gender Preserving (GP-)Glo Ve and GP-GN-Glo Ve Kaneko & Bollegala (2019). ... We focused on a debiasing method that aims to remove gender bias from contextual representations of stereotypical words ... Specifically the SEAT-7 and SEAT-8 datasets (May et al., 2019). |
| Dataset Splits | No | The paper mentions specific datasets like COMPAS, Adult, CIFAR-10/100, and SEAT-7/8. While Section A.1 notes |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models (e.g., Intel Xeon), or other processor types and memory amounts used for running the experiments. It lacks any concrete specifications of the computational resources employed. |
| Software Dependencies | No | The paper mentions using specific models like 'Albert (albert-base-v2) model' in Section D. However, it does not provide specific version numbers for any key software components, libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages that would be necessary to replicate the experiments. |
| Experiment Setup | No | The paper describes the objectives of the learning algorithms (e.g., 'optimizing a loss function that maintains as much utility as possible, while removing information about protected attributes' and 'classification accuracy (denoted by us as utility), statistical parity (to achieve group fairness, and data loss (as a proxy for achieving individual fairness)') and mentions training 'logistic regression models' in Section 3.3. However, it does not provide any specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations (e.g., optimizer type and settings, model initialization, dropout rate) in the main text. |