Counterfactual Situation Testing: From Single to Multidimensional Discrimination
Authors: Jose M. Alvarez, Salvatore Ruggieri
JAIR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the CST framework on synthetic and real ADM datasets. We use a k-nearest neighbor implementation of the framework, k-NN CST, to compare it to its situation testing counterpart, k-NN ST, by Thanh et al. (2011). Our experiments show that CST uncovers a higher number of cases than ST, even when the model is counterfactually fair. |
| Researcher Affiliation | Academia | Jose M. Alvarez EMAIL Department of Computer Science, KU Leuven 3001 Leuven, Belgium Salvatore Ruggieri EMAIL Department of Computer Science, University of Pisa 56126 Pisa, Italy |
| Pseudocode | Yes | Algorithm 1 reports the pseudo-code of the k-NN CST w/o algorithm. The pseudo-code is self-explanatory. After selecting the control and test search space (lines 1 2) as stated in Definition 4.1, the algorithm iterates over the protected instances. |
| Open Source Code | Yes | The code is available in this repository: https://github.com/cc-jalvarez/counterfactual-situation-testing. |
| Open Datasets | Yes | We use US data from the Law School Admission Council survey (Wightman, 1998), and recreate an admissions scenario for a top US law school. |
| Dataset Splits | No | The paper describes generating synthetic data for n = 5000 and using the LSAC dataset with n = 21790 applicants. However, it does not specify explicit training, validation, or test splits for evaluating the proposed CST method. Instead, CST is applied to the entire dataset of classifier decisions to detect individual discrimination cases using k-nearest neighbors. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions implementing k-NN CST and referring to other methods, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, specific libraries and their versions). |
| Experiment Setup | Yes | We use a significance level of α = 0.05, an accepted deviation of τ = 0.0, and the neighborhood sizes of k {15, 30, 50, 100, 250}. We define b() as b Y = 1{X1 + 5 X2 > $225000}. |