Transport-based Counterfactual Models
Authors: Lucas De Lara, Alberto González-Sanz, Nicholas Asher, Laurent Risser, Jean-Michel Loubes
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Sections 6, 7 and 8, we illustrate the practicality of our approach for fairness in machine learning. We apply the mass-transportation viewpoint of structural counterfactuals by recasting the counterfactual fairness criterion (Kusner et al., 2017) into a transport-like one. Then, we propose new causality-free criteria by substituting the causal model by transport-based models in the original criterion. Finally, we address the training of counterfactually fair classifiers and predictors, providing statistical guarantees and numerical experiments over various data sets. In this section, we present the implementation of our counterfactually fair learning procedure on real data, and show that it has the expected behaviour. |
| Researcher Affiliation | Academia | Lucas De Lara lucas.de EMAIL Institut de Mathematiques de Toulouse Universit e Paul Sabatier Toulouse, France Alberto Gonz alez-Sanz EMAIL Department of Statistics Columbia University New York, United States Nicholas Asher EMAIL Institut de Recherche en Informatique de Toulouse CNRS Toulouse, France Laurent Risser EMAIL Institut de Mathematiques de Toulouse CNRS Toulouse, France Jean-Michel Loubes EMAIL Institut de Mathematiques de Toulouse Universit e Paul Sabatier Toulouse, France |
| Pseudocode | No | The paper describes methods and procedures in narrative text and mathematical formulations. It does not include any clearly labeled 'Pseudocode', 'Algorithm', or structured code blocks. |
| Open Source Code | Yes | The code is available at https: //github.com/lucasdelara/PI-Fair. |
| Open Datasets | Yes | The Adult Data Set from the UCI Machine Learning Repository (Dua and Graff, 2019) has become a gold reference data set to evaluate and benchmark fairness frameworks. The Communities and Crimes data set can also be found in the UCI Machine Learning Repository (Dua and Graff, 2019). We follow Kusner et al. (2017) and try to predict the risk of recidivism while avoiding discrimination against the race, using the same data. This is the data set used in Section 4.4.1, gathering statistics from 163 US law schools and more than 20,000 students. Here again we follow Kusner et al. (2017) |
| Dataset Splits | Yes | We divide it into a training set of size ntrain = 32, 724 and a testing set of size ntest = 16, 118. Finally, we divide the data into a training set of size ntrain = 4, 120 and a testing set of size ntest = 2, 030. All in all, we have d = 2 features excluding the outcome and the protected attributes, and work with ntrain = 13, 109 training entries and ntest = 6, 458 testing entries. After processing the 128 numerical and categorical attributes composing the data set, we obtain d + 1 = 98 features over ntrain = 1, 335 training instances and ntesting = 659 testing instances. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It describes the datasets, models, and training procedures but omits hardware specifications. |
| Software Dependencies | No | In practice, we rely on the Python Optimal Transport (POT) library to compute an approximation of the mapping from data (Flamary et al., 2021). This mention of the POT library does not include a specific version number. No other software with version numbers is mentioned. |
| Experiment Setup | Yes | For a given counterfactual model Π := π s |s s,s S and a given weight λ > 0, we define the following expected risk on the predictors... The regularization weight λ takes successively all the values in a grid 10 4, 10 3.5, . . . , 101 . We repeat the training and evaluation processes of our models together with the baselines across 10 repeats for every data sets. For classification tasks, we consider logistic models; for regression tasks, we consider linear regression models. In the classification setting we set ϵ = 0 while in the regression setting we work with ϵ = 12E [|Y Y |] where Y is an independent copy of Y . As the empirical counterfactual models we use are non-deterministic although their continuous counterparts may be deterministic we set δ = 0.1 whatever the prediction task. |