Optimized Score Transformation for Consistent Fair Classification

Authors: Dennis Wei, Karthikeyan Natesan Ramamurthy, Flavio P. Calmon

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments comparing to 10 existing methods show that Fair Score Transformer has advantages for score-based metrics such as Brier score and AUC while remaining competitive for binary label-based metrics such as accuracy. We have conducted comprehensive experiments, reported in Section 6 and Appendix C, comparing FST to 10 existing methods, a number that compares favorably to recent metastudies (Friedler et al., 2019).
Researcher Affiliation Collaboration Dennis Wei EMAIL Karthikeyan Natesan Ramamurthy EMAIL IBM Research 1101 Kitchawan Road Yorktown Heights, NY 10598, USA Flavio P. Calmon EMAIL John A. Paulson School of Engineering and Applied Sciences Harvard University 150 Western Ave Allston, MA 02134, USA
Pseudocode Yes Under the first decomposition, application of the scaled ADMM algorithm (Boyd et al., 2011, Section 3.1.1) to (19) yields the following three steps in each iteration k = 0, 1, . . . : µ(k+1)(xi) = arg min µ 1 ng µ; ˆr(xi) + ρ µ (λ(k))Tˆf(xi) + c(k)(xi) 2 i = 1, . . . , n λ(k+1) = arg min λ ϵ λ 1 + ρ µ(k+1)(xi) λTˆf(xi) + c(k)(xi) 2 (20b) c(k+1)(xi) = c(k)(xi) + µ(k+1)(xi) λ(k+1) T ˆf(xi) i = 1, . . . , n. (20c)
Open Source Code No The paper discusses third-party code used for comparison (e.g., reductions, FERM) and provides links to those, but does not explicitly state that the authors' own source code for the methodology described in this paper (Fair Score Transformer) is available. There is no link provided for their implementation.
Open Datasets Yes Four data sets were used, the first three of which are standard in the fairness literature: 1) Adult Income, 2) Pro Publica s COMPAS recidivism, 3) German credit risk, 4) Medical Expenditure Panel Survey (MEPS). Specifically, we used versions pre-processed by an opensource library for algorithmic fairness (Bellamy et al., 2018).
Dataset Splits Yes Each data set was randomly split 10 times into training (75%) and test (25%) sets and all methods were subject to the same splits.
Hardware Specification Yes Experiments were performed on a machine running Ubuntu OS with 32 cores, and 64 GB RAM.
Software Dependencies No The paper mentions using "scikit-learn (Pedregosa et al., 2011)" for base classifiers and references "http://cvxopt.org/" for a generic convex optimization solver. However, it does not provide specific version numbers for these software components, which is required for a reproducible description.
Experiment Setup Yes 5-fold cross-validation to select parameters for LR (regularization parameter C from [10 4, 104]) and GBM (minimum number of samples per leaf from {5, 10, 15, 20, 30}) was done only once per training set. All other parameters were set to the scikit-learn defaults.