reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fairness guarantees in multi-class classification with demographic parity

Authors: Christophe Denis, Romuald Elie, Mohamed Hebiri, François Hu

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The approach is evaluated on both synthetic and real datasets and reveals very eﬀective in decision making with a preset level of unfairness. In addition, our method is competitive (if not better) with the state-of-the-art in binary and multi-class tasks. Keywords: algorithmic fairness, demographic parity, multi-class classiﬁction
Researcher Affiliation	Collaboration	Christophe Denis EMAIL LPSM, UMR-CNRS 8001, Sorbonne Universit e 4, Place Jussieu, 75005 Paris, France Romuald Elie EMAIL LAMA, UMR-CNRS 8050, Universit e Gustave Eiﬀel 5 Bd Descartes, 77454 Marne-la-Vall ee cedex 2, France Mohamed Hebiri EMAIL LAMA, UMR-CNRS 8050, Universit e Gustave Eiﬀel 5 Bd Descartes, 77454 Marne-la-Vall ee cedex 2, France Fran cois Hu EMAIL R&D Department, AI Lab, Milliman France 14 Av. de la Grande Arm ee, 75017 Paris, France
Pseudocode	Yes	The implementation pseudo-code is provided in Algorithm 1. Algorithm 1 ε-fairness calibration Input: Approximate fairness parameter ε, new data point (x, s), base estimators ( pk)k, unlabeled sample D N, (ζk)k and i.i.d uniform perturbations (ζs k,i)k,i,s in [0, 10 5]. Step 0. Split D N and construct the samples (S1, . . . , SN) and {Xs 1, . . . , Xs Ns}, for s S; Step 1. Compute the empirical frequencies (ˆπs)s based on (S1, . . . , SN); Step 2. Compute ˆλ(1) = (ˆλ(1) 1 , . . . , ˆλ(1) K ) and ˆλ(2) = (ˆλ(2) 1 , . . . , ˆλ(2) K ) as a solution of Eq. (1); Sequential quadratic programming of Section 4.1 can be used for this step. Step 3. Compute ˆg thanks to Eq. (2); Output: ε-fair classiﬁcation ˆg(x, s) at point (x, s).
Open Source Code	Yes	The source of our method can be found at https://github.com/curious ML/epsilon-fairness.
Open Datasets	Yes	Drug Consumption (DRUG) This dataset Fehrman et al. (2017) contains demographic information such as age, gender, and education level, as well as measures of personality traits thought to inﬂuence drug use for 1885 respondents. Communities&Crime (CRIME) This dataset contains socio-economic, law enforcement, and crime data about communities in the US with 1994 examples. Following Calders et al. (2013), the sensitive feature is a binary variable that corresponds to the ethnicity.
Dataset Splits	Yes	We generate n = 5000 synthetic examples and split the data into three sets (60% training, 20% hold-out and 20% unlabeled).
Hardware Specification	Yes	All algorithms were executed on an Apple M1 Pro processor to maintain consistency in the experimental setup.
Software Dependencies	No	The paper mentions "Random Forest (RF) with default parameters in scikit-learn" and "reglog, we use the default parameters in scikit-learn." without specifying a version number for scikit-learn or any other library.
Experiment Setup	Yes	For RF, we set the number of trees in {10, 11, . . . , 200}, the maximum depth of each tree in {2, 3, . . . , 16}, the minimum number of samples required to split an internal node in {2, 3, . . . , 10}, and the minimum number of samples required to be at a leaf node in {1, . . . , 8}; For GBM, we set L1 and L2 regularization terms on weights both in {0, 0.1, 1, 2, 5, 10, 20, 50}, the number of boosted trees in {10, 11, . . . , 200}, the maximum tree leaves in {6, 7, . . . , 50}, the maximum depth of each tree in {2, 3, . . . , 16}, and the minimum number of samples required in a child node for a split to occur in the tree in {10, 11, . . . , 100}.