Fair Text Classification via Transferable Representations
Authors: Thibaud Leteno, Michael Perrot, Charlotte Laclau, Antoine Gourru, Christophe Gravier
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide both theoretical and empirical evidence that our approach is well-founded. Section 6 introduces the setting of our experiments, and Section 7 presents the experiments and their interpretations. We further validate our approach empirically by comparing it to state-of-the-art methods and evaluating different variations of our architecture. |
| Researcher Affiliation | Academia | Thibaud Leteno EMAIL Universit e Jean Monnet Saint-Etienne, CNRS, Institut d Optique Graduate School, Laboratoire Hubert Curien UMR 5516, F-42023, Saint-Etienne, France. Michael Perrot EMAIL Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRISt AL, F-59000, Lille, France. Charlotte Laclau EMAIL LTCI, T el ecom Paris Institut Polytechnique de Paris, France. |
| Pseudocode | Yes | In this section, we describe the full algorithm of WFC. Algorithm 1 provides the detailed algorithm for WFC used in our experiments. Algorithm 1: WFC Algorithm |
| Open Source Code | Yes | Our implementation is available on Github: https://github.com/Leteno Thibaud/wasserstein_fair_classification. |
| Open Datasets | Yes | We employ two widely-used data sets to evaluate fairness in the context of text classification, building upon prior research (Ravfogel et al., 2020; Han et al., 2021b; Shen et al., 2022b). Both data sets are readily available in the Fair Lib library (Han et al., 2022). Bias in Bios (De-Arteaga et al., 2019). Moji (Blodgett et al., 2016). |
| Dataset Splits | Yes | Bias in Bios (De-Arteaga et al., 2019). This data set, referred to as Bios data set in the rest of the paper, consists of brief biographies from the common crawl associated with occupations (a total of 28) and genders (male or female). As per the partitioning prepared by Ravfogel et al. (2020), the training, validation, and test sets comprise 257, 000, 40, 000, and 99, 000 samples, respectively. Moji (Blodgett et al., 2016). This data set contains tweets written in either Standard American English (SAE) or African American English (AAE), annotated with positive or negative polarity. We use the data set prepared by Ravfogel et al. (2020), which includes 100, 000 training examples, 8, 000 validation examples, and 8, 000 test examples. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, or memory specifications used for running the experiments. It mentions using 'BERT model' and 'SFR-Embedding-2 R model' for representations, but these refer to language models, not hardware. |
| Software Dependencies | No | Our experiments use the previously mentioned Fairlib framework. Note that the values are computed exactly using the POT library (Flamary et al., 2021). The optimizer used is Adam. While specific software names like Fairlib and POT library are mentioned, no version numbers are provided for these or other software components to ensure reproducibility. |
| Experiment Setup | Yes | We evaluate and optimize the hyperparameters for our models on a validation set, focusing on the MLP and Critic learning rates, the value of nd (number of batches used to train the main MLP), the layers producing Za and Zy, the value of β and the value used to clamp the weights to enforce the Lipschitz constraint. In all our experiments, and if not mentioned otherwise, the value of β is set to 1. Data set Bios Moji input dimension 768 2304 hidden layers 2 2 hidden dimension 300 300 learning rate 1e-4 1e-5 batch size 128 128 epochs max 10000 10000 activation Tan H Tan H β 1 1 nc 20 5 nd 5 5 clipping value 0.01 0.01 layer used last last. Hyperparameter Value number hidden layer 1 hidden dimension 512 activation Re LU optimizer Root Mean Square Propagation learning rate 5e-5. |