Adversarial Latent Feature Augmentation for Fairness

Authors: Hoin Jung, Junyi Chai, Xiaoqian Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across diverse datasets, modalities, and backbone networks validate that training with these adversarial features significantly enhances fairness while maintaining predictive accuracy in classification tasks. The code is available on Git Hub.
Researcher Affiliation Academia Hoin Jung, Junyi Chai & Xiaoqian Wang Elmore Family School of Electrical and Computer Engineering Purdue University West Lafayette, IN, 47907, USA EMAIL
Pseudocode Yes Algorithm 1 Adversarial Latent Feature Augmentation
Open Source Code Yes The code is available on Git Hub.
Open Datasets Yes In this paper, we use four different tabular datasets Adult (Dua et al., 2017), COMPAS (Jeff Larson & Angwin, 2016), German (Dua et al., 2017), and Drug (Dua et al., 2017). Also Celeb A (Liu et al., 2018) and Wikipedia Toxicity (Thain et al., 2017) datasets are used for verify the performance of the proposed method in image and text classification, respectively.
Dataset Splits Yes All datasets are split into 60:20:20 for train, validation, and test subset, respectively.
Hardware Specification Yes CPU AMD Ryzen Threadripper 3960X 24-Core Processor GPU NVIDIA Ge Force RTX 3090
Software Dependencies No During the pre-training, we choose the best parameter when the validation accuracy is the highest. In the attacking step, parameters of both the encoder and classifier are fixed, while only the last layer is newly initialized for fine-tuning with the augmented latent features. The different learning rates are used in each step, Adam optimizer with learning rate 1e 3 in pre-training and fine-tuning, Adam optimizer with learning rate 0.1 in adversarial attack.
Experiment Setup Yes To verify our approach, we apply our method to two base classifiers for tabular datasets, Logistic Regression and MLP with Re LU activation function and two hidden layers of 128 dimensions. For the Celeb A dataset, we adopt Res Net-50 (He et al., 2016), Vi T (Dosovitskiy, 2020), and Swin Transformer (Liu et al., 2021) as baselines. For the Wiki dataset, we use LSTM (Hochreiter & Schmidhuber, 1997), BERT (Devlin, 2018), and Distill BERT (Sanh, 2019) as baselines. During the pre-training, we choose the best parameter when the validation accuracy is the highest. In the attacking step, parameters of both the encoder and classifier are fixed, while only the last layer is newly initialized for fine-tuning with the augmented latent features. The different learning rates are used in each step, Adam optimizer with learning rate 1e 3 in pre-training and fine-tuning, Adam optimizer with learning rate 0.1 in adversarial attack. For each experiment, we take the result when the validation accruacy is the highest. For a fair comparison, we train each case 10 times and report the mean and the standard deviation for tabular datasets and text dataset.