reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FairICP: Encouraging Equalized Odds via Inverse Conditional Permutation

Authors: Yuheng Lai, Leying Guan

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The efficacy and adaptability of our method are demonstrated through both simulation studies and empirical analyses of real-world datasets.
Researcher Affiliation	Academia	1Department of Statistics, University of Wisconsin-Madison 2Department of Biostatistics, Yale University. Correspondence to: Leying Guan <EMAIL>.
Pseudocode	Yes	Algorithm 1 Fair ICP: Fairness-aware learning via inverse conditional permutation Input: Data (X, A, Y) = {(Xi, Ai, Yi)}i Itr Parameters: penalty weight µ, step size α, number of gradient steps Ng, and iterations T. Output: predictive model ˆfˆθf ( ) and discriminator ˆDˆθd( ).
Open Source Code	No	The paper discusses implementing Fair ICP and other methods. It mentions adapting code for FDL from 'https://github.com/yromano/fair_dummies', HGR from 'https://github.com/criteo-research/continuous-fairness', Gerry Fair from 'https://github.com/algowatchpenn/Gerry Fair', and Reduction from 'https://github.com/fairlearn/fairlearn'. However, there is no explicit statement or link provided for the open-source code of Fair ICP itself.
Open Datasets	Yes	Communities and Crime Data Set: This dataset contains 1994 samples and 122 features. The goal is to build a regression model predicting the continous violent crime rate. ACSIncome Dataset: We use the ACSIncome dataset from Ding et al. (2021) with 100,000 instances (subsampled) and 10 features. Adult Dataset: The dataset consists of 48,842 instances and the task is the same as ACSIncome. COMPAS Dataset: The Pro Publica s COMPAS recidivism dataset contains 5278 examples and 11 features (Fabris et al., 2022).
Dataset Splits	Yes	For all experiments, the data is repeated divided into a training set (60%) and a test set (40%) 100 times, with the average results on the test sets reported . We set K {1, 5, 10}, ω {0.6, 0.9} to investigate different levels of dependence on A, and the sample size for training/test data to be 500/400.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory details) used for running its experiments.
Software Dependencies	Yes	For KPC (Huang et al., 2022), we use R Package KPC (Huang, 2022) with the default Gaussian kernel and other parameters. Huang, Z. KPC: Kernel Partial Correlation Coefficient, 2022. URL https://CRAN.R-project.org/ package=KPC. R package version 0.1.2.
Experiment Setup	Yes	For all the models evaluated (Fair ICP, Fair CP, FDL, Oracle), we set the hyperparameters as follows: We set f as a linear model and use the Adam optimizer with a mini-batch size in {16, 32, 64}, learning rate in {1e-4, 1e-3, 1e-2}, and the number of epochs in {20, 40, 60, 80, 100, 120, 140, 160, 180, 200}. The discriminator is implemented as a four-layer neural network with a hidden layer of size 64 and Re LU non-linearities. We use the Adam optimizer, with a fixed learning rate of 1e-4. For the MAF used to estimate the conditional density (Y \|A and A\|Y ) in the training phase, we use MAF with one MADE component and one hidden layer with 2 conditional inputs nodes, and for optimizer we choose Adam with 0.01 l1 penalty and 0.1 learning rate.