reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aequitas Flow: Streamlining Fair ML Experimentation

Authors: Sérgio Jesus, Pedro Saleiro, Inês Oliveira e Silva, Beatriz M. Jorge, Rita P. Ribeiro, João Gama, Pedro Bizarro, Rayid Ghani

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Aequitas Flow is an open-source framework and toolkit for end-to-end Fair Machine Learning (ML) experimentation, and benchmarking in Python. This package ﬁlls integration gaps that exist in other fair ML packages. In addition to the existing audit capabilities in Aequitas, the Aequitas Flow module provides a pipeline for fairness-aware model training, hyperparameter optimization, and evaluation, enabling easy-to-use and rapid experiments and analysis of results. The goal is to help 1)researchers compare and benchmark new methods they develop against existing methods in a systematic and reproducible manner and 2) practitioners easily evaluate existing bias mitigation methods and deploy ones that best match their goals. Table 1: Comparison of packages for training and evaluation of fair ML Methods. Figure 2: Plots introduced in Aequitas Flow. Plot (a) is designed for model selection; Plot (b) compares the diﬀerent tested methods.
Researcher Affiliation	Collaboration	Sérgio Jesus1,2 EMAIL Pedro Saleiro1 EMAIL Inês Oliveira e Silva1 EMAIL Beatriz M. Jorge1 EMAIL Rita P. Ribeiro2 EMAIL João Gama2 EMAIL Pedro Bizarro1 EMAIL Rayid Ghani3 EMAIL 1Feedzai 2University of Porto 3Carnegie Mellon University
Pseudocode	No	The paper includes Python code snippets demonstrating the usage of the Aequitas Flow framework, such as `exp = Experiment(config_file="configs/experiment.yaml") exp.run()` and `model = methods.inprocessing.FairGBM() model.fit(train.X, train.y, train.s)`. However, these are examples of implementation code rather than structured pseudocode or algorithm blocks describing a method's steps.
Open Source Code	Yes	Aequitas Flow is an open-source framework and toolkit for end-to-end Fair Machine Learning (ML) experimentation, and benchmarking in Python. This paper introduces Aequitas Flow, an open-source framework for reproducible and extensible end-to-end fair ML experimentation that extends Aequitas, our original bias audit toolkit. 1. https://github.com/dssg/aequitas
Open Datasets	Yes	The framework initially encompasses eleven tabular datasets, including those from the Bank Account Fraud (Jesus et al., 2022) and Folktables (Ding et al., 2021). The component also permits user-supplied datasets in CSV or parquet formats with splits based on a column, or randomly. dataset = datasets.FolkTables(variant=ACSIncome)
Dataset Splits	No	The paper mentions that the Datasets component has two primary functions: "loading the data and generating splits." It also states that "The component also permits user-supplied datasets in CSV or parquet formats with splits based on a column, or randomly." While it describes the capability of the framework to create splits, it does not provide specific details (e.g., percentages, methodology, random seeds) for the experimental splits used in the paper itself.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. It focuses on the software framework and its capabilities.
Software Dependencies	No	The paper mentions the use of 'Python', 'Optuna (Akiba et al., 2019)' for hyperparameter selection, 'Aequitas (Saleiro et al., 2018)' for bias auditing, and 'pandas dataframe format (pandas development team, 2020)'. However, it does not specify exact version numbers for Python or any of these libraries.
Experiment Setup	No	The paper describes the functionalities of the 'Optimizer' component, stating that 'Several attributes of the hyperparameter optimization can be determined by configurations, such as the number of trials and jobs, the selection algorithm (e.g., random search, grid search), and the random seed.' However, it does not provide the specific hyperparameter values or training configurations used for any experiments conducted by the authors in the paper, instead describing how a user of the framework could set them.