Robust ML Auditing using Prior Knowledge

Authors: Jade Garcia Bourrée, Augustin Godinot, Sayan Biswas, Anne-Marie Kermarrec, Erwan Le Merrer, Gilles Tredan, Martijn De Vos, Milos Vujasinovic

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, our experiments with two standard datasets illustrate the maximum level of unfairness a platform can hide before being detected as malicious. Our formalization and generalization of manipulation-proof auditing with a prior opens up new research directions for more robust fairness audits.
Researcher Affiliation Academia 1Universite de Rennes, Rennes, France 2Inria, Rennes, France 3IRISA/CNRS, Rennes, France 4PEReN, Paris, France 5EPFL, Lausanne, Switzerland 6LAAS, CNRS, Toulouse, France.
Pseudocode No No explicit pseudocode or algorithm blocks are provided in the paper.
Open Source Code Yes The code to run the experiments is available online.2 2See https://github.com/grodino/merlin.
Open Datasets Yes The tabular dataset comes from the ACSEmployment task for the state of Minnesota in 2018, which is derived from US Census data and provided in folktables (Ding et al., 2021). For the vision modality, we study Celeb A (Liu et al., 2015), which consists of images of celebrities along with several binary attributes associated with each image, such as whether the person in the photo is blond, smiling, or if the photo is blurry.
Dataset Splits No The paper mentions using ACSEmployment and Celeb A datasets but does not provide specific training/validation/test split percentages, sample counts, or methodology for the models trained on these datasets. It refers to an 'audit budget' for the audit set S, which is distinct from model training dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It only discusses software implementations and training parameters.
Software Dependencies No The paper mentions 'SCIKIT-LEARN' and 'Adam optimizer' but does not specify version numbers for these or any other software libraries or dependencies used in the experiments.
Experiment Setup Yes GBDT and Log. Reg. are trained using the default parameters of their respective implementations in SCIKIT-LEARN. Meanwhile, Le Net is trained irrespective of the target attribute using the Adam optimizer with a learning rate of γ = 0.001, a batch size of 32, and for two epochs, which is sufficient for the model to converge on all features.