Say My Name: a Model's Bias Discovery Framework

Authors: Massimiliano Ciranni, Luca Molinaro, Carlo Alberto Barbano, Attilio Fiandrotti, Vittorio Murino, Vito Paolo Pastore, Enzo Tartaglione

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluation on typical benchmarks demonstrates its effectiveness in detecting biases and even disclaiming them. When sided with a traditional debiasing approach for bias mitigation, it can achieve state-of-the-art performance while having the advantage of associating a semantic meaning with the discovered bias. The code is available at https://github.com/Say My Name-Bias Naming/samyna-tmlr. 4 Empirical Results
Researcher Affiliation Academia Massimiliano Ciranni EMAIL Ma LGa, DIBRIS, University of Genoa, Italy Luca Molinaro EMAIL Computer Science Department, University of Turin, Italy Carlo Alberto Barbano EMAIL Computer Science Department, University of Turin, Italy Attilio Fiandrotti EMAIL Computer Science Department, University of Turin, Italy LTCI, Télécom Paris, Institut Polytechnique de Paris, France Vittorio Murino EMAIL Istituto Italiano di Tecnologia (IIT), Genoa, Italy University of Verona, Italy Vito Paolo Pastore EMAIL Ma LGa, DIBRIS, University of Genoa, Italy Istituto Italiano di Tecnologia (IIT), Genoa, Italy Enzo Tartaglione EMAIL LTCI, Télécom Paris, Institut Polytechnique de Paris, France
Pseudocode No The paper describes the method's steps as a pipeline in Section 3.2 and defines metrics in Section 3.1, but it does not present a clearly labeled pseudocode or algorithm block.
Open Source Code Yes The code is available at https://github.com/Say My Name-Bias Naming/samyna-tmlr.
Open Datasets Yes For our study, we employ the following datasets: Waterbirds (Sagawa* et al., 2020), Celeb A (Liu et al., 2015), BAR (Nam et al., 2020), and Image Net-A (Hendrycks et al., 2021).
Dataset Splits No The paper describes the datasets used and some characteristics of their composition, such as for Waterbirds, but it does not explicitly provide specific train/test/validation split percentages or sample counts for all datasets in a way that would allow direct reproduction of the data partitioning.
Hardware Specification Yes For our experiments, we have employed an NVIDIA A5000 with 24GB of VRAM, except for the captioning step for which we have employed an NVIDIA A100 equipped with 64GB of VRAM.
Software Dependencies No The paper mentions software like torchvision, LLaVA-NeXT (34B configuration, quantized in 8 bits), huggingface library, and Mini LM model from the sentencetransformers library, but it does not provide specific version numbers for these software components to ensure reproducibility.
Experiment Setup Yes For this step, we train with a batch size of 128 and a learning rate of 0.001 for Waterbirds, as done in (Sagawa* et al., 2020); for Celeb A, we use a batch size of 256 and a learning rate of 0.0001, following (Nam et al., 2020). For both, we employ SGD with Nesterov, set to 0.9. Finally, for BAR, we employ a batch size of 256 and a learning rate of 0.001, with Adam as the optimizer (Kim et al., 2021).