reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FairDen: Fair Density-Based Clustering

Authors: Lena Krieger, Anna Beer, Pernille Matthews, Anneka Thiesson, Ira Assent

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that Fair Den finds meaningful and fair clusters in extensive experiments. 3 EXPERIMENTAL EVALUATION We measure cluster quality with DCSI (Gauss et al., 2024) and group-level fairness with generalized balance Bera et al. (2019) (Sect. 3.2,3.1). We study real-world benchmarks (Sect. 3.3) and compare to state-of-the-art fair clustering methods Fair SC (Kleindessner et al., 2019b), normalized Fair SC (Kleindessner et al., 2019b), Fairlets (Chierichetti et al., 2017), and Scalable Fair Clustering (Backurs et al., 2019) (Section 3.4).
Researcher Affiliation	Academia	1 IAS-8: Data Analytics and Machine Learning, Forschungszentrum J ulich, J ulich, Germany 2 Faculty of Computer Science, University of Vienna, Vienna, Austria 3 Department of Computer Science, Aarhus University, Aarhus, Denmark
Pseudocode	Yes	A.1 PSEUDO-CODE The pseudo-code for our novel fair density-based clustering method Fair Den is given in Algorithm 1: Algorithm 1 Fair Den
Open Source Code	Yes	Our code is available at Git Hub1. 1https://jugit.fz-juelich.de/ias-8/fairden
Open Datasets	Yes	We use the common benchmark datasets for fair clustering (Chhabra et al., 2021; Le Quy et al., 2022), details shown in Table 6: The datasets Adult (Kohavi et al., 1996), Bank (Moro et al., 2014), Communities and Crime (Asuncion & Newman, 2007), and Diabetes (Strack et al., 2014) provide different scenarios in terms of dimensionality and number of sensitive groups.
Dataset Splits	No	The paper describes sampling points from datasets (e.g., "We sampled 2000 data points from the dataset" for Adult, "We sample the dataset to 5000 data points" for Bank and Diabetes), and for runtime experiments, it mentions generating datasets with DENSIRED and randomly assigning a binary sensitive attribute. However, it does not specify explicit training/test/validation splits, percentages, or predefined partitions for the benchmark datasets used in the main evaluation.
Hardware Specification	Yes	The experiments are performed on a Mac Book Pro, with an M2 Pro, and 16 GB of RAM using Python 3.9. The runtime experiments are performed on a workstation with an AMD Ryzen Threadripper PRO 3955W, 250 GB RAM, and an RTX 3090.
Software Dependencies	No	The experiments are performed on a Mac Book Pro, with an M2 Pro, and 16 GB of RAM using Python 3.9. While Python 3.9 is mentioned, no specific version numbers for libraries or other key software components are provided to ensure full reproducibility.
Experiment Setup	Yes	Following Schubert et al. (2017), we fix the parameter µ = 2d 1 for Fair Den and show an ablation in App. A.2. The parameters for DBSCAN, min Pts, and ε, were determined with a hyperparameter optimization. The criterion for the optimization is the DCSI score with a constant setting of min Pts DCSI = 5. The parameter space for min Pts DBSCAN comprises {4, 5, 10, 15, 2d 1}, with d being the dimension of the dataset, and for ε {0.01, 0.05, 0.1, .., 0.5, 0.6, .., 2.5, 2.6, 2.8, 3, 3.25, 3.5, 3.75}. The final parameter settings are given in Table 6.