reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Relative Error Fair Clustering in the Weak-Strong Oracle Model

Authors: Vladimir Braverman, Prathamesh Dharangutte, Shaofeng H.-C. Jiang, Hoai-An Nguyen, Chen Wang, Yubo Zhang, Samson Zhou

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement and evaluate our algorithm on two real-world datasets for fair k-median. We use Adult which has about 50, 000 instances and 8 features (Becker & Kohavi, 1996). The second dataset is Default of Credit Card Clients with about 30, 000 instances and 9 features (Yeh, 2009). For both we make the sensitive attribute gender. ... We compare our algorithm against a baseline algorithm that uniformly at random samples and re-weights points. The experiments show that the proposed algorithm in Theorem 1 could significantly outperform the uniform baseline. We defer the experiments to Section 5.
Researcher Affiliation	Collaboration	1Google Research 2Department of Computer Science, Rutgers University 3School of Computer Science, Peking University 4Computer Science Department, Carnegie Mellon University 5Department of Computer Science, Rice University 6Department of Computer Science & Engineering, Texas A&M University.
Pseudocode	Yes	Algorithm 1. The construction algorithm for (1 + ε) coreset ... Algorithm 2 (CORESET-UPDATE). An algorithm that adds points to coreset. ... Algorithm 3 (PEELING). An algorithm that removes points. ... Algorithm 4 (CONSERVATIVE-PEELING). An algorithm that removes points. ... Algorithm 5. The construction algorithm for (1 + ε) coreset ... Algorithm 6. A charging scheme (a thought process for the analysis purpose only). ... Algorithm 7. A charging scheme (a thought process for the analysis purpose only).
Open Source Code	No	All experiments were run locally on a Macbook Air. The code can be found here.
Open Datasets	Yes	We use Adult which has about 50, 000 instances and 8 features (Becker & Kohavi, 1996). The second dataset is Default of Credit Card Clients with about 30, 000 instances and 9 features (Yeh, 2009).
Dataset Splits	No	Due to computational limits, we first subsampled the dataset to size roughly 2000 using Meyerson sampling (Meyerson, 2001a). ... After subsampling the original dataset using Meyerson sampling to a smaller dataset, we ran our fair coreset algorithm on this smaller dataset to create a coreset of size 100. We also made a uniform coreset as a baseline which simply uniformly selected 100 points.
Hardware Specification	Yes	All experiments were run locally on a Macbook Air.
Software Dependencies	No	We use the fairtree algorithm (Backurs et al., 2019), which we run on top of the uniform and our coreset. ... Due to computational limits, we first subsampled the dataset to size roughly 2000 using Meyerson sampling (Meyerson, 2001a).
Experiment Setup	Yes	For all experiments we use the values of balance parameters p = 1 and q = 10 (refer to Backurs et al. (2019) for more details) and we report the cost averaged over 10 independent runs.