EL-Clustering: Combining Upper- and Lower-Bounded Clusterings for Equitable Load Constraints
Authors: Rajni Dabas, Neelima Gupta, Rudra Bhardwaj, Sapna Grover
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also do empirical evaluation for k-Median objective on benchmark datasets to show that both, the cost as well as the violation factor are significantly smaller in practice than the theoretical worst-case guarantees. |
| Researcher Affiliation | Academia | Rajni Dabas EMAIL Department of Computer Science, Northwestern University P.G.D.A.V. College, University of Delhi Neelima Gupta EMAIL Department of Computer Science, University of Delhi Rudra Bhardwaj EMAIL Department of Computer Science, University of Delhi Sapna Grover EMAIL Department of Computer Science, University of Delhi Dyal Singh College, University of Delhi |
| Pseudocode | Yes | Algorithm 1: Constructing Graph G2(σL, σU) Algorithm 2: Breaking Cycles: Constructing an almost-DAG G2(ˆσL, σU) Algorithm 3: Process(Si) Algorithm 4: Constructing SI |
| Open Source Code | Yes | All source code, logs, and charts are publicly available8. 8https://github.com/0-rudra-0/el-clustering |
| Open Datasets | Yes | We empirically evaluate our combination algorithm for the k-Median objective on benchmark datasets from the UCI Machine Learning Repository, as preprocessed in Almanza et al. (2022): Adult, Diabetes, and Bank. The Adult dataset Dua & Graff (2017) contains census records, the Bank dataset Moro et al. (2014) contains data points from a marketing campaign, and the Diabetes dataset Strack et al. (2014) contains admission records of diabetic patients. |
| Dataset Splits | No | The paper mentions varying clustering parameters (k, L, U) for experiments but does not provide specific training/test/validation dataset splits for the benchmark datasets used. |
| Hardware Specification | Yes | The algorithm is implemented in Python and executed on a Apple laptop with an Apple M1 processor, 8 GB RAM, running mac OS (Tahoe 26 beta). |
| Software Dependencies | No | The paper states the algorithm is implemented in Python but does not specify a version number for Python or any specific libraries/packages used. |
| Experiment Setup | Yes | We vary the number of clusters k for k {10, 25, 50, 75, 100}. The lower bound L and upper bound U are derived from the natural cluster size n/k, with buffers of {10%, 25%, 50%}. Facility pool F of varying sizes ({5%, 10%, 20%, 25%} of the input size) are selected randomly from the input points. |