Approximation Algorithms for Stochastic Clustering
Authors: David G. Harris, Shi Li, Thomas Pensyl, Aravind Srinivasan, Khoa Trinh
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We consider stochastic settings for clustering, and develop provably-good approximation algorithms for a number of these notions. These algorithms yield better approximation ratios compared to the usual deterministic clustering setting. Additionally, they offer a number of advantages including clustering which is fairer and has better long-term behavior for each user. In particular, they ensure that every user is guaranteed to get good service (on average). We also complement some of these with impossibility results. ... Keywords: clustering, k-center, k-median, lottery, approximation algorithms |
| Researcher Affiliation | Collaboration | David G. Harris EMAIL Department of Computer Science, University of Maryland College Park, MD 20742; Shi Li EMAIL University at Buffalo, Buffalo, NY; Thomas Pensyl EMAIL Bandwidth, Inc. Raleigh, NC; Aravind Srinivasan EMAIL Department of Computer Science and Institute for Advanced Computer Studies University of Maryland, College Park, MD 20742; Khoa Trinh EMAIL Google, Mountain View, CA 94043 |
| Pseudocode | Yes | Algorithm 1 Greedy Cluster(F, w)... Algorithm 2 Rounding algorithm for half-homogeneous chance k-coverage... Algorithm 3 Iterative rounding algorithm for chance k-coverage... Algorithm 4 Rounding algorithm with clusters... Algorithm 5 Rounding algorithm for k-center... Algorithm 7 Partial-cluster based algorithm... Algorithm 8 Approximation algorithm for E[d(j, S)]: first phase... Algorithm 9 (α, β)-determinization algorithm... Algorithm 10 (1, k + 2)-determinization algorithm |
| Open Source Code | No | We wrote C code to perform this computation to upper-bound EQ ˆR with M = 10, ϵ = 2 12, L = 7. |
| Open Datasets | No | The paper focuses on theoretical approximation algorithms for clustering problems and does not describe experiments run on specific datasets, therefore no concrete access information for open datasets is provided. |
| Dataset Splits | No | The paper presents theoretical approximation algorithms and does not include empirical experiments with datasets that would require training/test/validation splits. |
| Hardware Specification | No | We wrote C code to perform this computation to upper-bound EQ ˆR with M = 10, ϵ = 2 12, L = 7. This runs in about an hour on a single CPU core. |
| Software Dependencies | No | We wrote C code to perform this computation to upper-bound EQ ˆR with M = 10, ϵ = 2 12, L = 7. |
| Experiment Setup | No | The paper is theoretical, focusing on algorithm design and proofs, and does not contain details about specific experimental setups, hyperparameters, or training configurations. |