Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Clustering with Semidefinite Programming and Fixed Point Iteration
Authors: Pedro Felzenszwalb, Caroline Klivans, Alice Paul
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that using fixed point iteration for rounding the Max k-Cut SDP relaxation leads to significantly better results when compared to randomized rounding. In Section 6 we compare our fixed point iteration method to the randomized rounding procedure in several examples, showing that the fixed point approach can produce much better clusterings in practice. Finally, in Section 6 we illustrate experimental results of our new rounding method and compare them to the randomized rounding approach from Frieze and Jerrum (1997). We also compare the result of our method to the k-means algorithm for clustering images in the MNIST handwritten digit dataset. |
| Researcher Affiliation | Academia | Pedro Felzenszwalb EMAIL School of Engineering and Department of Computer Science Brown University Providence, RI 02912, USA; Caroline Klivans EMAIL Division of Applied Mathematics Brown University Providence, RI 02912, USA; Alice Paul alice EMAIL Department of Biostatistics Brown University Providence, RI 02912, USA |
| Pseudocode | No | The paper describes algorithms and methods in prose and mathematical formulations but does not present any explicitly labeled pseudocode blocks or algorithms in a structured, code-like format. |
| Open Source Code | No | The paper mentions "The algorithms were implemented in Python" but does not provide any statement about releasing the code or a link to a repository for the methodology described. |
| Open Datasets | Yes | Section 6.2 MNIST digits: In this section we evaluate the performance of our clustering method on a subset of the MNIST handwritten digits dataset (Le Cun et al.). The full reference is provided: Yann Le Cun, Corinna Cortes, and Christopher Burges. MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/. |
| Dataset Splits | No | For synthetic data: "We performed several experiments using synthetic data in R2...". For MNIST: "In each trial we selected 20 random examples for each of 5 digits (0, 1, 2, 3 and 4).". The paper describes data generation and sampling but does not provide specific train/test/validation splits with percentages, sample counts, or defined random seeds for reproducibility. |
| Hardware Specification | Yes | The algorithms were implemented in Python and run on a computer with an Intel i7 CPU @ 2.6 Ghz with 8GB of RAM. |
| Software Dependencies | No | The algorithms were implemented in Python... We use the cvxpy package for convex optimization together with the SCS (splitting conic solver) package to solve SDPs. We used the scipy library implementation of k-means. While specific software packages are mentioned, no version numbers are provided for Python, cvxpy, SCS, or scipy. |
| Experiment Setup | Yes | The fixed point iteration method converged after 1 to 5 iterations in each case. The number of iterations until convergence in the first setting was between 3 and 10, with an average of 7.02. The number of iterations until convergence in the second setting was between 3 and 4, with an average of 3.01. Each trial involved 10 different random initializations of the initial cluster centers using the k-means++ method. |