Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Augmented Sparsifiers for Generalized Hypergraph Cuts
Authors: Nate Veldt, Austin R. Benson, Jon Kleinberg
JMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition to its strong theoretical guarantees, Sparse Card is very practical and leads to substantial improvements in benchmark image segmentation problems and localized hypergraph clustering tasks. We focus on image segmentation and localized hypergraph clustering tasks that simultaneously include component functions of large and small support, which are common in practice (Shanu et al., 2016; Ene et al., 2017; Veldt et al., 2022; Liu et al., 2021; Purkait et al., 2016). Image segmentation experiments were run on a laptop with a 2.2 GHz Intel Core i7 processor and 8GB of RAM. For our local hypergraph clustering experiments, in order to run a larger number of experiments we used a machine with 4 x 18-core, 3.10 GHz Intel Xeon gold processors with 1.5 TB RAM. We consider public data sets previously made available for academic research, and use existing open source software for competing methods. Image data sets are available at http://people.csail.mit.edu/stefje/code.html and hypergraph clustering data sets are available at www.cs.cornell.edu/~arb/data/. Our code is available at https://github.com/nveldt/SparseCardDSFM. |
| Researcher Affiliation | Academia | Nate Veldt EMAIL Department of Computer Science and Engineering Texas A&M University College Station, TX, USA Austin R. Benson EMAIL Department of Computer Science Cornell University Ithaca, NY, USA Jon Kleinberg EMAIL Department of Computer Science Cornell University Ithaca, NY, USA |
| Pseudocode | Yes | Algorithm 1 Find Best-PL-Approx(w, ε) (solves problem (22)) Input: w Sr, ε 0 Output: f Fr optimizing problem (22) L = {g(0)}, where g(0) = w(1)x p = max {i [r] | g(0)(i) (1 + ε)w(i)} ℓ= p + 1 while ℓ r do (g , p) = Find Next(w, ε, ℓ) ℓ p + 1 L L {g } if p = r then L L {g(r)}, where g(r)(x) = w(r) end if end while Return f defined by f(x) = ming L g(x) |
| Open Source Code | Yes | Our code is available at https://github.com/nveldt/SparseCardDSFM. |
| Open Datasets | Yes | Image data sets are available at http://people.csail.mit.edu/stefje/code.html and hypergraph clustering data sets are available at www.cs.cornell.edu/~arb/data/. Our code is available at https://github.com/nveldt/SparseCardDSFM. We consider the smallplant and octopus segmentation tasks from Jegelka and Bilmes (2011). |
| Dataset Splits | Yes | We focus on a set of 45 local clusters, all of which are question topics involving between 2,000 and 10,000 nodes. For each cluster, we generate a random seed set by selecting 5% of the nodes in the target cluster uniformly at random, and then add neighboring nodes to the seed set to grow it into a larger input set Z to use for Hyper Local (see Veldt et al. 2020a for details). |
| Hardware Specification | Yes | Image segmentation experiments were run on a laptop with a 2.2 GHz Intel Core i7 processor and 8GB of RAM. For our local hypergraph clustering experiments, in order to run a larger number of experiments we used a machine with 4 x 18-core, 3.10 GHz Intel Xeon gold processors with 1.5 TB RAM. |
| Software Dependencies | No | The continuous optimization methods for DSFM that we compare against are implemented in C++ with a MATLAB front end. |
| Experiment Setup | Yes | We set δ = 5000 for the δ-linear hyperedge cut function and set a locality parameter for Hyper Local to be 1.0 for all experiments. With this setup, using Hyper Local with the δ-linear penalty will then reproduce our original experimental settings (Veldt et al., 2020a). Our goal is to show how using Sparse Card leads to fast and often improved results for alternative penalties that could not previously been used. Sparse Card was run for a range of ε values on a decreasing logarithmic scale from 1 to 10 4, and obtains significantly better results on all but the octopus with 500 superpixels instance. |