Curvature-based Clustering on Graphs
Authors: Yu Tian, Zachary Lubberts, Melanie Weber
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide both theoretical and empirical evidence for the utility of our curvature-based clustering algorithms. In addition, we give several results on the relationship between the curvature of a graph and that of its dual, which enable the efficient implementation of our proposed mixed-membership community detection approach and which may be of independent interest for curvature-based network analysis. |
| Researcher Affiliation | Academia | Yu Tian EMAIL Center for Systems Biology Dresden Dresden, 01307, Germany Zachary Lubberts EMAIL Department of Statistics University of Virginia Charlottesville, VA 22904, USA Melanie Weber EMAIL Harvard University Cambridge, MA 02138, USA |
| Pseudocode | Yes | Algorithm 1 Curvature-based Clustering via Ricci Flow |
| Open Source Code | No | Our implementations for augmented FRC and our ORC approximation build on the code in this library. |
| Open Datasets | Yes | To test our methods on real data, we consider two data sets, (i) collaboration networks from DBLP (Yang and Leskovec, 2015), and (ii) ego-networks from Facebook (Leskovec and Mcauley, 2012), for which ground truth labels are available through SNAP (Leskovec and Krevl, 2014)... Obtained from Neuro Data’s open source data base https://neurodata.io/project/connectomes/, accessed on 2 May 2022. |
| Dataset Splits | No | For each set of parameters, we generate ns = 10 graphs and run our algorithms for both ORC and FRC, together with other popular community detection methods... For each set of parameters, we construct ns = 10 graphs, and run our algorithms for both ORC and FRC, together with the Louvain algorithm and Spectral clustering, in the same way as in the single-membership setting, but on the line graphs L(G). |
| Hardware Specification | No | Part of the computations in this paper were run on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University. Part of the computations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at the PDC Center for High Performance Computing, KTH Royal Institute of Technology, partially funded by the Swedish Research Council through grant agreement no. 2018-05973. |
| Software Dependencies | No | Specifically, for ORC, the optimal transport can be obtained by solving the earth mover s distance exactly (ORC-E) or approximately with Sinkhorn (ORC-S), while we can also approximate it by the mean of the upper and lower bounds we have developed (ORC-A). For FRC, we can consider the 1-d version as in Equation (4.9) where no face information is included (FRC-1), the augmented version with triangular faces as in Equation (4.11) (FRC-2) and also the one with further quadrangular faces as in Definition 17 (FRC-3). For Spectral Clustering, we present the results from the following variants: (i) the highly optimized version in scikit-learn (Pedregosa et al., 2011) together with a grid search for the number of clusters, from 2 up to 10 (Spectra-S), and (ii) automatic embedding dimension selection (Zhu and Ghodsi, 2006), followed by kmeans clustering (Spectra-A). |
| Experiment Setup | Yes | We further choose equally-spaced cut-offpoints {xi}nf i=0, where x0 = max{u,v} E w T u,v, x1 = x0 δ, . . . , xnf = ((x0 1) mod δ + 1), and the step size for the cut-offpoints is set to be δ = 0.025. Other hyperparameter choices include a constant step size ν = 1, ϵ = 10 4, drop threshold ϵd = 0.1, and stopping time T = 10, i.e., we evolve edge weights under Ricci flow for 10 iterations... In the absence of given edge weights, we initialize edge weights to 1 as before. Adaptive step sizes are chosen proportional to the inverse of the highest absolute curvature of any edge in a given iteration, i.e., νt = 1/(1.1 max{u,v} E |κt u,v|). In our experiments below, we evolve edge-weights for T = 10 iterations under the Ricci flow. To infer communities, cut-offpoints are chosen as {xi}nf i=0, where x0 = max{u,v} E w T u,v, x1 = max{u,v} E,w Tu,v<x0 w T u,v, . . . , xn f = w T q , xn f+1 = xn f δ, . . . , xnf = ((xn f 1.1wmin) mod δ + 1.1wmin). Here, w T q is the qth quantile of all weights; we set q = 0.999, step size δ = 0.25, and wmin = min{u,v} E w T u,v. |