Signed Laplacians for Constrained Graph Clustering
Authors: John Stewart Fabila Carrasco, He Sun
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted experiments to compare the spectral clustering method with the constrained clustering approach, using both synthetic and real-world datasets. The experimental results, visualised in Figure 1, illustrate the superior performance of CC++ compared to traditional SC, particularly as the inter-cluster edge probability q increases. Our experiments on synthetic and real-world datasets confirm the robustness and scalability of our algorithm, significantly outperforming standard spectral clustering under challenging scenarios. |
| Researcher Affiliation | Academia | 1School of Informatics, University of Edinburgh, Edinburgh, United Kingdom. Correspondence to: He Sun <EMAIL>. |
| Pseudocode | Yes | See Algorithm 1 for the formal description of our algorithm. Algorithm 1 The Constrained Clustering Algorithm |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We considered a binary Stochastic Block Model (SBM) with n = 1, 000 vertices divided into two equal-sized communities. Based on the Geometric Random Graphs (RGG) (Avrachenkov et al., 2021; Dall & Christensen, 2002), we generated two clusters of vertices... We evaluated SC and CC++ on real-world temperature data from ground stations in Brittany, January 2014 (Girault, 2015). |
| Dataset Splits | No | The paper describes the generation of synthetic datasets and the use of a real-world dataset but does not provide specific training/test/validation splits. For the synthetic datasets, it describes how the graphs were generated and varied (e.g., 'n = 1,000 vertices divided into two equal-sized communities'), and for the real-world data, it mentions using 'all available measurements, one per hour, across 24 hours over 31 days, totaling 744 observations', but no explicit splits for training, validation, or testing are detailed for any dataset. |
| Hardware Specification | Yes | All simulations were run on a PC equipped with an Intel Core i7-10610U CPU running at 1.80 GHz and 32 GB of RAM, using MATLAB R2024a for computation. |
| Software Dependencies | Yes | All simulations were run on a PC equipped with an Intel Core i7-10610U CPU running at 1.80 GHz and 32 GB of RAM, using MATLAB R2024a for computation. |
| Experiment Setup | Yes | Specifically, we fixed p = 0.2 and varied q from 0.12 to 0.2 in 30 equidistant steps. Each configuration of cluster separation was repeated 10 times, and we generated G and H on the same set of vertices but with different connectivity structures. Parameters were set to σ2 1 = 5 108 and σ2 = 105 (Girault, 2015). |