Signed Laplacians for Constrained Graph Clustering

Authors: John Stewart Fabila Carrasco, He Sun

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted experiments to compare the spectral clustering method with the constrained clustering approach, using both synthetic and real-world datasets. The experimental results, visualised in Figure 1, illustrate the superior performance of CC++ compared to traditional SC, particularly as the inter-cluster edge probability q increases. Our experiments on synthetic and real-world datasets confirm the robustness and scalability of our algorithm, significantly outperforming standard spectral clustering under challenging scenarios.
Researcher Affiliation Academia 1School of Informatics, University of Edinburgh, Edinburgh, United Kingdom. Correspondence to: He Sun <EMAIL>.
Pseudocode Yes See Algorithm 1 for the formal description of our algorithm. Algorithm 1 The Constrained Clustering Algorithm
Open Source Code No The paper does not contain an explicit statement about the release of source code, nor does it provide a link to a code repository.
Open Datasets Yes We considered a binary Stochastic Block Model (SBM) with n = 1, 000 vertices divided into two equal-sized communities. Based on the Geometric Random Graphs (RGG) (Avrachenkov et al., 2021; Dall & Christensen, 2002), we generated two clusters of vertices... We evaluated SC and CC++ on real-world temperature data from ground stations in Brittany, January 2014 (Girault, 2015).
Dataset Splits No The paper describes the generation of synthetic datasets and the use of a real-world dataset but does not provide specific training/test/validation splits. For the synthetic datasets, it describes how the graphs were generated and varied (e.g., 'n = 1,000 vertices divided into two equal-sized communities'), and for the real-world data, it mentions using 'all available measurements, one per hour, across 24 hours over 31 days, totaling 744 observations', but no explicit splits for training, validation, or testing are detailed for any dataset.
Hardware Specification Yes All simulations were run on a PC equipped with an Intel Core i7-10610U CPU running at 1.80 GHz and 32 GB of RAM, using MATLAB R2024a for computation.
Software Dependencies Yes All simulations were run on a PC equipped with an Intel Core i7-10610U CPU running at 1.80 GHz and 32 GB of RAM, using MATLAB R2024a for computation.
Experiment Setup Yes Specifically, we fixed p = 0.2 and varied q from 0.12 to 0.2 in 30 equidistant steps. Each configuration of cluster separation was repeated 10 times, and we generated G and H on the same set of vertices but with different connectivity structures. Parameters were set to σ2 1 = 5 108 and σ2 = 105 (Girault, 2015).