Regularized spectral methods for clustering signed networks
Authors: Mihai Cucuringu, Apoorv Vikram Singh, Déborah Sulem, Hemant Tyagi
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We complement our theoretical results with an extensive set of numerical experiments on synthetic data, and three real world data sets standard in the signed networks literature. |
| Researcher Affiliation | Academia | Mihai Cucuringu EMAIL Department of Statistics and Mathematical Institute University of Oxford The Alan Turing Institute, London, UK Apoorv Vikram Singh EMAIL Department of Computer Science and Engineering New York University D eborah Sulem EMAIL Department of Statistics University of Oxford Hemant Tyagi EMAIL UMR 8524 Laboratoire Paul Painlev e, F-59000 Inria, Univ. Lille, CNRS |
| Pseudocode | No | The paper describes algorithms and procedures textually and mathematically throughout Section 2 and 6, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Python implementations of a suite of algorithms for signed clustering are available at https://github. com/alan-turing-institute/signet |
| Open Datasets | Yes | Finally, we measure the performances of our unregularized and regularized algorithms on three benchmark data sets in the signed clustering problem: the Wikipedia Requests for Adminship, the Slashbot Zoo and Bitcoin data sets from Leskovec and Krevl (2014). |
| Dataset Splits | No | The paper describes the generation of synthetic data and the number of runs for experiments but does not provide explicit training/test/validation splits for either synthetic or real-world datasets. For instance, in Section 6.2, it states: "1. Fix s 1 = 1/k, pick a value for ρ and compute s k = s 1/ρ. 2. For i [2, k 1], sample s i from the uniform distribution in the interval [s 1, s k]. 3. Compute the proportions si = s i Pk i=1 s i , and then sample the graph from the resulting proportions. 4. Repeat 20 times the steps 1-3 mentioned above, and record the average performance over the 20 runs." |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware used for running its experiments, such as GPU or CPU models, or cloud computing specifications. |
| Software Dependencies | No | The paper mentions using "Python implementation", the "scikit-learn package", and the "signet package" but does not specify any version numbers for these software components. For example, it states: "using the implementation in scikit-learn of the k-means++ algorithm" and "with the implementation in the signet package (Cucuringu et al., 2019)". |
| Experiment Setup | Yes | In the following experiments, the Signed Stochastic Block Model will be sampled with the following set of parameters: the number of nodes n = 5000, the number of communities k {3, 5, 10, 20}, the relative size of communities ρ = 1 (equal-size clusters) and ρ = 1/k (non-equal size clusters). For the edge density parameter p, we choose two sparsity regimes, Regime I and Regime II... The noise level η is chosen such that the recovery of the clusters is unsatisfactory for a subset of pairs of parameters (τ +, τ ). For each set of parameters, we sample 20 graphs from the SSBM and average the resulting ARI. For the combinatorial and symmetric Signed Laplacians L and Lsym, we compute (k 1)-dimensional embeddings before applying the k-means++ algorithm. For all other methods, we use the k smallest eigenvectors. |