Sparse Popularity Adjusted Stochastic Block Model
Authors: Majid Noroozi, Marianna Pensky, Ramchandra Rimal
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Simulations and Real Data Examples In this section we evaluate the performance of our method using synthetic networks. We assume that the number of communities (clusters) K is known and for simplicity consider a perfectly balanced model with n/K nodes in each cluster. We generate each network from a random graph model with a symmetric probability matrix P given by the SPABM model with a clustering matrix Z and a block matrix Λ. |
| Researcher Affiliation | Academia | Majid Noroozi EMAIL Department of Mathematical Sciences University of Memphis Memphis, TN 38152, USA Marianna Pensky EMAIL Department of Mathematics University of Central Florida Orlando, FL 32816, USA Ramchandra Rimal EMAIL Department of Mathematical Sciences Middle Tennessee State University Murfreesboro, TN 37132, USA |
| Pseudocode | No | In this paper, we use Sparse Subspace Clustering (SSC) since it allows one to take advantage of the knowledge that, for a given K, columns of matrix P lie in the union of K distinct subspaces, each of the dimension at most K. If matrix P were known, the weight matrix W would be based on writing every data point as a sparse linear combination of all other points by solving the following optimization problem min Wj Wj 1 s.t. (P )j = X k =j Wkj(P )k (40) In the case of data contaminated by noise, the SSC algorithm does not attempt to write data as an exact linear combination of other points. Instead, SSC can be built upon the solution of the elastic net problem c Wj argmin Wj 2 Aj AWj 2 2 + γ1 Wj 1 + γ2 Wj 2 2 s.t. Wjj = 0 , j = 1, ..., n, (41) These are mathematical optimization problems, not structured pseudocode blocks. |
| Open Source Code | No | We solve (41) using the LARS algorithm Efron et al. (2004) implemented in SPAMS Matlab toolbox (see Mairal et al. (2014)). This refers to a third-party toolbox and does not indicate the authors are providing their own source code for the methodology described in the paper. |
| Open Datasets | Yes | To study the ego-network, we use the dataset described comprehensively in Leskovec and Mcauley (2012)." and "Our second example involves analyzing a human brain functional network, constructed on the basis of the resting-state functional MRI (rsf MRI). We use the the brain connectivity dataset presented as a Group Average rsf MRI matrix described in Crossley et al. (2013)." |
| Dataset Splits | No | For our study, we extract the five largest circles of the this network, obtaining a network with 629 nodes and 12557 edges." and "for evaluating the performance of SSC on this network, we extract 6 largest communities derived by the Asymptotical Surprise, obtaining a network with 422 nodes and 15447 edges." These statements describe data selection or subsetting for analysis, not explicit training/test/validation splits for model training or evaluation in a predictive context. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or specific cluster configurations) are mentioned in the paper. |
| Software Dependencies | No | We solve (41) using the LARS algorithm Efron et al. (2004) implemented in SPAMS Matlab toolbox (see Mairal et al. (2014))." and "In this paper, we use the normalized cut algorithm Shi and Malik (2000) to perform spectral clustering." While 'SPAMS Matlab toolbox' is mentioned, no specific version numbers for SPAMS or Matlab are provided. |
| Experiment Setup | Yes | solve optimization problem (41) with γ1 = 30ρ(A) and γ2 = 125(1 ρ(A)), where ρ(A) is the density of matrix A, the proportion of nonzero entries of A. The values of γ1 and γ2 have been obtained empirically by testing on synthetic networks. |