reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sparse Popularity Adjusted Stochastic Block Model

Authors: Majid Noroozi, Marianna Pensky, Ramchandra Rimal

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Simulations and Real Data Examples In this section we evaluate the performance of our method using synthetic networks. We assume that the number of communities (clusters) K is known and for simplicity consider a perfectly balanced model with n/K nodes in each cluster. We generate each network from a random graph model with a symmetric probability matrix P given by the SPABM model with a clustering matrix Z and a block matrix Λ.
Researcher Affiliation	Academia	Majid Noroozi EMAIL Department of Mathematical Sciences University of Memphis Memphis, TN 38152, USA Marianna Pensky EMAIL Department of Mathematics University of Central Florida Orlando, FL 32816, USA Ramchandra Rimal EMAIL Department of Mathematical Sciences Middle Tennessee State University Murfreesboro, TN 37132, USA
Pseudocode	No	In this paper, we use Sparse Subspace Clustering (SSC) since it allows one to take advantage of the knowledge that, for a given K, columns of matrix P lie in the union of K distinct subspaces, each of the dimension at most K. If matrix P were known, the weight matrix W would be based on writing every data point as a sparse linear combination of all other points by solving the following optimization problem min Wj Wj 1 s.t. (P )j = X k =j Wkj(P )k (40) In the case of data contaminated by noise, the SSC algorithm does not attempt to write data as an exact linear combination of other points. Instead, SSC can be built upon the solution of the elastic net problem c Wj argmin Wj 2 Aj AWj 2 2 + γ1 Wj 1 + γ2 Wj 2 2 s.t. Wjj = 0 , j = 1, ..., n, (41) These are mathematical optimization problems, not structured pseudocode blocks.
Open Source Code	No	We solve (41) using the LARS algorithm Efron et al. (2004) implemented in SPAMS Matlab toolbox (see Mairal et al. (2014)). This refers to a third-party toolbox and does not indicate the authors are providing their own source code for the methodology described in the paper.
Open Datasets	Yes	To study the ego-network, we use the dataset described comprehensively in Leskovec and Mcauley (2012)." and "Our second example involves analyzing a human brain functional network, constructed on the basis of the resting-state functional MRI (rsf MRI). We use the the brain connectivity dataset presented as a Group Average rsf MRI matrix described in Crossley et al. (2013)."
Dataset Splits	No	For our study, we extract the ﬁve largest circles of the this network, obtaining a network with 629 nodes and 12557 edges." and "for evaluating the performance of SSC on this network, we extract 6 largest communities derived by the Asymptotical Surprise, obtaining a network with 422 nodes and 15447 edges." These statements describe data selection or subsetting for analysis, not explicit training/test/validation splits for model training or evaluation in a predictive context.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or specific cluster configurations) are mentioned in the paper.
Software Dependencies	No	We solve (41) using the LARS algorithm Efron et al. (2004) implemented in SPAMS Matlab toolbox (see Mairal et al. (2014))." and "In this paper, we use the normalized cut algorithm Shi and Malik (2000) to perform spectral clustering." While 'SPAMS Matlab toolbox' is mentioned, no specific version numbers for SPAMS or Matlab are provided.
Experiment Setup	Yes	solve optimization problem (41) with γ1 = 30ρ(A) and γ2 = 125(1 ρ(A)), where ρ(A) is the density of matrix A, the proportion of nonzero entries of A. The values of γ1 and γ2 have been obtained empirically by testing on synthetic networks.