reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Modeling Random Networks with Heterogeneous Reciprocity

Authors: Daniel Cirkovic, Tiandong Wang

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare Bayesian and frequentist model ﬁtting techniques for large networks, as well as computationally eﬃcient variational alternatives. Cases where the number of communities is known and unknown are both considered. We apply the presented methods to the analysis of Facebook and Reddit networks where users have nonuniform reciprocal behavior patterns.
Researcher Affiliation	Academia	Daniel Cirkovic EMAIL Department of Statistics Texas A&M University College Station, TX 77843, USA, Tiandong Wang EMAIL Shanghai Center for Mathematical Sciences Fudan University Shanghai 200438, China
Pseudocode	Yes	Algorithm 1 Gibbs sampling for heterogeneous reciprocal PA with known K, Algorithm 2 CAVI for heterogeneous reciprocal PA with known K, Algorithm 3 VEM for heterogeneous reciprocal PA with known K, Algorithm 4 Telescoping sampler for heterogeneous reciprocal PA with known K, Algorithm 5 Initialization of VEM for heterogeneous reciprocal PA
Open Source Code	No	The paper does not contain any explicit statement about providing access to the source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets	Yes	We apply the presented methods to the analysis of Facebook and Reddit networks where users have nonuniform reciprocal behavior patterns. Now we apply the heterogeneous reciprocal PA model to the Facebook wall post data from KONECT analyzed in Viswanath et al. (2009) and Cirkovic et al. (2023a). We additionally analyze Reddit user replies from December 11th, 2005 to December 31st, 2006 (Hessel et al., 2016; Liu et al., 2019).
Dataset Splits	No	The paper describes trimming and processing the datasets, e.g., 'This trimming procedure results in a connected network of 16,099 nodes and 123,920 edges...', but does not specify any training, validation, or test splits for experimental reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using 'R s mcmcse package (Flegal et al., 2017)' and 'R package igraph (Csardi et al., 2006)', but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup	Yes	For the VEM algorithm, we terminate the E-step once either the ELBO has increased by less than ϵ = 0.01 or the total number of iterations exceeds 500, and terminate the entire algorithm once the element-wise diﬀerences in the parameters fall below κ = 0.01. We also terminate the VB algorithm via the same conditions as in the E-step of the VEM algorithm. We run the fully Bayesian method for M = 5,000 MCMC samples, and discard the ﬁrst half as burn-in. Further, when K is unknown, we assume a BNB(1, 4, 3) prior on K as recommended by Fr uhwirth-Schnatter et al. (2021) and set Kmax = 20. For the variational methods, we search over K = 1, 2, 3, 4.