reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Edge Sampling Using Local Network Information

Authors: Can M. Le

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically analyze the behavior of parameter α and the accuracy of the proposed sampling methods. We use these methods to sparsify networks, both simulated and real-world, and then measure the accuracy of the resulting sparsified networks by comparing their Laplacians with those of the original networks.
Researcher Affiliation	Academia	Can M. Le EMAIL Department of Statistics University of California, Davis Davis, CA 95616-5270, USA
Pseudocode	Yes	Algorithm 1 (Estimating the numbers of common neighbors) Choose ε (0, 1) and k 1. For each edge (i, j), let i be the node with di dj. If di k, calculate tij directly by counting the number of elements of Ni Nj. If di > k, sample k neighbors Z1, ..., Zk of i independently and uniformly from Ni, calculate ˆθij = 1 k Pk ℓ=1 1(Zℓ Nj) and proceed as follows: If ˆθij < ε, calculate tij directly by counting the number of elements of Ni Nj. If ˆθij ε, estimate tij by ˆtij = diˆθij.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets	Yes	We further report in Table 2 the value of α, the clustering coeﬃcient and the average degree of several well-known real-world networks: karate club network Zachary (1977), dolphins network Lusseau et al. (2003), political blogs network Adamic and Glance (2005), Facebook ego network Mc Auley and Leskovec (2012), Astrophysics collaboration network Leskovec et al. (2007), Enron email network Klimt and Yang (2004), Twitter Social circles Yang and Leskovec (2012), Google+ social circles Yang and Leskovec (2012), DBLP collaboration network Yang and Leskovec (2012) and Live Journal social network Yang and Leskovec (2012).
Dataset Splits	No	The paper focuses on graph sparsification and evaluation of the sparsified graph against the original. It does not describe typical machine learning dataset splits (e.g., training, validation, test sets) for tasks like classification or regression.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the numerical studies, such as CPU/GPU models, memory, or specific computing environments.
Software Dependencies	No	The paper mentions an "MPI-based distributed memory parallel algorithm in Arifuzzaman et al. (2019)" but this refers to a method from another work, not specific software dependencies or versions used for the current paper's implementation.
Experiment Setup	Yes	For each ρ {0.01, 0.1}, we consider three settings corresponding to diﬀerent community size ratios (1/20, 9/20, 1/2), (1/10, 2/5, 1/2) and (1/3, 1/3, 1/3). ... We vary the sample size m by setting m = τn, where τ is the sample size factor taking values from 10 to 210. To approximate the number of common neighbors for CNA, we sample k = 50 neighbors using Algorithm 1. ... we vary the sample size m by setting m = τn with τ [10, 50]. To approximate the number of common neighbors for CNA, we sample k = 20 neighbors according to Algorithm 1.