Edge Sampling Using Local Network Information

Authors: Can M. Le

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically analyze the behavior of parameter α and the accuracy of the proposed sampling methods. We use these methods to sparsify networks, both simulated and real-world, and then measure the accuracy of the resulting sparsified networks by comparing their Laplacians with those of the original networks.
Researcher Affiliation Academia Can M. Le EMAIL Department of Statistics University of California, Davis Davis, CA 95616-5270, USA
Pseudocode Yes Algorithm 1 (Estimating the numbers of common neighbors) Choose ε (0, 1) and k 1. For each edge (i, j), let i be the node with di dj. If di k, calculate tij directly by counting the number of elements of Ni Nj. If di > k, sample k neighbors Z1, ..., Zk of i independently and uniformly from Ni, calculate ˆθij = 1 k Pk ℓ=1 1(Zℓ Nj) and proceed as follows: If ˆθij < ε, calculate tij directly by counting the number of elements of Ni Nj. If ˆθij ε, estimate tij by ˆtij = diˆθij.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets Yes We further report in Table 2 the value of α, the clustering coefficient and the average degree of several well-known real-world networks: karate club network Zachary (1977), dolphins network Lusseau et al. (2003), political blogs network Adamic and Glance (2005), Facebook ego network Mc Auley and Leskovec (2012), Astrophysics collaboration network Leskovec et al. (2007), Enron email network Klimt and Yang (2004), Twitter Social circles Yang and Leskovec (2012), Google+ social circles Yang and Leskovec (2012), DBLP collaboration network Yang and Leskovec (2012) and Live Journal social network Yang and Leskovec (2012).
Dataset Splits No The paper focuses on graph sparsification and evaluation of the sparsified graph against the original. It does not describe typical machine learning dataset splits (e.g., training, validation, test sets) for tasks like classification or regression.
Hardware Specification No The paper does not provide specific details about the hardware used for running the numerical studies, such as CPU/GPU models, memory, or specific computing environments.
Software Dependencies No The paper mentions an "MPI-based distributed memory parallel algorithm in Arifuzzaman et al. (2019)" but this refers to a method from another work, not specific software dependencies or versions used for the current paper's implementation.
Experiment Setup Yes For each ρ {0.01, 0.1}, we consider three settings corresponding to different community size ratios (1/20, 9/20, 1/2), (1/10, 2/5, 1/2) and (1/3, 1/3, 1/3). ... We vary the sample size m by setting m = τn, where τ is the sample size factor taking values from 10 to 210. To approximate the number of common neighbors for CNA, we sample k = 50 neighbors using Algorithm 1. ... we vary the sample size m by setting m = τn with τ [10, 50]. To approximate the number of common neighbors for CNA, we sample k = 20 neighbors according to Algorithm 1.