Statistical Inference of Random Graphs With a Surrogate Likelihood Function

Authors: Dingbo Wu, Fangzheng Xie

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The empirical performance of the proposed surrogate-likelihood-based methods is validated through the analyses of simulation examples and two real-world data sets.
Researcher Affiliation Academia Dingbo Wu EMAIL Department of Statistics Indiana University Bloomington, IN 47405, USA and Fangzheng Xie EMAIL Department of Statistics Indiana University Bloomington, IN 47405, USA
Pseudocode Yes We present the detailed stochastic gradient descent algorithm for computing the maximum surrogate likelihood estimator in Algorithm 1, the convergence of which is guaranteed by Theorem 9 below. Algorithm 1 Stochastic gradient descent for maximum surrogate likelihood estimation
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. It only mentions the license for the paper itself and that a detailed algorithm is provided in the supplementary material, which refers to the pseudocode.
Open Datasets Yes The network data is structured as follows: The vertices represent 1382 Wikipedia articles that are connected to the article named Algebraic geometry within two hyperlinks... The data set is publicly available at at http://www.cis.jhu.edu/~parky/Data/data.html. and We now consider the political blogs network (Adamic and Glance, 2005), a benchmark network data that has also been analyzed by Karrer and Newman (2011); Zhao et al. (2012); Amini et al. (2013); Jin (2015); Bickel and Sarkar (2015); Le et al. (2016).
Dataset Splits No The paper describes generating synthetic data and using real-world datasets, but it does not specify any training/test/validation splits, sample counts for splits, or cross-validation strategies applied to its own proposed methods. For real-world datasets, it applies clustering to the full estimated latent positions without detailing data partitioning for model training or evaluation.
Hardware Specification No The paper does not provide any specific details about the hardware used to conduct the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using 'the R built-in optim function' and 'coda::heidel.diag() in R' but does not specify the version numbers for R or the 'coda' package, nor any other key software dependencies with their respective versions.
Experiment Setup Yes For the Bayes estimate, we use the uniform prior on the unit disk for all xi. The Metropolis Hastings sampler is implemented with parallelization over vertices i [n], and each Markov chain contains 1000 burn-in iterations and 2000 post-burn-in samples with a thinning of 5. The posterior mean is taken as the point estimate. and For the MSLE, we implement the step-halving stochastic gradient descent algorithm with the batch size set to s = 500 and s = n (giving rise to the classical gradient descent algorithm) to compare the computational costs.