Statistical Inference of Random Graphs With a Surrogate Likelihood Function
Authors: Dingbo Wu, Fangzheng Xie
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical performance of the proposed surrogate-likelihood-based methods is validated through the analyses of simulation examples and two real-world data sets. |
| Researcher Affiliation | Academia | Dingbo Wu EMAIL Department of Statistics Indiana University Bloomington, IN 47405, USA and Fangzheng Xie EMAIL Department of Statistics Indiana University Bloomington, IN 47405, USA |
| Pseudocode | Yes | We present the detailed stochastic gradient descent algorithm for computing the maximum surrogate likelihood estimator in Algorithm 1, the convergence of which is guaranteed by Theorem 9 below. Algorithm 1 Stochastic gradient descent for maximum surrogate likelihood estimation |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. It only mentions the license for the paper itself and that a detailed algorithm is provided in the supplementary material, which refers to the pseudocode. |
| Open Datasets | Yes | The network data is structured as follows: The vertices represent 1382 Wikipedia articles that are connected to the article named Algebraic geometry within two hyperlinks... The data set is publicly available at at http://www.cis.jhu.edu/~parky/Data/data.html. and We now consider the political blogs network (Adamic and Glance, 2005), a benchmark network data that has also been analyzed by Karrer and Newman (2011); Zhao et al. (2012); Amini et al. (2013); Jin (2015); Bickel and Sarkar (2015); Le et al. (2016). |
| Dataset Splits | No | The paper describes generating synthetic data and using real-world datasets, but it does not specify any training/test/validation splits, sample counts for splits, or cross-validation strategies applied to its own proposed methods. For real-world datasets, it applies clustering to the full estimated latent positions without detailing data partitioning for model training or evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to conduct the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using 'the R built-in optim function' and 'coda::heidel.diag() in R' but does not specify the version numbers for R or the 'coda' package, nor any other key software dependencies with their respective versions. |
| Experiment Setup | Yes | For the Bayes estimate, we use the uniform prior on the unit disk for all xi. The Metropolis Hastings sampler is implemented with parallelization over vertices i [n], and each Markov chain contains 1000 burn-in iterations and 2000 post-burn-in samples with a thinning of 5. The posterior mean is taken as the point estimate. and For the MSLE, we implement the step-halving stochastic gradient descent algorithm with the batch size set to s = 500 and s = n (giving rise to the classical gradient descent algorithm) to compare the computational costs. |