Change Point Estimation in a Dynamic Stochastic Block Model

Authors: Monika Bhattacharjee, Moulinath Banerjee, George Michailidis

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The results are illustrated on synthetic data. In summary, this work provides an in-depth investigation of the novel problem of change point analysis for networks generated by stochastic block models, identifies key conditions for the consistent estimation of the change point, and proposes a computationally fast algorithm that solves the problem in many settings that occur in applications. Finally, it discusses challenges posed by employing clustering algorithms in this problem, that require additional investigation for their full resolution. ... The numerical performance of the two strategies based on synthetic data is illustrated in Section 4.1.
Researcher Affiliation Academia Monika Bhattacharjee EMAIL Informatics Institute University of Florida Gainesville, USA Moulinath Banerjee EMAIL Department of Statistics University of Michigan Ann Arbor, USA George Michailidis EMAIL Informatics Institute Department of Statistics University of Florida Gainesville, USA
Pseudocode Yes Clustering Algorithm I: (proposed in Bhattacharyya and Chatterjee (2017)) 1. Obtain sums of the adjacency matrices before and after b as B1 = Pnb t=1 A(t) and B2 = Pn t=nb+1 A(t) respectively. 2. Obtain ˆUm K and ˆVm K consisting of the leading K eigenvectors of B1 and B2, respectively, corresponding to its largest absolute eigenvalues. 3. Use an (1 + ϵ) approximate K-means clustering algorithm on the row vectors of ˆU and ˆV to obtain zb,n,m and wb,n,m respectively. ... 2-Step Algorithm: Step 1: In this step, we ignore the community structures and assume z(i) = w(i) = i for all 1 i m. We compute the least-squares criterion function L( ) given in (3.2) and obtain the estimate ˆτm,n = arg minb (c ,1 c ) L(b). Step 2: This step involves estimation of other parameters in DSBM. We estimate z and w by ˆz = zˆτm,n,n,m and ˆw = wˆτm,n,n,m, respectively, and subsequently Λ and by ˆˆΛ = Λˆz,(ˆτm,n,n),m and ˆˆ = ˆw,(ˆτm,n,n),m, respectively.
Open Source Code No The paper does not provide explicit access to source code for the methodology described. It only mentions the license for the paper itself and attribution requirements for JMLR publications.
Open Datasets No The results are illustrated on synthetic data. ... Next, we discuss the performance of the three change-point estimates τm,n, ˆτm,n and τm,n based on synthetic data generated according to the following mechanism, focusing on the impact of the parameters m, n and small community connection probabilities on their performance.
Dataset Splits No The paper uses synthetic data and discusses simulation experiments involving '100 replicates' but does not specify dataset splits like training, validation, or test sets in the context of machine learning model training.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments or simulations.
Software Dependencies No The paper does not provide specific version numbers for any software, libraries, or programming languages used in the experiments or simulations.
Experiment Setup Yes Next, we discuss the performance of the three change-point estimates τm,n, ˆτm,n and τm,n based on synthetic data generated according to the following mechanism, focusing on the impact of the parameters m, n and small community connection probabilities on their performance. Effect of m and n: We simulate from the following DSBMs (1), (2), (3) for three choices of (m, n, nτn) = (60, 60, 30), (500, 20, 10), (500, 100, 50) and two choices of λ = 0, 1/20. These results are presented in Tables 1-6.