Sequential change-point detection in high-dimensional Gaussian graphical models

Authors: Hossein Keshavarz, George Michaildiis, Yves Atchade

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical works on both synthetic and real data illustrate the good performance of the proposed methodology both in terms of computational and statistical efficiency across numerous experimental settings. Keywords: Sequential change-point detection, Gaussian graphical models, Pseudolikelihood, Mini-batch update, Asymptotic analysis
Researcher Affiliation Academia Hossein Keshavarz EMAIL Institute for Mathematics and its Applications University of Minnesota Minneapolis, MN 55455, USA; George Michailidis EMAIL Department of Statistics and UF Informatics Institute University of Florida Gainesville, FL 32611, USA; Yves Atchad e EMAIL Department of Mathematics and Statistics Boston University Boston, MA 02215, USA
Pseudocode Yes Algorithm 1 Online detection with batch update of pre-change precision matrix Input: n0, w, B, ζπ0 and tuning parameter τ Initialization Set ˆD = and b = 0. Given X1, . . . , Xn0, obtain Ω(1) by CLIME or QUIC with tuning parameter τ. Also set ˆtlast = 0, where ˆtlast denotes the estimated location of the last change-point. Iterate For t = (n0 + 1) , . . . , T
Open Source Code No The paper does not provide an explicit statement or link to the source code for the methodology described. It mentions using third-party tools like CLIME, QUIC, and scikit-learn's graphical lasso, but not their own implementation code for this work. The CC-BY license refers to the paper content, not the software.
Open Datasets Yes Numerical works on both synthetic and real data illustrate the good performance of the proposed methodology both in terms of computational and statistical efficiency across numerous experimental settings. ... We assess the performance of ˆΞt in a real-world scenario. The objective is to estimate the time location of abrupt changes in the dependency structure of S&P 500 daily close price data for the period from 2000-01-01 to 2016-03-03 (total of 3814 trading days). ... We follow the cleaning procedure from Atchade and Bybee (2017) and select a fixed list of 436 ticker symbols from 2004-02-06 to 2016-03-03, consisting of 3039 trading days.
Dataset Splits Yes The burn-in period n0 refers to the number of samples used for computing an initial estimate of the post-change precision matrix (with the premise that the first change-point occurs at t = 0). ... We assume that the distance between two consecutive change-points is greater than n0. ... We set to T = 500, with a single change point at t = 250. ... In all experiments, we choose p = 100, π0 = 0.01 and w = 20. Moreover three abrupt changes occur at {3000, 6000, 9000} in T = 104 samples.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments.
Software Dependencies No Our approach to updating the estimated pre-change precision matrix in phase (b). Again for simplicity, we only focus on updating Ω(1). Updating Ω(1). A mini-batch procedure is used for updating Ω(1), wherein we first obtain B (a predetermined block size) new samples and subsequently a new estimate at time t = n + k B (k N) by employing X1, . . . , Xn0+k B; the parameter k tracks the number of size-B batches before the first abrupt change. Throughout this paper, we employ the CLIME algorithm (Cai et al., 2011) or alternatively the QUIC estimator (Hsieh et al., 2014) for estimating Ω(1), since both enjoy desirable theoretical and numerical properties. ... The scikit-learn implementation of the graphical lasso with the coordinate descent optimization algorithm is utilized for estimating Ω(t).
Experiment Setup Yes In all experiments, we choose p = 100, π0 = 0.01 and w = 20. Moreover three abrupt changes occur at {3000, 6000, 9000} in T = 104 samples. ... In each replicate, Ω0 is independently generated with the same procedure as in Remark 15 with p = 100, d = 20 and λ0 = 0.1. The first change point uniformly affects all nodes with β = 0.2, i.e. Ω1 = (1 + β) Ω0. The second abrupt change only impacts the top r = 50 eigenvalues without changing the eigenvectors. Specifically, P i=1 λiβviv i + Ω0, with β = 0.4, ... We set B = 50 and κ = 4 for all experiments, while varying n0 in the set {1100, 1300, 1500, 1700, 1900, 2100}. ... For each experiment, we conduct our sequential detection algorithm on a network of 100 randomly chosen tickers, i.e., p = 100. We also set ˆΞt = 1 for t in the burn-in period. ... We choose n0 = 200, w = 22 (corresponding to the number of trading days in a month), π0 = 0.05, κ = 2, and B = 10. ... The optimal value of τ is chosen on a grid of size 20 to minimize the Bayesian information criterion (BIC) score. In particular, when ˆtlast stands for the last detected abrupt change before t and n := t ˆtlast, then we optimize the BIC score over the following grid (for finding the best i {0, . . . , 19}). 10( 1+j/10) r n : j = 0, . . . , 19. ... The regularization parameters are chosen as 0.01, 0.01, 0.01, and 0.005 from left to right and top to bottom panels, respectively (Figure 4 caption).