reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sequential change-point detection in high-dimensional Gaussian graphical models

Authors: Hossein Keshavarz, George Michaildiis, Yves Atchade

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical works on both synthetic and real data illustrate the good performance of the proposed methodology both in terms of computational and statistical eﬃciency across numerous experimental settings. Keywords: Sequential change-point detection, Gaussian graphical models, Pseudolikelihood, Mini-batch update, Asymptotic analysis
Researcher Affiliation	Academia	Hossein Keshavarz EMAIL Institute for Mathematics and its Applications University of Minnesota Minneapolis, MN 55455, USA; George Michailidis EMAIL Department of Statistics and UF Informatics Institute University of Florida Gainesville, FL 32611, USA; Yves Atchad e EMAIL Department of Mathematics and Statistics Boston University Boston, MA 02215, USA
Pseudocode	Yes	Algorithm 1 Online detection with batch update of pre-change precision matrix Input: n0, w, B, ζπ0 and tuning parameter τ Initialization Set ˆD = and b = 0. Given X1, . . . , Xn0, obtain Ω(1) by CLIME or QUIC with tuning parameter τ. Also set ˆtlast = 0, where ˆtlast denotes the estimated location of the last change-point. Iterate For t = (n0 + 1) , . . . , T
Open Source Code	No	The paper does not provide an explicit statement or link to the source code for the methodology described. It mentions using third-party tools like CLIME, QUIC, and scikit-learn's graphical lasso, but not their own implementation code for this work. The CC-BY license refers to the paper content, not the software.
Open Datasets	Yes	Numerical works on both synthetic and real data illustrate the good performance of the proposed methodology both in terms of computational and statistical eﬃciency across numerous experimental settings. ... We assess the performance of ˆΞt in a real-world scenario. The objective is to estimate the time location of abrupt changes in the dependency structure of S&P 500 daily close price data for the period from 2000-01-01 to 2016-03-03 (total of 3814 trading days). ... We follow the cleaning procedure from Atchade and Bybee (2017) and select a ﬁxed list of 436 ticker symbols from 2004-02-06 to 2016-03-03, consisting of 3039 trading days.
Dataset Splits	Yes	The burn-in period n0 refers to the number of samples used for computing an initial estimate of the post-change precision matrix (with the premise that the ﬁrst change-point occurs at t = 0). ... We assume that the distance between two consecutive change-points is greater than n0. ... We set to T = 500, with a single change point at t = 250. ... In all experiments, we choose p = 100, π0 = 0.01 and w = 20. Moreover three abrupt changes occur at {3000, 6000, 9000} in T = 104 samples.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments.
Software Dependencies	No	Our approach to updating the estimated pre-change precision matrix in phase (b). Again for simplicity, we only focus on updating Ω(1). Updating Ω(1). A mini-batch procedure is used for updating Ω(1), wherein we ﬁrst obtain B (a predetermined block size) new samples and subsequently a new estimate at time t = n + k B (k N) by employing X1, . . . , Xn0+k B; the parameter k tracks the number of size-B batches before the ﬁrst abrupt change. Throughout this paper, we employ the CLIME algorithm (Cai et al., 2011) or alternatively the QUIC estimator (Hsieh et al., 2014) for estimating Ω(1), since both enjoy desirable theoretical and numerical properties. ... The scikit-learn implementation of the graphical lasso with the coordinate descent optimization algorithm is utilized for estimating Ω(t).
Experiment Setup	Yes	In all experiments, we choose p = 100, π0 = 0.01 and w = 20. Moreover three abrupt changes occur at {3000, 6000, 9000} in T = 104 samples. ... In each replicate, Ω0 is independently generated with the same procedure as in Remark 15 with p = 100, d = 20 and λ0 = 0.1. The ﬁrst change point uniformly aﬀects all nodes with β = 0.2, i.e. Ω1 = (1 + β) Ω0. The second abrupt change only impacts the top r = 50 eigenvalues without changing the eigenvectors. Speciﬁcally, P i=1 λiβviv i + Ω0, with β = 0.4, ... We set B = 50 and κ = 4 for all experiments, while varying n0 in the set {1100, 1300, 1500, 1700, 1900, 2100}. ... For each experiment, we conduct our sequential detection algorithm on a network of 100 randomly chosen tickers, i.e., p = 100. We also set ˆΞt = 1 for t in the burn-in period. ... We choose n0 = 200, w = 22 (corresponding to the number of trading days in a month), π0 = 0.05, κ = 2, and B = 10. ... The optimal value of τ is chosen on a grid of size 20 to minimize the Bayesian information criterion (BIC) score. In particular, when ˆtlast stands for the last detected abrupt change before t and n := t ˆtlast, then we optimize the BIC score over the following grid (for ﬁnding the best i {0, . . . , 19}). 10( 1+j/10) r n : j = 0, . . . , 19. ... The regularization parameters are chosen as 0.01, 0.01, 0.01, and 0.005 from left to right and top to bottom panels, respectively (Figure 4 caption).