reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Reliever: Relieving the Burden of Costly Model Fits for Changepoint Detection

Authors: Chengde Qian, Guanghui Wang, Changliang Zou

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness of the Reliever approach compared to the original implementation of various grid-search algorithms, we explore two scenarios: high-dimensional linear changepoint models (cf. Example 2) and nonparametric changepoint models (cf. Example 3). The grid-search algorithms assessed include SN, WBS (with M = 100 random intervals), and Seed BS (using a decay parameter a = 2 1/2, as recommended by Kovács et al. (2022)), implemented with a known number of changepoints for a fair comparison. For nonparametric models, we also consider OP and PELT, which do not presuppose the number of changepoints. The accuracy of changepoint estimation is quantified using the Hausdorff distance max{OE, UE}, where OE = max1 j b K min1 k K \|τ k bτj\| is the oversegmentation error and UE = max1 k K min1 j b K \|τ k bτj\| is the under-segmentation error. The following results are based on 500 replications.
Researcher Affiliation	Academia	Chengde Qian EMAIL School of Mathematical Sciences Shanghai Jiao Tong University Shanghai 200240, China; Guanghui Wang EMAIL School of Statistics and Data Science, LPMC, KLMDASR, and LEBPS Nankai University Tianjin 300071, China; Changliang Zou EMAIL NITFID, School of Statistics and Data Science, LPMC, KLMDASR, and LEBPS Nankai University Tianjin 300071, China
Pseudocode	No	The paper describes algorithms verbally and illustrates workflows (e.g., Figure 1: "Workflow comparison between a standard grid-search algorithm and the same algorithm equipped with Reliever"), but does not include explicit pseudocode blocks or algorithms formatted as such.
Open Source Code	No	The paper does not provide any explicit statements about code availability, such as a link to a repository or a declaration that code is included in supplementary materials. It mentions the use of the 'glmnet package (Friedman et al., 2010) in R' for parameter estimation, but this refers to a third-party tool, not the authors' own implementation code.
Open Datasets	No	In this scenario, we examine changepoint detection in high-dimensional linear models as outlined in Example 2, with n {300, 600, 900, 1200} and p = 100. The covariates {xi} are i.i.d. from the standard multivariate Gaussian distribution, and the noises {ϵi} are i.i.d. from the standard Gaussian distribution N(0, 1). We introduce three changepoints at {τ k}3 k=1 = { 0.22n , 0.55n , 0.77n }. The regression coefficients {θ k} are generated such that θk,j = 0 for j = 3, . . . , p, and θk,1 and θk,2 are uniformly sampled, satisfying the SNRs θ1 2 2/ Var(ϵ1) = 2 and θk θk 1 2 2/ Var(ϵ1) = 2 1 for k = 2, 3, 4. Here θk,j denotes the jth element of θk.
Dataset Splits	No	The numerical studies describe synthetic data generation based on specified distributions and parameters (e.g., 'The covariates {xi} are i.i.d. from the standard multivariate Gaussian distribution'), and state that 'The following results are based on 500 replications'. This indicates simulation-based evaluation rather than the use of pre-defined training, validation, and test splits from a fixed dataset.
Hardware Specification	No	The paper does not contain any specific details regarding the hardware used to conduct the experiments, such as GPU models, CPU types, or memory specifications. It primarily discusses computational time and efficiency in terms of algorithmic complexity and model fitting operations.
Software Dependencies	No	We set δm = 20 for numerical stability and employ lasso for parameter estimation using the glmnet package (Friedman et al., 2010) in R.
Experiment Setup	Yes	In this scenario, we examine changepoint detection in high-dimensional linear models as outlined in Example 2, with n {300, 600, 900, 1200} and p = 100. The covariates {xi} are i.i.d. from the standard multivariate Gaussian distribution, and the noises {ϵi} are i.i.d. from the standard Gaussian distribution N(0, 1). We introduce three changepoints at {τ k}3 k=1 = { 0.22n , 0.55n , 0.77n }. The regression coefficients {θ k} are generated such that θk,j = 0 for j = 3, . . . , p, and θk,1 and θk,2 are uniformly sampled, satisfying the SNRs θ1 2 2/ Var(ϵ1) = 2 and θk θk 1 2 2/ Var(ϵ1) = 2 1 for k = 2, 3, 4. Here θk,j denotes the jth element of θk. We set δm = 20 for numerical stability and employ lasso for parameter estimation using the glmnet package (Friedman et al., 2010) in R. We apply three grid-search algorithms, SN, WBS (with M = 100 random intervals), and Seed BS (using a decay parameter a = 2 1/2, as recommended by Kovács et al. (2022)), assuming a known number of changepoints b K = K = 3. For each algorithm, we scale the regularization parameter λI = λ\|I\| 1 2 , with λ ranging from a set of 30 values. The changepoint detection error for each algorithm is reported as the minimal Hausdorff distance achieved across all values of λ.