reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Model-free Change-Point Detection Using AUC of a Classifier

Authors: Rohit Kanrar, Feiyu Jiang, Zhanrui Cai

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive simulation studies and the analysis of two real-world data sets illustrate the superior performance of our approach compared to existing model-free change-point detection methods. ... Extensive numerical studies on both simulated data and two real applications further corroborate the theoretical ﬁndings. The empirical results underscore the eﬀectiveness of our testing framework, demonstrating its ability to control size performance and showcasing competitive power performance compared to existing methods.
Researcher Affiliation	Academia	Rohit Kanrar EMAIL Department of Statistics Iowa State University Ames, IA 50011-1090, USA; Feiyu Jiang EMAIL School of Management Fudan University Shanghai, 200433, China; Zhanrui Cai EMAIL Faculty of Business and Economics University of Hong Kong Hong Kong, China
Pseudocode	Yes	Algorithm 1 change AUC: single change-point detection. ... Algorithm 2 change AUC-SBS: multiple change-points detection.
Open Source Code	Yes	Both R and Python code developed for change AUC is available at https://github.com/rohitkanrar/change AUC.
Open Datasets	Yes	In particular, we detect change-points in artiﬁcially constructed sequences of images of various animals (dog, cat, deer, horse, etc.) from the CIFAR10 database (Krizhevsky et al., 2009)... We collect For Hire Vehicle (FHV) Trip records during 2018 and 2022 from the NYC Taxi and Limousine Commission (TLC) Trip Record Database. ... https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
Dataset Splits	Yes	For a small trimming parameter ϵ (0, 1/2) with m = Tϵ , data are split into three parts: D0 def = {Zt}m t=1, Dv def = {Zt}T m t=m+1 and D1 def = {Zt}T t=T m+1. ... validation data Dv are split into two parts for each k Icp: Dv 0(k) def = {Zt}k t=m+1 and Dv 1(k) def = {Zt}T m t=k+1. ... In practice, we propose to ﬁx ϵ = 0.15. ... We propose to ﬁx η = 0.05 for practical implementations.
Hardware Specification	Yes	To illustrate the idea, we use a 16-core AMD EPYC 7542 CPU with 32 GB RAM for all methods except for Fnn, which is performed using a single-core Intel Xeon 6226R CPU with a single NVIDIA Tesla V100 GPU.
Software Dependencies	No	R package glmnet is used (Friedman et al., 2010). ... Python framework Tensorflow is used. ... R package random Forest (Liaw and Wiener, 2002) is used. ... we adopt the pre-trained weight imagenet (Deng et al., 2009), which is default in Tensorflow.
Experiment Setup	Yes	For a small trimming parameter ϵ (0, 1/2) ... In practice, we propose to ﬁx ϵ = 0.15. ... For Fully-connected Neural Network: We use a fully connected neural network with three hidden layers, Re LU activation and the binary cross-entropy loss. Additionally, a LASSO penalty is enforced in each layer. ... For vgg16, vgg19: The default imagenet embeddings are frozen to train classiﬁers, and only the last layer of the neural network is modiﬁed during back-propagation. A single feed-forward layer with 512 neurons is added on top of the embeddings.