Maximum Mean Discrepancy on Exponential Windows for Online Change Detection

Authors: Florian Kalinke, Marco Heyden, Georg Gntuni, Edouard Fouché, Klemens Böhm

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on standard benchmark data streams show that MMDEW obtains the best F1-score on most data sets. Our experiments on standard benchmark data sets show that MMDEW performs better than state-of-the art on benchmark data streams.
Researcher Affiliation Academia Florian Kalinke EMAIL Karlsruhe Institute of Technology, Germany
Pseudocode Yes Algorithm 1: Proposed MMDEW change detection algorithm.
Open Source Code Yes Our code is available at https://github.com/FlopsKa/mmdew-change-detector.
Open Datasets Yes CIFAR10 (Krizhevsky et al., 2009) 60,000 1,024 9 Fashion MNIST (Xiao et al., 2017) 70,000 784 9 Gas (Vergara et al., 2012) 13,910 128 5 HAR (Anguita et al., 2013) 10,299 561 5 MNIST (Deng, 2012) 70,000 784 9
Dataset Splits No The paper does not provide explicit training/test/validation dataset splits. It describes how classification datasets are converted into streaming data for change detection: "For each data set, we first order the observations by their classes; a change occurs if the class changes. To introduce variation into the order of change points, we randomly permute the order of the classes before each run but use the same permutation across all algorithms."
Hardware Specification Yes We ran all experiments on a server running Ubuntu 20.04 with 124GB RAM, and 32 cores with 2GHz each.
Software Dependencies No The paper mentions "Ubuntu 20.04" as the operating system for the server but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the implementation.
Experiment Setup Yes We run a grid parameter optimization per data set and algorithm and report the best result w.r.t. the F1-score. We note that such an optimization is difficult to perform in practice here one typically prefers approaches with fewer or easy-to-set parameters but allows a fair comparison. Table 3 in Appendix C lists all the parameters we tested. For kernel-based algorithms (MMDEW, NEWMA, and Scan B-statistics) we use the Gaussian kernel k(x, y) = exp γ x y 2 (γ > 0) and set γ using the median heuristic (Garreau et al., 2018) on the first 100 observations. We also supply the first 100 observations to competitors requiring data to estimate further parameters (IBDD, WATCH) upfront.