Early Concept Drift Detection via Prediction Uncertainty

Authors: Pengqian Lu, Jie Lu, Anjin Liu, Guangquan Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations on both synthetic and real-world datasets demonstrate PUDD s efficacy in detecting drift in structured and image data.
Researcher Affiliation Academia Australian Artificial Intelligence Institute (AAII), University of Technology Sydney, Ultimo, NSW 2007, Australia {Pengqian.Lu@student., Jie.Lu@, Anjin.Liu@, Guangquan.Zhang@}uts.edu.au
Pseudocode No The pseudo-code and time complexity analysis is provided in the Appendix.
Open Source Code Yes Code https://github.com/Roc Stone/PUDD
Open Datasets Yes Our experiments utilize 3 real-world datasets (airline(Ikonomovska 2011), elec2(Harries 1999), powersupply(Dau et al. 2019)) and 4 synthetic sets (sine(Gama et al. 2004), mixed(Gama et al. 2004), CIFAR-10-CD, sea variants(Bifet et al. 2010)).
Dataset Splits No The paper describes partitioning data streams into chunks and using a sliding window strategy for drift detection. For example: "If the data is collected in chunks, then the stream includes a set of chunks D1,t = { Dj|j [1, t]}, where each chunk Dj = {(xjk, yjk)|k [1, M]} includes M examples." and "Dt1,t+1 is split into Dt1,r and Dr,t+1 for the Adaptive PU-index Bucketing algorithm." However, it does not provide specific train/test/validation splits with percentages or counts for the classifiers used in the experiments.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions evaluating methods using "three classifiers DNN (architecture detailed in the Appendix), Gaussian Naive Bayes (GNB) (Virtanen et al. 2020), and VFDT (Hulten, Spencer, and Domingos 2001)". While Virtanen et al. 2020 refers to Sci Py 1.0, this is a citation for the GNB implementation used, not an explicit statement of multiple software dependencies with version numbers for the authors' own method.
Experiment Setup Yes Our method is denoted as PUDD-X, where X represents the exponent in 10 X. ... The p-value obtained from the Pearson s Chi-square test serves as a precise control mechanism for our tolerance to false alarms. By adjusting the significance level (α), we can directly modulate the trade-off between sensitivity and false positive rate.