Detecting Change Intervals with Isolation Distributional Kernel
Authors: Yang Cao, Ye Zhu, Kai Ming Ting, Flora D. Salim, Hong Xian Li, Luxing Yang, Gang Li
JAIR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness and efficiency of i CID have been systematically verified on both synthetic and real-world datasets. Our empirical results show that i CID performs better than 6 state-of-the-art methods, including deep learning methods in terms of F1-score and/or runtime. |
| Researcher Affiliation | Academia | Yang Cao EMAIL Ye Zhu EMAIL Deakin University, Geelong, Australia Kai Ming Ting EMAIL Nanjing University, Nanjing, China Flora D. Salim EMAIL University of New South Wales, Kensington, Australia Hong Xian Li EMAIL Luxing Yang EMAIL Gang Li EMAIL Deakin University, Geelong, Australia |
| Pseudocode | Yes | Algorithm 1 Offline i CID(D,w,Ψ) Algorithm 2 Online i CID(D,D ,w,Ψ) |
| Open Source Code | Yes | The source code of i CID can be obtained from https://github.com/Isolation Kernel/i CID. |
| Open Datasets | Yes | In order to demonstrate the effectiveness of the proposed i CID method in a variety of applications, we include 6 real-world and 2 synthetic datasets. Table 2 presents the properties of each dataset. Yahoo-164 records hardware resource usage... https://webscope.sandbox.yahoo.com/catalog.php?datatype=s USC-HAD5 (Zhang & Sawchuk, 2012) includes 14 subjects... http://sipi.usc.edu/had Google-Trend6 is a monthly time series data... https://github.com/zhaokg/Rbeast/blob/master/Matlab/testdata/googletrend.mat Well Log7 ( O Ruanaidh & Fitzgerald, 1996) contains measurements... https://github.com/alan-turing-institute/TCPD/tree/master/datasets/well log Weather8 is global monthly mean temperature... https://datahub.io/core/global-temp#readme HASC9 (Kawaguchi et al., 2011a, 2011b) is human activity data... http://hasc.jp/hc2011/ |
| Dataset Splits | Yes | For the online version, the ψ is obtained based on a reference dataset D, e.g., first k points of the streaming data. For the online i CID, we used the first half of the whole datasets as the reference D and applied a sliding window to generate the current streaming data D . |
| Hardware Specification | Yes | i CID is tested using Matlab R2022b with Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz and TS-CP2 is tested using Python with NVIDIA Quadro RTX 5000 GPU. |
| Software Dependencies | Yes | i CID is tested using Matlab R2022b with Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz and TS-CP2 is tested using Python with NVIDIA Quadro RTX 5000 GPU. |
| Experiment Setup | Yes | Offline i CID has three inputs: Dataset D, window Size w and Subsample Size List Ψ. The parameter search ranges are provided in Table 3. Table 3: Parameters search ranges for i CID Parameter Search ranges Subsampe size ψ {2, 4, 8, 16, 32, 64} Window size w 10, 15, ..., 400 Power factor α 0, 0.1, 0.2, ..., 3 |