Anomaly detection with semi-supervised classification based on risk estimators
Authors: Le Thi Khanh Hien, Sukanya Patra, Souhaib Ben Taieb
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments provide evidence of the effectiveness of the risk-based anomaly detection methods. 6 Experiments |
| Researcher Affiliation | Academia | Le Thi Khanh Hien EMAIL Department of Mathematics and Operational Research University of Mons, Belgium Sukanya Patra EMAIL Department of Computer Science, University of Mons, Belgium Souhaib Ben Taieb EMAIL Department of Computer Science, University of Mons, Belgium |
| Pseudocode | No | The paper presents optimization problems in mathematical notation (14) and (15) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that source code for the described methods is openly available, nor does it provide any links to a code repository. |
| Open Datasets | Yes | Datasets We test the algorithms on 26 classical anomaly detection benchmark datasets from Han et al. (2022), whose πn ranges from 0.02 to 0.4. Datasets We test the algorithms on 3 benchmark k-classes-out datasets: MNIST, Fashion-MNIST, and CIFAR-10 (all have 10 classes). |
| Dataset Splits | Yes | We randomly split each dataset 30 times into train and test data with a ratio of 7:3, i.e. we have 30 trials for each dataset. Then, for each trial, we randomly select 5% of the train data to make the labeled data and keep the remaining 95% as unlabeled data. for each πn {0.01, 0.05, 0.1, 0.2}, we set one of the ten classes to be a positive class, letting the remaining nine classes be anomalies and maintaining the ratio between normal instances and anomaly instances such that the setup has the required πn (so we have 10 setups corresponding to 10 classes). We note that the anomalous data in our generation process can originate from more than one of the nine classes (unlike in the setup of deep SAD where the anomaly is only from one of the nine classes). For each πn, we repeat this generation process 2 times to get 20 AD setups (or 20 trials). Then, in each trial, we randomly choose γl (with γl {0.05, 0.1, 0.2}) portion of the train data to be labeled and keep the remaining (1 γl) portion as unlabeled data. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU, CPU models, or cloud computing instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using ADAM for optimization, and refers to existing implementations for Deep SAD and nn PU, but does not provide specific version numbers for any software libraries or dependencies. |
| Experiment Setup | Yes | For r AD, we use l2 regularization and take ϕ(x) = x in (18), i.e. no kernel is used. We set a = 0.1 and πe p = 0.8 (πe n = 0.2) as default values for both the shallow r AD and the PU methods. For deep SAD and nn PU, we use default hyperparameter settings and network architectures as in their original implementation by the authors. We use the same network architectures as deep SAD for experiments on Fashion-MNIST and MNIST datasets. For experiments on CIFAR-10, the network architecture from nn PU is used. In deep r AD, the optimization problem in (19) is solved using ADAM. We implement 4 losses for deep r AD: squared loss, sigmoid loss, logistic loss, and modified Huber loss. We set a = 0.1 and πe p = 0.8 (thus πe n = 0.2) as default values for deep r AD. |