Automatic Unsupervised Outlier Model Selection

Authors: Yue Zhao, Ryan Rossi, Leman Akoglu

NeurIPS 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that selecting a model by METAOD significantly outperforms no model selection (e.g. always using the same popular model or the ensemble of many) as well as other meta-learning techniques that we tailored for UOMS.
Researcher Affiliation Collaboration Yue Zhao Carnegie Mellon University EMAIL Ryan A. Rossi Adobe Research EMAIL Leman Akoglu Carnegie Mellon University EMAIL
Pseudocode Yes We also provide the detailed steps of METAOD in pseudo-code, for both meta-training (offline) and model selection (online), in Appendix D Algo. 1.
Open Source Code Yes We open-source1 METAOD and our meta-learning database for practical use and to foster further research on the UOMS problem. 1Code available at URL: https://github.com/yzhao062/UOMS
Open Datasets Yes 1. Proof-of-Concept (POC) testbed contains 100 datasets that form clusters of similar datasets, where 5 different detection tasks ( siblings ) are created from each one of 20 mothersets . 2. Stress Testing (ST) testbed consists of 62 independent datasets from 3 different public-domain OD dataset repositories , which exhibit relatively lower similarity to one another. We use the benchmark datasets4 by Emmott et al. [11], who created childsets from 20 independent mothersets by sampling. 4https://ir.library.oregonstate.edu/concern/datasets/47429f155
Dataset Splits Yes We split them into 5 folds for cross-validation, each test fold containing 20 independent childsets without siblings. For evaluation on ST, we use leave-one-out cross validation; each time using 61 datasets as meta-train.
Hardware Specification Yes All models are built using the Py OD library [61] on an Intel i7-9700 @3.00 GHz, 64GB RAM, 8-core workstation.
Software Dependencies No The paper mentions "Py OD library [61]" but does not provide specific version numbers for it or any other ancillary software dependencies.
Experiment Setup Yes We pair 8 SOTA OD algorithms and their corresponding hyperparameters to compose a model set M with 302 unique models. (See Appendix A Table 2 for the complete list.)