reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unsupervised Anomaly Detection with Rejection

Authors: Lorenzo Perini, Jesse Davis

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally address the following research questions: Q1. How does REJEX s cost compare to the baselines? Q2. How does varying the cost function affect the results? Q3. How does REJEX s CPU time compare to the baselines? Q4. Do the theoretical results hold in practice?
Researcher Affiliation	Academia	Lorenzo Perini DTAI lab & Leuven.AI, KU Leuven, Belgium EMAIL Jesse Davis, DTAI lab & Leuven.AI, KU Leuven, Belgium EMAIL
Pseudocode	No	The paper includes mathematical formulations and proofs (e.g., Theorem 3.1, Theorem 3.5, Theorem 3.6, Theorem 3.8), but it does not feature any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code available at: https://github.com/Lorenzo-Perini/RejEx.
Open Datasets	Yes	We carry out our study on 34 publicly available benchmark datasets, widely used in the literature [23]. These datasets can be downloaded in the following link: https://github.com/Minqi824/ADBench/tree/main/datasets/Classical.
Dataset Splits	Yes	(1) we split the dataset into training and test sets (80-20) using 5 fold cross-validation; ... This uses (any) unsupervised detector to obtain pseudo labels for the training set. It then sets the rejection threshold as follows: 1) it creates a held-out validation set (20%)
Hardware Specification	Yes	All experiments were run on an Intel(R) Xeon(R) Silver 4214 CPU.
Software Dependencies	No	The paper mentions software tools like PYOD [66] and Bayesian Optimization [17] but does not specify their version numbers or the version of the programming language (e.g., Python) used.
Experiment Setup	Yes	We set our tolerance ε = 2e T with T = 32. Note that the exponential smooths out the effect of T 4, which makes setting a different T have little impact. We use a set of 12 unsupervised anomaly detectors implemented in PYOD [66] with default hyperparameters [62] because the unsupervised setting does not allow us to tune them: KNN [3], IFOREST [42], LOF [5], OCSVM [58], AE [8], HBOS [21], LODA [53], COPOD [39], GMM [2], ECOD [40], KDE [36], INNE [4]. We set all the baselines rejection threshold via Bayesian Optimization with 50 calls [17].