Adversarial Subspace Generation for Outlier Detection in High-Dimensional Data

Authors: Jose Cribeiro-Ramallo, Federico Matteucci, Paul Enciu, Alexander Jenke, Vadim Arzamasov, Thorsten Strufe, Klemens Böhm

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on 42 real-world datasets show that using V-GAN subspaces to build ensemble methods leads to a significant increase in one-class classification performance compared to existing subspace selection, feature selection, and embedding methods. Further experiments on synthetic data show that VGAN identifies subspaces more accurately while scaling better than other relevant subspace selection methods.
Researcher Affiliation Academia Jose Cribeiro-Ramallo EMAIL Karlsruhe Institute of Technology Federico Matteucci EMAIL Karlsruhe Institute of Technology Paul Enciu EMAIL Karlsruhe Institute of Technology Alexander Jenke EMAIL Karlsruhe Institute of Technology Vadim Arzamasov EMAIL Karlsruhe Institute of Technology Thorsten Strufe EMAIL Karlsruhe Institute of Technology Klemens Böhm EMAIL Karlsruhe Institute of Technology
Pseudocode Yes A pseudo-code of the training is included in the Appendix
Open Source Code Yes Finally, we provide the code for all of our experiments and methods1. 1https://github.com/jcribeiro98/V-GAN
Open Datasets Yes We used 42 normalized datasets from the benchmark study by Han et al., listed in Tables 11-15 in the appendix. For those datasets with multiple versions, we chose the first in alphanumeric order. Details about each dataset are available in (Han et al., 2022).
Dataset Splits Yes 1. Split the dataset D into a training set Dtrain containing 80% of the inliers from D, and a test set Dtest containing the remaining 20% and the outliers.
Hardware Specification Yes Experiments ran on a Ryzen 9 7900X CPU and an Nvidia RTX 4090 GPU.
Software Dependencies No All experiments were implemented in Python. We used popular implementations for all competitors and baselines and implemented V-GAN in Py Torch. We used the Python package pyod for all outlier detectors.
Experiment Setup Yes We trained the network for 2000 epochs, with minibatch gradient descent using the Adadelta optimizer (Zeiler, 2012) following preliminary results. In particular, we use batches of size 500, a learning rate of lr G = lr E = 0.007 for the generator and the encoder, respectively. We set momentum (0.99) and weight-decay (0.04) (Goodfellow et al., 2016). Additionally, we updated Eϕ once every 5 epochs.