Adversarial Subspace Generation for Outlier Detection in High-Dimensional Data
Authors: Jose Cribeiro-Ramallo, Federico Matteucci, Paul Enciu, Alexander Jenke, Vadim Arzamasov, Thorsten Strufe, Klemens Böhm
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on 42 real-world datasets show that using V-GAN subspaces to build ensemble methods leads to a significant increase in one-class classification performance compared to existing subspace selection, feature selection, and embedding methods. Further experiments on synthetic data show that VGAN identifies subspaces more accurately while scaling better than other relevant subspace selection methods. |
| Researcher Affiliation | Academia | Jose Cribeiro-Ramallo EMAIL Karlsruhe Institute of Technology Federico Matteucci EMAIL Karlsruhe Institute of Technology Paul Enciu EMAIL Karlsruhe Institute of Technology Alexander Jenke EMAIL Karlsruhe Institute of Technology Vadim Arzamasov EMAIL Karlsruhe Institute of Technology Thorsten Strufe EMAIL Karlsruhe Institute of Technology Klemens Böhm EMAIL Karlsruhe Institute of Technology |
| Pseudocode | Yes | A pseudo-code of the training is included in the Appendix |
| Open Source Code | Yes | Finally, we provide the code for all of our experiments and methods1. 1https://github.com/jcribeiro98/V-GAN |
| Open Datasets | Yes | We used 42 normalized datasets from the benchmark study by Han et al., listed in Tables 11-15 in the appendix. For those datasets with multiple versions, we chose the first in alphanumeric order. Details about each dataset are available in (Han et al., 2022). |
| Dataset Splits | Yes | 1. Split the dataset D into a training set Dtrain containing 80% of the inliers from D, and a test set Dtest containing the remaining 20% and the outliers. |
| Hardware Specification | Yes | Experiments ran on a Ryzen 9 7900X CPU and an Nvidia RTX 4090 GPU. |
| Software Dependencies | No | All experiments were implemented in Python. We used popular implementations for all competitors and baselines and implemented V-GAN in Py Torch. We used the Python package pyod for all outlier detectors. |
| Experiment Setup | Yes | We trained the network for 2000 epochs, with minibatch gradient descent using the Adadelta optimizer (Zeiler, 2012) following preliminary results. In particular, we use batches of size 500, a learning rate of lr G = lr E = 0.007 for the generator and the encoder, respectively. We set momentum (0.99) and weight-decay (0.04) (Goodfellow et al., 2016). Additionally, we updated Eϕ once every 5 epochs. |