reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards a Unified Framework of Clustering-based Anomaly Detection

Authors: Zeyu Fang, Ming Gu, Sheng Zhou, Jiawei Chen, Qiaoyu Tan, Haishuai Wang, Jiajun Bu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments, involving 17 baseline methods across 30 diverse datasets, validate the effectiveness and generalization capability of the proposed method, surpassing state-of-the-art methods.
Researcher Affiliation	Academia	1Zhejiang Key Laboratory of Accessible Perception and Intelligent Systems, Zhejiang University, Hangzhou, China 2The State Key Laboratory of Blockchain and Data Security, Hangzhou, China 3New York University Shanghai, Shanghai, China. Correspondence to: Sheng Zhou <EMAIL>.
Pseudocode	Yes	Algorithm 1 Model Training for Uni CAD
Open Source Code	Yes	The code for reproducing our experiments is publicly available at https://github.com/BabelTower/Uni CAD.
Open Datasets	Yes	We evaluated Uni CAD on an extensive collection of datasets, comprising 30 tabular datasets that span 16 diverse fields. We specifically focused on naturally occurring anomaly patterns, rather than synthetically generated or injected anomalies, as this aligns more closely with real-world scenarios. The detailed descriptions are provided in Table 4 of Appendix D.1. Following the setup in ADBench (Han et al., 2022), we adopt an inductive setting to predict newly emerging data, a highly beneficial approach for practical applications.
Dataset Splits	No	The paper mentions using an "inductive setting to predict newly emerging data" and evaluation using "AUC-ROC and AUC-PR metrics" following the setup in ADBench (Han et al., 2022). However, it does not explicitly provide specific percentages, counts, or a detailed methodology for how the training, validation, and test splits were performed for the 30 datasets used in its experiments.
Hardware Specification	No	The paper provides a "Runtime Comparison" in Table 2 but does not specify any hardware details (e.g., GPU models, CPU types, memory) used for running the experiments or training the models.
Software Dependencies	No	The paper mentions using the "Adam optimizer" and states that a "two-layer MLP" was employed, but it does not specify any software frameworks (e.g., PyTorch, TensorFlow) or library version numbers (e.g., Python 3.x, scikit-learn x.x.x).
Experiment Setup	Yes	For all datasets, we employ a two-layer MLP with a hidden dimension of d = 128 and Re LU activation function as both encoder and decoder. We utilize the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 1e 4 for 100 epochs. For the EM process, we set the maximum iteration number to 100 and a tolerance of 1e 3 for stopping training when the objectives converge. The number of components in the mixture model is set as k = 10, and the proportion of the outlier is set as l = 1%.