An Empirical Evaluation of Ranking Measures With Respect to Robustness to Noise

Authors: D. Berrar

JAIR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using both synthetic and real-world data sets, we investigate how different types and levels of noise affect the area under the ROC curve (AUC), the area under the ROC convex hull, the scored AUC, the Kolmogorov-Smirnov statistic, and the H-measure.
Researcher Affiliation Academia Daniel Berrar EMAIL Interdisciplinary Graduate School of Science and Engineering Tokyo Institute of Technology 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan
Pseudocode Yes Appendix A. Pseudocodes Algorithm 1 Pseudocode for ta KS.
Open Source Code No The paper states that experiments were carried out in R2.10.1, but does not provide a specific repository link or an explicit statement about releasing the source code for their methodology. All experiments are described in pseudocode in Appendix A and were carried out in R2.10.1 (R Development Core Team, 2009).
Open Datasets Yes We used ten benchmark data sets from the UCI repository (Bache & Lichman, 2013).
Dataset Splits Yes Then, we compared the performance of C1 and C2 in 10-fold cross-validation. We repeated this experiment 1000 times and recorded how many times C2 was declared the better model by the respective ranking measure (see Appendix A, algorithm 2).
Hardware Specification No The paper states that experiments were carried out in R2.10.1, which is a software environment, but does not specify any hardware components (e.g., CPU, GPU models, memory, etc.).
Software Dependencies Yes All experiments are described in pseudocode in Appendix A and were carried out in R2.10.1 (R Development Core Team, 2009).
Experiment Setup No The paper describes the general process for generating synthetic data and adding noise, and mentions using 'naive Bayes learning to construct our base classifier'. However, it does not provide specific hyperparameters for this naive Bayes classifier or other system-level training settings (e.g., learning rate, batch size, specific smoothing parameters).