reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Empirical Evaluation of Ranking Measures With Respect to Robustness to Noise

Authors: D. Berrar

JAIR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using both synthetic and real-world data sets, we investigate how diﬀerent types and levels of noise aﬀect the area under the ROC curve (AUC), the area under the ROC convex hull, the scored AUC, the Kolmogorov-Smirnov statistic, and the H-measure.
Researcher Affiliation	Academia	Daniel Berrar EMAIL Interdisciplinary Graduate School of Science and Engineering Tokyo Institute of Technology 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan
Pseudocode	Yes	Appendix A. Pseudocodes Algorithm 1 Pseudocode for ta KS.
Open Source Code	No	The paper states that experiments were carried out in R2.10.1, but does not provide a specific repository link or an explicit statement about releasing the source code for their methodology. All experiments are described in pseudocode in Appendix A and were carried out in R2.10.1 (R Development Core Team, 2009).
Open Datasets	Yes	We used ten benchmark data sets from the UCI repository (Bache & Lichman, 2013).
Dataset Splits	Yes	Then, we compared the performance of C1 and C2 in 10-fold cross-validation. We repeated this experiment 1000 times and recorded how many times C2 was declared the better model by the respective ranking measure (see Appendix A, algorithm 2).
Hardware Specification	No	The paper states that experiments were carried out in R2.10.1, which is a software environment, but does not specify any hardware components (e.g., CPU, GPU models, memory, etc.).
Software Dependencies	Yes	All experiments are described in pseudocode in Appendix A and were carried out in R2.10.1 (R Development Core Team, 2009).
Experiment Setup	No	The paper describes the general process for generating synthetic data and adding noise, and mentions using 'naive Bayes learning to construct our base classiﬁer'. However, it does not provide specific hyperparameters for this naive Bayes classifier or other system-level training settings (e.g., learning rate, batch size, specific smoothing parameters).