On Averaging ROC Curves

Authors: Jack Hogan, Niall M. Adams

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical For a simple illustration of the proposed averaging methods, we simulate scores arising from a probabilistic classifier employed on two datasets: the first well separable , the second poorly separable . We assume equally balanced classes for both datasets but allow the total number of instances to differ between datasets. The number of scores simulated for the negative and positive classes of the well separable dataset was n N1 = n P1 = 150; for the poorly separable dataset n N2 = n P2 = 75. Density estimates of the simulated positive and negative scores for the two datasets are shown in the top two panels of Figure 1a.
Researcher Affiliation Academia Jack Hogan EMAIL Department of Mathematics Imperial College London, London, UK. Niall M. Adams EMAIL Department of Mathematics Imperial College London, London, UK.
Pseudocode No The paper describes methods and concepts in prose and mathematical notation without explicit pseudocode or algorithm blocks.
Open Source Code No The paper discusses third-party software packages (ROCR, scikit-learn) and refers to their online tutorials, but it does not state that the authors are releasing their own code for the methodology described in this paper.
Open Datasets No The paper uses simulated scores for illustration, stating: 'For a simple illustration of the proposed averaging methods, we simulate scores arising from a probabilistic classifier employed on two datasets.' There is no mention of using or providing access to any publicly available or open datasets.
Dataset Splits No The paper uses simulated scores to illustrate concepts and mentions 'M = 100 datasets simulated in this way' but does not specify training, validation, or test splits for these datasets, as they are generated for illustrative purposes rather than used for model training and evaluation within the paper's own experimental setup.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory used for running simulations or computations.
Software Dependencies No The paper mentions 'the R package ROCR' and 'the Python package scikit-learn' in a general discussion of existing tools, but it does not specify version numbers for any software dependencies used by the authors to conduct their simulations or analysis.
Experiment Setup Yes For a simple illustration of the proposed averaging methods, we simulate scores arising from a probabilistic classifier employed on two datasets: the first well separable , the second poorly separable . We assume equally balanced classes for both datasets but allow the total number of instances to differ between datasets. The number of scores simulated for the negative and positive classes of the well separable dataset was n N1 = n P1 = 150; for the poorly separable dataset n N2 = n P2 = 75. Density estimates of the simulated positive and negative scores for the two datasets are shown in the top two panels of Figure 1a. The output classification scores for two arbitrary classifiers were simulated as follows: For classifier 1 (C-1), negative and positive class scores for dataset i are simulated from Gaussian distributions N(µi0, σi0) and N(µi1, σi1) respectively.