A Finite Sample Analysis of the Naive Bayes Classifier
Authors: Daniel Berend, Aryeh Kontorovich
JMLR 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We revisit, from a statistical learning perspective, the classical decision-theoretic problem of weighted expert voting. In particular, we examine the consistency (both asymptotic and finitary) of the optimal Naive Bayes weighted majority and related rules. In the case of known expert competence levels, we give sharp error estimates for the optimal rule. We derive optimality results for our estimates and also establish some structural characterizations. When the competence levels are unknown, they must be empirically estimated. We provide frequentist and Bayesian analyses for this situation. Some of our proof techniques are non-standard and may be of independent interest. Several challenging open problems are posed, and experimental results are provided to illustrate the theory. |
| Researcher Affiliation | Academia | Daniel Berend EMAIL Department of Computer Science and Department of Mathematics Ben-Gurion University Beer Sheva, Israel Aryeh Kontorovich EMAIL Department of Computer Science Ben-Gurion University Beer Sheva, Israel |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. It primarily presents mathematical derivations and proofs. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. There is no mention of a repository link, an explicit code release statement, or code being available in supplementary materials. |
| Open Datasets | No | In each trial, a vector of expert competences p [0, 1]n was drawn independently componentwise, with pi Beta(1, 1). These values (i.e., αi = βi 1) were used for ˆf Ba. The results are reported in Figure 2. The paper describes a generative process for the data used in experiments (drawing expert competences from Beta distributions) rather than using a pre-existing public dataset, and no access information is provided for the generated data. |
| Dataset Splits | No | The paper describes generating data for its experiments (e.g., "In each trial, a vector of expert competences p [0, 1]n was drawn independently componentwise, with pi Beta(1, 1)."). However, it does not provide specific dataset split information (e.g., train/test/validation percentages or counts) as would be applicable for pre-existing datasets. It mentions "results were averaged over 10^5 trials," which is a simulation setting, not a dataset split. |
| Hardware Specification | No | The paper describes the experimental setup in Section 6, including the number of experts (n=5), sample sizes (mi), and number of trials (10^5). However, it does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct these experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers (e.g., Python, PyTorch, or specific solvers). |
| Experiment Setup | Yes | In our experiments, we set n = 5 and the sample sizes mi were identical for all experts. The results were averaged over 10^5 trials. For ˆf Ba, we used αi = βi = 1 for all i. |