Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Generalized Resubstitution for Classification Error Estimation
Authors: Parisa Ghane, Ulisses Braga-Neto
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive numerical experiments with various classification rules trained on synthetic data, which indicate that the new family of error estimators proposed here produces the best results overall, except in the case of very complex, overfitting classifiers, in which semi-bolstered resubstitution should be used instead. In addition, results of an image classification experiment using the Le Net-5 convolutional neural network and the MNIST data set show that naive-Bayes bolstered resubstitution with a simple data-driven calibration procedure produces excellent results, demonstrating the potential of this class of error estimators in deep learning for computer vision. |
| Researcher Affiliation | Academia | Parisa Ghane EMAIL Ulisses Braga-Neto EMAIL Department of Electrical and Computer Engineering Texas A&M University College Station, TX 77843 USA |
| Pseudocode | No | The paper describes the methods using mathematical formulations and prose, but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | In addition, results of an image classification experiment using the Le Net-5 convolutional neural network and the MNIST data set show that naive-Bayes bolstered resubstitution with a simple data-driven calibration procedure produces excellent results, demonstrating the potential of this class of error estimators in deep learning for computer vision. |
| Dataset Splits | Yes | The Le Net-5 network was trained using 200 epochs of stochastic gradient descent, with batch size 32, employing 10% of the training data in each case as a validation data set to stop training early if the validation loss was not reduced for 10 consecutive epochs. ... The bias is roughly estimated by training the classifier on a random sample of 80% of the images form the available training data, and testing it on remaining 20%. |
| Hardware Specification | No | The paper discusses the use of a 'Le Net-5 convolutional neural network' but does not specify any particular hardware (e.g., GPU, CPU models) used for its implementation or training. |
| Software Dependencies | No | The paper mentions 'stochastic gradient descent' as a training method and refers to 'Le Net-5 convolutional neural network' but does not specify any software platforms (e.g., PyTorch, TensorFlow) or their version numbers. |
| Experiment Setup | Yes | The Le Net-5 network was trained using 200 epochs of stochastic gradient descent, with batch size 32, employing 10% of the training data in each case as a validation data set to stop training early if the validation loss was not reduced for 10 consecutive epochs. ... The bias is roughly estimated by training the classifier on a random sample of 80% of the images form the available training data, and testing it on remaining 20%. ... increasing κ by a fixed step-size (here, 0.1) until the magnitude of the roughly estimated bias does not decrease for two consecutive iterations. |