A Unified Theory of Diversity in Ensemble Learning

Authors: Danny Wood, Tingting Mu, Andrew M. Webb, Henry W. J. Reeve, Mikel Luján, Gavin Brown

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present illustrative experiments, estimating bias/variance/diversity terms from data. Figure 5 shows an experiment with Bagged regression trees, varying ensemble size.
Researcher Affiliation Academia Department of Computer Science, University of Manchester, UK School of Mathematics, University of Bristol, UK
Pseudocode Yes Algorithm 1: Algorithm for collecting data, later used to estimate diversity of a Bagging ensemble, whilst varying ensemble size m.
Open Source Code Yes Code for all experiments at: https://github.com/Echo Statements/Decompose
Open Datasets Yes Figure 5 shows an experiment with Bagged regression trees, varying ensemble size (California Housing data). We compare Bagging MLPs on MNIST
Dataset Splits Yes each trial uses a 90% sub-sample of the full training data, as outlined in Figure 17. Figure 17: Visualisation of the sub-sampling scheme used for Bagging ensembles.
Hardware Specification No No specific hardware details (GPU models, CPU models, etc.) are provided in the paper for the experiments conducted.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn versions) are mentioned in the paper.
Experiment Setup Yes The following configuration was used in all MLP experiments: learning rate: 0.1 (Stochastic gradient descent) num epochs: 50 (MNIST), 200 (other data sets) hidden layer size (20 small/100 larger) number of trials: 100