A Unified Theory of Diversity in Ensemble Learning
Authors: Danny Wood, Tingting Mu, Andrew M. Webb, Henry W. J. Reeve, Mikel Luján, Gavin Brown
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present illustrative experiments, estimating bias/variance/diversity terms from data. Figure 5 shows an experiment with Bagged regression trees, varying ensemble size. |
| Researcher Affiliation | Academia | Department of Computer Science, University of Manchester, UK School of Mathematics, University of Bristol, UK |
| Pseudocode | Yes | Algorithm 1: Algorithm for collecting data, later used to estimate diversity of a Bagging ensemble, whilst varying ensemble size m. |
| Open Source Code | Yes | Code for all experiments at: https://github.com/Echo Statements/Decompose |
| Open Datasets | Yes | Figure 5 shows an experiment with Bagged regression trees, varying ensemble size (California Housing data). We compare Bagging MLPs on MNIST |
| Dataset Splits | Yes | each trial uses a 90% sub-sample of the full training data, as outlined in Figure 17. Figure 17: Visualisation of the sub-sampling scheme used for Bagging ensembles. |
| Hardware Specification | No | No specific hardware details (GPU models, CPU models, etc.) are provided in the paper for the experiments conducted. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn versions) are mentioned in the paper. |
| Experiment Setup | Yes | The following configuration was used in all MLP experiments: learning rate: 0.1 (Stochastic gradient descent) num epochs: 50 (MNIST), 200 (other data sets) hidden layer size (20 small/100 larger) number of trials: 100 |