MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler
Authors: Zhining Liu, Pengfei Wei, Jing Jiang, Wei Cao, Jiang Bian, Yi Chang
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both synthetic and realworld tasks demonstrate the effectiveness, robustness, and transferability of MESA. Our code is available at https://github.com/Zhining Liu1998/mesa. 4 Experiments To thoroughly assess the effectiveness of MESA, two series of experiments are conducted: one on controlled synthetic toy datasets for visualization and the other on real-world imbalanced datasets to validate MESA s performance in practical applications. |
| Researcher Affiliation | Collaboration | Zhining Liu Jilin University EMAIL Pengfei Wei National University of Singapore EMAIL Jing Jiang University of Technology Sydney EMAIL Wei Cao Microsoft Research EMAIL Jiang Bian Microsoft Research EMAIL Yi Chang Jilin University EMAIL |
| Pseudocode | Yes | Algorithm 1 Sample(Dτ; F, µ, σ) Algorithm 2 Ensemble training in MESA Algorithm 3 Meta-training in MESA |
| Open Source Code | Yes | Our code is available at https://github.com/Zhining Liu1998/mesa. |
| Open Datasets | Yes | We extend the experiments to real-world imbalanced classification tasks from the UCI repository [10] and KDD CUP 2004. For each dataset, we keep-out the 20% validation set and report the result of 4-fold stratified cross-validation (i.e., 60%/20%/20% training/validation/test split). |
| Dataset Splits | Yes | For each dataset, we keep-out the 20% validation set and report the result of 4-fold stratified cross-validation (i.e., 60%/20%/20% training/validation/test split). |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Setup Details. We build a series of imbalanced toy datasets corresponding to different levels of underlying class distribution overlapping, as shown in Fig. 3. All the datasets have the same imbalance ratio (|N|/|P| = 2, 000/200 = 10). In this experiment, MESA is compared with four representative EIL algorithms from 4 major EIL branches (Parallel/Iterative Ensemble + Under/Over-sampling), i.e., SMOTEBOOST [7], SMOTEBAGGING [42], RUSBOOST [35], and UNDERBAGGING [2]. All EIL methods are deployed with decision trees as base classifiers with ensemble size of 5. |