A Unified View on Multi-class Support Vector Classification

Authors: Ürün Doğan, Tobias Glasmachers, Christian Igel

JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compared the performance of nine different multi-class SVMs in a thorough empirical study. Our results suggest to use the Weston & Watkins SVM, which can be trained comparatively fast and gives good accuracies on benchmark functions. If training time is a major concern, the one-vs-all approach is the method of choice.
Researcher Affiliation Collaboration Ur un Do gan EMAIL Microsoft Research Tobias Glasmachers EMAIL Institut f ur Neuroinformatik Ruhr-Universit at Bochum, Germany Christian Igel EMAIL Department of Computer Science University of Copenhagen, Denmark
Pseudocode No The paper includes mathematical formulations for SVMs and dual problems, such as Equation (3) for regularized empirical risk and Equation (5) for the dual problem, but it does not present any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes All algorithms were implemented in the Shark open source machine learning library (Igel et al., 2008). Source code for reproducing the results can be found in the supplementary material.
Open Datasets Yes First, twelve standard benchmark data sets were considered for non-linear SVM learning, and careful model selection was conducted. This already makes our experiments the most extensive comparison of multi-class SVMs so far. Second, additional experiments for linear SVMs are presented later in this section. [...] These data sets are available from the libsvm data collection.
Dataset Splits Yes Repeated cross-validation was employed as model selection criterion. Five-fold cross-validation was repeated ten times using ten independent random splits into five folds. This stabilizes the model selection procedure especially for small data sets. [...] Using the best parameters found during model selection, 100 machines were trained on 100 random splits into training and test data (preserving the original set sizes).
Hardware Specification No The paper mentions that training time was measured 'on a single core' and discusses linear vs. non-linear machines, but it does not specify any particular CPU or GPU models, or other hardware components used for running the experiments.
Software Dependencies No The paper states: 'All algorithms were implemented in the Shark open source machine learning library (Igel et al., 2008).' While it names a software library, it does not provide a specific version number for Shark or any other key software components used in the implementation.
Experiment Setup Yes The bandwidth γ of the Gaussian kernel and the regularization parameter C of the machine were determined by nested grid search. [...] We set the initial grid to γ {2 12+3i | i = 0, 1, . . . , 4} and C {23i | i = 0, 1, . . . , 4}. Let (γ0, C0) denote the parameter configuration picked in the first stage. Then in the second stage the parameters were further refined on the grid γ {2i γ0 | i = 2, 1, 0, 1, 2} and C {2i C0 | i = 2, 1, 0, 1, 2}.