Bridging performance gap between minimal and maximal SVM models
Authors: Ondrej Such, René Fabricius
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to uncover metrical and topological properties that impact the accuracy of a multi-class SVM model. Based on their results we propose a way to construct intermediate multi-class SVM models. In our work we use convolutional data sets, which have multiple advantages for benchmarking multi-class SVM models. |
| Researcher Affiliation | Academia | Ondrej Šuch EMAIL Mathematical institute of Slovak Academy of Sciences Ďumbierska 1 Banská Bystrica, 974 01, Slovakia René Fabricius EMAIL Faculty of Management and Informatics Žilinská Univerzita v Žiline Univerzitná 8215/1 Žilina, 010 26, Slovakia |
| Pseudocode | Yes | Algorithm 1 Incremental multi-class SVM model creation 1: procedure Grow_SVM(T, V ) 2: # Argument T is a training dataset with N classes 3: # Argument V is a validation dataset (possibly V T) 4: 5: Choose a spanning tree with star topology at random. Denote by E the set of its edges. 6: for e := (i, j) E do 7: Train the probabilistic pairwise SVM model on S distinguishing classes i and j 8: Set step 1 9: Set F to be the complement of E in the set of edges of the complete graph on N vertices. 10: while F is nonempty do 11: yield(tuple(step = i, model graph = E)) 12: Compute confusion matrix X on V using (7) to fill missing pairwise odds 13: Set Y = X + X 14: Choose an edge f = (i, j) in F such Yij = max(m,n) F Ymn 15: Train probabilistic pairwise SVM model on S distinguishing classes i and j 16: Set F F {f} 17: Set E E {f} |
| Open Source Code | Yes | The source repository can be found on https://github.com/ondrej-such/svm3. |
| Open Datasets | Yes | An overview of the datasets used throughout the paper is summarized in Table 1. The number of classes was selected to balance the requirements of being able to run multiple experiments to get error bounds and to represent a large-class SVM problem while being within our computational capacity. In sections 3 and 4 we have used three datasets each having ten classes: CIFAR-10 (Krizhevsky et al., 2009), Imagenette, and Imagewoof (Howard, 2022), the latter two being subsets of the well-known Imagenet 2012 dataset (Russakovsky et al., 2015). |
| Dataset Splits | Yes | Table 1: Summary of datasets used in the experiments. Train samples for SVM include augmented data samples. Dataset name Classes # samples when training a neural network (per class) # samples when training SVM (per class) # samples in the testing dataset (per class) CIFAR-10 10 6000 10000 1000 Imagenette 10 1000 10000 50 Imagewoof 10 1000 10000 50 Imagenet-50 50 1000 10000 50 Each training matrix had 10000 entries per class, which included all training samples and also augmented data samples. We selected 200 samples from the training set for each class. we set aside 200 extra samples for each class from the train dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory configurations used for running experiments. It generally mentions |
| Software Dependencies | Yes | For computations of SVM, we use R package e1071 (R Core Team, 2021; Meyer et al., 2020). For data processing, we use tidyverse (Wickham et al., 2019) and for visualizations package ggplot2 (Wickham, 2016). Meyer et al., 2020 is listed as "e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien, 2020. URL https://CRAN.R-project.org/package=e1071. R package version 1.7-4." |
| Experiment Setup | Yes | For CIFAR-10 we have used a custom architecture by David C. Page, which can be quickly trained to 94% accuracy (Page, 2019). For Imagenette and Imagewoof datasets, we used the Resnet-18 network (He et al., 2016). This architecture was designed to solve the Imagenet classification problem, which has 1000 classes. We have trained 20 instances of each architecture on CIFAR-10 and Imagenet respectively. We used default parameters for the svm function from the e1071 library. In particular, data were scaled, and we used default kernel and cost parameters. |