Approximation Vector Machines for Large-scale Online Learning
Authors: Trung Le, Tu Dinh Nguyen, Vu Nguyen, Dinh Phung
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments for classification and regression tasks in batch and online modes using several benchmark datasets. The quantitative results show that our proposed AVM obtained comparable predictive performances with current state-of-the-art methods while simultaneously achieving significant computational speed-up due to the ability of the proposed AVM in maintaining the model size. |
| Researcher Affiliation | Academia | Trung Le EMAIL Tu Dinh Nguyen EMAIL Vu Nguyen EMAIL Dinh Phung EMAIL Centre for Pattern Recognition and Data Analytics, School of Information Technology, Deakin University, Australia, Waurn Ponds Campus |
| Pseudocode | Yes | Algorithm 1 Stochastic Gradient Descent algorithm. Algorithm 2 Approximation Vector Machine. Algorithm 3 Constructing hypersphere δ-coverage. Algorithm 4 Constructing hyperrectangle δ-coverage. Algorithm 5 Multiclass Approximation Vector Machine. |
| Open Source Code | Yes | The source code and experimental scripts are published for reproducibility5. https://github.com/tund/avm. |
| Open Datasets | Yes | Except the airlines, all of the datasets can be downloaded from LIBSVM6 and UCI7 websites. The airlines dataset is provided by American Statistical Association (ASA8). The data can be downloaded from http://stat-computing.org/dataexpo/2009/. |
| Dataset Splits | Yes | In batch classification experiments, we follow the original divisions of training and testing sets in LIBSVM and UCI sites wherever available. For KDDCup99, covtype and airlines datasets, we split the data into 90% for training and 10% for testing. In online classification and regression tasks, we either use the entire datasets or concatenate training and testing parts into one. The online learning algorithms are then trained in a single pass through the data. In both batch and online settings, for each dataset, the models perform 10 runs on different random permutations of the training data samples. |
| Hardware Specification | Yes | All experiments are conducted using a Windows machine with 3.46GHz Xeon processor and 96GB RAM. |
| Software Dependencies | No | Our models are implemented in Python with Numpy package. Their C++ implementations with Matlab interfaces are published as a part of LIBSVM, Budgeted SVM9 and LSOKL10 toolboxes. |
| Experiment Setup | Yes | The hyperparameters are varied in certain ranges and selected for the best performance on the validation set. The ranges are given as follows: C {2−5, 2−3, ..., 215}, λ {2−4/N, 2−2/N, ..., 216/N}, γ {2−8, 2−4, 2−2, 20, 22, 24, 28}, η {16.0, 8.0, 4.0, 2.0, 0.2, 0.02, 0.002, 0.0002} where N is the number of data points. The coverage diameter δ of AVM is selected following the approach described in Section 9.2. For the budget size B in NOGD and Pegasos algorithm, and the feature dimension D in FOGD for each dataset, we use identical values to those used in Section 7.1.1 of Lu et al. (2015). We utilize RBF kernel, i.e., K(x,x) = exp(−γ ||x-x'||^2) for all algorithms including ours. For a fair comparison, these hyperparameters are specified using cross-validation on training subset. Particularly, we further partition the training set into 80% for learning and 20% for validation. For large-scale databases, we use only 1% of training set, so that the searching can finish within an acceptable time budget. |