Adaptive Sampling for Large Scale Boosting
Authors: Charles Dubout, Francois Fleuret
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments in image classification and object recognition on four standard computer vision data sets show that the adaptive methods we propose outperform basic sampling and state-of-the-art bandit methods. |
| Researcher Affiliation | Academia | Computer Vision and Learning Group Idiap Research Institute CH-1920 Martigny, Switzerland |
| Pseudocode | Yes | Algorithm 1 The Tasting 1.Q algorithm first samples uniformly R features from every feature subset Fk. It uses these features at every boosting step to find the optimal feature subset k from which to sample. After the selection of the Q features, the algorithm continues like Ada Boost. Algorithm 2 The Tasting Q.1 algorithm first samples uniformly R features from every feature subset Fk. It uses them to find the optimal subset kq for every one of the Q features to sample at every boosting step. After the selection of the Q features, the algorithm continues like Ada Boost. Algorithm 3 The M.A.S. naive algorithm models the current edge distribution with a Gaussian mixture model fitted on the edges estimated at the previous iteration. It uses this density model to compute the pair (Q , S ) maximizing the expectation of the true edge of the selected learner E[ϵ ], and then samples the corresponding number of weak learners and training examples, before keeping the weak learner with the highest approximated edge. After the selection of the Q features, the algorithm continues like Ada Boost. Algorithm 4 The Laminating algorithm starts by sampling Q weak learners and S examples at the beginning of every boosting iteration, and refine those by successively halving the number of learners and doubling the number of examples until only one learner remains. After the selection of the Q features, the algorithm continues like Ada Boost. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code, nor does it provide any links to a code repository. The conclusion section discusses future extensions but not code availability. |
| Open Datasets | Yes | The first data set that we used is the MNIST handwritten digits database (Le Cun et al., 1998). ... The second data set that we used is the INRIA Person data set (Dalal and Triggs, 2005). ... The third data set that we used is Caltech 101 (Fei-Fei et al., 2004) ... The fourth and last data set that we used is CIFAR-10 (Krizhevsky, 2009). |
| Dataset Splits | Yes | The first data set that we used is the MNIST handwritten digits database (Le Cun et al., 1998). It is composed of 10 classes and its training and testing sets consist respectively of 60,000 and 10,000 grayscale images of resolution 28 28 pixels... The second data set that we used is the INRIA Person data set (Dalal and Triggs, 2005). It is composed of a training and a testing set respectively of 2,418 and 1,126 color images... The third data set that we used is Caltech 101 (Fei-Fei et al., 2004)... We sampled 15 training examples and 20 distinct test examples from every class, as advised on the data set website. The fourth and last data set that we used is CIFAR-10 (Krizhevsky, 2009)... its training and testing sets consist respectively of 50,000 and 10,000 color images... |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments. It only discusses the experimental setup in terms of algorithms and datasets. |
| Software Dependencies | No | The paper mentions using "Ada Boost.MH algorithm (Schapire and Singer, 1999) with decision stumps as weak learners" and several bandit algorithms (UCB, Exp3.P, ϵ-greedy), but it does not specify any software libraries or their version numbers for implementation. |
| Experiment Setup | Yes | We used the Ada Boost.MH algorithm (Schapire and Singer, 1999) with decision stumps as weak learners to be able to use all methods in the same conditions. ... We set the maximum cost of all the algorithms to 10N, setting Q = 10 and S = N for the baselines, as this configuration leads to the best results after 10,000 boosting rounds. ... We set the values of the parameters of Exp3.P to η = 0.3 and λ = 0.15 as recommended in (Busa-Fekete and Kegl, 2010). |