Agnostic Pointwise-Competitive Selective Classification

Authors: Yair Wiener, Ran El-Yaniv

JAIR 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We thus consider a heuristic approximation procedure that is based on SVMs, and show empirically that this algorithm consistently outperforms a traditional rejection mechanism based on distance from decision boundary. ... In Section 7 we present some numerical examples over medical classification problems and examine the empirical performance of the new algorithm and compare its performance with that of the widely used selective classification method for rejection, based on distance from decision boundary.
Researcher Affiliation Academia Yair Wiener EMAIL Ran El-Yaniv EMAIL Computer Science Department Technion Israel Institute of Technology Haifa 32000, Israel
Pseudocode Yes Strategy 1 Agnostic low-error selective strategy (LESS) Input: Sm, m, δ, d Output: a pointwise-competitive selective classifier (h, g) w.p. 1 δ 1: Set ˆf = ERM(F, Sm), i.e., ˆf is any empirical risk minimizer from F w.r.t. Sm 2: Set G = ˆV ˆf, 2σ(m, δ/4, d) (see Eq. (2) and (3)) 3: Construct g such that g(x) = 1 x {X \ DIS (G)}
Open Source Code No For implementation we used LIBSVM (Chang & Lin, 2011). We tested our algorithm on standard medical diagnosis problems from the UCI repository, including all datasets used by Grandvalet, Rakotomamonjy, Keshet, and Canu (2008). ... Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm .
Open Datasets Yes We tested our algorithm on standard medical diagnosis problems from the UCI repository, including all datasets used by Grandvalet, Rakotomamonjy, Keshet, and Canu (2008).
Dataset Splits Yes In each iteration we choose uniformly at random non-overlapping training set (100 samples) and test set (200 samples) for each dataset.6
Hardware Specification No The paper does not provide specific hardware details (like CPU/GPU models, memory, etc.) used for running the experiments. It only discusses the software and datasets used.
Software Dependencies No For implementation we used LIBSVM (Chang & Lin, 2011).
Experiment Setup Yes Using support vector machines (SVMs) we use a high C value (105 in our experiments) to penalize more on training errors than on small margin (see definitions of the SVM parameters in, e.g. Chang & Lin, 2011). In order to estimate ˆR( fx) we have to restrict the SVM optimizer to only consider hypotheses that classify the point x in a specific way. To accomplish this we use a weighted SVM for unbalanced data. We add the point x as another training point with weight 10 times larger than the weight of all training points combined. ... First we generate an odd number k of different samples (S1 m, S2 m, . . . Sk m) using bootstrap sampling (we used k = 11).