reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Learning for Cost-Sensitive Classification

Authors: Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daumé III, John Langford

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically compare COAL to passive learning and several active learning baselines, showing signiﬁcant improvements in labeling eﬀort and test cost on real-world datasets. Keywords: Active Learning, Cost-sensitive Learning, Structured Prediction, Statistical Learning Theory, Oracle-based Algorithms. ... Experimentally, we show that COAL substantially outperforms the passive learning baseline with orders of magnitude savings in the labeling eﬀort on a number of hierarchical classiﬁcation datasets (see Figure 1 for comparison between passive learning and COAL on Reuters text categorization). ... We now turn to an empirical evaluation of COAL. For further computational eﬃciency, we implemented an approximate version of COAL using: 1) a relaxed version space Gi(y) {g G \| b Ri(g; y) b Ri(gi,y; y) + i}, which does not enforce monotonicity, and 2) online optimization, based on online linear least-squares regression. The algorithm processes the data in one pass, and the idea is to (1) replace gi,y, the ERM, with an approximation go i,y obtained by online updates, and (2) compute the minimum and maximum costs via a sensitivity analysis of the online update. We describe this algorithm in detail in Subsection 7.1. Then, we present our experimental results, ﬁrst for simulated active learning (Subsection 7.2) and then for learning to search for joint prediction (Subsection 7.3).
Researcher Affiliation	Collaboration	Akshay Krishnamurthy EMAIL Microsoft Research New York, NY 10011 ... Alekh Agarwal EMAIL Microsoft Research Redmond, WA 98052 ... Tzu-Kuo Huang EMAIL Uber Advanced Technology Center Pittsburgh, PA 15201 ... Hal Daum e III EMAIL Microsoft Research New York, NY 10011 ... John Langford EMAIL Microsoft Research New York, NY 10011
Pseudocode	Yes	Algorithm 1 Cost Overlapped Active Learning (COAL) ... Algorithm 2 Max Cost
Open Source Code	Yes	Our code is publicly available as part of the Vowpal Wabbit machine learning library.3 ... 3. http://hunch.net/~vw
Open Datasets	Yes	We performed simulated active learning experiments with three datasets. Image Net 20 and 40 are sub-trees of the Image Net hierarchy covering the 20 and 40 most frequent classes... The third, RCV1-v2 (Lewis et al., 2004), is a multilabel textcategorization dataset... 3. http://hunch.net/~vw ... RCV1-v2 (Lewis et al., 2004). Data available at http://www.jmlr.org/papers/volume5/lewis04a/lyrl2004_rcv1v2_README.htm.
Dataset Splits	No	The paper mentions: "We randomly permute the training data 100 times and make one pass through the training set with each parameter setting." This describes how the training data was processed, but it does not specify how the original datasets were split into training, validation, and test sets, nor does it provide absolute counts or percentages for these splits.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running the experiments. It discusses experimental procedures and mentions using Vowpal Wabbit but provides no details on specific CPU, GPU, or other computational resources.
Software Dependencies	No	Our code is publicly available as part of the Vowpal Wabbit machine learning library.3 ... We use the cost-sensitive one-against-all (csoaa) implementation in Vowpal Wabbit5... The paper mentions "Vowpal Wabbit" but does not specify a version number for this or any other software dependency.
Experiment Setup	Yes	There are two tuning parameters in our implementation. First, instead of i, we set the radius of the version space to i = κνi 1 i 1 (i.e. the log(n) term in the deﬁnition of νn is replaced with log(i)) and instead tune the constant κ. This alternate mellowness parameter controls how aggressive the query strategy is. The second parameter is the learning rate used by online linear regression6. For all experiments, we show the results obtained by the best learning rate for each mellowness on each dataset, which is tuned as follows. We randomly permute the training data 100 times and make one pass through the training set with each parameter setting. ... The best learning rates for diﬀerent datasets and mellowness settings are in Table 2. ... We choose the mellowness by visual inspection for the baselines and use 0.01 for COAL8.