Fast Rates in Pool-Based Batch Active Learning

Authors: Claudio Gentile, Zhilei Wang, Tong Zhang

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our work is theoretical in nature. We try to understand the statistical sample efficiency of pool based batch active learning, although some of the proposed methods in the paper may not pay specific attention to the computational aspects of the involved procedures. Specifically, whereas our methods are computationally efficient in the linear function case, they need not be in the general nonlinear case.
Researcher Affiliation Collaboration Claudio Gentile EMAIL Google Research New York City, NY, USA Zhilei Wang EMAIL World Quant LLC New York City, NY, USA Tong Zhang EMAIL University of Illinois Urbana-Champaign, IL
Pseudocode Yes Algorithm 1: Pool-based batch active learning algorithm for linear models. Algorithm 2: Pool-based batch active learning algorithm for general non-linear models.
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository. The authors state their work is "theoretical in nature".
Open Datasets No The paper refers to a generic "pool P of T unlabeled instances x1, . . . , x T X, drawn i.i.d. according to a marginal distribution DX" rather than a specific, named public dataset with access information.
Dataset Splits No The paper describes a theoretical framework and does not include any experimental results that would require dataset splits (e.g., train/test/validation percentages or counts).
Hardware Specification No The paper is theoretical and does not describe any experiments that would require hardware specifications.
Software Dependencies No The paper is theoretical and focuses on algorithms and analysis rather than implementation, so it does not list any specific software dependencies or versions.
Experiment Setup No The paper is theoretical and does not present experimental results, thus no experimental setup details like hyperparameters or training configurations are provided.