Accelerated Deep Active Learning with Graph-based Sub- Sampling

Authors: Dan Kushnir, Shiyun Xu

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments validate our goal to reduce the query time while maintaining the highest accuracy. We report DGAL s improved query time vs. accuracy trade-offand compare it with pivotal SOTA baselines. Additionally, we provide classical active learning empirical analysis of the trade-offbetween the number of queried data points and accuracy, and we provide average query times, for all baselines and data sets. We provide an ablation study with VAE-SEALS (see Algorithm 3) and with random versions of pool set restriction. Training details are provided in table 1 in the appendix.
Researcher Affiliation Collaboration Dan Kushnir EMAIL Bell Laboratories NOKIA Shiyun Xu EMAIL Department of Applied Mathematics and Computational Science University of Pennsylvania
Pseudocode Yes We provide the pseudo-code of our method in Algorithm 2. We note that the input to VAE-DGAL includes the labeled and unlabeled pool set, the VAE architecture, and the task classification network f .
Open Source Code No The paper does not provide a direct link to a source-code repository for the methodology described in this paper, nor does it include an explicit statement about releasing the code for this work.
Open Datasets Yes We experimented with benchmark data sets MNIST Le Cun et al. (1998), EMNIST Cohen et al. (2017), SVHN Netzer et al. (2011), CIFAR10 Krizhevsky et al. (2009), CIFAR100 Krizhevsky et al. (2009), and Mini-Image Net Ravi & Larochelle (2017) data sets. The Image Net Deng et al. (2009) is a well-known large-scale dataset in computer vision.
Dataset Splits No The paper mentions using well-known datasets like MNIST, CIFAR10, and ImageNet, but it does not explicitly provide the specific training/test/validation splits (e.g., percentages or sample counts) used for the experiments in the main text. It refers to 'Training details are provided in table 1 in the appendix,' implying such details are not in the main body.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes Additional parameters: The batch size B is set at no more than 500 (see batch size for diffusion algorithm in Kushnir & Venturi (2023). The number of epochs is set with a stopping criterion for the convergence of the loss function.