reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Cost Efficiency in Active Learning with Candidate Set Query

Authors: Yeho Gwon, Sehyun Hwang, Hoyoung Kim, Jungseul Ok, Suha Kwak

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on CIFAR-10, CIFAR-100, and Image Net64x64 demonstrate the effectiveness and scalability of our framework. Notably, it reduces labeling cost by 48% on Image Net64x64. The project page can be found at https://yehogwon.github.io/csq-al. We verify the effectiveness and generalizability of CSQ through extensive experiments with varying datasets, acquisition functions, and budgets.
Researcher Affiliation	Academia	Yeho Gwon EMAIL Department of Computer Science and Engineering POSTECH; Sehyun Hwang EMAIL Department of Computer Science and Engineering POSTECH; Hoyoung Kim EMAIL Graduate School of Artificial Intelligence POSTECH; Jungseul Ok EMAIL Graduate School of Artificial Intelligence POSTECH; Suha Kwak EMAIL Graduate School of Artificial Intelligence POSTECH
Pseudocode	Yes	Algorithm 1 Cost-efficient active learning with candidate set query
Open Source Code	No	The paper states: "The project page can be found at https://yehogwon.github.io/csq-al." This is a project page, which is considered a demonstration or overview page, not a direct link to a source-code repository containing the methodology's implementation.
Open Datasets	Yes	We use three image classification datasets: CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 (Krizhevsky et al., 2009), and Image Net64x64 (Chrabaszcz et al., 2017). The R52 dataset (Lewis, 1997) is a subset of the Reuters-21578 (Lewis, 1997) news collection.
Dataset Splits	Yes	CIFAR-10 comprises 50K training and 10K validation images across 10 classes. CIFAR-100 contains the same number of images as CIFAR-10, but is associated with 100 classes. Image Net64x64... consists of 1.2M training and 50K validation images with 1000 classes. In the initial round, we randomly sample 1K images for CIFAR-10, 5K images for CIFAR-100, and 60K images for Image Net64x64. We set the size of the calibration dataset ncal to 500 for CIFAR-10 and CIFAR-100, and 5K for Image Net64x64.
Hardware Specification	Yes	We trained our classification model on CIFAR-10 and CIFAR100 using NVIDIA RTX 3090 and on Image Net64x64 using 4 NVIDIA A100 GPUs in parallel.
Software Dependencies	No	The paper mentions various software components and models such as ResNet18, AdamW, Mix-up, WRN-36-5, SVM classifier, TF-IDF, and RoBERTa-Large, but does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	For CIFAR-10 and CIFAR-100, we adopt Res Net18 (He et al., 2016) as a classification model. We train it for 200 epochs using Adam W (Loshchilov & Hutter, 2019) optimizer with an initial learning rate of 1e 3, decreasing by a factor of 0.2 at epochs 60, 120, and 160. We apply a weight decay of 5e 4 and a data augmentation consisting of random crop, random horizontal flip, and random rotation. For Image Net64x64, we adopt WRN-36-5 (Zagoruyko, 2016), and train it for 30 epochs using Adam W optimizer with an initial learning rate of 8e 3. We apply a learning rate warm-up for 10 epochs from 2e 3. After the warm-up, we decay the learning rate by a factor of 0.2 every 10 epochs. We adopt random horizontal flip and random translation as data augmentation. For all the datasets, we use Mix-up (Zhang et al., 2018), where a mixing ratio is sampled from Beta(1, 1). We set the size of the calibration dataset ncal to 500 for CIFAR-10 and CIFAR-100, and 5K for Image Net64x64. For all datasets and acquisition functions, hyperparameter d in Eq. (8) is set to 0.3.