reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Queried Unlabeled Data Improves and Robustifies Class-Incremental Learning

Authors: Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that CIL-QUD achieves substantial accuracy gains on CIFAR-10 and CIFAR-100, compared to previous state-of-the-art CIL approaches. Moreover, RCIL-QUD establishes the first strong milestone for robustness-aware CIL. Codes are available in https://github.com/VITA-Group/CIL-QUD. (Abstract) Figure 1: Summary of our achieved performance on CIFAR10, where CIL is conducted over 5 incremental learning tasks (each 2-class). Figure (a) presents the standard accuracy (SA) achieved by CIL-QUD versus the previous SOTA (Belouadah & Popescu, 2019). Figure (b) shows the robust accuracy (RA) achieved by RCIL-QUD versus the baseline of directly applying adversarial training (Madry et al., 2018) to CIL. 5 Experiments and Analyses 5.1 Implementation Details Dataset, Memory Bank, and External Source. We evaluate our proposed method on CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009). Table 1: Final performance for each task T of CIL-QUD, compared with SOTAs. Table 2: Robust accuracy for each task Ti of RCIL-QUD and its variants.
Researcher Affiliation	Collaboration	Tianlong Chen EMAIL University of Texas at Austin Sijia Liu EMAIL Michigan State University MIT-IBM Watson AI Lab, IBM Research Shiyu Chang EMAIL University of California, Santa Barbara Lisa Amini EMAIL MIT-IBM Watson AI Lab, IBM Research Zhangyang Wang EMAIL University of Texas at Austin
Pseudocode	Yes	More details are referred to the supplement Section S1.1 and Algorithm 1.
Open Source Code	Yes	Codes are available in https://github.com/VITA-Group/CIL-QUD.
Open Datasets	Yes	We evaluate our proposed method on CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009). The default external source of queried unlabeled data is 80 Million Tiny Image dataset (Torralba et al., 2008), while another source of Image Net 32 32 (Deng et al., 2009) is investigated later.
Dataset Splits	Yes	We randomly split the original training dataset into training and validation set with a ratio of 9 : 1. ... On CIFAR-10, we divide the 10 classes into splits of 2 classes with a random order (10/2 = 5 tasks); On CIFAR-100, we divide 100 classes into splits of 20 classes with a random order (100/20 = 5 tasks).
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or cloud instance types) are provided in the main text of the paper.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned in the main text of the paper.
Experiment Setup	Yes	where θc,1, θc,2 donate the parameters from the primary and auxiliary classifiers respectively, λ is a hyperparameter, controlling the contributions of Lw F regularizers on queried unlabeled data. In our case, λ = 0.5. For tuning of hyper parameters, we perform a grid search. (Section 3.4) where γ1, γ2 balance the effect between AT and robust regularizers. In our case, γ1 = 0.05, γ2 = 0.2. δ is the adversarial perturbation generated by Projective Gradient Descent (PGD) (Madry et al., 2018). ϵ is the upper bound of perturbations under ℓ norm. (Section 4) Adversarial samples are crafted by n-step Projected Gradient Descent (PGD) with perturbation magnitude ϵ = 8/255 and step size α = 2/255 for both adversarial training and evaluation, with set n = 10 for training and n = 20 for evaluation, following Madry et al. (2018). (Section 5.1) We set the memory bank to store 100 images per class for CIFAR-10 and 10 images for CIFAR-100 by default. ... During each incremental session, we query 5,000 and 500 unlabeled images per class for CIFAR-10 and CIFAR-100, respectively by default. Increasing the amounts of queried unlabeled data may improve performance further but incur more training costs. We use a buffer of fixed capacity to store 128 queried unlabeled images at each training iteration. (Section 5.1)