Similarity-Based Adaptation for Task-Aware and Task-Free Continual Learning

Authors: Tameem Adel

JAIR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically evaluate the performance of the proposed STAF. We chiefly aim at evaluating the following aspects: i) the task-aware CL performance of STAF, measured by the average classification accuracy over all tasks encountered by the learner; ii) the degree to which catastrophic forgetting can be reduced with STAF; iii) the task-free CL performance of the corresponding STAF version; and iv) an ablation study gauging the impact of the adaptation layer on the obtained results.
Researcher Affiliation Academia Tameem Adel EMAIL National Physical Laboratory (NPL), Maxwell Centre, University of Cambridge JJ Thomson Avenue, Cambridge, CB3 0HE, United Kingdom
Pseudocode Yes Algorithm 1 Training of the task-aware STAF Algorithm Input: A sequence of m datasets: Dt, t = 1, 2, . . . , m. Initialize the shared and task-specific parameters θh and θt f for all tasks, and adaptation parameters θ1 a for task 1. for t = 2, . . . , m do // PN Learning Observe sample Dt = {xn t , yn t }Nt n=1 Embed data xt as Gr(xt; φr), where φr are the mapping parameters. Compute current prototype µt using (5). Predict the most similar task t NN using (6). Train Dt via negative LL. Compute prob. using (7). // Main CL architecture Initialize θt a via final values of θNN a Train parameters θt f, θt a and θh using (2), (3) and (4). end for
Open Source Code No The paper does not provide concrete access information to source code. There is no mention of a code repository link, an explicit code release statement, or code being available in supplementary materials.
Open Datasets Yes We perform CL experiments on six datasets: Split MNIST (Goodfellow et al., 2014a; Zenke et al., 2017), Fashion-MNIST (Xiao et al., 2017), Omniglot (Lake et al., 2011), Split CIFAR-10 (Krizhevsky & Hinton, 2009), Split CIFAR-100 (Krizhevsky & Hinton, 2009; Rebuffiet al., 2017) and Split mini-Image Net (Deng et al., 2009; Vinyals et al., 2016; Chaudhry et al., 2019b).
Dataset Splits Yes All methods are trained with a minibatch size of 256 and for 100 epochs. We adopt a 60/20/20% allocation for training, validation and test, respectively.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'Adam (Kingma & Ba, 2015) is the optimizer used' but does not specify versions for programming languages, libraries, or other software dependencies necessary for replication.
Experiment Setup Yes All methods are trained with a minibatch size of 256 and for 100 epochs. We adopt a 60/20/20% allocation for training, validation and test, respectively. Initial values assigned to the shared learning rate αh and the task-specific learning rate αt f are 0.02 and 0.05, respectively.