Similarity-Based Adaptation for Task-Aware and Task-Free Continual Learning
Authors: Tameem Adel
JAIR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically evaluate the performance of the proposed STAF. We chiefly aim at evaluating the following aspects: i) the task-aware CL performance of STAF, measured by the average classification accuracy over all tasks encountered by the learner; ii) the degree to which catastrophic forgetting can be reduced with STAF; iii) the task-free CL performance of the corresponding STAF version; and iv) an ablation study gauging the impact of the adaptation layer on the obtained results. |
| Researcher Affiliation | Academia | Tameem Adel EMAIL National Physical Laboratory (NPL), Maxwell Centre, University of Cambridge JJ Thomson Avenue, Cambridge, CB3 0HE, United Kingdom |
| Pseudocode | Yes | Algorithm 1 Training of the task-aware STAF Algorithm Input: A sequence of m datasets: Dt, t = 1, 2, . . . , m. Initialize the shared and task-specific parameters θh and θt f for all tasks, and adaptation parameters θ1 a for task 1. for t = 2, . . . , m do // PN Learning Observe sample Dt = {xn t , yn t }Nt n=1 Embed data xt as Gr(xt; φr), where φr are the mapping parameters. Compute current prototype µt using (5). Predict the most similar task t NN using (6). Train Dt via negative LL. Compute prob. using (7). // Main CL architecture Initialize θt a via final values of θNN a Train parameters θt f, θt a and θh using (2), (3) and (4). end for |
| Open Source Code | No | The paper does not provide concrete access information to source code. There is no mention of a code repository link, an explicit code release statement, or code being available in supplementary materials. |
| Open Datasets | Yes | We perform CL experiments on six datasets: Split MNIST (Goodfellow et al., 2014a; Zenke et al., 2017), Fashion-MNIST (Xiao et al., 2017), Omniglot (Lake et al., 2011), Split CIFAR-10 (Krizhevsky & Hinton, 2009), Split CIFAR-100 (Krizhevsky & Hinton, 2009; Rebuffiet al., 2017) and Split mini-Image Net (Deng et al., 2009; Vinyals et al., 2016; Chaudhry et al., 2019b). |
| Dataset Splits | Yes | All methods are trained with a minibatch size of 256 and for 100 epochs. We adopt a 60/20/20% allocation for training, validation and test, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Adam (Kingma & Ba, 2015) is the optimizer used' but does not specify versions for programming languages, libraries, or other software dependencies necessary for replication. |
| Experiment Setup | Yes | All methods are trained with a minibatch size of 256 and for 100 epochs. We adopt a 60/20/20% allocation for training, validation and test, respectively. Initial values assigned to the shared learning rate αh and the task-specific learning rate αt f are 0.02 and 0.05, respectively. |