RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
Authors: Nazia Tasnim, Bryan Plummer
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across six diverse datasets demonstrate RECAST outperforms the state-of-the-art by up to 1.5% and improves baselines > 3% across various scales, architectures, and parameter spaces. We evaluate RECAST s effectiveness on diverse datasets in a Task-incremental IL setting. We compare against 16 baselines comprising state-of-the-art IL methods on both CNN and Transformer architectures, where RECAST reports > 3% gain over prior work. |
| Researcher Affiliation | Academia | Nazia Tasnim Boston University EMAIL Bryan A. Plummer Boston University EMAIL |
| Pseudocode | Yes | Algorithm 1 RECAST Framework; Algorithm 2 Neural Mimicry |
| Open Source Code | Yes | 1Code: Repository; The implementation details of our custom Res Net and Vision Transformer architectures are fully described in the main text, with complete code provided in the supplementary materials. |
| Open Datasets | Yes | Following standard experiment setups in similar works by Aljundi et al. (2019); Ge et al. (2023) we employ six diverse benchmarking datasets covering fine-grained to coarse image classification tasks across various domain including flowers (Nilsback & Zisserman, 2008), scenes (Quattoni & Torralba, 2009), birds (Wah et al., 2011), animals (Krizhevsky & Hinton, 2009), vehicles (Maji et al., 2013), and other man-made objects (Krizhevsky & Hinton, 2009). In Table 3 we have summarized the class variations and number of samples for the six datasets we have used in our TIL experiments. |
| Dataset Splits | Yes | All the datasets were split with 75% 15% 15% train-validation-test split. |
| Hardware Specification | Yes | All experiments were run on a single RTX8000 GPU. |
| Software Dependencies | No | We trained GDUMB (Prabhu et al., 2020), EWC (Lee et al., 2019), LWF (Li & Hoiem, 2016), and L2P (Wang et al., 2022b) using the avalanche library (Carta et al., 2023). Official Py Torch implementations of other methods were modified for TIL settings. All experiments were conducted using Py Torch, with specific versions and dependencies listed in the supplementary materials. The main text does not provide specific version numbers for PyTorch or avalanche. |
| Experiment Setup | Yes | RECAST, Me Lo (Zhu et al., 2024), and Adapt Former (Chen et al., 2022), Ro SA (Nikdan et al., 2024), Do RA (Liu et al., 2024b) used Adam W (Loshchilov & Hutter, 2017) with 2e 3 5e 3 learning rates, 1e 6 weight decay, and stepwise LR scheduling (decay by 0.1 every 33 epochs) for 100 epochs. Default hyperparameters were used for avalanche models and methods like HAT (Serra et al., 2018), Piggyback (Mallya et al., 2018), and CLR (Ge et al., 2023), trained for 100 epochs. For RECAST-Vi T, we used a group size of 6, 2 templates per bank, and 2 coefficient sets. |