Advancing Prompt-Based Methods for Replay-Independent General Continual Learning
Authors: Zhiqi KANG, Liyuan Wang, Xingxing Zhang, Karteek Alahari
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate substantial performance gains of our approach compared to recent competitors, especially without a replay buffer (e.g., up to 18.39, 22.06, and 11.96 % points performance lead on CIFAR-100, Tiny-Image Net, and Image Net R, respectively). |
| Researcher Affiliation | Academia | Zhiqi Kang Inria Liyuan Wang Tsinghua University Xingxing Zhang Tsinghua University Karteek Alahari Inria |
| Pseudocode | No | The paper describes the methods in descriptive text but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code is publicly available at https://github.com/kangzhiq/MISA. |
| Open Datasets | Yes | We consider three representative datasets: CIFAR-100, Tiny-Image Net and Image Net-R with 60k, 100k, 30k training samples and 100, 200, 200 classes, respectively. ... By default, our ISA uses the Image Net-1k dataset (Deng et al., 2009) of 1000-class large-scale images. |
| Dataset Splits | Yes | For FAM, we split Image Net-1k into 900 classes for Did and 100 classes for Dood without overlap. ... We follow the setting in Si-Blurry (Moon et al., 2023) in our GCL experiments. |
| Hardware Specification | Yes | Specifically, we report the average execution time for each method to learn from one batch on 1 RTX A5000 GPU. |
| Software Dependencies | No | The paper mentions specific tools and models like ViT-B/16 and Adam optimizer but does not provide specific version numbers for software libraries or programming languages used for implementation. |
| Experiment Setup | Yes | For MISA, we follow the same training configuration to ensure a fair comparison, e.g., using an Adam optimizer with learning rate 0.005 and batch size 32. In ISA, we use an Adam optimizer with learning rate 0.0001 and batch size 128 to train the prompts in an offline manner for three epochs. For FAM, we split Image Net-1k into 900 classes for Did and 100 classes for Dood without overlap. We randomly sample 10 classes as Dood to simulate a small-scale downstream task and resample new ones when we iterate over this subset. More aggressive augmentation is applied on Dood data as we applied double auto-augmentation (Cubuk et al., 2019). |