LADA: Scalable Label-Specific CLIP Adapter for Continual Learning
Authors: Mao-Lin Luo, Zi-Hao Zhou, Tong Wei, Min-Ling Zhang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive results show that LADA achieves stateof-the-art performance in continual learning settings. The implementation code is available at https://github.com/Maolin Luo/LADA. ... We evaluate our method on both 16-shot and full-shot settings. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China. Correspondence to: Tong Wei <EMAIL>. |
| Pseudocode | No | The paper describes methods using mathematical formulations (Eq. 1-10) and prose, but does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The implementation code is available at https://github.com/Maolin Luo/LADA. |
| Open Datasets | Yes | We conduct experiments on the recently proposed X-TAIL (Xu et al., 2024) benchmark which consists of 10 image classification datasets: Aircraft (Maji et al., 2013), Caltech101 (Fei-Fei et al., 2004), DTD (Cimpoi et al., 2014), Euro SAT (Helber et al., 2019), Flowers (Nilsback & Zisserman, 2008), Food (Bossard et al., 2014), MNIST (Deng, 2012), Oxford Pet (Parkhi et al., 2012), Stanford Cars (Krause et al., 2013), and SUN397 (Xiao et al., 2010). |
| Dataset Splits | Yes | In addition to the 16-shot setting proposed by (Xu et al., 2024), in which 16 training samples per class were selected for each task, we also evaluate the benchmark under a fullshot setting. This more realistic scenario maintains the original dataset distribution, with varying numbers of training samples across tasks, providing a more comprehensive and challenging evaluation for continual learning methods. |
| Hardware Specification | Yes | All experiments of LADA are conducted on a single NVIDIA 4090 GPU. |
| Software Dependencies | No | The paper mentions the use of CLIP, ViT-B/16, AdaptFormer, and AdamW optimizer, but does not specify version numbers for these or any other software dependencies such as Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | The training process is carried out using the Adam W (Loshchilov & Hutter, 2019) optimizer, with a learning rate of 0.001 and a batch size of 64 across all tasks. For the primary experiments, we set the hyperparameters as λ1 = 16 and λ2 = 4. |