Online Continual Learning via Logit Adjusted Softmax
Authors: Zhehao Huang, Tao Li, Chenhe Yuan, Yingwen Wu, Xiaolin Huang
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on various benchmarks and demonstrate significant performance improvements compared to prior arts. For example, our approach improves the best baseline by 4.6% on CIFAR10. Codes are available at https://github.com/K1nght/online_CL_logit_adjusted_softmax. Section 6 is titled "Experiment" and contains comprehensive experimental results, comparisons, and ablation studies. |
| Researcher Affiliation | Academia | All authors are affiliated with "Shanghai Jiao Tong University," which is an academic institution. The email domains provided are all "@sjtu.edu.cn". |
| Pseudocode | Yes | The paper includes pseudocode in Appendix B, with "Algorithm 1 Experience Replay (ER)", "Algorithm 2 Experience Replay with Logit Adjusted Softmax (ER-LAS)", and "Algorithm 3 Knowledge distillation with Logit Adjusted Softmax (KD-LAS)". |
| Open Source Code | Yes | The abstract states: "Codes are available at https://github.com/K1nght/online_CL_logit_adjusted_softmax." |
| Open Datasets | Yes | The paper evaluates on well-known public datasets such as "CIFAR10 (Krizhevsky, 2009)", "CIFAR100 (Krizhevsky, 2009)", "Tiny Image Net (Le & Yang, 2015)", "Image Net ILSVRC 2012 (Deng et al., 2009)", and "i Naturalist 2017 (Horn et al., 2017)". It also provides a download link for iNaturalist: "We download the dataset of i Naturalist from https://github.com/visipedia/inat_comp." |
| Dataset Splits | Yes | Section 6 "Benchmark setups" and Appendix C.2 "Continual Learning Setup Details" provide detailed information on dataset splits. For example, "C-CIFAR10 (5 tasks) is split into 5 disjoint tasks with 2 classes each" and "We split CIFAR100 and Tiny Image Net into 10 blurry tasks according to (Koh et al., 2022) with disjoint ratio Nblurry = 50 and blurry level Mblurry = 10." |
| Hardware Specification | Yes | Table 7 in Appendix F.6 states: "Training time compared with top-3 fast methods on C-CIFAR100 (10 tasks) by one Nvidia Geforce GTX 2080 Ti." |
| Software Dependencies | No | The paper lists various baseline methods and their code sources (e.g., "https://github.com/aimagelab/mammoth") and hyperparameters (e.g., learning rate) in Appendix E.1. However, it does not explicitly state the specific versions of core software components like Python, PyTorch/TensorFlow, or CUDA used for their own experimental setup. |
| Experiment Setup | Yes | Section 6 "Training Protocol" details the experimental setup: "We use the full Res Net18 as the feature extractor... We use SGD optimizer without momentum and weight decay. The learning rate is set to 0.03 and kept constant. Incoming and buffer batch sizes are both 32. On C-Image Net and C-i Naturalist, we set both batch sizes to 128... By default, we set τ = 1.0 and l = 1 for LAS." |