Coreset Selection via Reducible Loss in Continual Learning

Authors: Ruilin Tong, Yuhang Liu, Javen Qinfeng Shi, Dong Gong

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on data summarization and continual learning demonstrate the effectiveness and efficiency of our approach. Our code is available at https://github.com/Ruilin Tong/CSRe L-Coreset-CL.
Researcher Affiliation Academia 1 School of Computer Science and Engineering, University of New South Wales 2 Australian Institute for Machine Learning, The University of Adelaide EMAIL EMAIL
Pseudocode Yes Algorithm 1: The proposed CSRe L; Algorithm 2: CSRe L Continual Learning; Algorithm 3: Coreset selection considering previous tasks; Algorithm 4: CSRe L-CL-Prv; Algorithm 5: CSRe L-RS; Algorithm 6: Continual learning with CSRe L-RS
Open Source Code Yes Our code is available at https://github.com/Ruilin Tong/CSRe L-Coreset-CL.
Open Datasets Yes We evaluate our CSRe L selection method on MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2009) and more challenging CIFAR-100 (Krizhevsky et al., 2009). We conduct experiments on Split MNIST (Zenke et al., 2017), Perm MNIST (Goodfellow et al., 2013), Split CIFAR-10 and CIFAR-100. We evaluate our methods on Split CIFAR-100 and Split Tiny Image Net (Wu et al., 2017) with memory sizes of 200 and 500. Additionally, we evaluate the scalability of our method on a 500class subset of Image Net-1K in Appendix N.
Dataset Splits Yes Following Borsos et al. (2020) we split CIFAR-10 and MNIST into 5 tasks with 2 classes for each task. We split CIFAR-100 into 10 tasks with 10 classes for each class. For Perm MNIST contains 10 tasks and each task is randomly permuted version of MNIST.
Hardware Specification Yes All our experiments are conducted on NVIDIA RTX3090. We conduct all experiments on 2 NVIDIA Ge Force RTX 3090 graphical cards with 24 GB memory for each graphical card. Our CPU type is Intel(R) Core(TM) i9-12900K, memory of our server is 32 GB.
Software Dependencies No The paper mentions 'Adam optimizer (Kingma, 2014)' and 'SGD optimizer' but does not provide specific version numbers for software libraries, frameworks (like PyTorch or TensorFlow), or programming languages used.
Experiment Setup Yes We use CNN as backbone for MNIST, which contains two blocks of convolution, dropout, maxpooling and Re LU activation, two convolution layers have 32 and 64 filters with 5x5 kernel size. Two fully connected layers of size 128 and 10 with dropout follows convolution blocks. The dropout probability is 0.5. We train CNN on selected coreset using SGD optimizer with learning rate 2e-2, training batch size is set to 32 and training epochs is set to 3000. This training protocol is applied in all compared methods. For experiments on CIFAR-10 and CIFAR-100, we train Res Net on selected coreset using Adam optimizer with learning rate 5e-4, training batch size is set to 64 and training epochs is set to 1800. Table 13 and Table 14 provide specific optimization hyperparameters for continual learning experiments including batch size, epochs, optimizer, learning rate, and loss factors.