Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Authors: Mohammadamin Banayeeanzade, Mahdi Soltanolkotabi, Mohammad Rostami
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, through extensive empirical evaluations, we demonstrate that our theoretical findings are also applicable to deep neural networks, offering valuable guidance for designing MTL and CL models in practice. |
| Researcher Affiliation | Academia | Amin Banayeeanzade EMAIL Department of Computer Science University of Southern California Mahdi Soltanolkotabi EMAIL Department of Electrical and Computer Engineering University of Southern California Mohammad Rostami EMAIL Department of Computer Science University of Southern California |
| Pseudocode | No | The paper includes mathematical formulations and derivations but does not present any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/aminbana/MTL-Theory . |
| Open Datasets | Yes | We empirically show that our findings for the linear models are generalizable to DNNs by conducting various experiments using different architectures and practical datasets such as CIFAR-100, Imagenet-R, and CUB-200. ... We use CIFAR-100, Image Net-R (IN-R), and CUB-200 datasets in our experiments. |
| Dataset Splits | Yes | We generate MTL or CL tasks by randomly splitting these datasets into 10 tasks, with an equal number of classes in each task (van de Ven et al., 2022). ... The sample budget is linearly proportional to the number of training images stored in the memory, where 100% corresponds to storing 10% of the whole training set. |
| Hardware Specification | Yes | All experiments in the main text are reproducible on a single NVIDIA 2080 TI Ge Force GPU. |
| Software Dependencies | No | We mainly used Py Torch (Paszke et al., 2019) for our implementations and datasets are freely accessible... The paper mentions PyTorch but does not provide a specific version number for it or other key software components like Python or CUDA. |
| Experiment Setup | Yes | We utilized SGD optimizer with a learning rate of 0.01, Nestrov, and 0.95 momentum. We trained all our models for 100 epochs. |