Privacy-Aware Lifelong Learning
Authors: Ozan Özdenizci, Elmar Rueckert, Robert Legenstein
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the scalability of PALL across various architectures in image classification, and provide a state-of-the-art solution that uniquely integrates lifelong learning and privacyaware unlearning mechanisms for responsible AI applications. 5 EXPERIMENTAL RESULTS In Table 1 we evaluate our approach against state-of-the-art methods in lifelong learning, by extending various methods to the experimental setting of task incremental learning and unlearning. |
| Researcher Affiliation | Academia | Ozan Ozdenizci1, Elmar Rueckert1, Robert Legenstein2 1 Chair of Cyber-Physical-Systems, Montanuniversit at Leoben, Austria 2 Institute of Machine Learning and Neural Computation, Graz University of Technology, Austria EMAIL EMAIL |
| Pseudocode | Yes | Our proposed methodology for privacy-aware lifelong learning is outlined in Algorithm 1. Algorithm 1 Privacy-Aware Lifelong Learning (PALL) |
| Open Source Code | Yes | Our code is available at: https://github.com/oozdenizci/PALL. Our algorithm is also outlined in Appendix A.2, and our implementations are available at: https://github.com/oozdenizci/PALL. |
| Open Datasets | Yes | We performed experiments with sequential CIFAR10 (S-CIFAR10: 5 tasks 2 classes), CIFAR100 (S-CIFAR100: 10 tasks 10 classes) and Tiny Image Net (S-Tiny Image Net: 20 tasks 10 classes, 40 tasks 5 classes, or 100 tasks 2 classes) datasets. We used Res Net-18 and Res Net-34 models in S-CIFAR10/100 experiments which are commonly used as benchmarks in lifelong learning, and attention-based Vi T-T/8 architectures with S-Tiny Image Net (see Appendix A.1 for details). |
| Dataset Splits | Yes | We experimented with S-CIFAR10: 5 tasks 2 classes, S-CIFAR100: 10 tasks 10 classes (Krizhevsky, 2009), and S-Tiny Image Net in various sequential configurations: 20 tasks 10 classes, 40 tasks 5 classes, and 100 tasks 2 classes (Le & Yang, 2015). To adapt these datasets for our experimental setting, we first randomly allocate the classes in the dataset into disjoint tasks, and subsequently generate user request sequences R1:r consisting of T task learning and Nu task unlearning instructions arranged in a logically consistent manner. This process was repeated in a randomized manner for each random seed (e.g., seed 0 to 19). |
| Hardware Specification | Yes | We compare training times of each algorithm in Table A7 using an NVIDIA A40 GPU. We evaluate all algorithms on the same test sequence that begins as: R1:r = [(1, D1, L), (2, D2, L), (1, U), . . .]. |
| Software Dependencies | No | The paper mentions optimizers like SGD and Adam, but does not provide specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | Training Configurations: We use a stochastic gradient descent (SGD) optimizer with momentum for 20 epochs per S-CIFAR10/100 task learning instruction, with a batch size of 32, learning rate of 0.01, and weight decay with parameter 0.0005. For S-Tiny Image Net, we use an Adam optimizer for 100 epochs with a batch size of 256, and a cosine annealing learning rate scheduler with an initial value of 0.001. Here, we do not use weight decay but instead apply dropout to intermediate activations of Vi T-T/8 with p = 0.1 (Steiner et al., 2022). All methods requiring a memory buffer had a total capacity of 500 and 1000 samples (evenly split across tasks) in S-CIFAR and S-Tiny Image Net experiments, respectively. Unless otherwise specified, for Eq. (6) we use Nf = 50 and β = 0.5 (see Appendix A.2 for further details). |