Distributed Multi-Agent Lifelong Learning
Authors: Prithviraj Tarale, Edward Rietman, Hava T Siegelmann
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our solutions on vision datasets CIFAR-100 and Mini Image Net. |
| Researcher Affiliation | Academia | Prithviraj P. Tarale EMAIL Department of Computer Science University of Massachusetts Amherst Amherst, MA 01003 |
| Pseudocode | Yes | Algorithm 1 The PEEPLL Algorithm Algorithm 2 Evaluating TRUE Confidence |
| Open Source Code | Yes | Our code is available at https://github.com/Prithvitarale/peepll. |
| Open Datasets | Yes | We evaluate our solutions on vision datasets CIFAR-100 and Mini Image Net. |
| Dataset Splits | No | The data in the training set is divided into two parts: (1.1) pretraining data for all agents and (1.2) lifelong learning data for the QA. The pretraining data (1.1) is further divided into (1.1.Q-Pre) and (1.1.R-Pre). (1.1.Q-Pre) contains data from 5 classes assigned for the QA s pretraining. (1.1.R-Pre) contains data from classes assigned for the pretraining of the RAs. These classes are distributed among RAs, ensuring that 2-3 RAs are assigned to each class to foster diverse responses. Each RA is trained on data from its assigned classes in (1.1.R-Pre). From the lifelong learning data (1.2), the data for classes not in (1.1.Q-Pre) are extracted and referred to as (1.2.LL). (1.2.LL) is segmented into distinct tasks, each comprising a unique set of classes to facilitate the Class-Incremental Scenario as described by Aljundi et al. (2019b;a); Mai et al. (2020). Taskt has classes, Ct, absent in previously seen tasks, Task1 (t 1), such that Ct C \ C1 (t 1). The QA is incrementally introduced to a Taskt, which it learns in communication with the other agents (RAs) in the network. |
| Hardware Specification | No | The paper does not explicitly mention specific hardware (e.g., GPU models, CPU models, memory) used for running its experiments. It mentions 'edge conditions' in a general sense, but not specific hardware specifications for the experimental setup. |
| Software Dependencies | No | We implemented conventional single-agent LL strategies using Avalanche. We used maximum memory as 5k, the batch size of memory as 5k, and the Class Balanced Buffer style of memory with no adaptive size. This is the same for our implementations of our PEEPLL strategies. While Avalanche is mentioned, a specific version number is not provided. Other software components or libraries with version numbers are not specified. |
| Experiment Setup | Yes | We use VGG16 (Simonyan & Zisserman, 2015) as the backbone for our agents under PEEPLL. We deviate from the current lifelong learning research by doing so that currently uses Res Net18 (He et al., 2015). Our methodology utilizes VGG16 due to its lack of skip connections, enabling the isolation of our lifelong learning strategies impact. Res Net18 s skip connections, known to mitigate vanishing gradients, could introduce confounding factors. PEEPLL models employ VGG16 as the encoder with a small MLP decoder for task-specific outputs. The total parameters of our PEEPLL model were 15417124. We use the same Optimizer hyperparameters for pretraining and lifelong learning, as it would not be practical to tune those parameters further for online learning. We implemented conventional single-agent LL strategies using Avalanche. We used maximum memory as 5k, the batch size of memory as 5k, and the Class Balanced Buffer style of memory with no adaptive size. This is the same for our implementations of our PEEPLL strategies. |