Cooperative Online Learning with Feedback Graphs
Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Riccardo Della Vecchia
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic data confirm our theoretical findings. |
| Researcher Affiliation | Academia | Nicolò Cesa-Bianchi EMAIL Politecnico di Milano & Università degli Studi di Milano, Milano, Italy Tommaso Cesari EMAIL School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada Riccardo Della Vecchia EMAIL Inria, Université de Lille, CNRS, Centrale Lille, UMR 9189 CRISt AL |
| Pseudocode | Yes | Algorithm 1: Exp3-α2 (Locally run by each agent v A) |
| Open Source Code | Yes | Our code available at Della Vecchia (2024). Riccardo Della Vecchia. Cooperative online learning with feedback graphs. https://github.com/riccardodv/COOP-learning, 2024. |
| Open Datasets | No | Experiments on synthetic data confirm our theoretical findings. In our experiments, we fix the time horizon (T = 10,000), the number of arms (K = 20), and the number of agents (A = 20). ... The feedback graph F and the communication graph N are Erdős Rényi random graphs of parameters p N, p F {0.2, 0.8}. |
| Dataset Splits | No | The paper uses synthetic data generated based on specified parameters (time horizon, number of arms, agents, graph types, etc.) but does not describe traditional dataset splits (e.g., train/test/validation percentages or counts) for a pre-existing dataset. It mentions 20 repetitions of each experiment for statistical averaging. |
| Hardware Specification | Yes | Experiments were run on a local cluster of CPUs (Intel Xeon E5-2623 v3, 3.00GHz), parallelizing the code over four cores. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | In our experiments, we fix the time horizon (T = 10,000), the number of arms (K = 20), and the number of agents (A = 20). We also set the delay δN to 1. The loss of each action is a Bernoulli random variable of parameter 1/2, except for the optimal action which has parameter 1/2 p K/T. The activation probabilities q(v) are the same for all agents v A, and range in the set {0.05, 0.5, 1}. This implies that Q {1, 10, 20}. The feedback graph F and the communication graph N are Erdős Rényi random graphs of parameters p N, p F {0.2, 0.8}. For each choice of the parameters, the same realization of N and F was kept fixed in all the experiments, see Figure 1. |