Cooperative Online Learning with Feedback Graphs

Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Riccardo Della Vecchia

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic data confirm our theoretical findings.
Researcher Affiliation Academia Nicolò Cesa-Bianchi EMAIL Politecnico di Milano & Università degli Studi di Milano, Milano, Italy Tommaso Cesari EMAIL School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada Riccardo Della Vecchia EMAIL Inria, Université de Lille, CNRS, Centrale Lille, UMR 9189 CRISt AL
Pseudocode Yes Algorithm 1: Exp3-α2 (Locally run by each agent v A)
Open Source Code Yes Our code available at Della Vecchia (2024). Riccardo Della Vecchia. Cooperative online learning with feedback graphs. https://github.com/riccardodv/COOP-learning, 2024.
Open Datasets No Experiments on synthetic data confirm our theoretical findings. In our experiments, we fix the time horizon (T = 10,000), the number of arms (K = 20), and the number of agents (A = 20). ... The feedback graph F and the communication graph N are Erdős Rényi random graphs of parameters p N, p F {0.2, 0.8}.
Dataset Splits No The paper uses synthetic data generated based on specified parameters (time horizon, number of arms, agents, graph types, etc.) but does not describe traditional dataset splits (e.g., train/test/validation percentages or counts) for a pre-existing dataset. It mentions 20 repetitions of each experiment for statistical averaging.
Hardware Specification Yes Experiments were run on a local cluster of CPUs (Intel Xeon E5-2623 v3, 3.00GHz), parallelizing the code over four cores.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes In our experiments, we fix the time horizon (T = 10,000), the number of arms (K = 20), and the number of agents (A = 20). We also set the delay δN to 1. The loss of each action is a Bernoulli random variable of parameter 1/2, except for the optimal action which has parameter 1/2 p K/T. The activation probabilities q(v) are the same for all agents v A, and range in the set {0.05, 0.5, 1}. This implies that Q {1, 10, 20}. The feedback graph F and the communication graph N are Erdős Rényi random graphs of parameters p N, p F {0.2, 0.8}. For each choice of the parameters, the same realization of N and F was kept fixed in all the experiments, see Figure 1.