Multi-Player Bandits: The Adversarial Case
Authors: Pragnya Alatur, Kfir Y. Levy, Andreas Krause
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We consider a setting where multiple players sequentially choose among a common set of actions (arms). ... In this work, we design the first multi-player Bandit algorithm that provably works in arbitrarily changing environments... We run experiments with three different setups and compare the performance of C&P to the Musical Chairs algorithm (MC) from Rosenski et al. (2016). ... For each setup, we create a plot that shows the average regret and the standard deviation (as a colored region around the average). |
| Researcher Affiliation | Academia | Kfir Y. Levy EMAIL Faculty of Electrical Engineering Technion Israel Institute of Technology Haifa, 3200003, Israel Andreas Krause EMAIL Department of Computer Science ETH Zurich 8092 Zürich, Switzerland |
| Pseudocode | Yes | Algorithm 1 K-Metaplayer algorithm (Input: η) Algorithm 2 C&P Ranking Algorithm 3 C&P Coordinator algorithm Algorithm 4 C&P Follower algorithm Algorithm 5 Sampling from a K-DPP (based on Algorithm 1 from Kulesza and Taskar (2012)) Algorithm 6 Sampling K eigenvectors (Algorithm 8 in Kulesza and Taskar (2012)) Algorithm 7 Sub-algorithm for Alg. 6: Computing elementary symmetric polynomials (Algorithm 7 in Kulesza and Taskar (2012)) |
| Open Source Code | No | The paper does not contain any explicit statements about code availability, such as a link to a repository, a mention of supplementary materials containing code, or phrases indicating code release. |
| Open Datasets | No | We choose N mean losses in [0,1] u.a.r. with a gap of at least 0.05 between the Kth and (K + 1)-th best arms. For each arm, the losses are sampled i.i.d. from a Bernoulli distribution with the selected means. |
| Dataset Splits | No | The paper describes generating synthetic data for its experiments but does not mention any training/test/validation splits. The data is generated on the fly based on specified parameters for the Bernoulli distribution. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, frameworks, or solvers). |
| Experiment Setup | Yes | For all experiments, we set N = 8, K = 4, T = 240000, TR = 20 and T0 = 3000. ... We initially set the mean loss µi for each arm i as follows: µ1 = µ2 = µ3 = µ4 = 0.1 and µ5 = µ6 = µ7 = µ8 = 0.3. Each arm i s losses are sampled i.i.d. from Bernoulli distribution Ber(µi). At time T/4, link" (arm) 1 fails and its remaining losses are sampled i.i.d. from Ber(0.9). |