Linear Multi-Resource Allocation with Semi-Bandit Feedback
Authors: Tor Lattimore, Koby Crammer, Csaba Szepesvari
NeurIPS 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two experiments to demonstrate the behaviour of Algorithm 2. All code and data is available in the supplementary material. |
| Researcher Affiliation | Academia | Tor Lattimore Department of Computing Science University of Alberta, Canada EMAIL Koby Crammer Department of Electrical Engineering The Technion, Israel EMAIL Csaba Szepesv ari Department of Computing Science University of Alberta, Canada EMAIL |
| Pseudocode | Yes | Algorithm 1 and Algorithm 2 are presented in the paper. |
| Open Source Code | Yes | All code and data is available in the supplementary material. |
| Open Datasets | Yes | All code and data is available in the supplementary material. For this experiment we used D = K = 2 and n = 106 and ν = 8/10 2/10 4/10 2 and M = 1 0 1/2 1/2 where the kth column is the parameter/allocation for the kth task. We fix n = 5 105 and D = K = 2. For α ∈ (0, 1) we define να = 1/2 α/2 1/2 α/2 and M = 1 0 1 0. |
| Dataset Splits | No | The paper describes a sequential online learning problem and does not explicitly state traditional training/validation/test dataset splits. It refers to a time horizon 'n' and average regret over runs. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., GPU/CPU models, memory details). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For this experiment we used D = K = 2 and n = 106 and ν = 8/10 2/10 4/10 2 and M = 1 0 1/2 1/2 where the kth column is the parameter/allocation for the kth task. We ran two versions of the algorithm. The first, exactly as given in Algorithm 2 and the second identical except that the weights were fixed to γtk = 4 for all t and k (this value is chosen because it corresponds to the minimum inverse variance for a Bernoulli variable). The data was produced by taking the average regret over 8 runs. We fix n = 5 105 and D = K = 2. For α ∈ (0, 1) we define να = 1/2 α/2 1/2 α/2 and M = 1 0 1 0. |