Instance-optimal PAC Algorithms for Contextual Bandits
Authors: Zhaoqi Li, Lillian Ratliff, houssam nassif, Kevin G. Jamieson, Lalit Jain
NeurIPS 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We characterize the first instance-dependent PAC sample complexity of contextual bandits through a quantity , and provide matching upper and lower bounds in terms of for the agnostic and linear contextual best-arm identification settings. We show that no algorithm can be simultaneously minimax-optimal for regret minimization and instance-dependent PAC for best-arm identification. Our main result is a new instance-optimal and computationally efficient algorithm that relies on a polynomial number of calls to an argmax oracle. |
| Researcher Affiliation | Collaboration | Zhaoqi Li Department of Statistics University of Washington EMAIL Lillian Ratliff Department of Electrical and Computer Engineering University of Washington EMAIL Houssam Nassif Amazon EMAIL Kevin Jamieson Allen School of Computer Science & Engineering University of Washington EMAIL Lalit Jain Foster School of Business University of Washington EMAIL |
| Pseudocode | Yes | Algorithm 1 Elimination Contextual RAGE Algorithm 2 Non-elimination Contextual RAGE Algorithm 3 Contextual Oracle-efficient Dualized Algorithm (CODA) |
| Open Source Code | No | The paper does not provide a link to open-source code or explicitly state that code for the methodology is available. The checklist also states '[N/A]' for code reproduction. |
| Open Datasets | No | The paper is theoretical and does not conduct empirical experiments on a publicly available dataset. It makes an assumption about access to a large dataset of contexts D (Assumption 1) but does not specify a real-world public dataset with concrete access information. The checklist also states '[N/A]' for experiments. |
| Dataset Splits | No | The paper is theoretical and does not conduct empirical experiments, thus no training/validation/test splits are specified. The checklist also states '[N/A]' for experiments. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for experiments. The checklist also states '[N/A]' for experiments. |
| Software Dependencies | No | The paper is theoretical and does not describe any specific software dependencies with version numbers used for experiments. The checklist also states '[N/A]' for experiments. |
| Experiment Setup | No | The paper is theoretical and does not describe specific experimental setup details such as hyperparameters. The checklist also states '[N/A]' for experiments. |