Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Understanding Bandits with Graph Feedback
Authors: Houshuang Chen, zengfeng Huang, Shuai Li, Chihao Zhang
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose the notions of the fractional weak domination number δ and the k-packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual the fractional vertex packing set respectively. Based on this connection, we utilize the strong duality theorem to prove a general regret upper bound O (δ log |V |) 1 3 T 2 3 and a lower bound Ω (δ /α) 1 3 T 2 3. Our main algorithmic result is: Theorem 1. There exists an algorithm such that for any weakly observable graph, any time horizon T n3 log(n)/δ 2(G), its regret satisfies R(G, T) = O (δ (G) log n) 1 3 T 2 3. |
| Researcher Affiliation | Academia | Houshuang Chen Shanghai Jiao Tong University EMAIL Zengfeng Huang Fudan University EMAIL Shuai Li Shanghai Jiao Tong University EMAIL Chihao Zhang Shanghai Jiao Tong University EMAIL |
| Pseudocode | Yes | Algorithm 1: Online Stochastic Mirror Descent with Exploration |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments on specific datasets, therefore it does not mention publicly available datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with datasets, thus it does not specify validation splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper describes algorithms (OSMD, EXP3) but does not provide any specific software names or version numbers used for implementation or experiments. |
| Experiment Setup | No | The paper is theoretical and presents an algorithm (Algorithm 1) but does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings. |