Efficient Graph Bandit Learning with Side-Observations and Switching Constraints
Authors: Xueping Gong, Jiheng Zhang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive numerical experiments on various types of graphs, including two real-world datasets, demonstrate the efficacy of our proposed methods and their advantages over benchmark methods in graph bandit settings. |
| Researcher Affiliation | Academia | Xueping Gong, Jiheng Zhang The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong S.A.R., China. EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: UCB-GGmax Algorithm 2: LP-GG |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the methodology described is publicly available. |
| Open Datasets | Yes | we conduct numerical experiments on the dataset of roads in a small area of California (Leskovec and Krevl 2014). We conduct numerical experiments on the dataset of products in Amazon (Leskovec and Krevl 2014) Leskovec, J.; and Krevl, A. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/ data. |
| Dataset Splits | No | The paper describes how reward distributions are initialized and graphs are generated for simulations, and mentions the use of real-world datasets, but it does not specify any training/test/validation splits for these datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. It only mentions running simulations. |
| Software Dependencies | No | The paper mentions implementing Q-learning and its hyperparameters but does not specify any software libraries, frameworks, or their version numbers used for the implementation or experiments. |
| Experiment Setup | Yes | Q-learning (Sutton and Barto 2018): we choose the learning rate 0.5, the discount factor 0.9 and the exploration probability min(1, 2|V |/T). |