Contextual Bandits for Unbounded Context Distributions
Authors: Puning Zhao, Rongfei Fan, Shaowei Wang, Li Shen, Qixin Zhang, Zong Ke, Tianhang Zheng
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 7. Numerical Experiments To begin with, to validate our theoretical analysis, we run experiments using some synthesized data. We then move on to experiments with the MNIST dataset (Le Cun, 1998). ... From these experiments, it can be observed that the adaptive nearest neighbor method significantly outperforms the other baselines. |
| Researcher Affiliation | Academia | 1 Shenzhen Campus of Sun Yat-sen University, Shenzhen, China 2 Guangming Laboratory, Shenzhen, China 3 Beijing Institute of Technology, Beijing, China 4 Guangzhou University, Guangzhou, China 5 Nanyang Technological University, Singapore 6 National University of Singapore, Singapore 7 Zhejiang University, Hangzhou, China. Correspondence to: Li Shen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Adaptive nearest neighbor with UCB exploration and Algorithm 2 Adaptive nearest neighbor with UCB exploration |
| Open Source Code | No | The paper does not provide any statement about making its source code publicly available, nor does it include a link to a code repository. There is no mention of code in supplementary materials either. |
| Open Datasets | Yes | 7.2. Real Data Now we run experiments using the MNIST dataset (Le Cun, 1998), which contains 60, 000 images of handwritten digits with size 28 28. |
| Dataset Splits | No | Now we run experiments using the MNIST dataset (Le Cun, 1998), which contains 60, 000 images of handwritten digits with size 28 28. Following the settings in (Guan & Jiang, 2018), the images are regarded as contexts, and there are 10 actions from 0 to 9. The reward is 1 if the selected action equals the true label, and 0 otherwise. The paper does not explicitly state the training, validation, or test splits for the MNIST dataset. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments. It only mentions conducting numerical experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies or versions, such as programming languages, libraries, or frameworks used for the implementation or experiments. |
| Experiment Setup | Yes | In each experiment, we run T = 1,000 steps and compare the performance of the adaptive nearest neighbor method with the UCBogram (Rigollet & Zeevi, 2010), ABSE (Perchet & Rigollet, 2013) and fixed k nearest neighbor method. For a fair comparison, for UCBogram and ABSE, we try different numbers of bins and only pick the one with the best performance. The values in each curve in Figure 1 are averaged over m = 100 random and independent trials. The figure legends also mention fixed k values like "fixed, k=15", "fixed, k=10", etc. |