Contextual Bandits for Unbounded Context Distributions

Authors: Puning Zhao, Rongfei Fan, Shaowei Wang, Li Shen, Qixin Zhang, Zong Ke, Tianhang Zheng

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 7. Numerical Experiments To begin with, to validate our theoretical analysis, we run experiments using some synthesized data. We then move on to experiments with the MNIST dataset (Le Cun, 1998). ... From these experiments, it can be observed that the adaptive nearest neighbor method significantly outperforms the other baselines.
Researcher Affiliation Academia 1 Shenzhen Campus of Sun Yat-sen University, Shenzhen, China 2 Guangming Laboratory, Shenzhen, China 3 Beijing Institute of Technology, Beijing, China 4 Guangzhou University, Guangzhou, China 5 Nanyang Technological University, Singapore 6 National University of Singapore, Singapore 7 Zhejiang University, Hangzhou, China. Correspondence to: Li Shen <EMAIL>.
Pseudocode Yes Algorithm 1 Adaptive nearest neighbor with UCB exploration and Algorithm 2 Adaptive nearest neighbor with UCB exploration
Open Source Code No The paper does not provide any statement about making its source code publicly available, nor does it include a link to a code repository. There is no mention of code in supplementary materials either.
Open Datasets Yes 7.2. Real Data Now we run experiments using the MNIST dataset (Le Cun, 1998), which contains 60, 000 images of handwritten digits with size 28 28.
Dataset Splits No Now we run experiments using the MNIST dataset (Le Cun, 1998), which contains 60, 000 images of handwritten digits with size 28 28. Following the settings in (Guan & Jiang, 2018), the images are regarded as contexts, and there are 10 actions from 0 to 9. The reward is 1 if the selected action equals the true label, and 0 otherwise. The paper does not explicitly state the training, validation, or test splits for the MNIST dataset.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments. It only mentions conducting numerical experiments.
Software Dependencies No The paper does not specify any software dependencies or versions, such as programming languages, libraries, or frameworks used for the implementation or experiments.
Experiment Setup Yes In each experiment, we run T = 1,000 steps and compare the performance of the adaptive nearest neighbor method with the UCBogram (Rigollet & Zeevi, 2010), ABSE (Perchet & Rigollet, 2013) and fixed k nearest neighbor method. For a fair comparison, for UCBogram and ABSE, we try different numbers of bins and only pick the one with the best performance. The values in each curve in Figure 1 are averaged over m = 100 random and independent trials. The figure legends also mention fixed k values like "fixed, k=15", "fixed, k=10", etc.