Neural Contextual Bandits with Deep Representation and Shallow Exploration

Authors: Pan Xu, Zheng Wen, Handong Zhao, Quanquan Gu

ICLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on contextual bandit problems based on real-world datasets, demonstrating a better performance and computational efficiency of Neural-Lin UCB over Lin UCB and existing neural bandits algorithms such as Neural UCB, which well aligns with our theory.
Researcher Affiliation Collaboration Pan Xu California Institute of Technology EMAIL Zheng Wen Deep Mind EMAIL Handong Zhao Adobe Research EMAIL Quanquan Gu University of California, Los Angeles EMAIL
Pseudocode Yes Algorithm 1 Deep Representation and Shallow Exploration (Neural-Lin UCB) ... Algorithm 2 Update Weight Parameters with Gradient Descent
Open Source Code No The paper does not provide a statement or link for open-sourcing the code.
Open Datasets Yes Specifically, following the experimental setting in Zhou et al. (2020),we use datasets (Shuttle) Statlog, Magic and Covertype from UCI machine learning repository (Dua & Graff, 2017), and the MINST dataset from Le Cun et al. (1998).
Dataset Splits No The paper mentions using
Hardware Specification Yes All numerical experiments were run on a workstation with Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz.
Software Dependencies No The paper mentions using 'Re LU neural network' and 'stochastic gradient decent' but does not specify software versions for libraries like PyTorch, TensorFlow, or scikit-learn.
Experiment Setup Yes We use a Re LU neural network defined as in (2.3) with L = 2 and m = 100 for the UCI datasets (Statlog, Magic, Covertype). ... We set the time horizon T = 15, 000... We use stochastic gradient decent to optimize the network weights, with a step size ηq =1e-5 and maximum iteration number n = 1, 000. ... the network parameter w is updated every H = 100 rounds... We set λ = 1 and αt = 0.02 for all algorithms, t [T].