Towards Domain Adaptive Neural Contextual Bandits

Authors: Ziyan Wang, Xiaoming Huo, Hao Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that our approach outperforms the state-of-the-art contextual bandit algorithms on real-world datasets. Code will soon be available at https:// github.com/Wang-ML-Lab/DABand. In this section, we compare DABand with existing methods on real-world datasets. To demonstrate the effectiveness of our DABand, We evaluate our methods in terms of prediction accuracy and zero-shot target regret on three datasets, i.e., DIGIT (Ganin et al., 2016a), Vis DA17 (Peng et al., 2017), and S2RDA49 (Tang & Jia, 2023).
Researcher Affiliation Academia Ziyan Wang1, Xiaoming Huo1, Hao Wang2 Georgia Institute of Technology1 Rutgers University2 EMAIL1 EMAIL.edu2
Pseudocode Yes Algorithm 1 DABand Training Algorithm
Open Source Code No Code will soon be available at https://github.com/Wang-ML-Lab/DABand.
Open Datasets Yes To demonstrate the effectiveness of our DABand, We evaluate our methods in terms of prediction accuracy and zero-shot target regret on three datasets, i.e., DIGIT (Ganin et al., 2016a), Vis DA17 (Peng et al., 2017), and S2RDA49 (Tang & Jia, 2023). See details for each dataset in Appendix B.
Dataset Splits Yes Vis DA17. The training set consists of 3D rendering images, whereas the validation and test sets feature real images from the COCO (Lin et al., 2014) and You Tube Bounding Boxes (Real et al., 2017) datasets, respectively. ... For our purposes, we use the training set as the source domain and the validation set as the target domain. ... S2RDA49. ... The source domain (i.e., the synthetic domain) is synthesized by rendering 3D models from Shape Net (Chang et al., 2015). ... The target domain (i.e., the real domain) of S2RDA49 contains 60535 images from 49 classes, collected from the Image Net validation set (Deng et al., 2009), Object Net (Barbu et al., 2019), Vis DA2017 validation set (Peng et al., 2017), and the web. ... DIGIT. Within the DIGIT dataset framework, we use the MNIST dataset as the source domain and MNIST-M as the target domain.
Hardware Specification Yes We use Pytorch to implement our method, and all experiments are run on servers with NVIDA A5000 GPUs.
Software Dependencies No We use Pytorch to implement our method, and all experiments are run on servers with NVIDA A5000 GPUs. ... Use the Adam optimizer (Diederik, 2014) to update encoder bϕi and discriminator g by back-propagation to solve the minimax optimization in Eqn. (10).
Experiment Setup Yes For the optimal hyperparameters, we set the learning rate to 1 e 5, with λ is chosen from {1.0, 5.0, 10.0, 15.0, 20.0} and kept the same for all experiments. Additionally, we set the exploration rate α to 0.05. ... Each image undergoes normalization and is resized to 28 28 pixels with 3 channels to accommodate the format requirements of both domains. Then, an encoder is utilized to diminish the data s dimensionality to a more manageable latent space. Following this reduction, the data is processed through two fully connected neural network layers, ending in the final latent space necessary for loss computation as delineated in main paper. ... For hyperparameters, the learning rate of 1 e 5 is applied, with λ is chosen from {1.0, 5.0, 10.0, 15.0, 20.0} and then kept the same for all experiments. We set the exploration rate α to 0.05. ... For hyperparameters, the learning rate of 1 e 3 is applied, with λ chosen from {1.0, 5.0, 10.0, 15.0, 20.0} and then kept the same for all experiments. We set the exploration rate α to 0.01. ... Each episode contains 64 samples.