Towards Domain Adaptive Neural Contextual Bandits
Authors: Ziyan Wang, Xiaoming Huo, Hao Wang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that our approach outperforms the state-of-the-art contextual bandit algorithms on real-world datasets. Code will soon be available at https:// github.com/Wang-ML-Lab/DABand. In this section, we compare DABand with existing methods on real-world datasets. To demonstrate the effectiveness of our DABand, We evaluate our methods in terms of prediction accuracy and zero-shot target regret on three datasets, i.e., DIGIT (Ganin et al., 2016a), Vis DA17 (Peng et al., 2017), and S2RDA49 (Tang & Jia, 2023). |
| Researcher Affiliation | Academia | Ziyan Wang1, Xiaoming Huo1, Hao Wang2 Georgia Institute of Technology1 Rutgers University2 EMAIL1 EMAIL.edu2 |
| Pseudocode | Yes | Algorithm 1 DABand Training Algorithm |
| Open Source Code | No | Code will soon be available at https://github.com/Wang-ML-Lab/DABand. |
| Open Datasets | Yes | To demonstrate the effectiveness of our DABand, We evaluate our methods in terms of prediction accuracy and zero-shot target regret on three datasets, i.e., DIGIT (Ganin et al., 2016a), Vis DA17 (Peng et al., 2017), and S2RDA49 (Tang & Jia, 2023). See details for each dataset in Appendix B. |
| Dataset Splits | Yes | Vis DA17. The training set consists of 3D rendering images, whereas the validation and test sets feature real images from the COCO (Lin et al., 2014) and You Tube Bounding Boxes (Real et al., 2017) datasets, respectively. ... For our purposes, we use the training set as the source domain and the validation set as the target domain. ... S2RDA49. ... The source domain (i.e., the synthetic domain) is synthesized by rendering 3D models from Shape Net (Chang et al., 2015). ... The target domain (i.e., the real domain) of S2RDA49 contains 60535 images from 49 classes, collected from the Image Net validation set (Deng et al., 2009), Object Net (Barbu et al., 2019), Vis DA2017 validation set (Peng et al., 2017), and the web. ... DIGIT. Within the DIGIT dataset framework, we use the MNIST dataset as the source domain and MNIST-M as the target domain. |
| Hardware Specification | Yes | We use Pytorch to implement our method, and all experiments are run on servers with NVIDA A5000 GPUs. |
| Software Dependencies | No | We use Pytorch to implement our method, and all experiments are run on servers with NVIDA A5000 GPUs. ... Use the Adam optimizer (Diederik, 2014) to update encoder bϕi and discriminator g by back-propagation to solve the minimax optimization in Eqn. (10). |
| Experiment Setup | Yes | For the optimal hyperparameters, we set the learning rate to 1 e 5, with λ is chosen from {1.0, 5.0, 10.0, 15.0, 20.0} and kept the same for all experiments. Additionally, we set the exploration rate α to 0.05. ... Each image undergoes normalization and is resized to 28 28 pixels with 3 channels to accommodate the format requirements of both domains. Then, an encoder is utilized to diminish the data s dimensionality to a more manageable latent space. Following this reduction, the data is processed through two fully connected neural network layers, ending in the final latent space necessary for loss computation as delineated in main paper. ... For hyperparameters, the learning rate of 1 e 5 is applied, with λ is chosen from {1.0, 5.0, 10.0, 15.0, 20.0} and then kept the same for all experiments. We set the exploration rate α to 0.05. ... For hyperparameters, the learning rate of 1 e 3 is applied, with λ chosen from {1.0, 5.0, 10.0, 15.0, 20.0} and then kept the same for all experiments. We set the exploration rate α to 0.01. ... Each episode contains 64 samples. |