Zero-Inflated Bandits
Authors: Haoyu Wei, Runzhe Wan, Lei Shi, Rui Song
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The superior empirical performance of these methods is demonstrated through extensive numerical studies. |
| Researcher Affiliation | Collaboration | 1Department of Economics, University of California San Diego, La Jolla, USA 2Amazon, Seattle, USA. This work does not relate to the positions at Amazon. Correspondence to: Haoyu Wei <EMAIL>, Rui Song <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 UCB for ZI MAB with light tails; Algorithm 2 General template of UCB for ZI contextual bandits; Algorithm B.1 UCB for zero-inflated bandits with heavy tails; Algorithm C.1 TS for zero-inflated MAB with light tails; Algorithm C.2 General template of UCB for zero-inflated generalized linear bandits; Algorithm C.3 General template of TS for ZI GLM bandits |
| Open Source Code | No | The paper does not provide any explicit statement or link to its own open-source code for the methodology described. |
| Open Datasets | Yes | We apply our ZI contextual bandit algorithms to a real dataset of loan records from a U.S. online auto loan company, a widely studied public dataset (Columbia Business School, Columbia University, 2024; Phillips et al., 2015; Ban & Keskin, 2021; Bastani et al., 2022). This dataset can be accessed upon request at https://business.columbia.edu/cprm . |
| Dataset Splits | No | The paper describes an online evaluation process for the real data application and does not provide specific training/test/validation dataset splits or methodologies for reproducibility of data partitioning. |
| Hardware Specification | Yes | Our simulations were performed on a Mac mini (2020) with an M1 chip and 8GB of RAM. |
| Software Dependencies | No | The paper mentions several algorithms and techniques (e.g., GMS generation, Huber regression) but does not provide specific version numbers for any software libraries, frameworks, or solvers used in its implementation. |
| Experiment Setup | Yes | For UCB-type algorithms, we set the confidence level δ = 4/T^2 consistently across all experiments. Prior parameters and tuning parameters for TS-type algorithms follow the recommendations in (Jin et al., 2021; Shi et al., 2023) for the MOTS algorithm and GMS generation. For contextual bandits, we design K = 100 arms and generate a vector νk ∈ R^d with d = 10 for each arm k ∈ [K]. The zero-inflation structure θ exhibits sparsity s, meaning only s elements in θ are derived from a uniform distribution, while the rest are zeros. For TS algorithms, we follow the original setting in Agrawal & Goyal (2013) while adding hyper-tuning parameters ϵX and ϵY such that ϵX, ϵY ∝ log 1/T suggested by Zhong et al. (2021) as φ^2_β = 24σ^2d / (K^2ϵX log(1/δ)) and φ^2_θ = 24σ^2d / (K^2ϵY log(1/δ)). For simplicity, in our implementation, we fix ϵX and ϵY to log 1/T. Additionally, for both UCB and TS algorithms, we set the parameter δ to 1/T. |