Zero-Inflated Bandits

Authors: Haoyu Wei, Runzhe Wan, Lei Shi, Rui Song

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The superior empirical performance of these methods is demonstrated through extensive numerical studies.
Researcher Affiliation Collaboration 1Department of Economics, University of California San Diego, La Jolla, USA 2Amazon, Seattle, USA. This work does not relate to the positions at Amazon. Correspondence to: Haoyu Wei <EMAIL>, Rui Song <EMAIL>.
Pseudocode Yes Algorithm 1 UCB for ZI MAB with light tails; Algorithm 2 General template of UCB for ZI contextual bandits; Algorithm B.1 UCB for zero-inflated bandits with heavy tails; Algorithm C.1 TS for zero-inflated MAB with light tails; Algorithm C.2 General template of UCB for zero-inflated generalized linear bandits; Algorithm C.3 General template of TS for ZI GLM bandits
Open Source Code No The paper does not provide any explicit statement or link to its own open-source code for the methodology described.
Open Datasets Yes We apply our ZI contextual bandit algorithms to a real dataset of loan records from a U.S. online auto loan company, a widely studied public dataset (Columbia Business School, Columbia University, 2024; Phillips et al., 2015; Ban & Keskin, 2021; Bastani et al., 2022). This dataset can be accessed upon request at https://business.columbia.edu/cprm .
Dataset Splits No The paper describes an online evaluation process for the real data application and does not provide specific training/test/validation dataset splits or methodologies for reproducibility of data partitioning.
Hardware Specification Yes Our simulations were performed on a Mac mini (2020) with an M1 chip and 8GB of RAM.
Software Dependencies No The paper mentions several algorithms and techniques (e.g., GMS generation, Huber regression) but does not provide specific version numbers for any software libraries, frameworks, or solvers used in its implementation.
Experiment Setup Yes For UCB-type algorithms, we set the confidence level δ = 4/T^2 consistently across all experiments. Prior parameters and tuning parameters for TS-type algorithms follow the recommendations in (Jin et al., 2021; Shi et al., 2023) for the MOTS algorithm and GMS generation. For contextual bandits, we design K = 100 arms and generate a vector νk ∈ R^d with d = 10 for each arm k ∈ [K]. The zero-inflation structure θ exhibits sparsity s, meaning only s elements in θ are derived from a uniform distribution, while the rest are zeros. For TS algorithms, we follow the original setting in Agrawal & Goyal (2013) while adding hyper-tuning parameters ϵX and ϵY such that ϵX, ϵY ∝ log 1/T suggested by Zhong et al. (2021) as φ^2_β = 24σ^2d / (K^2ϵX log(1/δ)) and φ^2_θ = 24σ^2d / (K^2ϵY log(1/δ)). For simplicity, in our implementation, we fix ϵX and ϵY to log 1/T. Additionally, for both UCB and TS algorithms, we set the parameter δ to 1/T.