Linear Bandits with Partially Observable Features

Authors: Wonyoung Kim, Sungwoo Park, Garud Iyengar, Assaf Zeevi, Min-Hwan Oh

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our numerical experiments confirm that our algorithm consistently outperforms both OLB and MAB algorithms, validating both its practicality and theoretical guarantees (Section 6).
Researcher Affiliation Academia 1Chung-Ang University, Seoul, Korea 2Seoul National University, Seoul, Korea 3Columbia University, New York, USA.
Pseudocode Yes Algorithm 1 Robust to Latent Feature (Ro LF) Algorithm 2 Robust to Latent Feature with Ridge Estimator (Ro LF-Ridge)
Open Source Code No The paper does not contain an explicit statement about the release of source code, nor does it provide a link to a code repository.
Open Datasets No For both scenarios, the features including the true features za, observed features xa, and unobserved features ua are constructed differently based on the relationship between the row space spanned by the observed features, R(X), and the row space spanned by the unobserved features, R(U). In Case 1, the general case, the true features za for each arm a [K] are sampled from N(0, Idz), and the observed features xa are obtained by truncating the first d elements of za, following the definition given in Eq. (1).
Dataset Splits No The paper describes the generation of simulated data for experiments and sets a total decision horizon (T = 1200) but does not provide traditional training/test/validation dataset splits or specific percentages for data partitioning, as it's a bandit problem with online learning.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or cloud computing resources) used to run the numerical experiments.
Software Dependencies No The paper does not mention specific software names with version numbers (e.g., Python, PyTorch, or specific libraries) that were used for the implementation or experiments.
Experiment Setup Yes For the hyperparamters in our algorithms, the coupling probability p and the confidence parameter δ, are set to 0.6 and 10 4, respectively. The total decision horizon is T = 1200. Throughout the experiments, we fix the number of arms at K = 30 and the dimension of the true features at dz = 35.