Switching Latent Bandits
Authors: Alessio Russo, Alberto Maria Metelli, Marcello Restelli
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, Section 7 shows numerical simulations on synthetic and semi-synthetic data. We provide additional experiments that highlight the difference in performance between our estimation procedure and a technique based on SD approaches. |
| Researcher Affiliation | Academia | Alessio Russo EMAIL Department of Electronics, Information and Bioengineering Politecnico di Milano |
| Pseudocode | Yes | Algorithm 1: Estimation Procedure Input: Action Observation matrix O, number of rounds N Algorithm 2: SL-EC Algorithm Input: Observation model O, Exploration horizon T0, Total horizon T |
| Open Source Code | No | The paper does not contain any explicit statement about providing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | In this section, we provide numerical simulations on synthetic and semi-synthetic data based on the Movie Lens 1M (Harper & Konstan, 2015) dataset, demonstrating the effectiveness of the proposed Markov chain estimation procedure. |
| Dataset Splits | Yes | From the obtained matrix, we select 70% of all ratings as a training dataset and use the remaining 30% as a test set. |
| Hardware Specification | Yes | Experiments involving a higher number of states instead were not able to reach convergence with a number of samples of the order 10^5 and, by trying to increase this quantity, there were memory space problems with the used hardware (Intel i7-11th and 16G RAM). |
| Software Dependencies | No | The paper does not explicitly mention any specific software dependencies with version numbers. |
| Experiment Setup | Yes | The parameters for all the baseline algorithms have been properly tuned according to the settings considered. For the specific experiments considered, we adopted scaled values for the exploration horizon T0 w.r.t. the result derived from the theory. |