Offline Learning for Combinatorial Multi-armed Bandits
Authors: Xutong Liu, Xiangxiang Dai, Jinhang Zuo, Siwei Wang, Carlee Joe-Wong, John C.S. Lui, Wei Chen
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical Validation: Finally, extensive experiments on both synthetic and real-world datasets for learning to rank and LLM caching validate the superior performance of CLCB compared to baseline algorithms. |
| Researcher Affiliation | Collaboration | 1ECE Department, Carnegie Mellon University, Pittsburgh PA, United States 2CSE Department, Chinese University of Hong Kong, Hong Kong SAR, China 3CS Department, City University of Hong Kong, Hong Kong SAR, China 4Microsoft Research, Beijing, China. |
| Pseudocode | Yes | Algorithm 1 CLCB: Combinatorial Lower Confidence Bound Algorithm for Off-CMAB |
| Open Source Code | No | The paper does not provide explicit links to source code for the methodology described, nor does it contain an unambiguous statement of code release. |
| Open Datasets | Yes | For real-world evaluation, we use the Yelp dataset3, where users rate businesses (Dai et al., 2024c). ... We use the Sci Q dataset (Welbl et al., 2017). |
| Dataset Splits | No | The paper mentions running experiments over a certain number of rounds (e.g., "n = 100 rounds") or with specific cache sizes, but it does not provide specific training/test/validation dataset splits for reproducibility of data partitioning. |
| Hardware Specification | Yes | All tests were performed on a mac OS system equipped with an Apple M3 Pro processor and 18 GB of RAM. |
| Software Dependencies | No | The paper mentions using GPT-4-o and GPT-4-turbo, along with Open AI’s tiktoken library and Open AI LLM API, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | In the synthetic setup, we simulate 100 distinct queries with a cache size of 40, following a power-law frequency distribution (α = 0.9) as in (Zhu et al., 2023). ... For the evaluation, we work with 100 distinct prompts from the Sci Q dataset in an offline setting, performing a total of 10,000 queries with cache sizes of K = 10 and K = 20, respectively. |