Collapsing Bandits and Their Application to Public Health Intervention
Authors: Aditya Mate, Jackson Killian, Haifeng Xu, Andrew Perrault, Milind Tambe
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques, while achieving similar performance. |
| Researcher Affiliation | Academia | Aditya Mate Harvard University Cambridge, MA, 02138 EMAIL Jackson A. Killian Harvard University Cambridge, MA, 02138 EMAIL Haifeng Xu University of Virginia Charlottesville, VA, 22903 EMAIL Andrew Perrault Harvard University Cambridge, MA, 02138 EMAIL Milind Tambe Harvard University Cambridge, MA, 02138 EMAIL |
| Pseudocode | Yes | Algorithm 1: Sequential index computation algorithm |
| Open Source Code | Yes | The code is available at: https://github.com/Aditya Mate/collapsing_bandits |
| Open Datasets | Yes | We first test on tuberculosis medication adherence monitoring data, which contains daily adherence information recorded for each real patient in the system, as obtained from Killian et al. [17]. |
| Dataset Splits | No | The paper does not explicitly state specific training, validation, or test dataset splits (e.g., percentages or sample counts). It mentions using real-world data and synthetic distributions for evaluation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, or memory specifications). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., libraries, frameworks, or programming languages). |
| Experiment Setup | Yes | Reward is measured as the undiscounted sum of patients (arms) in the adherent state over all rounds, where each trial lasts T = 180 days (matching the length of first-line TB treatment) with N patients and a budget of k calls per day. All experiments in this section set all δ to 0.05. ... We set the resource level, k = 10%N in our simulation for Fig. 5a. |