reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Information-Theoretic Analysis of Thompson Sampling

Authors: Daniel Russo, Benjamin Van Roy

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization problems in which a decision-maker must learn from partial feedback. This analysis inherits the simplicity and elegance of information theory and leads to regret bounds that scale with the entropy of the optimal-action distribution. This strengthens preexisting results and yields new insight into how information improves performance.
Researcher Affiliation	Academia	Daniel Russo EMAIL Department of Management Science and Engineering Stanford University Stanford, California 94305 Benjamin Van Roy EMAIL Departments of Management Science and Engineering and Electrical Engineering Stanford University Stanford, California 94305
Pseudocode	Yes	Algorithm 1 Linear Gaussian Thompson Sampling
Open Source Code	No	The paper does not provide concrete access to source code. It mentions "forthcoming work of ours leverages this to produce an algorithm that outperforms Thompson sampling," which refers to future work rather than code released for the present paper.
Open Datasets	No	The paper is a theoretical analysis of Thompson sampling and regret bounds. It discusses problem settings like "classical multi armed bandit problem" and "online linear optimization problems" but does not describe empirical experiments using specific datasets, nor does it provide access information for any datasets.
Dataset Splits	No	The paper is theoretical and focuses on mathematical analysis and regret bounds. It does not describe any empirical experiments or the use of specific datasets, therefore, there is no mention of training/test/validation dataset splits.
Hardware Specification	No	The paper is a theoretical analysis and does not describe any empirical experiments or computational implementations that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies	No	The paper focuses on theoretical analysis and algorithms, without discussing specific implementations or experimental setups. Therefore, it does not list any software dependencies with version numbers.
Experiment Setup	No	The paper presents a theoretical analysis and derives regret bounds. It does not describe any empirical experiments, and consequently, there are no details provided regarding experimental setup, hyperparameters, or training configurations.