reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits

Authors: Yue Kang, Cho-Jui Hsieh, Thomas Lee

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we show by experiments that our hyperparameter tuning framework outperforms the theoretical hyperparameter setting and other tuning methods with various (generalized) linear bandit algorithms. We run comprehensive experiments on both simulations and real-world datasets. Specifically, for the real data, we use the benchmark Movielens 100K dataset along with the Yahoo News dataset:
Researcher Affiliation	Collaboration	Yue Kang EMAIL University of California, Davis Cho-Jui Hsieh EMAIL Google and University of California, Los Angeles Thomas C. M. Lee EMAIL University of California, Davis
Pseudocode	Yes	Algorithm 1 Zooming TS algorithm with Restarts
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	Yes	We run comprehensive experiments on both simulations and real-world datasets. Specifically, for the real data, we use the benchmark Movielens 100K dataset along with the Yahoo News dataset... Movielens 100K dataset: This dataset contains 100K ratings from 943 users on 1,682 movies. For data pre-processing, we utilize LIBPMF (Yu et al., 2014)... Yahoo News dataset: We downloaded the Yahoo Recommendation dataset R6A, which contains Yahoo data from May 1 to May 10, 2009... We transform the contextual information into a 6-dimensional vector based on the processing in (Chu et al., 2009).
Dataset Splits	Yes	We also include a warming-up period of length T1 in the beginning to guarantee sufficient exploration... Specifically, for Lin UCB, Lin TS, UCB-GLM, GLM-TSL and Laplace-TS, we choose it to be 118. For GLOC and SGD-TS, we set it as 45.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions running times in seconds without specifying the computational environment.
Software Dependencies	No	The paper mentions using LIBPMF (Yu et al., 2014) for data preprocessing but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We run comprehensive experiments on both simulations and real-world datasets... We believe a large value of warm-up period T1 may abandon some useful information in practice, and hence we use T1 = T 2/(p+3) according to Theorem 4.2 in experiments. And we would restart our hyperparameter tuning layer after every T2 = 3T (p+2)/(p+3) rounds... The time horizon T is set to 14,000. For linear models, the expected reward of arm a is formulated as x t,aθ and random noise is sampled from N(0, 0.5)... Each experiment is repeated for 20 times.