reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

Authors: Junyan Liu, Arnab Maiti, Artin Tajdini, Kevin Jamieson, Lillian J. Ratliff

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We initiate the study of a repeated principal-agent problem over a finite horizon T, where a principal sequentially interacts with K 2 types of agents arriving in an adversarial order... Without prior knowledge of agent behavior, we show that the problem becomes intractable, leading to linear regret. We analyze two key settings where sublinear regret is achievable. In the first setting, ...we propose an algorithm that achieves a regret bound of O(min{ KT log N, K T}) and provide a matching lower bound... In the second setting, ...we show that there is an algorithm with a regret bound of e O((LN)1/3T 2/3) and establish a matching lower bound... All our results are summarized in Table 1.
Researcher Affiliation	Academia	1Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, USA 2Department of Electrical and Computer Engineering, University of Washington, Seattle, USA. Correspondence to: Junyan Liu <EMAIL>, Arnab Maiti <EMAIL>, Artin Tajdini <EMAIL>.
Pseudocode	No	The paper describes algorithms such as 'EXP3 algorithm for linear bandits' and 'Tsallis-INF algorithm' and 'ADVERSARIALZOOMING' but does not present them in structured pseudocode blocks or figures within the main text.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the methodology described, nor does it provide links to any code repositories.
Open Datasets	No	The paper focuses on a theoretical 'repeated principal-agent problem' model and does not mention the use of any specific publicly available or open datasets for empirical evaluation. Therefore, no concrete access information for datasets is provided.
Dataset Splits	No	The paper does not utilize any specific datasets for experimental evaluation, thus there is no mention of training/test/validation splits.
Hardware Specification	No	The paper is theoretical in nature, focusing on algorithm design, regret bounds, and lower bounds. It does not describe any experimental setups or the hardware used to run experiments.
Software Dependencies	No	The paper references theoretical algorithms like 'EXP3 algorithm for linear bandits' and 'Tsallis-INF algorithm' from other works, but it does not specify any particular software, libraries, or their version numbers that would be required to implement or reproduce the authors' own described methodologies.
Experiment Setup	No	The paper is a theoretical work that focuses on algorithm design and mathematical analysis, including regret bounds and lower bounds. It does not describe any empirical experiments, and therefore, no experimental setup details, hyperparameters, or training configurations are provided.