Multi-objective Linear Reinforcement Learning with Lexicographic Rewards

Authors: Bo Xue, Dake Bu, Ji Cheng, Yuanyu Wan, Qingfu Zhang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical To bridge this gap, we examine MORL under lexicographic reward structures... We introduce the first MORL algorithm with provable regret guarantees... Our work provides a comprehensive theoretical analysis of regret bounds... This work establishes the first theoretical regret guarantee for MORL.
Researcher Affiliation Academia 1Department of Computer Science, City University of Hong Kong, Hong Kong, China 2School of Software Technology, Zhejiang University, Ningbo, China. Correspondence to: Qingfu Zhang <EMAIL>.
Pseudocode Yes Algorithm 1 Lexicographic Linear Reinforcement Learning Algorithm 2 Lexicographic Action Elimination
Open Source Code No The paper does not contain any explicit statements or links indicating that open-source code for the described methodology is available.
Open Datasets No The paper is theoretical and focuses on algorithm design and regret analysis. It does not conduct empirical studies using specific datasets that would be made publicly available.
Dataset Splits No The paper is theoretical and does not involve experiments on datasets, therefore no dataset splits are provided.
Hardware Specification No The paper describes a theoretical algorithm and provides proofs; it does not include any experimental evaluation requiring specific hardware, so no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not present experimental results, therefore it does not specify any software dependencies with version numbers.
Experiment Setup No The paper is theoretical and focuses on algorithm development and regret analysis. It does not describe an experimental setup with hyperparameters or training configurations.