Response-Based Approachability with Applications to Generalized No-Regret Problems

Authors: Andrey Bernstein, Nahum Shimkin

JMLR 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Here we introduce an approachability algorithm that relies on Blackwell s dual condition, which requires the agent to have a feasible response to each mixed action of the opponent... We demonstrate the utility of the proposed approach by applying it to certain generalizations of the classical regret minimization problem... In this paper, we introduce an approachability algorithm that avoids this projection computation step. Instead, the algorithm relies on the availability of a response map... In this section we present our basic response-based algorithm, and establish its convergence properties. Theorem 4 Let Assumption 1 hold, and suppose that the agent follows the strategy specified in Algorithm 1. Then d ( rn, S) λn ρ n, n 1, (4) for any strategy of the opponent.
Researcher Affiliation Academia Andrey Bernstein EMAIL School of Computer and Communication Sciences EPFL Ecole Polytechnique F ed erale de Lausanne Lausanne CH-1015, Switzerland Nahum Shimkin EMAIL Department of Electrical Engineering Technion Israel Institute of Technology Haifa 32000, Israel
Pseudocode Yes Algorithm 1 Response-Based Approachability Initialization: At time step n = 1, use arbitrary mixed action p1 and set an arbitrary target point r 1 S. At time step n = 2, 3, ...: 1. Set an approachability direction ... Algorithm 2 Generalized No-Regret Algorithm Input: The reward function v : A B RK; a set-valued goal function V : (B) RK; and for each q (B), a mixed action (or actions) p (A) such that v(p, q) V (q).
Open Source Code No The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets No The paper is theoretical, presenting algorithms and their convergence properties, and applying them to generalized no-regret problems. It does not use or reference any specific datasets for empirical evaluation.
Dataset Splits No No datasets are used for experiments in this theoretical paper, therefore no dataset split information is provided.
Hardware Specification No The paper presents theoretical algorithms and their properties, without conducting any empirical experiments that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe any empirical experiments. Therefore, it does not specify any software dependencies with version numbers.
Experiment Setup No This paper is theoretical, focusing on algorithm design and proofs of convergence. It does not describe empirical experiments with specific hyperparameters or training configurations.