reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Best of Both Worlds: Regret Minimization versus Minimax Play

Authors: Adrian Müller, Jon Schneider, Stratis Skoulakis, Luca Viano, Volkan Cevher

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally compare our Algorithm 2 for EFGs to the standard OMD algorithm with dilated KL (Kozuno et al., 2021) as well as to minimax play. Our evaluations confirm our theoretical findings, revealing that Algorithm 2 can achieve the best of both no-regret algorithms and minimax play.
Researcher Affiliation	Collaboration	1ETH Zürich 2Google Research 3Aarhus University 4EPFL. Correspondence to: Adrian Müller <EMAIL>.
Pseudocode	Yes	Algorithm 1 Phased Aggression with Importance-Weighting
Open Source Code	Yes	We provide the code in the supplementary material.
Open Datasets	Yes	We consider Kuhn poker (Kuhn, 1950), which serves as a simple yet fundamental example of two-player zero-sum imperfect information EFGs. Kuhn poker is a common 3-card simplification of standard poker, where each player selects one card from the deck {Jack, Queen, King} without replacement and initially bets one unit.3 https://en.wikipedia.org/wiki/Kuhn_poker
Dataset Splits	No	The paper describes experiments in a game environment (Kuhn Poker) involving repeated play for T rounds, but does not provide specific training/test/validation dataset splits typically found in data-driven experiments.
Hardware Specification	No	The paper mentions running experiments but does not specify any hardware details such as GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions 'dilated KL divergence' as part of the algorithm but does not list any specific software libraries or their version numbers.
Experiment Setup	Yes	In all algorithms, we used the same learning fixed rates (η 1/T) and the (unbalanced) dilated KL divergence for fairness and simplicity. We consider two types of experiments: First, we run the three algorithms against each other... Second, we evaluate how well each algorithm allows Alice to exploit exploitable strategies... We repeat each experiment 5 times. T = 1000 rounds.