Designing Skill-Compatible AI: Methodologies and Frameworks in Chess

Authors: Karim Hamade, Reid McIlroy-Young, Siddhartha Sen, Jon Kleinberg, Ashton Anderson

ICLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our agents outperform state-of-the-art chess AI (based on Alpha Zero) despite being weaker in conventional chess, demonstrating that skill-compatibility is a tangible trait that is qualitatively and measurably distinct from raw performance. Our evaluations further explore and clarify the mechanisms by which our agents achieve skill-compatibility.
Researcher Affiliation Collaboration Karim Hamade Reid Mc Ilroy-Young Siddhartha Sen EMAIL EMAIL EMAIL University of Toronto University of Toronto Microsoft Research Jon Kleinberg Ashton Anderson EMAIL EMAIL Cornell University University of Toronto
Pseudocode No The paper does not contain any sections explicitly labeled as "Pseudocode" or "Algorithm", nor does it present structured steps in a code-like format.
Open Source Code Yes Our code is released at github.com/CSSLab/skill-compatibility-chess. We also include several of our trained models.
Open Datasets No The paper states maia was trained on games from lichess.org, an open-source platform. However, it does not provide a direct link, DOI, specific repository for the *dataset* used, or a formal citation of the dataset itself, only the platform source.
Dataset Splits Yes To create att, a dataset of 10000 games (80% train, 10% validate, and 10% test) is generated of the following game leela maia leela maia for STT or leela maia leela maia for HB.
Hardware Specification Yes We made use of four Tesla K80 GPU s for the purpose of experimentation, each with a VRAM of 12 GB.
Software Dependencies Yes Against stockfish 13 (60k nodes), a strong classical engine that uses alpha-beta search, this version of leela obtains a score of 59 3.
Experiment Setup Yes To create att, a dataset of 10000 games (80% train, 10% validate, and 10% test) is generated of the following game leela maia leela maia for STT or leela maia leela maia for HB. Then, starting with leela s weights, and using a learning rate of 10 5, and 10000 iterations, we run back-propagation to update leela s policy and value neural network.