reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games

Authors: Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen Marcus McAleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper presents an empirical exploration of non-transitivity in perfect-information games, specifically focusing on Xiangqi... By analyzing over 10,000 records of human Xiangqi play, we highlight the existence of both transitive and non-transitive elements... We evaluate the algorithm empirically using a We Chat mini program and achieve a Master level with a 99.41% win rate against human players. The algorithm s effectiveness in overcoming non-transitivity is confirmed by a plethora of metrics, such as relative population performance and visualization results.
Researcher Affiliation	Collaboration	Yang Li1, , Kun Xiong2, Yingping Zhang2, Jiangcheng Zhu2, Stephen Mcaleer3, Wei Pan1, Jun Wang4, Zonghong Dai2, , Yaodong Yang5, 1The University of Manchester, 2Huawei, 3Carnegie Mellon University, 4University College London, 5Peking University
Pseudocode	Yes	Algorithm 1: Algorithm for building the payoff matrix M Algorithm 2: Algorithm for Populationer
Open Source Code	No	Our project site is available at https://sites.google.com/view/jiangjun-site/.
Open Datasets	No	In this study, we delve into the intricate geometry of Xiangqi, leveraging a dataset comprising over 10,000 game records from human gameplay as the foundational basis for our investigation... To thoroughly analyze Xiangqi s geometry, we obtained a dataset consisting of over 10,000 records of real-world Xiangqi games, which were sourced from the Play Ok game platform1... 1www.playok.com
Dataset Splits	Yes	Deployment Time Stage Wins Ties Losses Total Win Rate Month 1 Training 717 11 8 736 97.42% Month 2 Training 724 0 17 741 97.71% Month 3 Training 462 0 3 465 99.35% Month 4-6 Evaluation 5089 3 27 5119 99.41% Table 2: Monthly statistics of the Jiang Jun mini-program over a six-month period are presented in this table. The data is divided into two stages: Training and Evaluation.
Hardware Specification	Yes	The training of the Jiang Jun algorithm to the "Master" level was facilitated by our proposed training framework that effectively utilizes the computational capabilities of up to 90 V100 GPUs on the Huawei Cloud Model Art platform... utilizing a total of 90 V100 GPUs. Specifically, 78 of these GPUs were allocated for the MCTS Actor, 4 GPUs were used for the Training, and 8 GPUs were dedicated to the Populationer... The execution of these experiments relied on the power of high-performance computing, specifically utilizing 40 V100 GPUs.
Software Dependencies	No	The paper mentions "Res Nets-based neural network" and "Res Net-18 architecture" for the neural network, and "Simplex method" for solving LP. However, it does not specify any software libraries or frameworks with version numbers (e.g., PyTorch 1.x, Python 3.x, CUDA 11.x) that were used to implement these components.
Experiment Setup	Yes	l = (z v)2 απT log p + β θ 2, (5) where α, β are balance constants between 0 and 1, and θ 2 is the L2 weight regularization of Jiang Jun agent. ...The hyperparameters of the network and training are provided as follows. network filters: 192, network layers: 10, batch size: 2048, sample games: 500, c_puct : 1.5, saver step: 400, learning rate : [0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001, 0.0003, 0.001, 0.003, 0.01], minimum games in one block: 5000, maximum training blocks: 100, minimum training blocks: 3, number of the process: 10.