reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Combinatorial Multi-armed Bandits for Real-Time Strategy Games

Authors: Santiago Ontañón

JAIR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze the theoretical properties of several variants of na ıve sampling, and empirically compare it against the other existing strategies in the literature for CMABs. We then evaluate these strategies in the context of real-time strategy (RTS) games, a genre of computer games characterized by their very large branching factors. Our results show that as the branching factor grows, na ıve sampling outperforms the other sampling strategies.
Researcher Affiliation	Academia	Santiago Onta n on EMAIL Computer Science Department, Drexel University Philadelphia, PA, USA
Pseudocode	Yes	Algorithm 1 Select And Expand Node(n0)
Open Source Code	Yes	A snapshot of all the source code and data necessary to reproduce all the experiments presented in this paper can be downloaded from the author s website2. 2. https://sites.google.com/site/santiagoontanonvillar/code/Naive Sampling-journal-2016source-code.zip
Open Datasets	Yes	A snapshot of all the source code and data necessary to reproduce all the experiments presented in this paper can be downloaded from the author s website2. The µRTS game simulator1 as our application domain, which is a deterministic and fully-observable RTS game (although it can be conﬁgured for partial observability or non-determinism). 1. https://github.com/santiontanon/microrts
Dataset Splits	No	The paper describes experimental setups involving different game maps and round-robin tournaments, but does not use traditional train/test/validation dataset splits in the machine learning sense. The experiments are conducted within a simulator, and specific game situations are used, but these are not referred to as dataset splits.
Hardware Specification	Yes	In the Java µRTS implementation of these algorithms, running MC search takes about 271ms to run 1000 playouts in 8x8 maps, and about 615ms in 12x12 maps on an Intel Core i7 3.1GHz (the sampling strategy makes little diﬀerence, since the main bottleneck is running the forward model to run the playouts).
Software Dependencies	No	The paper mentions 'Java µRTS implementation' and provides a link to the µRTS simulator (https://github.com/santiontanon/microrts), but does not specify version numbers for Java, µRTS, or any other libraries or frameworks.
Experiment Setup	Yes	We used ϵ0 = 0.8 (emphasizing exploration), ϵl = 0.4, and ϵg = 0 (emphasizing the fact that the global MAB is used for exploitation), since they achieved the best results in our experiments. We performed a series of round-robin tournaments involving six diﬀerent AIs... In each round-robin tournament, each AI played 20 games against each other AI (10 times as player max, and 10 times as player min) in each of the 8 maps, resulting in a total of 2400 games per tournament. We played six such round-robin tournaments, using a computation budget of 500, 1000, 2000, 3000, 4000 and 5000 playouts per game cycle respectively. As the reward function, we use the result of running a Monte Carlo simulation of the game during 100 game cycles (using a random action selection strategy), and then using an evaluation function to the resulting game state. As the evaluation function, we used one of the built-in function in µRTS (Simple Sqrt Evaluation Function3).