Bridging Local and Global Knowledge via Transformer in Board Games
Authors: Yan-Ru Ju, Tai-Lin Wu, Chung-Chin Shih, Ti-Rong Wu
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Res TNet improves playing strength across multiple board games, increasing win rate from 54.6% to 60.8% in 9x9 Go, 53.6% to 60.9% in 19x19 Go, and 50.4% to 58.0% in 19x19 Hex. In addition, Res TNet effectively processes global information and tackles two long-sequence patterns in 19x19 Go, including circular pattern and ladder pattern. |
| Researcher Affiliation | Academia | Institute of Information Science, Academia Sinica, Taiwan EMAIL |
| Pseudocode | No | The paper describes methods and processes like the Alpha Zero algorithm and network architecture (Figure 2) but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://rlg.iis.sinica.edu.tw/papers/restnet. |
| Open Datasets | Yes | We trained these models using supervised learning on a human game collection. The collection contains a total of 1 million games played by 7 dan to 9 dan human Go players on Tygem [Cho and Corporation, 2001], a popular online Go platform. and Wang et al. [2023] provided a game collection containing 24 games, each featuring circular patterns... These games are available at https://goattack.far.ai/ adversarial-policy-katago. |
| Dataset Splits | Yes | For 19x19 Hex, we train two 10-block models directly using the Gumbel Alpha Zero algorithm with 32 simulations. Each model trains with 500,000 self-play games and 100,000 optimization steps. and After training, we evaluate these models on a separate testing ladder dataset consisting of 166,500 ladder patterns. |
| Hardware Specification | Yes | Each training generates a total of 1 million self-play games and includes 100,000 network optimization steps, requiring approximately 200 1080Ti GPU hours. |
| Software Dependencies | No | The paper mentions using the Gumbel Alpha Zero algorithm [Danihelka et al., 2022] and an open-sourced Alpha Zero framework [Wu et al., 2025], but does not provide specific software library names or their version numbers (e.g., Python, PyTorch, CUDA versions) used for implementation. |
| Experiment Setup | Yes | Each residual block consists of 256 filters, whereas each Transformer block comprises 81 (9 9) tokens with an embedding size of 256. and we train each network using the Gumbel Alpha Zero algorithm [Danihelka et al., 2022] with 64 simulations based on an open-sourced Alpha Zero framework [Wu et al., 2025]. Each training generates a total of 1 million self-play games and includes 100,000 network optimization steps... |