Understanding the learned look-ahead behavior of chess neural networks
Authors: Diogo Cruz
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate the look-ahead capabilities of chess-playing neural networks, specifically focusing on the Leela Chess Zero policy network. Our findings reveal that the network s look-ahead behavior is highly context-dependent, varying significantly based on the specific chess position. We demonstrate that the model can process information about board states up to seven moves ahead, utilizing similar internal mechanisms across different future time steps. All experiments were run using an RTX 3070Ti, with a combined runtime of 2 days. |
| Researcher Affiliation | Industry | Diogo Cruz EMAIL Pivotal |
| Pseudocode | No | The paper describes analysis techniques (Activation Patching, Probing, Ablation) but does not present them in a structured pseudocode or algorithm block format. It describes the methodology in narrative text. |
| Open Source Code | Yes | Our implementation is heavily based on the implementation described in Jenner et al. (2024), and previously made available at https://github.com/Human Compatible AI/leela-interp. For the activation patching, probing, and zero ablation results, modifications were made to account for the case of more than 3 moves. Code for reproducing our results is available at https://github.com/diogo-cruz/leela-interp. |
| Open Datasets | Yes | We use Lichess 4 million puzzle database as a starting point. Each puzzle in our dataset has a starting state with a single winning move for the player whose turn it is, along with an annotated principal variation (the optimal sequence of moves for both players from the starting state). Lichess. Lichess database: Puzzles. https://database.lichess.org/#puzzles, 2025. Data under CC0 1.0; puzzles file last updated 2025-08-02. |
| Dataset Splits | Yes | The puzzles were curated into three datasets: a 22k puzzle dataset used in Jenner et al. (2024), solvable for the Leela model but difficult for weaker models to solve, and used for the 3 and 5-move analysis; a 2.2k dataset of 7-move puzzles; and 609 puzzles for the alternative move analysis. Additional details on the dataset generation, and their difficulty level, can be found in Appendices F and H. |
| Hardware Specification | Yes | All experiments were run using an RTX 3070Ti, with a combined runtime of 2 days. |
| Software Dependencies | No | The paper mentions using a 'Leela Chess Zero (Leela) policy network' and that 'Our implementation is heavily based on the implementation described in Jenner et al. (2024)'. It also mentions 'Stockfish (depth 22, 8 threads, 2GB hash table, NNUE enabled)' for evaluation. However, specific version numbers for software components like Leela, Python, or machine learning frameworks (e.g., PyTorch, TensorFlow) are not provided. |
| Experiment Setup | Yes | We employ three main techniques to analyze the internal representations of the model: Activation Patching, Probing, and Ablation. For activation patching, we first run the model on the original position to get the clean activations. We then create a corrupted position by replacing specific moves in the game history and run the model on this corrupted position. Let mc be the correct move, sp be the patched model state, and sc be the clean model state. The log odds change L of the target move is then defined as: L = log odds(mc | sp) log odds(mc | sc). For probing, we extract activations from each attention head when running the model on chess positions. We then train a bilinear probe to predict the board square associated with the move of interest. For ablation, we selectively set certain activations to zero. The original 3-move dataset was created by starting from the Lichess chess puzzle database and filtering for puzzles where: the weaker model assigned less than a 5% probability to the optimal first move; the Leela model assigned at least a 50% probability to the 1st, 2nd, and 3rd optimal moves; the weaker model assigned more than a 70% probability to the optimal 2nd move. |