The Computational Complexity of Circuit Discovery for Inner Interpretability
Authors: Federico Adolfi, Martina G. Vilas, Todd Wareham
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | To address this, we study circuit discovery with classical and parameterized computational complexity theory: (1) we describe a conceptual scaffolding to reason about circuit finding queries in terms of affordances for description, explanation, prediction and control; (2) we formalize a comprehensive set of queries for mechanistic explanation, and propose a formal framework for their analysis; (3) we use it to settle the complexity of many query variants and relaxations of practical interest on multi-layer perceptrons. Our findings reveal a challenging complexity landscape. Many queries are intractable, remain fixed-parameter intractable relative to model/circuit features, and inapproximable under additive, multiplicative, and probabilistic approximation schemes. To navigate this landscape, we prove there exist transformations to tackle some of these hard problems with better-understood heuristics, and prove the tractability or fixed-parameter tractability of more modest queries which retain useful affordances. |
| Researcher Affiliation | Academia | Federico Adolfi ESI Neuroscience, Max-Planck Society & University of Bristol EMAIL Martina G. Vilas Department of Computer Science Goethe University Frankfurt EMAIL Todd Wareham Department of Computer Science Memorial University of Newfoundland EMAIL |
| Pseudocode | No | The paper defines various problems (e.g., Problem 1. BOUNDED LOCAL SUFFICIENT CIRCUIT (BLSC)) and discusses algorithms (e.g., "We describe an efficient algorithm to compute it..."), but it does not include any structured pseudocode or algorithm blocks in the main text. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide any links to a code repository or mention code in supplementary materials. |
| Open Datasets | No | The paper is theoretical, focusing on computational complexity analysis. It does not describe or use any specific datasets in its own methodology or experiments, therefore it does not provide access information for any public datasets. |
| Dataset Splits | No | The paper is a theoretical work on computational complexity and does not conduct experiments involving datasets. Therefore, there is no information provided regarding dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup that would require specific hardware. No hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental setup that would require specific software dependencies with version numbers. No such information is provided. |
| Experiment Setup | No | The paper is theoretical and focuses on computational complexity analysis. It does not describe any experimental setup, hyperparameters, or system-level training settings. |