Position: A Theory of Deep Learning Must Include Compositional Sparsity
Authors: David A. Danhofer, Davide D’Ascenzo, Rafael Dubach, Tomaso A Poggio
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. |
| Researcher Affiliation | Academia | 1Center for Brains, Minds and Machines (CBMM), MIT, Cambridge, MA, USA 2ETH Zurich, Zurich, Switzerland 3Politecnico di Torino, Torino, Italy 4University of Milan, Milan, Italy 5University of Zurich, Zurich, Switzerland. Correspondence to: Davide D Ascenzo <EMAIL>. |
| Pseudocode | No | The paper includes mathematical definitions, theorems, and proofs (e.g., in Appendix A), but it does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured, code-like steps for a procedure. |
| Open Source Code | No | The paper does not contain any statements about releasing code, links to code repositories, or mentions of code being available in supplementary materials for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not present experiments that would require specific datasets with access information provided by the authors. While it refers to datasets used in other research (e.g., ImageNet, AlphaGo, LLM training corpora), it does not provide access information for a dataset it uses for its own analysis or experiments. |
| Dataset Splits | No | The paper is theoretical and does not present experiments or analyze specific datasets, therefore, it does not provide any information regarding dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not present any experimental results. Therefore, it does not specify any hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not present any experimental results. Therefore, it does not specify any software dependencies with version numbers needed to replicate an experiment. |
| Experiment Setup | No | The paper is theoretical and does not describe any specific experimental setup, hyperparameters, or system-level training settings as it does not present its own experiments. |