reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning

Authors: Noa Rubin, Kirsten Fischer, Javed Lindner, Inbar Seroussi, Zohar Ringel, Michael Krämer, Moritz Helias

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This work presents a theoretical framework of multi-scale adaptive feature learning bridging these two views. Using methods from statistical mechanics, we derive analytical expressions for network output statistics which are valid across scaling regimes and in the continuum between them. ... In Fig. 2, we compare theoretical values for training and test discrepancies against empirical measurements for linear networks trained on a linearly separable Ising task (see App. C.3 for details). Comparing to the NNGP as a baseline, we find that, while the NNGP fails to match network outputs, the multi-scale adaptive theory accurately predicts the values observed in trained networks.
Researcher Affiliation	Academia	1The Racah Institute of Physics, The Hebrew University of Jerusalem, Jerusalem, Israel 2 Institute for Advanced Simulation (IAS-6), Computational and Systems Neuroscience, J ulich Research Centre, J ulich, Germany 3RWTH Aachen University, Aachen, Germany 4Department of Physics, RWTH Aachen University, Aachen, Germany 5Institute for Theoretical Particle Physics and Cosmology, RWTH Aachen University, Aachen, Germany 6Department of Applied Mathematics, School of Mathematical Sciences, Tel-Aviv University, Tel-Aviv, Israel. Correspondence to: Noa Rubin <EMAIL>.
Pseudocode	Yes	Algorithm 1 Annealing of solutions across scaling regimes Input: data X, labels Y , scales {χi}i Compute NNGP train predictors f NNGP α from data X and labels Y . Set initial value to NNGP predictor f NNGP α . for χ in {χi}i do Set gw 7 gw/χ. Solve self-consistency solution for tree-level approximation f TL α with initial value f NNGP α . Solve self-consistency solution for one-loop approximation f 1-Loop α with initial value f TL α . end for
Open Source Code	Yes	The code for theory and experiments can be found in 10.5281/zenodo.15480898. URL https://doi.org/ 10.5281/zenodo.15480898.
Open Datasets	Yes	In addition, our theory does not make any assumptions on the data set; we show results for an Ising task, a teacher student task and MNIST.
Dataset Splits	Yes	Parameters: γ = 1, Ptrain = 80, N = 100, D = 200, κ0 = 1, Ptest = 103, gv = gw = 0.5, p = 0.1.
Hardware Specification	No	The authors gratefully acknowledge the computing time granted by the JARA Vergabegremium and provided on the JARA Partition part of the supercomputer JURECA at Forschungszentrum J ulich (computation grant JINB33).
Software Dependencies	No	The time discrete version of (164) is implemented in our Py Torch code as
Experiment Setup	Yes	Parameters: γ = 1, Ptrain = 80, N = 100, D = 200, κ0 = 1, Ptest = 103, gv = gw = 0.5, p = 0.1.