reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry

Authors: Chi-Ning Chou, Hang Le, Yichen Wang, Sueyeon Chung

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show, in both theoretical and empirical settings, that as networks learn features, task-relevant manifolds untangle, with changes in manifold geometry revealing distinct learning stages and strategies beyond the lazy rich dichotomy. This framework provides novel insights into feature learning across neuroscience and machine learning, shedding light on structural inductive biases in neural circuits and the mechanisms underlying out-of-distribution generalization.
Researcher Affiliation	Collaboration	1Center for Computational Neuroscience, Flatiron Institute, New York, NY, USA 2University of California, UCLA, Los Angeles, CA, USA 3Center for Neural Science, New York University, New York, NY, USA. Correspondence to: Chi-Ning Chou <EMAIL>, Hang Le <EMAIL>, Sue Yeon Chung <EMAIL>.
Pseudocode	Yes	Algorithm 1 Estimate simulated manifold capacity... Algorithm 2 Estimate manifold capacity and effective geometric measures
Open Source Code	Yes	All code required to reproduce the figures presented is available under an MIT License at https://github.com/chungneuroai-lab/feature-learning-geometry
Open Datasets	Yes	Specifically, we considered VGG-11 (Simonyan & Zisserman, 2015) and Res Net-18 (He et al., 2016) and datasets CIFAR10 (Krizhevsky & Hinton, 2009), CIFAR-100 (Krizhevsky & Hinton, 2009), CIFAR-10C (Hendrycks & Dietterich, 2018).
Dataset Splits	Yes	The CIFAR-10 dataset (Krizhevsky & Hinton, 2009) consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. ... The CIFAR-100 dataset (Krizhevsky & Hinton, 2009) is similar to CIFAR-10, except that it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class.
Hardware Specification	No	All experiments were performed using the Flatiron Institute s high-performance computing cluster.
Software Dependencies	No	Optimizer: We use Stochastic Gradient Descent with momentum (implemented as torch.optim.SGD(momentum=0.9)) to train the models. ... The error bar indicates the bootstraped 95% confidence interval calculated using seaborn.lineplot(errorbar=( ci , 95)).
Experiment Setup	Yes	Optimizer: We use Stochastic Gradient Descent with momentum (implemented as torch.optim.SGD(momentum=0.9)) to train the models. Data augmentation: We apply the following data augmentation during training: Random Crop(32, padding=4), Random Horizontal Flip. Learning rate and learning schedule: We follow the practice in (Chizat et al., 2019) and set initial learning rate η0 = 1.0 for VGG-11 and η0 = 0.2 for Res Net-18. The learning rate schedule is defined as ηt = η0 / (1 + (1/3)t).