reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Rainbow in Deep Network Black Boxes

Authors: Florentin Guth, Brice Ménard, Gaspar Rochette, Stéphane Mallat

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also verify numerically our modeling assumptions on deep CNNs trained on image classiﬁcation tasks, and show that the trained networks approximately satisfy the rainbow hypothesis. In particular, rainbow networks sampled from the corresponding random feature model achieve similar performance as the trained networks. Our results highlight the central role played by the covariances of network weights at each layer, which are observed to be low-rank as a result of feature learning. Keywords: deep neural networks, inﬁnite-width limit, random features, representation alignment, weight covariance.
Researcher Affiliation	Academia	Florentin Guth EMAIL Center for Data Science, New York University, 60 5th Avenue, New York, NY 10011, USA Flatiron Institute, 162 5th Avenue, New York, NY 10010, USA Brice Ménard EMAIL Department of Physics & Astronomy, Johns Hopkins University Baltimore, MD 21218, USA Gaspar Rochette EMAIL Département d informatique, École Normale Supérieure, CNRS, PSL University 45 rue d Ulm, 75005 Paris, France Stéphane Mallat EMAIL Collège de France, 11, place Marcelin-Berthelot 75231 Paris, France Flatiron Institute, 162 5th Avenue, New York, NY 10010, USA
Pseudocode	No	The paper describes methods verbally and mathematically but does not present explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code to reproduce all our experiments can be found at https://github.com/Florentin Guth/Rainbow.
Open Datasets	Yes	Architectures and tasks. In this paper, we consider two architectures, learned scattering networks (Zarka et al., 2021; Guth et al., 2022) and Res Nets (He et al., 2016), trained on two image classiﬁcation datasets, CIFAR-10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015).
Dataset Splits	Yes	The alignment rotations are computed using the CIFAR-10 train set, while network accuracy is evaluated on the test set, so that the measured performance is not a result of overﬁtting.
Hardware Specification	No	We thank the Scientiﬁc Computing Core at the Flatiron Institute for the use of their computing resources.
Software Dependencies	No	Network weights are initialized with i.i.d. samples from an uniform distribution (Glorot and Bengio, 2010) with so-called Kaiming variance scaling (He et al., 2015), which is the default in the Py Torch library (Paszke et al., 2019).
Experiment Setup	Yes	Scattering networks are trained for 150 epochs with an initial learning rate of 0.01 which is divided by 10 every 50 epochs, with a batch size of 128. Res Nets are trained for 90 epochs with an initial learning rate of 0.1 which is divided by 10 every 30 epochs, with a batch size of 256. We use the optimizer SGD with a momentum of 0.9 and a weight decay of 10 4 (except for Figures 4 and 10 where weight decay has been disabled). We use classical data augmentations: horizontal ﬂips and random crops for CIFAR, random resized crops of size 224 and horizontal ﬂips for Image Net.