reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Bregman Learning Framework for Sparse Neural Networks

Authors: Leon Bungert, Tim Roith, Daniel Tenbrinck, Martin Burger

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 4 we ﬁrst discuss our statistical sparse initialization strategy and then evaluate our algorithms on benchmark data sets (MNIST, Fashion-MNIST, CIFAR-10) using feedforward, convolutional, and residual neural networks. Table 1 shows that all algorithms manage to compute very sparse networks with ca. 2% drop in test accuracy on Fasion-MNIST, compared to vanilla dense training with Adam. Table 3 shows the resulting sparsity levels of the total number of parameters and the percentage of non-zero convolutional kernels as well as the train and test accuracies.
Researcher Affiliation	Academia	Leon Bungert EMAIL HausdorﬀCenter for Mathematics University of Bonn Endenicher Allee 62, Villa Maria, 53115 Bonn, Germany. Tim Roith EMAIL Department of Mathematics Friedrich-Alexander-Universit at Erlangen-N urnberg Cauerstraße 11, 91058 Erlangen, Germany.
Pseudocode	Yes	Algorithm 1: Lin Breg, an inverse scale space algorithm for training sparse neural networks by successively adding weights whilst minimizing the loss. Algorithm 2: Lin Breg with Momentum, an acceleration of Lin Breg using momentum-based gradient memory. Algorithm 3: Ada Breg, a Bregman version of the Adam algorithm which uses moment-based bias correction.
Open Source Code	Yes	Code is available at https://github.com/Tim Roith/Bregman Learning.
Open Datasets	Yes	We consider the classiﬁcation task on the MNIST dataset (Le Cun and Cortes, 2010) for studying the impact of the hyperparameters of these methods. The set consists of 60, 000 images of handwritten digits which we split into 55, 000 images used for the training and 5, 000 images used for a validation process during training. We train a fully connected net with Re LU activations and two hidden layers (200 and 80 neurons), and use the ℓ1-regularization from (1.13), on the MNIST dataset... In this example we apply our algorithms to a convolutional neural network... to solve the classiﬁcation task on Fashion-MNIST. In this experiment we trained a Res Net-18 architecture for classiﬁcation on CIFAR10
Dataset Splits	Yes	The set consists of 60, 000 images of handwritten digits which we split into 55, 000 images used for the training and 5, 000 images used for a validation process during training.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or types) are provided in the paper.
Software Dependencies	No	Our code is available on Git Hub at https://github.com/Tim Roith/Bregman Learning and relies on Py Torch (Paszke et al., 2019). (No specific version number for PyTorch is given).
Experiment Setup	Yes	The learning rate is chosen as τ = 0.1 and is multiplied by a factor of 0.5 whenever the validation accuracy stagnates. We initialize the weights with 1% non-zero entries, i.e., r = 0.01.