reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Stochastic Gradient Descent for Deep Neural Networks

Authors: Anna Kuzina, Haotian Chen, Babak Esmaeili, Jakub M. Tomczak

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Lastly, we carry out experiments on two image classification datasets and four deep neural network architectures, where we show that VSGD outperforms Adam and SGD.
Researcher Affiliation	Academia	Haotian Chen EMAIL Department of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands. Anna Kuzina EMAIL Department of Computer Science, Vrije Universiteit Amsterdam, Netherlands. Babak Esmaeili EMAIL Department of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands. Jakub M. Tomczak EMAIL Department of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands.
Pseudocode	Yes	Algorithm 1 VSGD Input: SVI learning rate parameter {κ1, κ2}, learning rate η, prior strength γ, prior variance ratio Kg. Initialize: θ0, a0,g = γ; a0,ˆg = γ; b0,g = γ; b0,ˆg = Kgγ; µ0,g = 0 for t = 1 to T do Compute ˆgt for L(θ; ) ρt,1 = t κ1 ρt,2 = t κ2 Update σ2 t,g, µt,g Eq. 14, 15 Update at,g, at,ˆg Eq. 16 Update bt,g, bt,ˆg Eq. 19,20 Update θt Eq. 23 end for
Open Source Code	Yes	Code is available at github.com/generativeai-tue/vsgd
Open Datasets	Yes	Data We used three benchmark datasets: CIFAR100 (Krizhevsky et al., 2009), Tiny Imagenet-200 (Deng et al., 2009a)2, and Imagenet-1k (Deng et al., 2009b).
Dataset Splits	Yes	The CIFAR100 dataset contains 60000 small (32 32) RGB images labeled into 100 different classes, 50000 images are used for training, and 10000 are left for testing. In the case of Tiny Imagenet-200, the models are trained on 100000 images from 200 different classes and tested on 10000 images.
Hardware Specification	Yes	Table 2: Average training time on Ge Force RTX 2080 Ti (seconds per training iteration) on CIFAR100 dataset.
Software Dependencies	No	The paper mentions open-source implementations for VGG, Conv Mixer, and ResNeXt, e.g., 'github.com/alecwangcq/KFAC-Pytorch/blob/master/models/cifar/vgg.py', but does not provide specific version numbers for software libraries like PyTorch itself or other dependencies.
Experiment Setup	Yes	Hyperparameters We conducted a grid search over the following hyperparameters: Learning rate (all optimizers); Weight decay (Adam W, VSGD); Momentum coefficient (SGD). For each set of hyperparameters, we trained the models with three different random seeds and chose the best one based on the validation dataset. The complete set of hyperparameters used in all experiments is reported in Table 3. Furthermore, we apply the learning rate scheduler, which halves the learning rate after each 10000 training iterations for CIFAR100 and every 20000 iterations for Tiny Imagenet-200. We train VGG and Conv Mixer using batch size 256 for CIFAR100 and batch size 128 for Tiny Imagenet-200. We use a smaller batch size (128 for CIFAR100 and 64 for Tiny Imagenet-200) with the Res Ne Xt architecture to fit training on a single GPU.