reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the numerical reliability of nonsmooth autodiff: a MaxPool case study

Authors: Ryan Boustany

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper considers the reliability of automatic differentiation for neural networks involving the nonsmooth Max Pool operation across various precision levels (16, 32, 64 bits), architectures (Le Net, VGG, Res Net), and datasets (MNIST, CIFAR10, SVHN, Image Net). ... Using SGD for training, we found that nonsmooth Max Pool Jacobians with lower norms maintain stable and efficient test accuracy, while higher norms can result in instability and decreased performance. ... In Section 4, we present detailed experiments on neural network training.
Researcher Affiliation	Collaboration	Ryan Boustany EMAIL Toulouse School of Economics Université de Toulouse Thales LAS France
Pseudocode	Yes	def max1(x): res = x[0] for i in range(1, 4): if x[i] > res: res = x[i] return res def max2(x): return torch.max(x) def zero(t): z = t * x return max1(z) max2(z) Figure 7: Implementation of programs max1, max2 and zero using Pytorch. Programs max1 and max2 are an equivalent implementation of max, but with different derivatives due to the implementation.
Open Source Code	Yes	All experiments were conducted using Py Torch (Paszke et al., 2019), and our source code is available publicly 1. 1https://github.com/ryanboustany/Max Pool-numerical
Open Datasets	Yes	Our experiments used CIFAR10 (Krizhevsky & Hinton, 2010), MNIST (Le Cun et al., 1998) and Image Net (Deng et al., 2009) datasets. ... Datasets: In this work, we utilized various well-known image classification benchmarks. Below are the datasets, including their characteristics and original references. MNIST, CIFAR10, SVHN, Image Net. The corresponding references for these datasets are Le Cun et al. (1998); Krizhevsky & Hinton (2010); Netzer et al. (2011).
Dataset Splits	Yes	Appendix C.1 Benchmark datasets and architectures ... Dataset Dimensionality Training set Test set MNIST 28x28 (grayscale) 60K 10K CIFAR10 32x32 (RGB) 60K 10K SVHN 32x32 (RGB) 600K 26K Image Net 224x224 (RGB) 1.3M 50K
Hardware Specification	Yes	Conducted on Py Torch and Nvidia V100 GPUs ... Computational Resources: All the experiments were conducted on four Nvidia V100 GPUs.
Software Dependencies	No	The paper mentions "Py Torch (Paszke et al., 2019)" multiple times but does not specify the version number of PyTorch or any other software used in their experiments. Figures 7 and 9 show Python code using `torch`, but no version is specified.
Experiment Setup	Yes	Training settings: The default optimizer is SGD. ... with sizes \|Bq\| {1, . . . , N}, where αq > 0 is the learning rate for each mini-batch q. Each program Pi in P = {Pi}N i=1 implements a function li (as in Definition 1). The SGD algorithm updates network parameters θq,P by: θq+1,P = θq,P γ αq i Bq backprop[Pi](θq,P ) (16) with γ > 0 indicating the step-size parameter. ... We trained seven VGG11 networks {Pi}6 i=0 at 32-bit precision on CIFAR10 for 200 epochs, using 128-size mini-batches, fixed learning rate for each mini-batch q αq = 1.0, and step-size parameter γ [0.01, 0.012].