reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds

Authors: Kamalika Chaudhuri, Chuan Guo, Laurens van der Maaten, Saeed Mahloujifar, Mark Tygert

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments indicate that the HCR bounds are on the precipice of being effectual for small neural nets with the data sets, MNIST and CIFAR-10... Section 3 reports numerical experiments with the popular data sets MNIST, CIFAR10, and Image Net-1000 for image classification, when processed with standard architectures such as Res Nets and Swin Transformers as well as with some especially simple, illustrative neural nets.
Researcher Affiliation	Industry	Kamalika Chaudhuri EMAIL Fundamental Artificial Intelligence Research Meta Platforms, Inc. Chuan Guo EMAIL Fundamental Artificial Intelligence Research Meta Platforms, Inc. Laurens van der Maaten EMAIL Fundamental Artificial Intelligence Research Meta Platforms, Inc. Saeed Mahloujifar EMAIL Fundamental Artificial Intelligence Research Meta Platforms, Inc. Mark Tygert EMAIL Fundamental Artificial Intelligence Research Meta Platforms, Inc.
Pseudocode	Yes	Algorithm 1: Calculation of a perturbation ε to the vector of parameters θ
Open Source Code	Yes	Permissively licensed open-source codes that can automatically reproduce all results of the present paper are available at https://github.com/facebookresearch/hcrbounds
Open Datasets	Yes	Subsection 3.1 considers MNIST, a classic data set of 28 28 pixel grayscale scans of handwritten digits... Subsection 3.2 does similarly for CIFAR-10, a classic data set of 32 32 pixel color images of 10 classes... Subsection 3.3 considers Image Net, a standard data set with 1000 classes...
Dataset Splits	Yes	For training, we use random minibatches of 32 examples each, over 6 epochs of optimization (thus sweeping 6 times through all 60,000 examples from the training set of MNIST)... On the test set of MNIST... For training, we use random minibatches of 32 examples each, over 7 epochs of optimization (thus sweeping 7 times through all 50,000 examples from the training set of CIFAR-10)... On 2,500 examples drawn at random without replacement from the test set of CIFAR-10... All examples of the present subsection consider 128 examples from the validation set of Image Net-1000, drawing the examples uniformly at random without replacement.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types) used for running its experiments. It only mentions general models like ResNet-18 and Swin-T without hardware specifications.
Software Dependencies	No	The paper mentions software tools and algorithms like Adam W, LSQR, and Torch Vision, but it does not specify version numbers for these software components, which is required for reproducibility.
Experiment Setup	Yes	For training, we use random minibatches of 32 examples each, over 6 epochs of optimization... We minimize the empirical average cross-entropy loss using Adam W of Loshchilov & Hutter (2019), with a learning rate of 0.001. ... All results reported are HCR bounds maximized over 25 independent and identically distributed pseudorandom realizations of zε in (12), obtained by running the algorithm (Algorithm 1) of Subsection 2.2 with the n entries of the starting vector z being proportional to the normally distributed noise added to the features. The constant of proportionality is 1/ n times the size s of perturbation specified in the captions to the subfigures (these sizes are 1/200, 1/500, and 1/1000 for the different subfigures, as indicated in the captions).