reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Labels, Information, and Computation: Efficient Learning Using Sufficient Labels

Authors: Shiyu Duan, Spencer Chang, Jose C. Principe

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results are provided to support our theory. ... In addition to our theoretical contributions, we present the following experimental results. ... Section 6 contains our experimental results.
Researcher Affiliation	Academia	Department of Electrical and Computer Engineering University of Florida Gainesville, FL 32611, USA
Pseudocode	Yes	Algorithm 1 An example algorithm for converting part of a fully-labeled set of patient records into a suﬃciently-labeled one
Open Source Code	No	The paper does not contain any explicit statement about providing source code or a link to a repository for the methodology described in this paper.
Open Datasets	Yes	On MNIST, Fashion-MNIST (Xiao et al., 2017), SVHN (Netzer et al., 2011), CIFAR10, and CIFAR-100 (Krizhevsky and Hinton, 2009)
Dataset Splits	Yes	For a training set of size 50k/100k/200k/60k/120k/240k/73, 257/146, 514/293, 028/400k/800k, we randomly selected 5k/10k/20k/5k/10k/20k/5k/10k/20k/40k/80k training examples to form the validation set. ... For all datasets, we report performance on the standard test sets.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions models like Le Net-5 and Res Net-18, and optimizers like stochastic gradient descent, but does not provide specific software versions for libraries or frameworks used.
Experiment Setup	Yes	The optimizer used was stochastic gradient descent with batch size 128. Each hidden module (or full model, in the cases where only full labels were used) was trained with step size 0.1, 0.01, and 0.001, for 200, 100, and 50 epochs, respectively. For CIFAR-100 in online mode, the hidden module was trained for 1000, 800, and 400 epochs with the said step sizes. If needed, each output layer was trained with step size 0.1 for 50 epochs.