reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Knowledge Matters: Importance of Prior Information for Optimization

Authors: Çağlar Gülçehre, Yoshua Bengio

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We explored the eﬀect of introducing prior knowledge into the intermediate level of deep supervised neural networks on two tasks. On a task we designed, all black-box state-of-the-art machine learning algorithms which we tested, failed to generalize well. We motivate our work from the hypothesis that, there is a training barrier involved in the nature of such tasks, and that humans learn useful intermediate concepts from other individuals by using a form of supervision or guidance using a curriculum. Our results provide a positive evidence in favor of this hypothesis. In our experiments, we trained a two-tiered MLP architecture on a dataset for which each input image contains three sprites, and the binary target class is 1 if all of three shapes belong to the same category and otherwise the class is 0.
Researcher Affiliation	Academia	C a glar G ul cehre EMAIL Yoshua Bengio EMAIL D epartement d informatique et de recherche op erationnelle Universit e de Montr eal, Montr eal, QC, Canada
Pseudocode	No	The paper describes the model architectures (SMLP, P1NN, P2NN) using mathematical equations and textual descriptions, but it does not include explicitly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	The source code of some experiments presented in that paper is available at https://github.com/caglar/kmatters. (...) The source code of the structured MLP is available at the Git Hub repository: https://github.com/caglar/structured_mlp. (...) The codes to reproduce these experiments are available at https://github.com/caglar/Pentomino Exps.
Open Datasets	Yes	In order to test our hypothesis, we designed an artiﬁcial dataset for object recognition using 64 64 binary images. The source code for the script that generates the artiﬁcial Pentomino datasets (Arcade-Universe) is available at: https://github.com/caglar/Arcade-Universe.
Dataset Splits	Yes	Initially the models are cross-validated by using 5-fold cross-validation. With 40, 000 examples, this gives 32, 000 examples for training and 8, 000 examples for testing. (...) For the experimental results shown in Table 4, we used 3 training set sizes of 20k, 40k and 80k examples. We generated each dataset with diﬀerent random seeds (so they do not overlap).
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models) used for running its experiments. It mentions using 'a Theano... implementation of Convolutional Neural Networks', which implies computation on CPU/GPU, but no specific models are named.
Software Dependencies	No	The paper mentions several software packages like 'scikit-learn', 'libsvm', 'Theano', and 'pylearn2' but does not provide specific version numbers for any of them.
Experiment Setup	Yes	The P1NN has a highly overcomplete architecture with 1024 hidden units per patch, and L1 and L2 weight decay regularization coeﬃcients on the weights (not the biases) are respectively 1e 6 and 1e 5. The learning rate for the P1NN is 0.75. (...) The P2NN has 2048 hidden units. L1 and L2 penalty coeﬃcients for the P2NN are 1e-6, and the learning rate is 0.1. (...) With extensive hyperparameter optimization and using standardization in the intermediate level of the SMLP with softmax nonlinearity, SMLP-nohints was able to get 5.3% training and 6.7% test error on the 80k Pentomino training dataset. (...) We used 2050 hidden units in the P1NN, 11 softmax outputs per patch, and 1024 hidden units in the P2NN. The network was trained with a learning rate 0.1 without using any adaptive learning rate. The SMLP uses a rectiﬁer nonlinearity for hidden layers of both P1NN and P2NN. We also applied a small amount of L1 and L2 regularization on the weights of the network.