Monotone Learning with Rectified Wire Networks
Authors: Veit Elser, Dan Schmidt, Jonathan Yedidia
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate a training algorithm using this update, called sequential deactivation (SDA), on MNIST and some synthetic datasets. Upon adopting a natural choice for the nodal weights, SDA has no hyperparameters other than those describing the network structure. Our experiments explore behavior with respect to network size and depth in a family of sparse expander networks. |
| Researcher Affiliation | Collaboration | Veit Elser EMAIL Department of Physics Cornell University Ithaca, NY 14853-2501, USA Dan Schmidt EMAIL and Jonathan Yedidia EMAIL Analog Devices, Inc. Boston, MA, USA |
| Pseudocode | Yes | Algorithm 1 Elementary network procedures |
| Open Source Code | Yes | This construction is implemented by the publicly available1C program expander and was used for all the experiments reported in the next section. All our experiments were carried out with a publicly available1 C implementation of the SDA algorithm called rainman. (footnote 1. github.com/veitelser/rectified-wires) |
| Open Datasets | Yes | Conservative learning on rectified wire networks with the SDA algorithm is demonstrated for MNIST and synthetic datasets. |
| Dataset Splits | Yes | Seen as images, the MNIST handwritten digits (Le Cun et al., 1998) are analog data. We compute the cumulative probability function from the training data and use the same function when processing the test data (with test samples below the minimum or above the maximum training samples mapped to 0 and 1 respectively). |
| Hardware Specification | Yes | On a single Intel Xeon 2.00GHz core rainman runs at a rate of 50ns per iteration per network edge. |
| Software Dependencies | No | The paper mentions 'C program expander', 'C implementation of the SDA algorithm called rainman', and 'C++ SGD optimizer' but does not specify version numbers for these software components. |
| Experiment Setup | Yes | The two-parameter sparse expander networks offer a convenient way to study behavior both with respect to network size and depth. The mini-batch size for SGD was fixed at 100 and we employed standard stochastic gradient descent without momentum. With the learning rate set at 0.002, training accuracy reaches a maximum of 91.5% in 6 minutes; this is also the test accuracy for this mode of operation. [...] switching after 10 epochs improved training and test accuracies to 94.4% and 94.1%, respectively. This trend in improvement continues and reaches 96.8% (training) and 95.6% (test) when the switch is made after 20 epochs (121 minutes). |