Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

A Primer for Neural Arithmetic Logic Modules

Authors: Bhumika Mistry, Katayoun Farrahi, Jonathon Hare

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To alleviate the existing inconsistencies, we create a benchmark which compares all existing arithmetic NALMs. ... We compare existing findings across modules. ... Therefore, we provide results on a Single Module Arithmetic Task, training modules on their respective operations over a range of different interpolation distributions and testing over a range of extrapolation distributions. ... We present the NALMs performances on the four main arithmetic operations. Each figure consist of plots for each evaluation metric (success rate, speed of convergence and sparsity error) discussed in the evaluation paragraph above, with confidence intervals calculated over 25 seeds.
Researcher Affiliation Academia Bhumika Mistry EMAIL Katayoun Farrahi EMAIL Jonathon Hare EMAIL Department of Vision Learning, and Control Electronics and Computer Science University of Southampton Southampton, SO17 1BJ, United Kingdom
Pseudocode No The paper provides mathematical definitions for the modules (e.g., Equations 1-5 for NALU, Equations 6-12 for iNALU) and architectural illustrations (Figures 2-9). Appendix C also provides a 'Step-by-step Example using the NALU', which details calculations in a narrative format, but there are no structured pseudocode or algorithm blocks explicitly labeled as such.
Open Source Code Yes Code is available at: https://github.com/bmistry4/nalm-benchmark
Open Datasets Yes MNIST is also used to evaluate NALU s abilities on being part of end-to-end applications. ... Madsen and Johansen (2020) also use MNIST for testing the module s abilities to act as a recurrent module for adding/multiplying the digits. ... Interpolation (train/validation) and extrapolation (test) ranges are presented in Table 3. Data (as floats) is drawn from a Uniform distribution with the range values as the lower and upper bounds.
Dataset Splits Yes Interpolation (training/validation) and extrapolation (test) ranges are presented in Table 3. ... Table 3: Interpolation (train/validation) and extrapolation (test) ranges used for the Single Module Arithmetic Task. Data (as floats) is drawn from a Uniform distribution with the range values as the lower and upper bounds.
Hardware Specification No The authors acknowledge the use of the IRIDIS High Performance Computing Facility, the ECS Alpha Cluster, and associated support services at the University of Southampton in the completion of this work. This refers to computing facilities but lacks specific details like CPU/GPU models or memory amounts.
Software Dependencies No Table 2 lists 'Programming framework Pytorch (Python) Flux (Julia) Tensorflow (Python)' for different experiment setups. However, specific version numbers for these frameworks or any other software dependencies are not provided.
Experiment Setup Yes Setup. A single module is used. The input size is two and output size is one, hence there is no input redundancy. Hence, the objective is to model: y = x1 x2 where {+, , , }. We test the: NALU, i NALU, G-NALU, NAC+, NAC , NAU, NMU, NPU, and Real NPU. Each run trains for 50,000 iterations to allow for enough iterations until convergence. A MSE loss is used with an Adam optimiser. Interpolation (training/validation) and extrapolation (test) ranges are presented in Table 3. Early stopping is applied using a validation dataset sampled from the interpolation range. Experiment/hyper-parameters set can be found in Appendix D. ... Appendix D (Table 5, 6, 7, 8) contains detailed parameters like 'Total iterations 50000', 'Learning rate 1.00E-03', 'Optimiser Adam (with default parameters)', and specific parameters for NPU, Real NPU, NAU, NMU, and iNALU.