reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Primer for Neural Arithmetic Logic Modules

Authors: Bhumika Mistry, Katayoun Farrahi, Jonathon Hare

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To alleviate the existing inconsistencies, we create a benchmark which compares all existing arithmetic NALMs. ... We compare existing ﬁndings across modules. ... Therefore, we provide results on a Single Module Arithmetic Task, training modules on their respective operations over a range of diﬀerent interpolation distributions and testing over a range of extrapolation distributions. ... We present the NALMs performances on the four main arithmetic operations. Each ﬁgure consist of plots for each evaluation metric (success rate, speed of convergence and sparsity error) discussed in the evaluation paragraph above, with conﬁdence intervals calculated over 25 seeds.
Researcher Affiliation	Academia	Bhumika Mistry EMAIL Katayoun Farrahi EMAIL Jonathon Hare EMAIL Department of Vision Learning, and Control Electronics and Computer Science University of Southampton Southampton, SO17 1BJ, United Kingdom
Pseudocode	No	The paper provides mathematical definitions for the modules (e.g., Equations 1-5 for NALU, Equations 6-12 for iNALU) and architectural illustrations (Figures 2-9). Appendix C also provides a 'Step-by-step Example using the NALU', which details calculations in a narrative format, but there are no structured pseudocode or algorithm blocks explicitly labeled as such.
Open Source Code	Yes	Code is available at: https://github.com/bmistry4/nalm-benchmark
Open Datasets	Yes	MNIST is also used to evaluate NALU s abilities on being part of end-to-end applications. ... Madsen and Johansen (2020) also use MNIST for testing the module s abilities to act as a recurrent module for adding/multiplying the digits. ... Interpolation (train/validation) and extrapolation (test) ranges are presented in Table 3. Data (as ﬂoats) is drawn from a Uniform distribution with the range values as the lower and upper bounds.
Dataset Splits	Yes	Interpolation (training/validation) and extrapolation (test) ranges are presented in Table 3. ... Table 3: Interpolation (train/validation) and extrapolation (test) ranges used for the Single Module Arithmetic Task. Data (as ﬂoats) is drawn from a Uniform distribution with the range values as the lower and upper bounds.
Hardware Specification	No	The authors acknowledge the use of the IRIDIS High Performance Computing Facility, the ECS Alpha Cluster, and associated support services at the University of Southampton in the completion of this work. This refers to computing facilities but lacks specific details like CPU/GPU models or memory amounts.
Software Dependencies	No	Table 2 lists 'Programming framework Pytorch (Python) Flux (Julia) Tensorﬂow (Python)' for different experiment setups. However, specific version numbers for these frameworks or any other software dependencies are not provided.
Experiment Setup	Yes	Setup. A single module is used. The input size is two and output size is one, hence there is no input redundancy. Hence, the objective is to model: y = x1 x2 where {+, , , }. We test the: NALU, i NALU, G-NALU, NAC+, NAC , NAU, NMU, NPU, and Real NPU. Each run trains for 50,000 iterations to allow for enough iterations until convergence. A MSE loss is used with an Adam optimiser. Interpolation (training/validation) and extrapolation (test) ranges are presented in Table 3. Early stopping is applied using a validation dataset sampled from the interpolation range. Experiment/hyper-parameters set can be found in Appendix D. ... Appendix D (Table 5, 6, 7, 8) contains detailed parameters like 'Total iterations 50000', 'Learning rate 1.00E-03', 'Optimiser Adam (with default parameters)', and specific parameters for NPU, Real NPU, NAU, NMU, and iNALU.