reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Weight-balancing fixes and flows for deep learning

Authors: Lawrence K. Saul

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Fig. 2 plots the convergence of the multiplicative updates in Algorithm 1 for different values of p and q and for three randomly initialized networks with differing numbers of hidden layers but the same overall numbers of input (200), hidden (3750), and output (10) units. From shallowest to deepest, the networks had 200-2500-1250-10 units, 200-2000-1000-500-250-10 units, and 200-1000-750-750-500-500-250-10 units. The networks were initialized with zero-valued biases and zero-mean Gaussian random weights whose variances were inversely proportional to the fan-in at each unit (He et al., 2015). The panels in the figure plot the ratio W p,q/ W0 p,q as a function of the number of multiplicative updates, where W0 p,q and W p,q are respectively the ℓp,q-norms, defined in eq. (1), of the initial and updated weight matrix. Results are shown for several values for p and q.
Researcher Affiliation	Industry	Lawrence K. Saul EMAIL Flatiron Institute, Center for Computational Mathematics 162 Fifth Avenue, New York, NY 10010
Pseudocode	Yes	Algorithm 1 Given a network with weights W0 and biases b0, this procedure returns a functionally equivalent network whose rescaled weights W and biases b minimize the norm W p,q in eq. (1) up to some tolerance δ > 0. The set H contains the indices of the network s hidden units.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It mentions related work and future directions but no explicit link or statement about releasing code for this paper's contributions.
Open Datasets	No	Fig. 2 plots the convergence of the multiplicative updates in Algorithm 1 for different values of p and q and for three randomly initialized networks with differing numbers of hidden layers but the same overall numbers of input (200), hidden (3750), and output (10) units. ... The networks were initialized with zero-valued biases and zero-mean Gaussian random weights... The experiments are performed on these randomly initialized synthetic networks, not a publicly available dataset.
Dataset Splits	No	The paper uses randomly initialized synthetic networks for its demonstration, rather than a specific dataset. Therefore, there are no dataset splits to specify.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	The networks were initialized with zero-valued biases and zero-mean Gaussian random weights whose variances were inversely proportional to the fan-in at each unit (He et al., 2015).