reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks

Authors: Mahsa Taheri, Fang Xie, Johannes Lederer

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide here some numerical observations to clarify theories of Section 2 and Section 3. We minimize a least-squares complemented by ℓ1-regularization for shallow neural networks with linear and Re LU activation functions. We set our tuning parameter on the order of log(np)/ n based on our experiments. We consider neural networks with d = w = 10, that are trained over 500 and tested over 300 data sample generated from a standard normal distribution and labeled by a sparse-target network (having the same structure as the considered model) plus a Gaussian noise. ... We report the relative training error and the relative test error for a potential global optimum, an approximate stationary point, and a randomly generated network...
Researcher Affiliation	Academia	Mahsa Taheri EMAIL Department of Mathematics University of Hamburg; Fang Xie EMAIL Guangdong Provincial Key Laboratory of IRADS Beijing Normal-Hong Kong Baptist University; Johannes Lederer EMAIL Department of Mathematics University of Hamburg
Pseudocode	No	The paper describes mathematical derivations and theoretical proofs, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or sections with structured, code-like steps.
Open Source Code	No	The paper does not contain an unambiguous statement that the authors are releasing their code for the methodology described, nor does it provide a direct link to a source-code repository.
Open Datasets	Yes	We applied our method to the MNIST, fashion-MNIST, and K-MNIST dataset using cross-entropy loss, with a neural network consisting of 10-layer weight matrices and Re LU activations, with network width 50.
Dataset Splits	Yes	We consider neural networks with d = w = 10, that are trained over 500 and tested over 300 data sample generated from a standard normal distribution and labeled by a sparse-target network...
Hardware Specification	Yes	All the simulations were executed on a local computer (Apple M2, 16GB memory), with an average run time of less than 10 minutes per individual run in Python.
Software Dependencies	No	We use PyTorch s default initialization, where weights are drawn from a uniform distribution in [ 1/ p, 1/ p]... We use stochastic gradient descent with a small convergence threshold... All the simulations were executed... in Python. For optimization, we employed SGD with the learning rate 0.02. ... Specifically, we replaced SGD with Adam, using a learning rate of 0.005...
Experiment Setup	Yes	We set our tuning parameter on the order of log(np)/ n based on our experiments. ... We use stochastic gradient descent with a small convergence threshold to ensure that the optimization process does not stop early. ... We use PyTorch s default initialization, where weights are drawn from a uniform distribution in [ 1/ p, 1/ p]... For optimization, we employed SGD with the learning rate 0.02. ... Specifically, we replaced SGD with Adam, using a learning rate of 0.005...