reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Nonparametric Learning of Two-Layer ReLU Residual Units

Authors: Zhunxuan Wang, Linyun He, Chunchuan Lyu, Shay B Cohen

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further prove the strong statistical consistency of our algorithm, and demonstrate its robustness and sample efficiency through experimental results on synthetic data and a set of benchmark regression datasets. In our experiments, we describe a first set of synthetic dataset experiments, that show that our algorithm identifies the true parameters from which the data were generated (Section 7.1), and then a second set of experiments that use our algorithm on standard real-world benchmark datasets (Section 7.2).
Researcher Affiliation	Collaboration	Zhunxuan Wang EMAIL Amazon London EC2A 2FA, United Kingdom Linyun He EMAIL Georgia Institute of Technology Atlanta, GA 30332, United States Chunchuan Lyu EMAIL Instituto de Telecomunicacões Torre Norte 1049-001 Lisbon, Portugal Shay B. Cohen EMAIL University of Edinburgh Edinburgh EH8 9AB, United Kingdom
Pseudocode	Yes	Algorithm 1 Learn a Re LU residual unit, layer 2. Algorithm 2 Learn a Re LU residual unit, layer 1. Algorithm 3 Learn a Re LU residual unit by LR. Algorithm 4 Rescale a ˆGNPE 2 minimizer. Algorithm 5 Learn a Re LU residual unit layer 2.
Open Source Code	Yes	Our code is available at https://github.com/uuzeeex/relu-resunit-learning.
Open Datasets	Yes	The first six datasets are standard benchmark datasets taken from Delve10 and the UCI Machine Learning Repository.11 The last dataset, Jigsaw, is bigger with its goal to use word Fast Text12 embeddings of tweets to predict their level of toxicity.
Dataset Splits	Yes	In all of our experiments, we report five-fold cross-validation results, where the first fold is used to tune the hyperparameters, and the last fold is used as both a validation set for early stopping (first half) and to report the results (second half).
Hardware Specification	Yes	CPU specification: 2.8 GHz Quad-Core Intel Core i7.
Software Dependencies	Yes	We use CVX (Grant & Boyd, 2014; 2008) that calls SDPT3 (Toh et al., 1999) (a free solver under GPLv3 license) and solves our convex QP/LP in polynomial time. For this set of experiments, with all datasets, we use the MOSEK solver with an academic license (Ap S, 2019).
Experiment Setup	Yes	SGD is conducted on mini-batch empirical losses of L(A, B) with batch size 32 for 256 epochs in each learning trial. We apply time-based learning rate decay η = η0/ (1 + γ T) with initial rate η0 = 10 3 and decay rate γ = 10 5, where T is the epoch number. With backpropagation, we use SGD with a learning rate of 0.000001 and batch size of 500. We run backpropagation until the mean squared error does not change between epochs within a fraction of 1/10000.