Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

Authors: Roey Magen, Ohad Shamir

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix W0 is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.
Researcher Affiliation Academia Roey Magen Weizmann Institute of Science EMAIL Ohad Shamir Weizmann Institute of Science EMAIL
Pseudocode Yes Algorithm 1 Stochastic Gradient Descent (SGD), with projection step and initialization at W0
Open Source Code No The paper does not provide any specific links to open-source code or explicit statements about code availability for the described methodology.
Open Datasets No The paper discusses theoretical input domains such as 'inputs from {x Rd : ||x|| 1}' or 'inputs from {x Rd : ||x|| bx}', but does not refer to any specific, named, publicly available datasets used for training.
Dataset Splits No The paper is theoretical and focuses on mathematical bounds and proofs. It does not describe any specific dataset splits (training, validation, test) or mention the process of validation for empirical experiments.
Hardware Specification No The paper is theoretical and does not describe any hardware used to run experiments.
Software Dependencies No The paper is theoretical and does not describe any specific software dependencies or version numbers used for implementation or experiments.
Experiment Setup No The paper is theoretical and does not describe specific experimental setup details such as hyperparameters, training configurations, or system-level settings.