Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

The Implicit Bias of Minima Stability: A View from Function Space

Authors: Rotem Mulayoff, Tomer Michaeli, Daniel Soudry

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now verify our theoretical predictions in experiments. We train a single-hidden-layer Re LU network using GD with varying step sizes, all initialized at the same point. Figure 3(a) shows the training data and solutions to which GD converged.
Researcher Affiliation Academia Rotem Mulayoff Technion Israel Institute of Technology EMAIL Tomer Michaeli Technion Israel Institute of Technology EMAIL Daniel Soudry Technion Israel Institute of Technology EMAIL
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statements or links regarding the availability of open-source code for the methodology described.
Open Datasets No The paper refers to "empirical distribution of the data" and uses synthetic data derived from "Uniform distribution", "Gaussian distribution", and "Laplace distribution" for illustrative purposes, but does not provide access information for a specific, named public dataset.
Dataset Splits No The paper does not specify exact split percentages or sample counts for training, validation, or test sets.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No The paper does not specify any software names with version numbers used in the experiments.
Experiment Setup Yes We train a single-hidden-layer Re LU network using GD with varying step sizes, all initialized at the same point... Figure 3(c) visualizes the sharpness of the solution as a function of the learning rate.