On the Local Complexity of Linear Regions in Deep ReLU Networks
Authors: Niket Nikul Patel, Guido Montufar
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We replicate similar experiments in Figure 3. In this work, we aim to develop theory to explain some of the empirical results of Humayun et al. (2024b)." and "We empirically demonstrate this behavior on the MNIST dataset in Figure 3." and "We empirically validate this claim in Figure 13, where we demonstrate that the local complexity will typically be lower for networks trained with a larger weight decay. |
| Researcher Affiliation | Academia | 1Department of Mathematics, UCLA, USA 2Department of Statistics & Data Science, UCLA, USA 3MPI Mi S, Germany. Correspondence to: Niket Patel <EMAIL>, Guido Mont ufar <EMAIL>. |
| Pseudocode | No | The paper describes methods using mathematical formulations and textual explanations without presenting any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | We empirically demonstrate this behavior on the MNIST dataset in Figure 3." and "Specifically, we show similar trends for the CIFAR-10 (Krizhevsky & Hinton, 2009) and Imagenette (Howard, 2019) datasets. |
| Dataset Splits | No | Here we train a 4 layer MLP with 200 neurons in each layer on a subset of 1000 images across all classes in the MNIST dataset." The paper mentions using subsets of datasets but does not provide specific training/test/validation splits or references to standard splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | We train a 4 layer MLP with 200 neurons in each layer... We use an initialization scale that is 2x the standard He initialization." and "We train with the Adam optimizer with learning rate 1e 4. |