On Space Folds of ReLU Neural Networks
Authors: Michal Lewandowski, Hamid Eghbalzadeh, Bernhard Heinzl, Raphael Pisoni, Bernhard A. Moser
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we provide empirical analysis on a geometrical analysis benchmark (Cantor Net) as well as an image classification benchmark (MNIST). Our work advances the understanding of the activation space in Re LU neural networks by leveraging the phenomena of geometric folding, providing valuable insights on how these models process input information. |
| Researcher Affiliation | Collaboration | Michal Lewandowski EMAIL Software Competence Center Hagenberg (SCCH) Hamid Eghbalzadeh EMAIL AI at Meta Bernhard Heinzl EMAIL Software Competence Center Hagenberg (SCCH) Raphael Pisoni EMAIL Software Competence Center Hagenberg (SCCH) Bernhard A.Moser EMAIL Software Competence Center Hagenberg (SCCH) Johannes Kepler University of Linz (JKU) |
| Pseudocode | Yes | Algorithm 1: Computation of the Space Folding Measure (Eq. 4) ... Algorithm 2: Computation of Aggregated Folding Measures for Digit Pairs |
| Open Source Code | No | The paper does not contain any explicit statements about releasing code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Furthermore, we provide empirical analysis on a geometrical analysis benchmark (Cantor Net) as well as an image classification benchmark (MNIST). |
| Dataset Splits | Yes | We then train those networks for 30 epochs on pre-defined random seeds (in Python Num Py and Torch), and store their parameters. For the networks that we managed to train to a high validation accuracy (especially for deeper networks it is highly dependent on the initialization), we analyze the relationship between the depth of a Re LU network and the aggregated median of maximas of non-zero space folding across all pairs of digits in the MNIST test set (100 pairs, 1M paths Γ for each pair). |
| Hardware Specification | No | The paper mentions that networks were trained and experiments performed but does not specify any hardware details such as CPU, GPU models, or memory. |
| Software Dependencies | No | We then train those networks for 30 epochs on pre-defined random seeds (in Python Num Py and Torch), and store their parameters. While Python, NumPy, and Torch are mentioned, specific version numbers are not provided. |
| Experiment Setup | Yes | We keep the number of hidden neurons constant (equal 60), and we experiment with the depth and the width of the network, trying the following architectures: 2 × 30, 3 × 20, 4 × 15, 5 × 12, 6 × 10, 10 × 6, with the notation (no. layers) × (no. neurons). We then train those networks for 30 epochs on pre-defined random seeds (in Python Num Py and Torch), and store their parameters. |