On Space Folds of ReLU Neural Networks

Authors: Michal Lewandowski, Hamid Eghbalzadeh, Bernhard Heinzl, Raphael Pisoni, Bernhard A. Moser

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we provide empirical analysis on a geometrical analysis benchmark (Cantor Net) as well as an image classification benchmark (MNIST). Our work advances the understanding of the activation space in Re LU neural networks by leveraging the phenomena of geometric folding, providing valuable insights on how these models process input information.
Researcher Affiliation Collaboration Michal Lewandowski EMAIL Software Competence Center Hagenberg (SCCH) Hamid Eghbalzadeh EMAIL AI at Meta Bernhard Heinzl EMAIL Software Competence Center Hagenberg (SCCH) Raphael Pisoni EMAIL Software Competence Center Hagenberg (SCCH) Bernhard A.Moser EMAIL Software Competence Center Hagenberg (SCCH) Johannes Kepler University of Linz (JKU)
Pseudocode Yes Algorithm 1: Computation of the Space Folding Measure (Eq. 4) ... Algorithm 2: Computation of Aggregated Folding Measures for Digit Pairs
Open Source Code No The paper does not contain any explicit statements about releasing code, nor does it provide a link to a code repository.
Open Datasets Yes Furthermore, we provide empirical analysis on a geometrical analysis benchmark (Cantor Net) as well as an image classification benchmark (MNIST).
Dataset Splits Yes We then train those networks for 30 epochs on pre-defined random seeds (in Python Num Py and Torch), and store their parameters. For the networks that we managed to train to a high validation accuracy (especially for deeper networks it is highly dependent on the initialization), we analyze the relationship between the depth of a Re LU network and the aggregated median of maximas of non-zero space folding across all pairs of digits in the MNIST test set (100 pairs, 1M paths Γ for each pair).
Hardware Specification No The paper mentions that networks were trained and experiments performed but does not specify any hardware details such as CPU, GPU models, or memory.
Software Dependencies No We then train those networks for 30 epochs on pre-defined random seeds (in Python Num Py and Torch), and store their parameters. While Python, NumPy, and Torch are mentioned, specific version numbers are not provided.
Experiment Setup Yes We keep the number of hidden neurons constant (equal 60), and we experiment with the depth and the width of the network, trying the following architectures: 2 × 30, 3 × 20, 4 × 15, 5 × 12, 6 × 10, 10 × 6, with the notation (no. layers) × (no. neurons). We then train those networks for 30 epochs on pre-defined random seeds (in Python Num Py and Torch), and store their parameters.