reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fully Automatic Neural Network Reduction for Formal Verification

Authors: Tobias Ladner, Matthias Althoff

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation shows that our approach reduces large neural networks to a fraction of the original number of neurons and thus shortens the verification time to a similar degree. We implemented our approach in the MATLAB toolbox CORA (Althoff, 2015; Singh et al., 2018a) and evaluate it using several benchmarks and neural network variants from the VNN competition (Bak et al., 2021). The main results are presented in this section. All evaluation details, additional experiments, and ablation studies are given in Appendix B.
Researcher Affiliation	Academia	Tobias Ladner EMAIL Matthias Althoff EMAIL School of Computation, Information and Technology Technical University of Munich, Germany
Pseudocode	Yes	Algorithm 1 Automatic Determination of Bucket Tolerance Algorithm 2 On-the-fly Neural Network Reduction Algorithm 3 Sound Compression Preprocessing
Open Source Code	No	We implemented our approach in the MATLAB toolbox CORA (Althoff, 2015; Singh et al., 2018a) and evaluate it using several benchmarks and neural network variants from the VNN competition (Bak et al., 2021). Explanation: The paper states that the approach was implemented in an existing toolbox (CORA), but does not provide a specific link or explicit statement that their implementation for this paper is open-sourced.
Open Datasets	Yes	We implemented our approach in the MATLAB toolbox CORA (Althoff, 2015; Singh et al., 2018a) and evaluate it using several benchmarks and neural network variants from the VNN competition (Bak et al., 2021). For example, on the ERAN 6 200 sigmoid network classifying MNIST images... We show this iterative process on the CIFAR-10 dataset in Fig. 6b using the large convolutional neural network taken from the Marabou benchmark. ...ACAS Xu benchmark (Bak et al., 2021).
Dataset Splits	No	For all image datasets, we sample 100 correctly classified images from the test set and average the results. Explanation: The paper mentions using a 'test set' and sampling 100 images from it, implying standard splits for these benchmarks. However, it does not explicitly state the specific percentages or methodology for training/validation/test splits for reproducibility.
Hardware Specification	Yes	All computations were performed on an Intel Core Gen. 11 i7-11800H CPU @2.30GHz with 64GB memory if not stated otherwise. The GPU results for α, β-CROWN were computed on the same device using a NVIDIA Ge Force RTX 3080 laptop GPU.
Software Dependencies	No	We implemented our approach in the MATLAB toolbox CORA (Althoff, 2015; Singh et al., 2018a) and evaluate it using several benchmarks and neural network variants from the VNN competition (Bak et al., 2021). Explanation: The paper mentions using the 'MATLAB toolbox CORA' and 'α, β-CROWN', but does not provide specific version numbers for MATLAB, CORA, or α, β-CROWN.
Experiment Setup	Yes	Thus, we automatically determine δ given a desired remaining number of neurons in Alg. 1 using binary search. We denote the ratio of remaining neurons compared to the original network with the reduction rate ρ [0, 1]. To verify a given specification, we initially choose a very small ρ and iteratively increase it if the reduction is too outer-approximative. ... The bucket tolerance δ R+ influences how many neurons are merged, where a larger value results in more aggressive neuron merging and thus a larger outer approximation. ... The perturbation radius ϵ R+ is always stated with respect to the normalized images X [0, 1]n0. If not otherwise stated, feed-forward neural networks are reduced using static merge buckets and convolutional neural networks using dynamic merge buckets (Sec. 3.2).