reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LEVIS: Large Exact Verifiable Input Spaces for Neural Networks

Authors: Mohamad Fares El Hajj Chehade, Wenting Li, Brian Wesley Bell, Russell Bent, Saif R. Kazi, Hao Zhu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our methods on two neural network applications electric power flow regression and image classification demonstrating both performance gains and visual insights into the geometry of the verifiable input space.
Researcher Affiliation	Collaboration	1Chandra Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, USA 2Los Alamos National Laboratory, Los Alamos, NM, USA. Correspondence to: Mohamad Chehade <EMAIL>, Wenting Li <EMAIL>, Brian W. Bell <EMAIL>, Russell Bent <EMAIL>, Saif R. Kazi <EMAIL>, Hao Zhu <EMAIL>.
Pseudocode	Yes	Algorithm 1 LEVIS-α: Iterative Refinement for Center Estimation Algorithm 2 LEVIS-β
Open Source Code	Yes	Code and experiments are being made available at: https://github.com/LEVIS-LANL/LEVIS
Open Datasets	Yes	For the optimal power flow (OPF) problem, we generate a direct-current (DC) power flow dataset, denoted as D1 = {(xk, yk) \| xk R3, yk R3}1000 k=1 , using the IEEE 9bus power grid benchmark a widely recognized model for simulating energy flow in power systems. For image classification, we utilize the MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets.
Dataset Splits	Yes	The dataset is created by solving the DC Optimal Power Flow (DC-OPF) problem (Frank et al., 2012) using nominal inputs x0 = [125, 90, 100]T from standard datasets, perturbed by 10% uniform noise to produce a diverse set of 1,000 data samples. ...selecting 80% of the data samples randomly for training and reserving the remaining for testing. MNIST consists of 60,000 training and 10,000 testing samples with an input dimension of 784, while CIFAR-10 includes 50,000 training and 10,000 testing samples with an input dimension of 3,072.
Hardware Specification	Yes	We used three machines for the computations. The specifications of the machines are shown in Table 4, Table 5, and Table 6. Table 4. CPU 1 Information Summary Attribute Details Processor Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz Base Speed 1.99 GHz Current Speed 1.57 GHz (can vary) Cores 4 Logical Processors 8 L1 Cache 256 KB L2 Cache 1.0 MB L3 Cache 8.0 MB Virtualization Disabled Hyper-V Support Yes Table 5. CPU 2 Information Summary Attribute Details Processor Intel(R) Xeon(R) Gold 6258R CPU @ 2.70GHz Base Speed 2.70 GHz Cores 56 Logical Processors 112 L1 Cache 100 MB L2 Cache 3136 MB L3 Cache 154 MB Table 6. CPU 3 Information Summary Attribute Details Processor Apple M3 Max CPU @ 64GHz Cores 16
Software Dependencies	No	To solve the mixed-integer programming (MIP) problems in (1) and (3), we use Gurobi with OMLT (Ceccon et al., 2022) for small neural networks (fewer than 100 neurons) trained on DC-OPF datasets. For large-scale MNIST and CIFAR10 networks, we adopt the Complementary Constraints (CC) approach (Section 4.2), employing Ipopt (W achter & Biegler, 2006) for the NLP formulation and Gurobi for the reduced MIP. All optimization models are implemented using Pyomo (Hart et al., 2024). No specific version numbers were provided for Gurobi, OMLT, Ipopt, or Pyomo.
Experiment Setup	Yes	We employ three-layer dense neural networks to model solutions for both optimal power flow (regression) and image classification tasks. We train neural networks to regress optimal power flow solutions and compare the verifiable radius from our method (solving the NLP in (1) (1f)) against the Lipschitz-based bound. We implement a two-layer Re LU neural network trained using cross-entropy loss and optimized with the Adam optimizer at a learning rate of 0.001. Specifically, let pi and qi be the complementarity variables for the ith neuron. The nonlinear optimization of (1) transforms into: ... To improve convergence behavior, we introduce a small regularization parameter ε > 0 (suggested to be 10 5).