reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Gaussian Loss Smoothing Enables Certified Training with Tight Convex Relaxations

Authors: Stefan Balauca, Mark Niklas Mueller, Yuhao Mao, Maximilian Baader, Marc Fischer, Martin Vechev

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that when combined with tight relaxations, these methods surpass state-of-the-art methods when training on the same network architecture for many settings. Our results clearly demonstrate the promise of Gaussian Loss Smoothing for training certifiably robust neural networks and pave a path towards leveraging tighter relaxations for certified training.
Researcher Affiliation	Collaboration	Stefan Balauca EMAIL INSAIT, Sofia University St. Kliment Ohridski , Bulgaria Mark Niklas Müller EMAIL Logic Star.ai (work done while at ETH Zürich) Yuhao Mao EMAIL Department of Computer Science, ETH Zürich, Switzerland Maximilian Baader EMAIL Department of Computer Science, ETH Zürich, Switzerland Marc Fischer EMAIL Invariant Labs (work done while at ETH Zürich) Martin Vechev EMAIL Department of Computer Science, ETH Zürich, Switzerland
Pseudocode	No	The paper describes the methods PGPE and RGS textually and illustrates their processes with figures (Figure 6 and Figure 7), but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Reproducibility Statement We release the complete code used for our experiments at github.com/stefanrzv2000/GLS-Cert-Training. A detailed description of the experimental setup and hyperparameters is provided in F.
Open Datasets	Yes	We use the MNIST (Le Cun et al., 2010), CIFAR-10 (Krizhevsky et al., 2009) and Tiny Image Net (Le & Yang, 2015) datasets for our experiments. All are open-source and freely available with unspecified license.
Dataset Splits	Yes	We train on the corresponding train set and certify on the validation set, as adopted in the literature (Shi et al., 2021; Müller et al., 2023; Mao et al., 2023a; De Palma et al., 2024).
Hardware Specification	Yes	For PGPE and RGS training, we used between 2 and 8 NVIDIA L4-24GB or NVIDIA A100-40GB GPUs. For standard certified training and for certification of all models we used single L4 GPUs.
Software Dependencies	No	The paper mentions "We implement all certified training methods in Py Torch (Paszke et al., 2019)..." but does not specify a version number for PyTorch or any other key software dependencies.
Experiment Setup	Yes	We train with the Adam optimizer (Kingma & Ba, 2015) with a starting learning rate of 5e-5 for 70 epochs on MNIST and 160 epochs on CIFAR-10 and Tiny Image Net. We use the first 20 epochs on MNIST and 80 epochs on CIFAR-10 and Tiny Image Net for epsilon-annealing, with the first epoch having epsilon = 0 for CIFAR-10 and Tiny Image Net. We decay the learning rate by a factor of 0.2 after epochs 50 and 60 for MNIST and respectively 120 and 140 for CIFAR-10 and Tiny Image Net.