Cauchy-Schwarz Regularizers

Authors: Sueda Taner, Ziyi Wang, Christoph Studer

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the efficacy of CS regularizers, we provide results for solving underdetermined systems of linear equations and weight quantization in neural networks. ... Finally, we showcase the efficacy and versatility of CS regularizers for solving underdetermined systems of linear equations and neural network weight binarization and ternarization. All proofs and additional experimental results are relegated to the appendices in the supplementary material. The code for our numerical experiments is available under https://github.com/IIP-Group/CS_regularizers. ... We conduct experiments on the benchmark datasets Image Net (ILSVRC12) (Deng et al., 2009) and CIFAR-10 (Krizhevsky, 2009) for image classification using Py Torch (Paszke et al., 2019).
Researcher Affiliation Academia Sueda Taner ETH Zurich, Switzerland EMAIL Ziyi Wang ETH Zurich, Switzerland EMAIL Christoph Studer ETH Zurich, Switzerland EMAIL
Pseudocode No The paper describes methods and procedures in narrative text and mathematical formulations. For example, Section 3.4.1 details the 'METHOD' for weight quantization in three steps. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code Yes All proofs and additional experimental results are relegated to the appendices in the supplementary material. The code for our numerical experiments is available under https://github.com/IIP-Group/CS_regularizers.
Open Datasets Yes We conduct experiments on the benchmark datasets Image Net (ILSVRC12) (Deng et al., 2009) and CIFAR-10 (Krizhevsky, 2009) for image classification using Py Torch (Paszke et al., 2019).
Dataset Splits Yes Image Net has over 1.2 M training images and 50 k validation images from 1000 object classes. ... CIFAR-10 (Krizhevsky, 2009) consists of over 50 k training images and 10 k testing images from 10 object classes.
Hardware Specification Yes For these three scenarios with Res Net-18, we measured 660 s, 680 s, and 650 s, respectively; with Res Net-20, we measured 4.1 s, 5.2 s, and 3.8 s. These numbers demonstrate that, while the calculation of the CS regularizers naturally results in some overhead in Step 1, the training might be even faster than full-precision training in Step 3, depending on the size of the network and the training dataset.
Software Dependencies No For our neural network weight quantization experiments in Section 3.4, we use Py Torch (Paszke et al., 2019). ... For our underdetermined linear systems experiments in Section 3.1, we used MATLAB. The paper mentions PyTorch and MATLAB but does not provide specific version numbers for either, nor version numbers for any libraries used with PyTorch.
Experiment Setup Yes For Image Net, we use Res Net-18 (He et al., 2016), initialize the weights with a pretrained fullprecision model from Py Torch, and train the network for 40 and 20 epochs in Steps 1 and 3, respectively, with a batch size of 1024. For CIFAR-10, we use Res Net-20, initialize the weights with a pretrained full-precision model from Idelbayev (2021) similarly to Qin et al. (2020), and train the network for 400 and 20 epochs in Steps 1 and 3, respectively, with a batch size of 128. For both datasets, we set λ = 10 for binarization and λ = 105 for ternarization. We use the Adam optimizer (Kingma & Ba, 2017) with its learning rate initialized by 0.001 and the cosine annealing learning rate scheduler (Loshchilov & Hutter, 2016).