Cauchy-Schwarz Regularizers
Authors: Sueda Taner, Ziyi Wang, Christoph Studer
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the efficacy of CS regularizers, we provide results for solving underdetermined systems of linear equations and weight quantization in neural networks. ... Finally, we showcase the efficacy and versatility of CS regularizers for solving underdetermined systems of linear equations and neural network weight binarization and ternarization. All proofs and additional experimental results are relegated to the appendices in the supplementary material. The code for our numerical experiments is available under https://github.com/IIP-Group/CS_regularizers. ... We conduct experiments on the benchmark datasets Image Net (ILSVRC12) (Deng et al., 2009) and CIFAR-10 (Krizhevsky, 2009) for image classification using Py Torch (Paszke et al., 2019). |
| Researcher Affiliation | Academia | Sueda Taner ETH Zurich, Switzerland EMAIL Ziyi Wang ETH Zurich, Switzerland EMAIL Christoph Studer ETH Zurich, Switzerland EMAIL |
| Pseudocode | No | The paper describes methods and procedures in narrative text and mathematical formulations. For example, Section 3.4.1 details the 'METHOD' for weight quantization in three steps. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | Yes | All proofs and additional experimental results are relegated to the appendices in the supplementary material. The code for our numerical experiments is available under https://github.com/IIP-Group/CS_regularizers. |
| Open Datasets | Yes | We conduct experiments on the benchmark datasets Image Net (ILSVRC12) (Deng et al., 2009) and CIFAR-10 (Krizhevsky, 2009) for image classification using Py Torch (Paszke et al., 2019). |
| Dataset Splits | Yes | Image Net has over 1.2 M training images and 50 k validation images from 1000 object classes. ... CIFAR-10 (Krizhevsky, 2009) consists of over 50 k training images and 10 k testing images from 10 object classes. |
| Hardware Specification | Yes | For these three scenarios with Res Net-18, we measured 660 s, 680 s, and 650 s, respectively; with Res Net-20, we measured 4.1 s, 5.2 s, and 3.8 s. These numbers demonstrate that, while the calculation of the CS regularizers naturally results in some overhead in Step 1, the training might be even faster than full-precision training in Step 3, depending on the size of the network and the training dataset. |
| Software Dependencies | No | For our neural network weight quantization experiments in Section 3.4, we use Py Torch (Paszke et al., 2019). ... For our underdetermined linear systems experiments in Section 3.1, we used MATLAB. The paper mentions PyTorch and MATLAB but does not provide specific version numbers for either, nor version numbers for any libraries used with PyTorch. |
| Experiment Setup | Yes | For Image Net, we use Res Net-18 (He et al., 2016), initialize the weights with a pretrained fullprecision model from Py Torch, and train the network for 40 and 20 epochs in Steps 1 and 3, respectively, with a batch size of 1024. For CIFAR-10, we use Res Net-20, initialize the weights with a pretrained full-precision model from Idelbayev (2021) similarly to Qin et al. (2020), and train the network for 400 and 20 epochs in Steps 1 and 3, respectively, with a batch size of 128. For both datasets, we set λ = 10 for binarization and λ = 105 for ternarization. We use the Adam optimizer (Kingma & Ba, 2017) with its learning rate initialized by 0.001 and the cosine annealing learning rate scheduler (Loshchilov & Hutter, 2016). |