SMMF: Square-Matricized Momentum Factorization for Memory-Efficient Optimization
Authors: Kwangryeol Park, Seulki Lee
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiment, SMMF takes up to 96% less memory compared to state-of-the-art memoryefficient optimizers, e.g., Adafactor, CAME, and SM3, while achieving comparable model performance on various CNN and Transformer tasks. |
| Researcher Affiliation | Academia | Kwangryeol Park1, Seulki Lee2 1Artificial Intelligence Graduate School UNIST, South Korea 2Department of Computer Science and Engineering, UNIST, South Korea |
| Pseudocode | Yes | Algorithm 1: Overall SMMF applied to each layer. The elements of r, c, M, V , and S are initially set to zeros. |
| Open Source Code | Yes | Code https://github.com/eai-lab/SMMF |
| Open Datasets | Yes | We apply the five optimizers, including SMMF, to two representative image tasks, i.e., image classification and object detection, and evaluate them by 1) training Res Net-50 (He et al. 2016) and Mobile Net V2 (Dong et al. 2020) on CIFAR100 (Krizhevsky, Hinton et al. 2009) and Image Net (Russakovsky et al. 2015), and 2) training YOLOv5s and YOLOv5m (Ultralytics 2021) on COCO (Lin et al. 2015). |
| Dataset Splits | Yes | We apply the five optimizers, including SMMF, to two representative image tasks, i.e., image classification and object detection, and evaluate them by 1) training Res Net-50 (He et al. 2016) and Mobile Net V2 (Dong et al. 2020) on CIFAR100 (Krizhevsky, Hinton et al. 2009) and Image Net (Russakovsky et al. 2015), and 2) training YOLOv5s and YOLOv5m (Ultralytics 2021) on COCO (Lin et al. 2015). (Implies use of standard benchmark splits for these well-known datasets.) |
| Hardware Specification | No | The paper does not explicitly state the specific hardware used for running its experiments, such as GPU models, CPU models, or memory. It mentions 'different machines' in passing, but no specifics. |
| Software Dependencies | No | We implement the proposed SMMF using Py Torch (Paszke et al. 2017), which is available both on an Git Hub and in Appendix M. (PyTorch is mentioned but without a specific version number.) |
| Experiment Setup | No | The detailed experimental setups and training configurations are provided in Appendix L. |