Optimal Brain Apoptosis

Authors: Mingyuan Sun, Zheng Fang, Jiaxu Wang, Junjie Jiang, Delei Kong, Chenming Hu, Yuetong FANG, Renjing Xu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our code is available at https://github.com/NEU-REAL/OBA. This approach allows for a more precise pruning process, particularly in the context of CNNs and Transformers, as validated in our experiments including VGG19, Res Net32, Res Net50, and Vi TB/16 on CIFAR10, CIFAR100 and Imagenet datasets.
Researcher Affiliation Academia Mingyuan Sun1, Zheng Fang1, , Jiaxu Wang2, Junjie Jiang1, Delei Kong3, Chenming Hu1, Yuetong Fang2 & Renjing Xu2, 1Northeastern University 2The Hong Kong University of Science and Technology (Guangzhou) 3Hunan University
Pseudocode Yes Algorithm 1 Importance Score Acquisition of OBA
Open Source Code Yes Our code is available at https://github.com/NEU-REAL/OBA.
Open Datasets Yes as validated in our experiments including VGG19, Res Net32, Res Net50, and Vi TB/16 on CIFAR10, CIFAR100 and Imagenet datasets.
Dataset Splits Yes We set the maximum epochs to 150 for CIFAR10 and CIFAR100 datasets. For Image Net experiments, we use the same settings as Fang et al. (2023) of Res Net50 and Vi T-B/16. We set the maximum epochs to 100 for Image Net experiments.
Hardware Specification No The paper mentions "GPUs and TPUs" generally but does not specify the exact models or configurations used for their experiments.
Software Dependencies No The paper mentions using "the SGD optimizer with a momentum of 0.9 and a weight decay of 5 10 4" but does not specify any software versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For each pruning step, we obtain the importance score from 200 batches of data to calculate the importance score. We set the batch size to 64 and the learning rate to 0.001 for the fine-tuning process. We use the SGD optimizer with a momentum of 0.9 and a weight decay of 5 10 4. The learning rate is divided by 10 at the 80th and 120th epochs. We set the maximum epochs to 150 for CIFAR10 and CIFAR100 datasets. For Image Net experiments, we use the same settings as Fang et al. (2023) of Res Net50 and Vi T-B/16. We set the maximum epochs to 100 for Image Net experiments.