Optimal Brain Apoptosis
Authors: Mingyuan Sun, Zheng Fang, Jiaxu Wang, Junjie Jiang, Delei Kong, Chenming Hu, Yuetong FANG, Renjing Xu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our code is available at https://github.com/NEU-REAL/OBA. This approach allows for a more precise pruning process, particularly in the context of CNNs and Transformers, as validated in our experiments including VGG19, Res Net32, Res Net50, and Vi TB/16 on CIFAR10, CIFAR100 and Imagenet datasets. |
| Researcher Affiliation | Academia | Mingyuan Sun1, Zheng Fang1, , Jiaxu Wang2, Junjie Jiang1, Delei Kong3, Chenming Hu1, Yuetong Fang2 & Renjing Xu2, 1Northeastern University 2The Hong Kong University of Science and Technology (Guangzhou) 3Hunan University |
| Pseudocode | Yes | Algorithm 1 Importance Score Acquisition of OBA |
| Open Source Code | Yes | Our code is available at https://github.com/NEU-REAL/OBA. |
| Open Datasets | Yes | as validated in our experiments including VGG19, Res Net32, Res Net50, and Vi TB/16 on CIFAR10, CIFAR100 and Imagenet datasets. |
| Dataset Splits | Yes | We set the maximum epochs to 150 for CIFAR10 and CIFAR100 datasets. For Image Net experiments, we use the same settings as Fang et al. (2023) of Res Net50 and Vi T-B/16. We set the maximum epochs to 100 for Image Net experiments. |
| Hardware Specification | No | The paper mentions "GPUs and TPUs" generally but does not specify the exact models or configurations used for their experiments. |
| Software Dependencies | No | The paper mentions using "the SGD optimizer with a momentum of 0.9 and a weight decay of 5 10 4" but does not specify any software versions (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For each pruning step, we obtain the importance score from 200 batches of data to calculate the importance score. We set the batch size to 64 and the learning rate to 0.001 for the fine-tuning process. We use the SGD optimizer with a momentum of 0.9 and a weight decay of 5 10 4. The learning rate is divided by 10 at the 80th and 120th epochs. We set the maximum epochs to 150 for CIFAR10 and CIFAR100 datasets. For Image Net experiments, we use the same settings as Fang et al. (2023) of Res Net50 and Vi T-B/16. We set the maximum epochs to 100 for Image Net experiments. |