reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learned Thresholds Token Merging and Pruning for Vision Transformers

Authors: Maxim Bonnaerens, Joni Dambre

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our approach with extensive experiments on vision transformers on the Image Net classification task. Our results demonstrate that LTMP achieves state-of-the-art accuracy across reduction rates while requiring only a single fine-tuning epoch, which is an order of magnitude faster than previous methods.
Researcher Affiliation	Academia	Maxim Bonnaerens EMAIL IDLab-AIRO, Ghent University imec Joni Dambre EMAIL IDLab-AIRO, Ghent University imec
Pseudocode	No	The paper describes algorithms and mathematical formulations using equations (e.g., Equation 1, 2, 5, 6, 7, 8, 10, 11, 12, 14, 15, 16, 17) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https: //github.com/Mxbonn/ltmp.
Open Datasets	Yes	In this section, we demonstrate our approach through extensive experiments on the Image Net-1k classification task (Deng et al., 2009) using various Vi T variants (Steiner et al., 2022; Touvron et al., 2021).
Dataset Splits	Yes	To select these hyperparameters, we used a separate held-out validation set containing 2% of the original Image Net training set.
Hardware Specification	Yes	The benchmark is conducted on a Google Pixel 7 and averaged over 200 runs (with 50 warm-up runs prior).
Software Dependencies	No	All pretrained models are taken from the timm Py Torch library (Wightman, 2022). [...] benchmarked it on a mobile device using Py Torch s optimize_for_mobile function and speed_benchmark Android binary.
Experiment Setup	Yes	All our experiments are trained for a single epoch, using SGD without momentum and a batch size of 128. The remaining training settings such as augmentations are set to the default values of timm. The hyperparameters that are introduced in LTMP are set to τ = 0.1 and λ = 10. As the importance scores and similarity scores have values in different ranges we use separate learning rates for the thresholds in the pruning modules and the merging modules: 5 10 6 for the pruning thresholds and 5 10 3 for the merging thresholds.