reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Efficient Training Algorithm for Models with Block-wise Sparsity

Authors: Ding Zhu, Zhiqun Zuo, Mohammad Mahdi Khalili

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive empirical and theoretical analyses show that our algorithms can decrease the computation and memory costs significantly without a performance drop compared to baselines. Through extensive empirical study, we show that in some cases, our proposed method can reduce the number of training parameters and training FLOPs by 97% with a minimal accuracy drop.
Researcher Affiliation	Academia	Ding Zhu EMAIL Department of Computer Science and Engineering The Ohio State University Zhiqun Zuo EMAIL Department of Computer Science and Engineering The Ohio State University Mohammad Mahdi Khalili EMAIL Department of Computer Science and Engineering The Ohio State University
Pseudocode	No	The paper includes mathematical formulations, propositions, and proofs, but does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is released or provide a link to a code repository. It only mentions the use of the PyTorch package.
Open Datasets	Yes	The MNIST dataset contains 60000 training images and 10000 testing images which are gray pictures of digits 0 to 9. The size of each image is 28 28 pixels. ... We conduct an experiment with our approach with the Vi T-tiny, Vi T-base(Dosovitskiy et al., 2021; Touvron et al., 2021) and Swin-transformer Tiny (Liu et al., 2021b) on CIFAR-100 image classification dataset (He et al., 2016).
Dataset Splits	Yes	The MNIST dataset contains 60000 training images and 10000 testing images which are gray pictures of digits 0 to 9. ... The dataset has 60 thousand pictures of 100 different categories. The model is trained using this dataset for 300 epochs.
Hardware Specification	Yes	we used a server with 64 CPUs of AMD EPYC 7313 16-Core Processor. The server has 8 RTX A5000 GPUs, with 24GB memory for each one.
Software Dependencies	No	We used Py Torch package ptflops to calculate the number of flops. While it mentions PyTorch and ptflops, it does not specify version numbers for these software components.
Experiment Setup	Yes	We keep the rank of our decomposition equal to 2. ... We set λ1 = λ2 = 0.01 and increase these parameters by 0.002 every 5 epochs. We continue training for 50 epochs. ... The rank of the decomposition under our algorithm is 5 for all the layers in all the experiments. ... The model is trained using this dataset for 300 epochs. We keep the rank of our algorithm equal to 4.