Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Dynamic Sparse Training of Diagonally Sparse Networks
Authors: Abhishek Tyagi, Arjun Iyer, William H Renninger, Christopher Kanan, Yuhao Zhu
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations on diverse neural architectures demonstrate that our method maintains accuracy on par with unstructured counterparts while benefiting from tangible computational gains. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Rochester, Rochester, NY, USA 2The Institute of Optics, University of Rochester, Rochester, NY, USA. Correspondence to: Abhishek Tyagi <EMAIL>. |
| Pseudocode | No | For a comprehensive understanding of the implementation details, including pseudocode and additional optimizations, we direct readers to (Okanovic et al., 2024). |
| Open Source Code | Yes | Our source code is available at https://github.com/ horizon-research/Dyna Diag/. |
| Open Datasets | Yes | 1. CIFAR-10 (Krizhevsky & Hinton, 2009) consists of 60,000 colored images of resolution 32 32, divided into 10 classes (e.g., airplanes, cars, birds). The dataset is split into 50,000 training and 10,000 test images. 2. CIFAR-100 (Krizhevsky & Hinton, 2009) also contains 32 32 resolution images but spans 100 classes. Each class includes 500 training and 100 test images, totaling 60,000 images. 3. Image Net-1K (Deng et al., 2009) covers 1,000 object classes, with 1.28M training, 50,000 validation, and 100,000 test images. Images are typically resized and cropped to 224 224 for processing. 4. Wiki Text-103 (Merity et al., 2016) comprises over 100 million tokens extracted from verified Wikipedia articles. |
| Dataset Splits | Yes | The dataset is split into 50,000 training and 10,000 test images. (for CIFAR-10) Each class includes 500 training and 100 test images (for CIFAR-100) 1.28M training, 50,000 validation, and 100,000 test images (for Image Net-1K). |
| Hardware Specification | Yes | All experiments are conducted on the NVIDIA Tesla A100 GPUs with the following configuration: Model: NVIDIA A100 80GB Memory: 80GB HBM2e Memory Bandwidth: 2.0 TB/s (higher than the 40GB version) TDP : 400W (PCIe: 300W) Peak FP32 Performance: 19.5 TFLOPS (same as 40GB) Peak FP16 Performance: 312 TFLOPS (same as 40GB) |
| Software Dependencies | No | Rig L: We use CUSPARSE... DSB and Pixelated BFly: Both methods yield block-sparse weight matrices. We use the Triton-based library from Pixelated BFly (Dao et al., 2021)... Our Py Torch implementation does not exploit CUDA kernel optimizations... |
| Experiment Setup | Yes | Table 3: Configuration of the CIFAR10 and CIFAR100 experiments with MLPMixer. Table 4: Configuration of the CIFAR10 and CIFAR100 experiments with Vi T-Small. Table 5: Configuration of the Image Net experiments with Vi T-Base and MLPMixer. Table 6: Configuration of the Image Net experiments with Vi T-Large and Huge. Table 7: Configuration of the Wikitext-103 experiments GPT-2Small experiments. |