Sparse Neural Architectures via Deterministic Ramanujan Graphs

Authors: Suryam Arnav Kalra, Arindam Biswas, Pabitra Mitra, BISWAJIT BASU

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The goal of our experiments is to study the effectiveness of deterministic Ramanujan graph based sparse network construction. The datasets used for the experiments are CIFAR-10 and CIFAR-100 (Krizhevsky, 2009). The experiments are performed over a variety of architectures including VGG13, VGG16, VGG19 (Simonyan & Zisserman, 2014), Alex Net (Krizhevsky et al., 2012), Res Net18 and Res Net34 (He et al., 2016) to show the robustness of our method. We proceed in two parts. In the first part, we prune the Fully Connected layers by replacing them with sparse Ramanujan Graph which is applicable for VGG13, VGG19 and Alex Net architectures. In the second part, we prune the whole network including the Convolutional layers and the Fully Connected layers which is applicable for all the architectures considered in our experiment. The performance of the fully dense and the pruned networks are compared in each case. Finally, we compare the performance of our method against various sparse neural networks obtained by state-of-the art pruning at initialisation algorithms for VGG16 and the Res Net34 architectures. Training parameters for all of the architectures are same and are summarized in Table 1. We report accuracy on a randomly split 16% test set for all the experiments.
Researcher Affiliation Collaboration Suryam Arnav Kalra EMAIL Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Arindam Biswas EMAIL Polynom, Pairs, France Pabitra Mitra EMAIL Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Biswajit Basu EMAIL School of Engineering Trinity College Dublin, Dublin 2, Ireland
Pseudocode No The paper provides detailed mathematical formulations and construction steps (e.g., in Section 4.2 and 4.4), but it does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide an explicit statement about releasing code, nor does it include a link to a code repository. The Open Review link is for the peer review process, not code.
Open Datasets Yes The datasets used for the experiments are CIFAR-10 and CIFAR-100 (Krizhevsky, 2009).
Dataset Splits Yes We report accuracy on a randomly split 16% test set for all the experiments.
Hardware Specification No The paper mentions running experiments but does not specify any particular hardware components like CPU or GPU models, or memory specifications.
Software Dependencies No The paper lists 'Weight Initialization Kaiming Uniform' in Table 1, which is a method, not a specific software or library with a version number. No other specific software dependencies with versions are mentioned.
Experiment Setup Yes Training parameters for all of the architectures are same and are summarized in Table 1. Table 1: Training Parameters for the experiment Hyperparmeters Epochs 200 Train Batch Size 256 Test Batch Size 128 Learning Rate 0.1 LR Decay, Epoch 10x, [100, 150] Optimizer SGD Weight Decay 0.0005 Momentum 0.9 Weight Initialization Kaiming Uniform