Sparse Neural Architectures via Deterministic Ramanujan Graphs
Authors: Suryam Arnav Kalra, Arindam Biswas, Pabitra Mitra, BISWAJIT BASU
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The goal of our experiments is to study the effectiveness of deterministic Ramanujan graph based sparse network construction. The datasets used for the experiments are CIFAR-10 and CIFAR-100 (Krizhevsky, 2009). The experiments are performed over a variety of architectures including VGG13, VGG16, VGG19 (Simonyan & Zisserman, 2014), Alex Net (Krizhevsky et al., 2012), Res Net18 and Res Net34 (He et al., 2016) to show the robustness of our method. We proceed in two parts. In the first part, we prune the Fully Connected layers by replacing them with sparse Ramanujan Graph which is applicable for VGG13, VGG19 and Alex Net architectures. In the second part, we prune the whole network including the Convolutional layers and the Fully Connected layers which is applicable for all the architectures considered in our experiment. The performance of the fully dense and the pruned networks are compared in each case. Finally, we compare the performance of our method against various sparse neural networks obtained by state-of-the art pruning at initialisation algorithms for VGG16 and the Res Net34 architectures. Training parameters for all of the architectures are same and are summarized in Table 1. We report accuracy on a randomly split 16% test set for all the experiments. |
| Researcher Affiliation | Collaboration | Suryam Arnav Kalra EMAIL Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Arindam Biswas EMAIL Polynom, Pairs, France Pabitra Mitra EMAIL Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Biswajit Basu EMAIL School of Engineering Trinity College Dublin, Dublin 2, Ireland |
| Pseudocode | No | The paper provides detailed mathematical formulations and construction steps (e.g., in Section 4.2 and 4.4), but it does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code, nor does it include a link to a code repository. The Open Review link is for the peer review process, not code. |
| Open Datasets | Yes | The datasets used for the experiments are CIFAR-10 and CIFAR-100 (Krizhevsky, 2009). |
| Dataset Splits | Yes | We report accuracy on a randomly split 16% test set for all the experiments. |
| Hardware Specification | No | The paper mentions running experiments but does not specify any particular hardware components like CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper lists 'Weight Initialization Kaiming Uniform' in Table 1, which is a method, not a specific software or library with a version number. No other specific software dependencies with versions are mentioned. |
| Experiment Setup | Yes | Training parameters for all of the architectures are same and are summarized in Table 1. Table 1: Training Parameters for the experiment Hyperparmeters Epochs 200 Train Batch Size 256 Test Batch Size 128 Learning Rate 0.1 LR Decay, Epoch 10x, [100, 150] Optimizer SGD Weight Decay 0.0005 Momentum 0.9 Weight Initialization Kaiming Uniform |