Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee
Authors: Jincheng Bai, Qifan Song, Guang Cheng
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Notably, our empirical results demonstrate that this variational procedure provides uncertainty quantification in terms of Bayesian predictive distribution and is also capable to accomplish consistent variable selection by training a sparse multi-layer neural network. |
| Researcher Affiliation | Academia | Jincheng Bai Department of Statistics Purdue University West Lafayette, IN 47906 EMAIL Qifan Song Department of Statistics Purdue University West Lafayette, IN 47906 EMAIL Guang Cheng Department of Statistics Purdue University West Lafayette, IN 47906 EMAIL |
| Pseudocode | Yes | Algorithm 1 Variational inference for sparse BNN with normal slab distribution. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate the empirical performance of the proposed variational inference through simulation study and MNIST data application. |
| Dataset Splits | No | The paper does not explicitly provide details about a validation dataset split with percentages or sample counts. It mentions 'validation' in the context of 'validity of Theorem 4.1' and 'training/testing RMSE' but not as a separate data split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch' as a software package but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For all the numerical studies, we let σ2 0 = 2, the choice of λ follows Theorem 4.1 (denoted by λopt): log(λ-1_opt) = log(T) - 0.1r(L-1)logN - log(sqrt(n*s)). The remaining details of implementation (such as initialization, choices of K, m and learning rate) are provided in the supplementary material. ... The sparsity levels specified for AGP are 30% and 5%, and for LOT are 10% and 5%, respectively for the two cases. ... using the same batch size |