Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
Authors: Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments indicate that PG achieves excellent results on CNNs, including statically compressed mobile-friendly networks such as Shuffle Net. Compared to the state-of-the-art prediction-based quantization schemes, PG achieves the same or higher accuracy with 2.4 less compute on Image Net. PG furthermore applies to RNNs. Compared to 8-bit uniform quantization, PG obtains a 1.2% improvement in perplexity per word with 2.7 computational cost reduction on LSTM on the Penn Tree Bank dataset. |
| Researcher Affiliation | Academia | Yichi Zhang Cornell University EMAIL Ritchie Zhao Cornell University EMAIL Weizhe Hua Cornell University EMAIL Nayun Xu Cornell Tech EMAIL G. Edward Suh Cornell University EMAIL Zhiru Zhang Cornell University EMAIL |
| Pseudocode | No | No pseudocode or algorithm blocks found. The paper describes the method using equations and explanatory text. |
| Open Source Code | No | We will release the source code on the author s website. |
| Open Datasets | Yes | We evaluate PG using Res Net-20 (He et al., 2016a) and Shift Net-20 (Wu et al., 2018a) on CIFAR10 (Krizhevsky & Hinton, 2009), and Shuffle Net V2 (Ma et al., 2018) on Image Net (Deng et al., 2009). We also test an LSTM model (Hochreiter & Schmidhuber, 1997) on the Penn Tree Bank (PTB) (Marcus et al., 1993) corpus. |
| Dataset Splits | No | The paper mentions training details but does not provide explicit dataset splits for train/validation/test. For example, "On CIFAR-10, the batch size is 128, and the models are trained for 200 epochs." does not specify the splits. |
| Hardware Specification | Yes | All experiments are conducted on Tensorflow (Abadi et al., 2016) with NVIDIA Ge Force 1080Ti GPUs. One of the Titan Xp GPUs used for this research was donated by the NVIDIA Corporation. Intel Xeon Silver 4114 CPU (2.20GHz). |
| Software Dependencies | No | All experiments are conducted on Tensorflow (Abadi et al., 2016) with NVIDIA Ge Force 1080Ti GPUs. We implement the SDDMM kernel in Python leveraging a high performance JIT compiler Numba (Lam et al., 2015). No version numbers for TensorFlow or Numba are specified. |
| Experiment Setup | Yes | On CIFAR-10, the batch size is 128, and the models are trained for 200 epochs. The initial learning rate is 0.1 and decays at epoch 100, 150, 200 by a factor of 0.1 (i.e., multiply learning rate by 0.1). On Image Net, the batch size is 512 and the models are trained for 120 epochs. The learning rate decays linearly from an initial value of 0.5 to 0. The number of hidden units in the LSTM cell is set to 300, and the number of layers is set to 1. The full bitwidth B... The prediction bitwidth Bhb... The penalty factor σ... The gating target δ... The coefficient α in the backward pass... |