A Quantization Method for Deep Neural Networks Based on Elastic Significant Bits
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANKAI UNIV
- Publication Date
- 2021-06-22
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The invention belongs to the technical field of deep neural network compression, in particular to a deep neural network quantization method based on elastic valid bits. Background technique
[0002] Deep Neural Networks (DNNs) quantization is an effective way to compress DNNs, which can significantly improve the computational efficiency of DNNs and enable the network to be deployed on edge computing platforms with limited resources. One of the most common ways to achieve quantization in DNNs is to project high-precision floating-point values into low-precision quantized values. For example, DNNs with 32-bit floating point can achieve 32x model size compression by replacing weights with only one bit, and even reduce complex multiplication operations to simple bit operations on hardware. Therefore, in the case of fewer bit values, DNN quantization can significantly reduce the computational scale or memory footprint, thereby improving computational eff...