A Quantization Method for Deep Neural Networks Based on Elastic Significant Bits

A technology of deep neural network and quantization method, which is applied in the field of deep neural network quantization based on elastic effective bits, can solve the problems of difficult step function, decreased precision, and reduced computational efficiency of quantized models, achieving efficient multiplication calculations, improving accuracy, The effect of low quantization loss
CN111768002BActive Publication Date: 2021-06-22NANKAI UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NANKAI UNIV
Publication Date
2021-06-22

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The present invention provides a deep neural network quantization method based on elastic effective digits, which quantizes fixed-point numbers or floating-point numbers into quantized values ​​with elastic effective digits, discards redundant mantissa parts, and adopts a feasible solution method to quantitatively evaluate quantization The distribution of values ​​differs from the original data. The present invention has quantized values ​​with flexible effective bits. Through different effective bits, the distribution of quantized values ​​can cover a series of bell-shaped distributions from long tail to uniform, adapting to the weight / activation distribution of DNNs, thereby ensuring low precision loss; The multiplication calculation can be realized by multiple shift additions on the hardware, which improves the overall efficiency of the quantization model; the distribution difference function quantitatively estimates the quantization loss caused by different quantization schemes, and can select the optimal quantization scheme under different conditions. Achieve lower quantization loss and improve the accuracy of the quantization model.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of deep neural network compression, in particular to a deep neural network quantization method based on elastic valid bits. Background technique

[0002] Deep Neural Networks (DNNs) quantization is an effective way to compress DNNs, which can significantly improve the computational efficiency of DNNs and enable the network to be deployed on edge computing platforms with limited resources. One of the most common ways to achieve quantization in DNNs is to project high-precision floating-point values ​​into low-precision quantized values. For example, DNNs with 32-bit floating point can achieve 32x model size compression by replacing weights with only one bit, and even reduce complex multiplication operations to simple bit operations on hardware. Therefore, in the case of fewer bit values, DNN quantization can significantly reduce the computational scale or memory footprint, thereby improving computational eff...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More