Quantification method, device, storage medium and electronic equipment of neural network

A neural network and quantization method technology, applied in the field of devices, neural network quantization methods, storage media and electronic equipment, can solve problems such as slowing down the running speed of neural network processor chips, and achieve the effect of improving data transmission efficiency

Active Publication Date: 2022-05-24
SHENZHEN MICROBT ELECTRONICS TECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] For the quantization of the neural network, the larger the quantization bit width of the fixed-point integer after quantization, the higher the accuracy of the neural network model, but too large quantization bit width will also slow down the running speed of the neural network processor chip. , the operation inside the neural network processor chip does not have restrictive requirements on the quantization bit width
[0005] Therefore, there is still room for improvement in the optimization of quantized fixed-point integers for the quantization of neural networks to improve their quantization accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Quantification method, device, storage medium and electronic equipment of neural network
  • Quantification method, device, storage medium and electronic equipment of neural network
  • Quantification method, device, storage medium and electronic equipment of neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In order to make the objectives, technical solutions and advantages of the present disclosure more clear, the present disclosure will be described in further detail below with reference to the accompanying drawings and embodiments.

[0041] In the commonly used quantization methods, the quantization bit width is generally 16bit (2 4 ), 8bit (2 3 ), 4bit (2 2 ), 2bit (2 1 ), 1bit (2 0 ) and other alignment bit widths, generally do not choose non-2 n (ie, unaligned bit width) quantization bit width, which will reduce the efficiency of bus access, but these differences can be ignored by the internal computing unit of the NPU, for example, 8bit data can be converted into 9bit Data operation, obviously, 9bit data operation has higher precision than 8bit data operation. Based on this consideration, the quantized data of the original quantized bit width can be increased in the NPU, and then the neural network layer operation can be performed to improve the operation accur...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present disclosure relates to a neural network quantization method, device, storage medium, and electronic equipment. The quantization method includes: receiving the first value range interval corresponding to the first quantization bit width output by the previous neural network layer in the neural network input data; map the input data to at least include the second value range interval corresponding to the second quantization bit width to obtain quantized input data; perform calculations according to the quantization bit width corresponding to the quantized input data to obtain quantized output data; quantize The output data is inversely mapped to the first value range interval corresponding to the first quantization bit width to obtain the output data. The disclosure achieves the purpose of improving the reasoning accuracy of the quantized neural network without reducing the data transmission efficiency between hardware.

Description

technical field [0001] The present disclosure relates to the field of computer technology, and in particular, to a quantification method, device, storage medium and electronic device of a neural network. Background technique [0002] Neural networks generally use FP32 floating-point for inference and training. Although floating-point operations have higher precision, floating-point operations consume a lot of computing resources, resulting in low efficiency of neural networks. [0003] Quantization is a method of converting floating-point parameters into fixed-point parameters to reduce operation precision and increase operation speed. Within the range of quantized parameters, converting floating-point operations to integer fixed-point operations for forward inference will not significantly reduce the accuracy of the neural network model, and can speed up the running speed and significantly reduce the inference and training performance of the neural network. Consumption, re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/063G06N3/04
CPCG06N3/063G06N3/045
Inventor 徐祥艾国杨作兴房汝明向志宏
Owner SHENZHEN MICROBT ELECTRONICS TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products