Deep neural network quantification method, system and device, and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep neural network and neural network technology, applied in neural learning methods, biological neural network models, inference methods, etc., can solve problems such as the reduction of inference accuracy, and achieve the effects of high efficiency, simple model evaluation criteria, and strong universality.

Inactive Publication Date: 2022-01-18

CHENGDU SHULIANYUNSUAN TECH CORP

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Compared with fp16 quantization, int8 quantization with fewer digits has faster reasoning speed. However, because int8 quantization is the feature layer and The weight parameters are evenly divided into the integer interval of [-127, 127]. The precision of these parameters loss is higher, which makes the inference precision of some models decrease after int8 quantization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0048] Please refer to figure 1 , figure 1 It is a schematic flow chart of a deep neural network quantification method. Embodiment 1 of the present invention provides a deep neural network quantification method, the method comprising:

[0049] Get the first deep neural network , the Including n neural network layers, the neural network layers are divided into quantized layers and non-quantized layers, the The accuracy rate of , set the highest acceptable accuracy loss threshold for the quantized deep neural network ;

[0050] Based on the and said , using a dichotomy to search for all quantized layers from the n neural network layers, and quantize the obtained quantized layers.

[0051] The following is a detailed introduction to this method in combination with specific examples and existing deep neural network optimization methods:

[0052] This embodiment first introduces the prior art relevant to the present invention, and purpose is to highlight the differe...

Embodiment 2

[0131] Please refer to figure 2 , figure 2 It is a schematic diagram of the composition of a deep neural network quantization system. Embodiment 2 of the present invention provides a deep neural network quantization system, and the system includes:

[0132] Network accuracy rate and accuracy rate loss threshold acquisition unit, used to obtain the first deep neural network , the Including n neural network layers, the neural network layers are divided into quantized layers and non-quantized layers, the The accuracy rate of , set the highest acceptable accuracy loss threshold for the quantized deep neural network ;

[0133] quantization unit for the and said , using a dichotomy to search for all quantized layers from the n neural network layers, and quantize the obtained quantized layers.

Embodiment 3

[0135] Embodiment 3 of the present invention provides a deep neural network quantification device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, Steps for realizing the quantization method of the deep neural network.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep neural network quantization method, system and device, and a medium, and relates to the field of neural network quantization. The method comprises the steps of acquiring a first deep neural network, wherein the first deep neural network comprises n neural network layers, each neural network layer is divided into a quantization layer and a non-quantization layer, and the accuracy is described in the description; setting a highest acceptable accuracy loss threshold value of the quantization deep neural network; and on the basis of the sum, searching all quantization layers from the n neural network layers by using a dichotomy, and quantizing the obtained quantization layers. According to the invention, the time complexity of quantization can be reduced, and a locally optimal quantization layer combination can be found at the same time.

Description

technical field [0001] The present invention relates to the field of neural network quantization, in particular to a deep neural network quantization method, system, device and medium. Background technique [0002] Deep neural network models are widely used in machine vision tasks such as image classification and object detection, and have achieved great success. However, due to the limitation of storage resources and computing resources, the storage and calculation of deep neural network models on mobile terminals or embedded devices still face great challenges, so the compression and lightweight of deep neural networks is an urgent problem to be solved. In recent years, researchers have achieved a lot of research results in the compression of deep neural networks, among which quantization is one of the methods to compress deep neural networks. [0003] Generally, deep neural networks use parameters represented by float32-bit numbers to perform calculations such as convolu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/063G06N3/04

CPCG06N3/08G06N3/04G06N5/04G06F18/24

Inventor 不公告发明人

Owner CHENGDU SHULIANYUNSUAN TECH CORP

Deep neural network quantification method, system and device, and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology