Neural network low-bit quantization method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of neural network and quantization method, applied in the field of data compression, can solve the problems of large memory and limit the application of neural network, and achieve the effect of strong practicability, improving quantization efficiency and improving accuracy

Pending Publication Date: 2021-02-19

BEIJING TSINGMICRO INTELLIGENT TECH CO LTD

View PDF6 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] As a result, higher requirements are placed on the computing performance of the device, and the increasing network scale and power consumption have gradually become the main obstacles that limit the application of neural networks.

The ever-increasing neural network leads to more and more memory required for computing the neural network, and the increase of the network model also requires greater bandwidth, which greatly limits the application of neural networks in embedded devices

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0023]The specific embodiments of the present invention will be described below in conjunction with the drawings.

[0024]Such asfigure 1 As shown, an embodiment of the present invention is a neural network low-bit quantization method, including S101 to S114.

[0025]S101: Obtain an initial neural network, and acquire the weight values and biases of C channels in the initial neural network, and each channel includes a convolutional layer.

[0026]The initial neural network can be applied to tasks such as image classification, target detection and natural language processing. The initial neural network has been trained. The initial neural network is a floating-point storage operation, that is, it turns out that a weight needs to be represented by float32, that is, the initial neural network is a floating-point operation.

[0027]Quantify the initial neural network, that is, convert floating-point operations into integer storage operations, and realize the compression technology of the model's ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a neural network low-bit quantification method. The weight value of each channel of a neural network is quantified to a low-bit fixed-point weight. And the method also includes obtaining an input quantization coefficient of the current convolution layer according to the target quantization threshold and the floating point quantization interval; taking the input quantization coefficient of the next convolution layer as the output quantization coefficient of the current convolution layer; quantizing the input floating point data to obtain input fixed point data; and quantizing the output floating point data to obtain output fixed point data; and converting the scaling factor and the bias into a scaling factor fixed-point value and a bias fixed-point value respectively; according to the low-bit fixed-point weight, the input fixed-point data, the output fixed-point data, the scaling factor fixed-point value and the offset fixed-point value, obtaining a neural network, and applying the quantized neural network model to embedded equipment.

Description

Technical field[0001]The invention relates to the field of data compression, in particular to a neural network low-bit quantization method.Background technique[0002]Neural network technology has achieved good results in tasks including image classification, target detection, and natural language processing. In order to improve recognition accuracy, the scale of neural network models has been increasing, and the complexity of the model has been increasing.[0003]Subsequently, higher requirements have been placed on the computing performance of the equipment, and the increasing network scale and power consumption have gradually become the main obstacles limiting the application of neural networks. The ever-increasing neural network has led to the need for more and more memory when computing the neural network. At the same time, the increase of the network model also requires greater bandwidth, which greatly limits the application of neural networks in embedded devices.Summary of the in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/06

CPCG06N3/06G06N3/045

Inventor 张书瑞欧阳鹏尹首一

Owner BEIJING TSINGMICRO INTELLIGENT TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Neural network low-bit quantization method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology