Method for quantizing weight by channels

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of quantizing weights and splitting channels, which is applied in the field of neural network acceleration, can solve the problems of reducing model accuracy and insufficient utilization of low-bit data, and achieve the effects of improving utilization, increasing convergence speed, and fully utilizing

Pending Publication Date: 2021-12-07

合肥君正科技有限公司

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0010] This application proposes a method for quantizing weights by channels, which aims to overcome the defects in the above-mentioned prior art, solve the problem of insufficient utilization of low-bit data when quantizing existing low-bit models, and concentrate on a small number of values, reducing the model The problem of precision

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] In order to understand the technical content and advantages of the present invention more clearly, the present invention will be further described in detail in conjunction with the accompanying drawings.

[0033] Such as figure 1 As shown, a method for sub-channel quantization weights of the present invention specifically includes the following steps:

[0034] S1, convolutional neural network training: train the model with a full-precision algorithm. The full-precision algorithm is an image classification algorithm based on Resnet-50 as a neural network structure to obtain a network for target classification, that is, to obtain relevant parameters in the model reasoning process. , the relevant parameters include the weight of the convolution, the bias of the BiasAdd operator, the gamma, beta, mean and variance of the BatchNormal operator;

[0035] S2, fine-tuning the quantized model:

[0036] S2.1, for the model obtained from S1, quantify the weight according to the r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a method for quantizing weight by channels, which is characterized in that the weight is quantized according to the number of output channels of a model, when the weight of a convolutional neural network is four-dimensional (height, width, input_channel and output_channel), the extreme values of other three-dimensional data are respectively counted according to the output channel, then the data are quantized into low bits, and the data are quantized according to the distribution characteristic of each channel. The invention aims to overcome the defects in the prior art and solve the problem that the precision of the model is reduced due to the fact that low-bit data is not fully utilized and concentrated in a small number of numerical values when the existing low-bit model is quantized.

Description

technical field [0001] The invention relates to the technical field of neural network acceleration, in particular to a method for channel-by-channel quantization of weights. Background technique [0002] In recent years, with the rapid development of science and technology, the era of big data has arrived. Deep learning uses deep neural network (DNN) as a model, and has achieved remarkable results in many key areas of artificial intelligence, such as image recognition, reinforcement learning, and semantic analysis. As a typical DNN structure, convolutional neural network (CNN) can effectively extract hidden layer features of images and accurately classify images. It has been widely used in the field of image recognition and detection in recent years. [0003] In particular, weights are quantized according to the global extremum: first obtain the extremum of the entire weight from the weights and then quantize the weights to low bits according to this value. [0004] Howeve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/04G06F17/18

CPCG06N3/08G06F17/18G06N3/045Y02D10/00

Inventor 张东

Owner 合肥君正科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for quantizing weight by channels

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology