Deep neural network model compression method based on asymmetric ternary weight quantization

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of deep neural network and compression method, which is applied in the field of deep neural network model compression based on asymmetric ternary weight quantization, can solve the problems of limiting the expressive ability of ternary weight network, etc., so as to improve the recognition accuracy, reduce loss, improve The effect of expressiveness

Inactive Publication Date: 2018-12-11

SUZHOU INST FOR ADVANCED STUDY USTC

View PDF3 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] In the selection of quantization methods, it is believed that after training, the positive and negative weights of the network satisfy the same distribution, which greatly limits the expressive ability of the ternary weight network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0059] Deep neural networks usually contain millions of parameters making it difficult to be applied to devices with limited resources, but usually most of the parameters of the network are redundant, so the main purpose of this invention is to remove redundant parameters to achieve model compression. The realization of technology is mainly divided into three steps:

[0060] (1): Asymmetric ternary weight quantization process:

[0061] The asymmetric ternary weight quantization method quantizes the traditional floating-point network weight to the ternary value during network training Among them, the quantization method adopts the method of threshold setting, and the formula is as follows:

[0062]

[0063] in the formula Is the threshold used in the quantization process, and any floating-point number can be assigned to different ternary values according to its range. is the corresponding scaling factor, which is used to reduce the loss caused by the quantization pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep neural network model compression method based on asymmetric ternary weight quantization. The method comprises the following steps: when a deep neural network is trained,before each forward calculation, the floating point weight of each layer of the network is converted into asymmetric ternary value; the original floating point network weight is used in the parameterupdating stage; the trained deep neural network is compressed and stored. The redundant parameters of the deep neural network are removed and the network model is compressed, which effectively improves the recognition accuracy of the quantization method on the large data set.

Description

technical field [0001] The invention relates to the technical field of convolutional neural network compression, in particular to a deep neural network model compression method based on asymmetric ternary weight quantization. Background technique [0002] In recent years, with the rapid development of deep learning algorithms, deep neural networks have achieved state-of-the-art results in a range of machine learning tasks such as speech recognition, image classification, and natural language processing. However, a typical deep neural network usually has millions of parameters, making it difficult to be deployed in embedded devices with limited storage and computing resources. How to achieve model compression of deep neural networks has become an important issue in current deep learning. research direction. [0003] At present, there are two typical model compression methods. One is to optimize the structure of the network to reduce the number of network parameters. ICLR2016...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08H03M7/30

CPCG06N3/08H03M7/30

Inventor 吴俊敏丁杰吴焕

Owner SUZHOU INST FOR ADVANCED STUDY USTC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Deep neural network model compression method based on asymmetric ternary weight quantization

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology