Check patentability & draft patents in minutes with Patsnap Eureka AI!

Deep neural network model compression method based on asymmetric ternary weight quantization

A technology of deep neural network and compression method, which is applied in the field of deep neural network model compression based on asymmetric ternary weight quantization, can solve the problems of limiting the expressive ability of ternary weight network, etc., so as to improve the recognition accuracy, reduce loss, improve The effect of expressiveness

Inactive Publication Date: 2018-12-11
SUZHOU INST FOR ADVANCED STUDY USTC
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In the selection of quantization methods, it is believed that after training, the positive and negative weights of the network satisfy the same distribution, which greatly limits the expressive ability of the ternary weight network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep neural network model compression method based on asymmetric ternary weight quantization
  • Deep neural network model compression method based on asymmetric ternary weight quantization
  • Deep neural network model compression method based on asymmetric ternary weight quantization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0059] Deep neural networks usually contain millions of parameters making it difficult to be applied to devices with limited resources, but usually most of the parameters of the network are redundant, so the main purpose of this invention is to remove redundant parameters to achieve model compression. The realization of technology is mainly divided into three steps:

[0060] (1): Asymmetric ternary weight quantization process:

[0061] The asymmetric ternary weight quantization method quantizes the traditional floating-point network weight to the ternary value during network training Among them, the quantization method adopts the method of threshold setting, and the formula is as follows:

[0062]

[0063] in the formula Is the threshold used in the quantization process, and any floating-point number can be assigned to different ternary values ​​according to its range. is the corresponding scaling factor, which is used to reduce the loss caused by the quantization pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a deep neural network model compression method based on asymmetric ternary weight quantization. The method comprises the following steps: when a deep neural network is trained,before each forward calculation, the floating point weight of each layer of the network is converted into asymmetric ternary value; the original floating point network weight is used in the parameterupdating stage; the trained deep neural network is compressed and stored. The redundant parameters of the deep neural network are removed and the network model is compressed, which effectively improves the recognition accuracy of the quantization method on the large data set.

Description

technical field [0001] The invention relates to the technical field of convolutional neural network compression, in particular to a deep neural network model compression method based on asymmetric ternary weight quantization. Background technique [0002] In recent years, with the rapid development of deep learning algorithms, deep neural networks have achieved state-of-the-art results in a range of machine learning tasks such as speech recognition, image classification, and natural language processing. However, a typical deep neural network usually has millions of parameters, making it difficult to be deployed in embedded devices with limited storage and computing resources. How to achieve model compression of deep neural networks has become an important issue in current deep learning. research direction. [0003] At present, there are two typical model compression methods. One is to optimize the structure of the network to reduce the number of network parameters. ICLR2016...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08H03M7/30
CPCG06N3/08H03M7/30
Inventor 吴俊敏丁杰吴焕
Owner SUZHOU INST FOR ADVANCED STUDY USTC
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More