Convolution kernel similarity pruning-based recurrent neural network model compression method

A technology of cyclic neural network and compression method, which is applied in the field of cyclic neural network model compression based on convolution kernel similarity pruning, which can solve the problems of reducing model size, accuracy loss, and less impact on accuracy, etc., to improve reasoning Effects of speed, maintaining regularity, and reducing loss of accuracy

Inactive Publication Date: 2020-05-08
ZHEJIANG UNIV
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Model compression is a booming field. By simplifying the network structure or weight representation of the model, the amount of calculation and memory requirements of the deep learning model can be reduced, and the problem of limited accuracy on popular devices can be solved on the premise of minimizing the loss of accuracy. Memory and computing resources, it is difficult to deal with the problem of large and complex RNN models
[0005] The existing convolutional neural network compression and acceleration methods are roughly divided into three types: pruning, quantization, and low-rank decomposition; the commonly used pruning method refers to pruning unimportant weights in the network through certain rules, mainly divided into non- There are two types of structured pruning and structured pruning; among them, unstructured pruning refers to independent pruning weights, which have little impact on accuracy; unstructured pruning is hardware-unfriendly when reasoning and predicting, and it is easy to cause hardware equipment damage; while the structured sparse pruning method proposes an internal sparse structure, while pruning reduces the size of all the basic structures in the cycle unit by one, so that the size consistency is always maintained, the model size is reduced, and the inference speed is accelerated. Since the sparse structure of structured sparse pruning methods is very coarse, it is difficult to introduce high sparsity without loss of accuracy
[0006] Most pruning methods are based on different rules, adding regular constraints, constraining parameters in the process of backpropagation, and performing sparseness; when designing pruning rules, most of the existing pruning strategies are based on norm pruning and its Variant, more consideration is the importance of the convolution kernel. Ideally, the norm distribution is more regular, the standard deviation is large enough, and the similarity between convolution kernels is not considered more, resulting in the accuracy of the pruning process. Big loss

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolution kernel similarity pruning-based recurrent neural network model compression method
  • Convolution kernel similarity pruning-based recurrent neural network model compression method
  • Convolution kernel similarity pruning-based recurrent neural network model compression method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The present invention will be further described below in conjunction with specific examples.

[0046] Such as figure 1 As shown, step 1, load the pre-trained cyclic neural network model into the compressed cyclic neural network for training, set the parameters of the pre-trained cyclic neural network model to be consistent with the parameters in the compressed cyclic neural network, and obtain the cyclic neural network initialized with the weight matrix Model.

[0047] Prepare the pre-trained cyclic neural network model to be compressed. Set the data set, configuration parameters, given norm pruning rate P1 and similarity pruning rate P2 to be consistent with the parameters in the compressed cyclic neural network, and set the pre-trained cyclic neural network model to be compressed Load into a compressed recurrent neural network for training.

[0048] Such as figure 2As shown, the data set used is WikiText-2 English thesaurus data, and each vocabulary also retains t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a convolution kernel similarity pruning-based recurrent neural network model compression method and belongs to the technical field of computer electronic information. The method comprises the steps of: loading a pre-trained recurrent neural network model into a compressed recurrent neural network for training the pre-trained recurrent neural network model so as to obtain aweight matrix-initialized recurrent neural network model; calculating the L2 norms of each convolution kernel in the recurrent neural network model, sorting the L2 norms, and selecting convolution kernels in a norm pruning rate range and pruning the convolution kernels; and calculating the weight center of the convolution kernel of the pruned pre-trained recurrent neural network model, selecting convolution kernels in a similarity pruning rate P2 range and pruning the convolution kernels, performing gradient updating on a weight matrix corresponding to the convolution kernels, and pruning parameters in the updated weight matrix to obtain a compressed recurrent neural network model. According to the recurrent neural network model compression method provided by the invention, the large recurrent neural network model is effectively compressed while accuracy loss in a pruning process is reduced.

Description

technical field [0001] The invention relates to the technical fields of computer and electronic information, in particular to a method for compressing a cyclic neural network model based on convolution kernel similarity pruning. Background technique [0002] Recurrent neural network (RNN) has the ability to process sequential data, which greatly improves the accuracy of speech recognition, natural language processing and machine translation. However, the improvement of the effect of recurrent neural network is at the cost of high computational complexity. Resources are too demanding for embedded mobile devices. Thus, generalized deployment of recurrent neural network models is limited by inference performance, power consumption, and memory requirements. [0003] All pervasive devices such as smartphones, tablets, and laptops have limited memory and computing resources, and it is difficult to handle large and complex RNN models. However, one of the basic requirements for th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/082G06N3/045
Inventor 王曰海李忻瑶奚永新杨建义
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products