Convolution kernel similarity pruning-based recurrent neural network model compression method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of cyclic neural network and compression method, which is applied in the field of cyclic neural network model compression based on convolution kernel similarity pruning, which can solve the problems of reducing model size, accuracy loss, and less impact on accuracy, etc., to improve reasoning Effects of speed, maintaining regularity, and reducing loss of accuracy

Inactive Publication Date: 2020-05-08

ZHEJIANG UNIV

View PDF0 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Model compression is a booming field. By simplifying the network structure or weight representation of the model, the amount of calculation and memory requirements of the deep learning model can be reduced, and the problem of limited accuracy on popular devices can be solved on the premise of minimizing the loss of accuracy. Memory and computing resources, it is difficult to deal with the problem of large and complex RNN models

[0005] The existing convolutional neural network compression and acceleration methods are roughly divided into three types: pruning, quantization, and low-rank decomposition; the commonly used pruning method refers to pruning unimportant weights in the network through certain rules, mainly divided into non- There are two types of structured pruning and structured pruning; among them, unstructured pruning refers to independent pruning weights, which have little impact on accuracy; unstructured pruning is hardware-unfriendly when reasoning and predicting, and it is easy to cause hardware equipment damage; while the structured sparse pruning method proposes an internal sparse structure, while pruning reduces the size of all the basic structures in the cycle unit by one, so that the size consistency is always maintained, the model size is reduced, and the inference speed is accelerated. Since the sparse structure of structured sparse pruning methods is very coarse, it is difficult to introduce high sparsity without loss of accuracy

[0006] Most pruning methods are based on different rules, adding regular constraints, constraining parameters in the process of backpropagation, and performing sparseness; when designing pruning rules, most of the existing pruning strategies are based on norm pruning and its Variant, more consideration is the importance of the convolution kernel. Ideally, the norm distribution is more regular, the standard deviation is large enough, and the similarity between convolution kernels is not considered more, resulting in the accuracy of the pruning process. Big loss

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0045] The present invention will be further described below in conjunction with specific examples.

[0046] Such as figure 1 As shown, step 1, load the pre-trained cyclic neural network model into the compressed cyclic neural network for training, set the parameters of the pre-trained cyclic neural network model to be consistent with the parameters in the compressed cyclic neural network, and obtain the cyclic neural network initialized with the weight matrix Model.

[0047] Prepare the pre-trained cyclic neural network model to be compressed. Set the data set, configuration parameters, given norm pruning rate P1 and similarity pruning rate P2 to be consistent with the parameters in the compressed cyclic neural network, and set the pre-trained cyclic neural network model to be compressed Load into a compressed recurrent neural network for training.

[0048] Such as figure 2As shown, the data set used is WikiText-2 English thesaurus data, and each vocabulary also retains t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a convolution kernel similarity pruning-based recurrent neural network model compression method and belongs to the technical field of computer electronic information. The method comprises the steps of: loading a pre-trained recurrent neural network model into a compressed recurrent neural network for training the pre-trained recurrent neural network model so as to obtain aweight matrix-initialized recurrent neural network model; calculating the L2 norms of each convolution kernel in the recurrent neural network model, sorting the L2 norms, and selecting convolution kernels in a norm pruning rate range and pruning the convolution kernels; and calculating the weight center of the convolution kernel of the pruned pre-trained recurrent neural network model, selecting convolution kernels in a similarity pruning rate P2 range and pruning the convolution kernels, performing gradient updating on a weight matrix corresponding to the convolution kernels, and pruning parameters in the updated weight matrix to obtain a compressed recurrent neural network model. According to the recurrent neural network model compression method provided by the invention, the large recurrent neural network model is effectively compressed while accuracy loss in a pruning process is reduced.

Description

technical field [0001] The invention relates to the technical fields of computer and electronic information, in particular to a method for compressing a cyclic neural network model based on convolution kernel similarity pruning. Background technique [0002] Recurrent neural network (RNN) has the ability to process sequential data, which greatly improves the accuracy of speech recognition, natural language processing and machine translation. However, the improvement of the effect of recurrent neural network is at the cost of high computational complexity. Resources are too demanding for embedded mobile devices. Thus, generalized deployment of recurrent neural network models is limited by inference performance, power consumption, and memory requirements. [0003] All pervasive devices such as smartphones, tablets, and laptops have limited memory and computing resources, and it is difficult to handle large and complex RNN models. However, one of the basic requirements for th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08G06N3/04

CPCG06N3/082G06N3/045

Inventor 王曰海李忻瑶奚永新杨建义

Owner ZHEJIANG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Convolution kernel similarity pruning-based recurrent neural network model compression method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology