Method for compressing deep neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of neural network and compression method, which is applied in the direction of neural learning method, biological neural network model, neural architecture, etc., and can solve problems such as mode deviation

Active Publication Date: 2018-02-13

XILINX INC

View PDF8 Cites 32 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, such larger networks are more susceptible to noise in the training set when learning useful patterns that are actually needed, making the learned patterns deviate from actual expectations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0070] Figure 8 A compression method suitable for an LSTM neural network according to the first embodiment of the present application is shown, wherein the compression of the neural network is realized through multiple iterative operations. Each iterative operation specifically includes three steps: sensitivity analysis, pruning, and retraining. Each step is described in detail below.

[0071] Step 8100, sensitivity test (Sensitivity analysis)

[0072] In this step, for example, sensitivity analysis is performed on all matrices in the LSTM network to determine the initial density (or initial compression ratio) of different matrices.

[0073] Figure 9 The specific steps of the sensitivity test are shown.

[0074] Such as Figure 9 As shown, in step 8110, for example, each matrix in the LSTM network is tried to be compressed according to different densities (the selected densities are, for example, 0.1, 0.2, ..., 0.9, for the specific compression method of matrices, refer...

Embodiment 2

[0178] In Embodiment 1 above, a method of compressing a trained dense neural network based on a fixed-shaped mask matrix to obtain a sparse neural network is proposed.

[0179] In Embodiment 2, the applicant proposes a new neural network compression method. In each round of compression, the method uses a user-designed dynamic compression strategy to compress the neural network.

[0180] Specifically, the dynamic compression strategy includes: the current number of times of pruning for each pruning operation, the maximum number of times of pruning, and the density of the pruning. These three parameters are used to determine the ratio of the weights that need to be pruned this time ( That is, the compression rate of this pruning operation), and pruning is performed based on this.

[0181] Therefore, in the neural network compression process according to the method of Embodiment 2, the strength of each pruning is a function of the number of pruning times (also can be understood a...

example 21

[0198] Example 2.1: Neural Network with Constant Density

[0199] In this example, the target density of each pruning operation is kept constant during one round of compression of the neural network. Accordingly, the compression function is:

[0200] f D (t)=D final

[0201] That is, in this round of compression, the density of the neural network is always kept constant, but the weight size and distribution can be changed.

[0202] Figure 16 Shows the neural network density variation curve in Example 2.1.

[0203] Figure 17 shown with Figure 16 The corresponding neural network weight distribution changes during the corresponding compression process.

[0204] Figure 17 The left side of the graph shows the changes in the weight parameter distribution of each matrix in the neural network during each pruning operation, where the horizontal axis shows the 9 matrices in each LSTM layer, and the vertical axis shows the pruning The number of operations. Visible, corre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an improved method for compressing a neural network. The method has the advantages that a pruning technology of the neural network is integrated into the original training process of the neural network, so as to reduce the training iteration frequency, thereby shortening the training cycle of a sparse neural network; the pruning technology can also be integrated into the compression retraining process of a trained model, so that the neural network can be compressed at the premises of ensuring the model accuracy.

Description

field of invention [0001] The invention relates to a compression method and device for a deep neural network. Background technique [0002] Artificial neural networks [0003] Artificial Neural Networks (ANNs), also known as neural networks (NNs), is a mathematical computing model that imitates the behavioral characteristics of animal neural networks and performs distributed parallel information processing. In recent years, neural networks have developed rapidly and are widely used in many fields, such as image recognition, speech recognition, natural language processing, gene expression, content push and so on. [0004] In a neural network model, there are a large number of nodes (also called "neurons") connected to each other. Each neuron calculates and processes weighted input values from other adjacent neurons through a specific output function (also called "Activation Function"), and the information transmission strength between neurons is used as "weight " to defi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/08

CPCG06N3/049G06N3/082G10L15/26G10L15/16G06N3/044G06N3/045G06N3/04

Inventor 李鑫孟通

Owner XILINX INC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for compressing deep neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

example 21

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology