A deep neural network compression method and device

A neural network and compression method technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problems of large computational load of neural network, unsatisfactory compression effect, and speed up computing speed, so as to release storage resources, Small changes, the effect of speeding up the calculation

Active Publication Date: 2021-08-27
XILINX TECH BEIJING LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] Embodiments of the present invention provide a deep neural network compression method and device, which are used to solve the defects of the prior art related to the neural network, such as large amount of calculation, decreased compression rate, and unsatisfactory compression effect, and realize load balancing of parallel computing processing units, thereby Achieve the effects of releasing storage resources, speeding up computing speed, and reducing power consumption, so that the performance of FPGA and other hardware implementation structures can be fully utilized

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A deep neural network compression method and device
  • A deep neural network compression method and device
  • A deep neural network compression method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The drawings are for illustration only and should not be construed as limiting the invention. The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0041] In the following, an example of network sparsification in LSTM neural network is used as a preferred embodiment of the present invention to specifically describe the method for compressing the neural network according to the present invention.

[0042] In the LSTM neural network, the forward calculation is mainly a combination of a series of matrix and vector multiplication, as shown in the following formula:

[0043]

[0044] Two types of LSTM are given in the formula: the simplest LSTM structure on the right; the LSTMP structure on the left, whose main feature is the addition of peephole and projection operations on the basis of simple LSTM. Whether it is LSTM or LSTMP structure, it mainly includes four matrices: c (unit ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A deep neural network compression method and device are proposed. The connection relationship between neurons of a neural network is usually represented by multiple matrices. The neural network compression method (900) according to the present invention includes: rearranging all rows of the plurality of matrices across matrices (S910), wherein the rearranged matrix rows are sequentially divided into multiple sub-matrices; Sensitivity analysis is performed on the sub-matrices to determine an initial compression ratio of the neural network (S920); the plurality of sub-matrices are compressed according to the determined initial compression ratio to obtain a compressed neural network (S930). The invention can ensure the load balancing of parallel computing processing units, so as to achieve the effects of releasing storage resources, accelerating computing speed and reducing power consumption.

Description

technical field [0001] The present invention relates to an artificial neural network, and more particularly to a deep neural network compression method and device. Background technique [0002] Artificial Neural Networks (ANN), also referred to as Neural Networks (NNs), is a mathematical computing model that imitates the behavioral characteristics of animal neural networks and performs distributed parallel information processing. In recent years, neural networks have developed rapidly and are widely used in many fields, such as image recognition, speech recognition, natural language processing, weather forecast, gene expression, content push and so on. [0003] In the neural network, there are a large number of connected nodes (also called "neurons"), and have two characteristics: 1) Each neuron, through a specific output function (also called activation function, activation function), calculates Process weighted input values ​​from other adjacent neurons; 2) The intensity ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/049G06N3/082
Inventor 李鑫孟通江帆韩松单羿
Owner XILINX TECH BEIJING LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products