Neural network accelerator, convolution operation implementation method and device and storage medium

A convolution operation and neural network technology, applied in biological neural network models, physical implementation, complex mathematical operations, etc., can solve the problem of long time required to complete reasoning

Pending Publication Date: 2021-04-09
ZTE CORP
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Neural network processing is an important processing technology for the realization of artificial intelligence. However, the continuous growth of the scale of neural networks has resulted in a typical neural network with dozens or even hundreds of la

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network accelerator, convolution operation implementation method and device and storage medium
  • Neural network accelerator, convolution operation implementation method and device and storage medium
  • Neural network accelerator, convolution operation implementation method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] In order to improve the processing efficiency of neural network processing and reduce the calculation amount of neural network convolution operation, this embodiment provides a neural network convolution operation implementation method. Before introducing the neural network convolution operation implementation method, this embodiment first introduces A neural network convolution operation implementation device capable of implementing the neural network convolution operation implementation method, please refer to figure 1 :

[0039] The neural network convolution operation implementation system 1 includes a CPU 11 , a global cache 12 and a neural network accelerator 10 , wherein the neural network accelerator 10 includes an input and output cache 13 , a command parsing unit 14 , a calculation control unit 15 and a calculation unit array 16 .

[0040] Wherein, CPU 11 is connected by communication with command analysis unit 14, and CPU 11 can send control instructions to c...

Embodiment 2

[0077] This embodiment will continue to introduce the neural network convolution operation implementation method and device provided by the present invention, please refer to Figure 7 :

[0078] exist Figure 7 The shown neural network convolution operation implementation system 7 also includes a CPU 71 , a global cache 72 and a neural network accelerator 70 . However, different from Embodiment 1, in this embodiment, the neural network accelerator 70 not only includes an input-output cache 73, a command parsing unit 74, a computing control unit 75, and a computing unit array 76, but also includes a Winograde (Winograd ) conversion unit 77. The Winograde conversion unit 77 is arranged between the input-output cache 73 and the calculation unit array 76, and it is used to convert the compressed sub-convolution kernel and the original data block into In the Winograde domain, the data in the compressed sub-convolution kernel and the original data block transferred to the Winogr...

Embodiment 3

[0109] This example will combine Figure 7 or Figure 8 The shown neural network convolution operation implementation system and some examples continue to introduce the neural network convolution operation implementation provided in the foregoing embodiments:

[0110] [weight conversion]

[0111] Figure 10 A schematic diagram of weight compression using a compiler is shown:

[0112] The compiler is divided into a segmentation module 101 and a compression module 102. After processing by the two modules, the original convolution kernel becomes a compressed sub-convolution kernel and weight index.

[0113] For the convolutional neural network, since the convolution kernel is generally a three-dimensional structure, the weights in the convolution kernel can be pruned into a form in which the same channel C is 0 through training, which can further reduce the weight in the weight index. content, to improve compression efficiency, the following uses the 3×3 original convolution ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a neural network accelerator, a convolution operation implementation method and device, and a storage medium. The method comprises the steps: obtaining a plurality of compressed sub-convolution kernels corresponding to an original convolution kernel and a weight index, and then reading an original data block with the specification matched with the specification of the sub-convolution kernels from to-be-processed original data; controlling the computing unit array to execute an iterative process according to the original data block, the compressed sub-convolution kernel and the weight index, wherein the computing unit array comprises a plurality of multiply-accumulate calculators; after the iterative process is finished, obtaining an output result of the computing unit array. According to the neural network convolution operation implementation scheme provided by the embodiment of the invention, the zero-value weight in the original weight convolution kernel is removed by compressing the weight in the convolution kernel, and therefore the operand of convolution operation is reduced. Meanwhile, because the array type calculation unit array is adopted to carry out convolution operation on the data and the weight, the data multiplexing degree of the convolution operation can be improved, and bandwidth occupation and power consumption in the neural network processing process are reduced.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a neural network accelerator, a convolution operation realization method, a device and a storage medium. Background technique [0002] In recent years, artificial intelligence technology has developed rapidly all over the world. The industry has invested a lot of energy in the research of artificial intelligence technology and achieved remarkable results, especially in image detection and recognition and language recognition. In terms of direction, the recognition rate of artificial intelligence has surpassed that of human beings. Neural network processing is an important processing technology for the realization of artificial intelligence. However, the continuous growth of the scale of neural networks has resulted in a typical neural network with dozens or even hundreds of layers, and hundreds of millions of connections between neurons. The performance indicators are cons...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/063G06F17/15
CPCG06F17/15G06N3/063
Inventor 余金清闫盛男汪立林张鹤
Owner ZTE CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products