Neural network accelerator, convolution operation implementation method and device and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolution operation and neural network technology, applied in biological neural network models, physical implementation, complex mathematical operations, etc., can solve the problem of long time required to complete reasoning

Pending Publication Date: 2021-04-09

ZTE CORP

View PDF0 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Neural network processing is an important processing technology for the realization of artificial intelligence. However, the continuous growth of the scale of neural networks has resulted in a typical neural network with dozens or even hundreds of layers, and hundreds of millions of connections between neurons. The performance indicators are constantly improving, the amount of calculation is constantly increasing, and the time required to complete the reasoning is also getting longer and longer.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0038] In order to improve the processing efficiency of neural network processing and reduce the calculation amount of neural network convolution operation, this embodiment provides a neural network convolution operation implementation method. Before introducing the neural network convolution operation implementation method, this embodiment first introduces A neural network convolution operation implementation device capable of implementing the neural network convolution operation implementation method, please refer to figure 1 :

[0039] The neural network convolution operation implementation system 1 includes a CPU 11 , a global cache 12 and a neural network accelerator 10 , wherein the neural network accelerator 10 includes an input and output cache 13 , a command parsing unit 14 , a calculation control unit 15 and a calculation unit array 16 .

[0040] Wherein, CPU 11 is connected by communication with command analysis unit 14, and CPU 11 can send control instructions to c...

Embodiment 2

[0077] This embodiment will continue to introduce the neural network convolution operation implementation method and device provided by the present invention, please refer to Figure 7 :

[0078] exist Figure 7 The shown neural network convolution operation implementation system 7 also includes a CPU 71 , a global cache 72 and a neural network accelerator 70 . However, different from Embodiment 1, in this embodiment, the neural network accelerator 70 not only includes an input-output cache 73, a command parsing unit 74, a computing control unit 75, and a computing unit array 76, but also includes a Winograde (Winograd ) conversion unit 77. The Winograde conversion unit 77 is arranged between the input-output cache 73 and the calculation unit array 76, and it is used to convert the compressed sub-convolution kernel and the original data block into In the Winograde domain, the data in the compressed sub-convolution kernel and the original data block transferred to the Winogr...

Embodiment 3

[0109] This example will combine Figure 7 or Figure 8 The shown neural network convolution operation implementation system and some examples continue to introduce the neural network convolution operation implementation provided in the foregoing embodiments:

[0110] [weight conversion]

[0111] Figure 10 A schematic diagram of weight compression using a compiler is shown:

[0112] The compiler is divided into a segmentation module 101 and a compression module 102. After processing by the two modules, the original convolution kernel becomes a compressed sub-convolution kernel and weight index.

[0113] For the convolutional neural network, since the convolution kernel is generally a three-dimensional structure, the weights in the convolution kernel can be pruned into a form in which the same channel C is 0 through training, which can further reduce the weight in the weight index. content, to improve compression efficiency, the following uses the 3×3 original convolution ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention provides a neural network accelerator, a convolution operation implementation method and device, and a storage medium. The method comprises the steps: obtaining a plurality of compressed sub-convolution kernels corresponding to an original convolution kernel and a weight index, and then reading an original data block with the specification matched with the specification of the sub-convolution kernels from to-be-processed original data; controlling the computing unit array to execute an iterative process according to the original data block, the compressed sub-convolution kernel and the weight index, wherein the computing unit array comprises a plurality of multiply-accumulate calculators; after the iterative process is finished, obtaining an output result of the computing unit array. According to the neural network convolution operation implementation scheme provided by the embodiment of the invention, the zero-value weight in the original weight convolution kernel is removed by compressing the weight in the convolution kernel, and therefore the operand of convolution operation is reduced. Meanwhile, because the array type calculation unit array is adopted to carry out convolution operation on the data and the weight, the data multiplexing degree of the convolution operation can be improved, and bandwidth occupation and power consumption in the neural network processing process are reduced.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a neural network accelerator, a convolution operation realization method, a device and a storage medium. Background technique [0002] In recent years, artificial intelligence technology has developed rapidly all over the world. The industry has invested a lot of energy in the research of artificial intelligence technology and achieved remarkable results, especially in image detection and recognition and language recognition. In terms of direction, the recognition rate of artificial intelligence has surpassed that of human beings. Neural network processing is an important processing technology for the realization of artificial intelligence. However, the continuous growth of the scale of neural networks has resulted in a typical neural network with dozens or even hundreds of layers, and hundreds of millions of connections between neurons. The performance indicators are cons...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/063G06F17/15

CPCG06F17/15G06N3/063

Inventor 余金清闫盛男汪立林张鹤

Owner ZTE CORP

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Neural network accelerator, convolution operation implementation method and device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology