Parallel processing method for multi-input multi-output matrix convolution

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A parallel processing, multi-output technology, applied in neural learning methods, neural architectures, biological neural network models, etc., can solve problems such as difficulty in exerting computing advantages and small convolution kernel scale, achieving high-performance computing capabilities and easy operation. , the effect of improving computational efficiency

Active Publication Date: 2018-06-26

NAT UNIV OF DEFENSE TECH

View PDF6 Cites 18 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Matrix convolution is one of the core modules commonly used in convolutional neural network models. It is not only computationally intensive and memory-intensive, but the size of the convolution kernel in matrix convolution calculations is generally small. Therefore, if reasonable calculations cannot be taken method, even if high-performance computing equipment is used, it is difficult to exert due computing advantages

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0030] Such as Figure 7 Shown, a kind of parallel processing method of multi-input multi-output matrix convolution of the present invention, its steps are:

[0031] S1: According to the number N of vector processing units VPE of the vector processor, the number M of input feature maps, the number P of convolution kernels, the size k of convolution kernels, and the moving step size s, determine the optimal calculation scheme for output feature maps ;

[0032] S2: Store the M input feature maps in the external storage DDR in turn, splice the N input convolution kernels according to the third dimension and row by row, and transfer the spliced convolution kernel matrix to the vector processor In the vector storage body of ; wherein, N<=p;

[0033] S3: Load input features figure 1 The first element of is broadcast to the vector regi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a parallel processing method for multi-input multi-output matrix convolution. The method comprises the steps that S1, an optimal calculation scheme of output feature maps is determined according to the quantity N of vector processing elements (VPEs) of a vector processor and other parameters; S2, M input feature maps are sequentially stored into an external storage DDR, andN input convolution kernels are stitched in rows according to a third dimension; S3, a first element of the first input feature map is loaded and broadcasted to a vector register, and meanwhile elements in a first row of a convolution kernel in an AM are loaded into the vector register; S4, accumulation is performed kxk times, calculation of the first input feature map is completed, and meanwhilethe second input feature map is loaded; S5, the steps are repeated till calculation of first elements of N output feature maps is completed; S6, calculation of all elements of the N output feature maps is completed according to movement step length; and S7, the steps are repeated till the process is finally completed. The method has the advantages of being easy to implement, convenient to operate, capable of improving the parallelism of the vector processor and improving processor operation efficiency and the like.

Description

technical field [0001] The invention mainly relates to the fields of artificial intelligence, machine learning, and convolutional neural network, and in particular refers to a parallel processing method of multi-input multi-output matrix convolution. Background technique [0002] With the rise of deep learning technology, the target recognition technology based on convolutional neural network has made a breakthrough, and it has been widely used in image recognition, speech recognition, natural language processing and other fields. Matrix convolution is a calculation-intensive and memory-intensive calculation, and the matrix convolution operation in the convolutional neural network model often accounts for more than 85% of the calculation of a convolutional neural network model, so how to speed up the matrix convolution operation It is an important and difficult point of current research. [0003] For calculation and memory-intensive matrix convolution operations, the curren...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/08

CPCG06N3/08G06N3/045

Inventor 郭阳张军阳杨超田希扈啸李斌

Owner NAT UNIV OF DEFENSE TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Parallel processing method for multi-input multi-output matrix convolution

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology