Multi-sample multi-channel convolutional neural network Same convolution vectorization implementation method

A technology of convolutional neural network and implementation method, which is applied in the field of realization of multi-sample multi-channel convolutional neural network Same convolution vectorization, can solve the problems of uncertain size of the third dimension, waste of storage bandwidth, mismatching of the number of processing units, etc. Achieve the effect of reducing transmission volume, reducing bandwidth requirements, and reducing transmission time

Active Publication Date: 2020-02-18
NAT UNIV OF DEFENSE TECH
View PDF12 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] (1) Requires at least double the memory overhead;
[0006] (2) The storage locations of 0 elements are discontinuous, resulting in high operation overhead for supplementing 0 elements;
[0007] (3) Copying the original image data takes a lot of time and overhead
[0009] According to the architectural characteristics of vector processors, there are currently various vectorization implementation methods for convolution calculations. For example, a vectorization method for convolutional neural network operations of vector processors disclosed in Chinese patent application 201810687639.X, and patent application 201810689646.3. A GPDSP-oriented convolutional neural network multi-core parallel computing method, patent application 201710201589.5 discloses a vectorized implementation method of two-dimensional matrix convolution for vector processors, etc., such schemes use weight data loading In the vector array memory AM, the input image feature data is loaded i

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-sample multi-channel convolutional neural network Same convolution vectorization implementation method
  • Multi-sample multi-channel convolutional neural network Same convolution vectorization implementation method
  • Multi-sample multi-channel convolutional neural network Same convolution vectorization implementation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0058] Let the number of cores of the target vector processor be q, the number of vector processing units VPE per core be p, the total number of samples in the data set be M, and the Mini-batch size be MB, where MB=q*p, M=num *MB, num is a positive integer, and the two-dimensional image input data of the convolutional neural network of the current calculation layer is preH*preW, preH is the image width, preW is the image height, the number of channels is preC, and the convolution kernel size is kernelH* kernelW*preC, and kernelH, kernelW are both odd numbers, the number of convolution kernels is nextC, and the step size of convolution calculation is 1.

[0059] In order to keep the output image after the convolution operation the same size as the input image, se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-sample multi-channel convolutional neural network Same convolution vectorization implementation method, which comprises the steps of 1, storing input feature data set data according to a sample dimension priority mode, and storing data of convolution kernels according to a number dimension priority mode of the convolution kernels; 2, dividing a data matrix of the input feature data set into a plurality of matrix blocks according to columns; step 3, transmitting the convolution kernel data matrix to the SM of each kernel each time, transmitting a sub-matrix formed by row extraction from the input feature data matrix to the AM of each kernel, executing vectorization matrix multiplication calculation and parallelization matrix multiplication calculation, and executing zero supplement in the calculation; 4, storing an output characteristic matrix calculation result in an off-chip memory; and step 5, repeating the steps 3 to 4 until all calculations are completed. According to the invention, Same convolution vectorization can be realized, and the method has the advantages of simple implementation operation, high execution efficiency and precision, small bandwidth requirement and the like.

Description

technical field [0001] The invention relates to the technical field of vector processors, in particular to a multi-sample multi-channel convolutional neural network Same convolution vectorization implementation method. Background technique [0002] In recent years, deep learning models based on deep convolutional neural networks have made remarkable achievements in image recognition and classification, target detection, video analysis, etc. The rapid development of related technologies such as data processing and processors. Convolutional Neural Networks (CNN) is a type of Feedforward Neural Networks (Feedforward Neural Networks) that includes convolution calculations and has a deep structure. It is one of the representative algorithms for deep learning. The input layer of the convolutional neural network can process multi-dimensional data. Since the convolutional neural network is the most widely used in the field of computer vision, when designing the convolutional neural...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/16G06F17/15G06N3/04G06N3/063
CPCG06F17/16G06F17/15G06N3/063G06N3/045Y02D10/00
Inventor 刘仲陈小文陈海燕田希鲁建壮王耀华吴立马媛曹坤
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products