Convolutional neural network multinuclear parallel computing method facing GPDSP

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and parallel computing technology, applied in the field of deep learning, can solve problems such as accelerating convolutional neural network computing, and achieve the effects of powerful parallel computing, efficient parallel computing, and high-bandwidth vector data loading capabilities

Active Publication Date: 2018-11-30

NAT UNIV OF DEFENSE TECH

View PDF12 Cites 24 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The powerful computing power of GPDSP makes it possible to become a very good platform for accelerating convolutional neural network calculations. However, GPDSP is a heterogeneous multi-core processor that includes CPU cores and DSP cores, including register files, scalar memories, and on-chip vector arrays. Memory, on-chip shared memory array, off-chip DDR memory and other multi-level storage architectures, the existing convolutional neural network computing methods cannot be directly applied to GPDSP.

To implement convolutional neural network calculations by GPDSP, there are still problems such as how to map convolutional neural network calculations to GPDSP's CPU core and multiple DSP cores with 64-bit vector processing arrays, and how to utilize the multi-level parallelism of GPDSP. There is no effective solution for convolutional neural network computing based on GPDSP, and it is urgent to provide a GPDSP-oriented convolutional neural network multi-core parallel computing method to take advantage of the structural features and multi-level parallelism of GPDSP to improve the performance of convolutional neural networks. Network Computational Efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0041] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0042] The simplified memory access structure model of the GPDSP specifically adopted in this embodiment is as follows: figure 1 As shown, the system includes a CPU core unit and a DSP core unit, wherein the DSP core unit includes several 64-bit vector processing array computing units, dedicated on-chip scalar memory and vector array memory, and the on-chip shared memory shared by the CPU core unit and the DSP core unit. Storage, large-capacity off-chip DDR memory, that is, GPDSP contains multiple DSP cores of 64-bit vector processing arrays, which can simultaneously perform parallel data processing through SIMD.

[0043] Such as figure 2 As shown, the present embodiment is oriented to the convolutional neural network multi-core parallel computing method of GP...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a convolutional neural network multinuclear parallel computing method facing GPDSP. The method comprises the following steps that S1, two data cache regions and a weight data cache region are constructed by a CPU core in an off-chip memory; S2, convolution kernel data of the designated number is combined by the CPU core for processing and stored in the weight data cache region; S3, the CPU core accesses to-be-calculated image data of the designated amplitude for combination processing, and the data is transmitted to a free data cache region; S4, if DSP cores are leisure, and data ready of the data cache regions is achieved, an address is transmitted to the DSP cores; S5, all the DSP cores conduct convolution neural network calculation; S6, a calculation result of the current time is output; S7, the steps of S3-S6 are repeated until all the calculations are completed. The performance and multi-level parallelism of the CPU core and the DSP cores in the GPDSP can be fully exerted, and efficient convolutional neural network calculation is achieved.

Description

technical field [0001] The present invention relates to the technical field of deep learning, in particular to a convolutional neural network multi-core parallel computing method oriented to GPDSP (General-Purpose Digital Signal Processor). Background technique [0002] At present, the deep learning model based on Convolutional Neural Networks (CNN) has made remarkable achievements in various aspects such as image recognition and classification, machine translation, automatic text processing, speech recognition, automatic driving, and video analysis. become a research hotspot in various fields. Convolutional neural network is a deep feed-forward neural network, which is usually composed of several convolutional layers, activation layers and pooling layers alternately. The convolutional layer performs feature extraction through the convolution operation of the convolution kernel and the input features, so that Learn the characteristics of each level. In the calculation of c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F15/80G06N3/04G06N3/063

CPCG06F15/8069G06N3/063G06N3/045

Inventor 刘仲郭阳扈啸田希陈海燕陈跃跃孙永节王丽萍

Owner NAT UNIV OF DEFENSE TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Convolutional neural network multinuclear parallel computing method facing GPDSP

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology