Convolutional operation optimization method and system for efficiently running deep learning task

A deep learning and convolution operation technology, applied in neural learning methods, computing, image data processing, etc., can solve the problems of slow CPU running frequency, inability to obtain deep neural network output, and high cost

Active Publication Date: 2020-07-07
SUN YAT SEN UNIV
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a convolution operation optimization method and system for efficiently running deep learning tasks, which solves the proble

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolutional operation optimization method and system for efficiently running deep learning task
  • Convolutional operation optimization method and system for efficiently running deep learning task

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments, and the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. To explain the present invention, but not as a limitation of the present invention.

[0040] Such as figure 1 As shown, a convolution operation optimization method for efficiently running deep learning tasks, the method is based on an embedded platform, and specifically includes the following steps:

[0041] Step S1, input picture parameters and convolution kernel parameters to the memory of the embedded platform, and divide the picture parameters and the convolution kernel parameters into picture sub-tensors and convolution kernel sub-sheets whose size matches the high-speed memory capacity amount, wherein the high-speed memory includes L1 cache, L2 cache and L3 cache;

[0042] Step S2, copying the sub-tensor ob...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a convolution operation optimization method and system for efficiently running a deep learning task. The method comprises the following steps: obtaining picture parameters andconvolution kernel parameters, segmenting the picture parameters and the convolution kernel parameters to obtain picture sub-tensors and convolution kernel sub-tensors, copying the segmented sub-tensors to a high-speed memory, performing convolution operation on the sub-tensors stored in an L1 cache, and assembling the sub-tensors subjected to convolution operation according to an assembling stepof a matrix partitioning algorithm to obtain a final result. Through matrixes and tensor partitioning strategies adjusted according to hardware parameters of different embedded platforms, in the wholeoperation process, more operation data can be obtained from a high-speed memory instead of low-speed storage, and the operation speed is increased; meanwhile, through a reasonable strategy for optimizing the assembly level of the embedded platform, the potential of the platform can be better utilized by operation, and the operation speed is further increased; in addition, a matrix partitioning strategy is adopted, so that the implementation cost is lower.

Description

technical field [0001] The invention relates to the technical field of computer performance optimization, in particular to a convolution operation optimization method and system for efficiently running deep learning tasks. Background technique [0002] At present, deep learning technology and Internet of Things technology are developing rapidly, and technologies that combine the two are also emerging, for example, using deep learning technology and Internet of Things technology for monitoring, using deep learning technology and Internet of Things technology for intelligent analysis Wait, it can be seen that the combination of deep learning technology and Internet of Things technology is becoming more and more popular among the public. [0003] However, limited by the computing power of embedded devices necessary for IoT technology, deep learning technology cannot achieve satisfactory results on these devices. The main disadvantages are as follows: 1. The output of the deep n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/50G06T7/10G06N3/04G06N3/08
CPCG06F9/5011G06T7/10G06N3/08G06N3/045Y02D10/00
Inventor 刘宁罗旸泽
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products