Deep learning framework transplanting and optimizing method and system based on target many-core

A deep learning and target technology, applied in the field of high-performance computing, can solve problems such as deep learning framework transplantation and optimization

Active Publication Date: 2020-09-15
OCEAN UNIV OF CHINA +1
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention proposes a method and system for transplanting and optimizing a deep learning framework based on target many cores to solve the problem of how to transplant and optimize a deep learning framework based on target many cores

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning framework transplanting and optimizing method and system based on target many-core
  • Deep learning framework transplanting and optimizing method and system based on target many-core
  • Deep learning framework transplanting and optimizing method and system based on target many-core

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings. However, the present invention can be implemented in many different forms and is not limited to the embodiments described here. These embodiments are provided to disclose the present invention in detail and completely. Invention and fully convey the scope of the present invention to those skilled in the art. The terms in the exemplary embodiments shown in the drawings do not limit the present invention. In the drawings, the same units / elements use the same reference signs.

[0044] Unless otherwise specified, the terms (including scientific and technological terms) used herein have the usual meanings to those skilled in the art. In addition, it is understandable that the terms defined in commonly used dictionaries should be understood as having consistent meanings in the context of their related fields, and should not be understood as idealized or overly forma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a deep learning framework transplanting and optimizing method and system based on a target many-core. The method comprises the steps: in a transplanting process, transplantinga source code of a deep learning framework to a target many-core machine, modifying and compiling the framework according to a compiling instruction of the target many-core machine, and enabling theframework to meet the operation conditions of the target many-core machine; the acceleration optimization process comprises the steps that the framework is used for operating a functional model basedon deep learning on domestic many-cores, a target many-core performance analysis tool is used for analyzing codes, and confirmation and extraction of hotspot functions are achieved; analyzing and testing features and function parameters of the hotspot function; accelerating the hotspot function by using a parallel acceleration library; and determining an optimization strategy, finally improving the speed-up ratio of the framework on the premise of ensuring the correctness of the framework, and modifying and testing the compiled file of the deep learning framework according to the current master-slave core parallel code so as to realize the hybrid compilation and operation of the current master-slave core parallel code.

Description

Technical field [0001] The present invention relates to the technical field of high-performance computing, and more specifically, to a method and system for transplanting and optimizing a deep learning framework based on target many cores. Background technique [0002] With the rapid development of artificial intelligence, deep learning, as a powerful technical support of artificial intelligence, has been widely used in handwritten number recognition, speech recognition, and image understanding. With the rapid development of data and hardware equipment, convolutional neural networks have also gone from the initial 5 layers and 6 layers to the 152-layer ResidualNet proposed by MSRA, and even deeper layers. And as humans have higher and higher requirements for information processing capabilities, ordinary single-core or multi-core processors can no longer meet the demand for massive calculations in deep learning. [0003] At present, neural networks are all relying on graphics calcu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/08G06F8/41
CPCG06N3/063G06N3/08G06F8/41Y02D10/00
Inventor 魏志强孙文杰杨永全
Owner OCEAN UNIV OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products