Deep learning-oriented data processing method for GPU parallel computing

A deep learning and parallel computing technology, applied in neural learning methods, electrical digital data processing, computing, etc., can solve problems such as slow training speed, and achieve the effect of improving running speed, good image batch processing, and improving training speed.

Inactive Publication Date: 2020-03-31
HARBIN ENG UNIV
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But the complex network model brings great challenges to the GPU processor
Especially under

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning-oriented data processing method for GPU parallel computing
  • Deep learning-oriented data processing method for GPU parallel computing
  • Deep learning-oriented data processing method for GPU parallel computing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] The present invention mainly comprises the following contents:

[0057] 1. Model the calculation graph

[0058] (1) Calculation graph

[0059] Construct a directed graph G=(V,Eλ,τ) for the input data, where V is the vertex set in G, E∈V×V is the set of edges in G, and λ:V→(O,Bool) is the representation A function that maps each vertex to an operation o (tuple o ∈ O) and a boolean whether the operation is parameterized, τ: E→(D, ACT) is a function that maps each edge to a data type D and an action to ACT mapping function, where ACT = {"read", "update", "control"}.

[0060] (2) Topological sorting

[0061] Given a computational graph G=(V,Eλ,τ), let N be the number of vertices in the graph, topological sorting is a mapping from fixed points to integers, γ: V→{0,1,...,N-1}, satisfy

[0062] ·

[0063] ·

[0064] Topological sorting represents the execution order of operations in the graph. Given two operations u and v, if γ(u) < γ(v), execute u before v. If γ(...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a deep learning-oriented data processing method for GPU parallel computing. The method comprises the following steps: firstly, inputting data to model a calculation graph: (1) constructing operation rules of vertexes and edges of a directed graph; (2) defining an execution sequence of operations in the graph by using topological sorting; and (3) updating parameters through the training model, introducing a tensor life cycle, and rewriting a calculation graph based on data operation cost to obtain an optimal operation strategy, mainly comprising the following steps: firstly, modeling the calculation graph based on cost, and redefining an operation function on a CPU; fusing the swap-out operations with the same tensor into a single swap-out operation; and finally, obtaining a traversal sequence by applying a tensor replacement strategy based on calculation and transmission cost. Therefore, the computational graph modeling method based on the formalization rule is constructed. Finally, the extensible neural network and the calculation graph are combined, the training speed of the model can be increased, and the image processing effect is effectively improved.

Description

technical field [0001] The invention relates to a data processing method for GPU parallel computing, in particular to a computational graph modeling method based on formalized rules. Background technique [0002] With the rapid development of artificial intelligence, deep neural networks have penetrated into various fields of scientific research. At the same time, neural network models are becoming more and more complex, so GPUs are applied to deep learning. Compared with CPU, GPU has shown excellent performance in accelerating matrix calculation. For example, the face recognition accuracy of the FaceNet network model developed by Google can reach 99.63%. Optasia, developed by Microsoft, has shown high accuracy and performance in relational queries of metropolitan traffic cameras. But the complex network model brings great challenges to the GPU processor. Especially under large samples, large-scale parameters and deep structure, the training speed will be slower. Therefor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/50G06N3/08
CPCG06F9/5027G06N3/08
Inventor 吴艳霞任宁李晓松张硕王旭
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products