Heterogeneous network perception model division and task placement method in pipelined distributed deep learning

A deep learning and model division technology, applied in the field of distributed computing, can solve problems such as inability to adapt to the heterogeneity of the GPU cluster network, and achieve the effect of improving the training speed

Active Publication Date: 2019-12-03
SOUTHEAST UNIV
View PDF5 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention mainly aims at the fact that the model division and task placement of distributed deep learning in the current pipe...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Heterogeneous network perception model division and task placement method in pipelined distributed deep learning
  • Heterogeneous network perception model division and task placement method in pipelined distributed deep learning
  • Heterogeneous network perception model division and task placement method in pipelined distributed deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0023] The present invention is mainly carried out in the GPU cluster environment with heterogeneous network topology.

[0024] figure 1 It shows the overall architecture diagram, which mainly includes GPU server nodes connected by heterogeneous network. Heterogeneity is reflected in two aspects: heterogeneity of GPU connection methods between nodes and within nodes, and heterogeneity of connection bandwidth between nodes. Usually, the connection of GPUs is as follows: the internal nodes are connected through PCIe, and the nodes are connected through Ethernet / Infiniband. Install the CUDA library and cuDNN library on each GPU, and use the PyTorch framework to perform calculations.

[0025] figure 2 Represents the overall flow chart. First, for neural network applications, the characterization is performed layer by layer, and the cu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a heterogeneous network perception model division and task placement method in assembly line distributed deep learning, which mainly comprises three parts, namely deep learningmodel description, model division and task placement and assembly line distributed training. According to the method, firstly, for resource requirements of deep learning application in the GPU training process, corresponding indexes such as calculation time, intermediate result communication quantity and parameter synchronization quantity in the training execution process of the deep learning application are described and serve as input of model division and task placement; then indexes and heterogeneous network connection topology of the GPU cluster are obtained according to model description, a dynamic programming algorithm based on min-max is designed to execute model division and task placement, and the purpose is to minimize the maximum value of task execution time of each stage afterdivision so as to ensure load balance. And finally, according to a division placement result, performing distributed training by using assembly line time-sharing injection data on the basis of modelparallelism, thereby realizing effective guarantee of training speed and precision.

Description

technical field [0001] The invention relates to a model division and task placement technology for heterogeneous network perception in pipeline distributed deep learning, and belongs to the technical field of distributed computing. Background technique [0002] Deep learning (deep learning) is a class of machine learning techniques that use multi-layer nonlinear information for supervised or unsupervised feature extraction and transformation, as well as techniques for pattern analysis and classification. Deep learning generally includes two processes, the training process and the inference process: the training process is to use the designed neural network to extract features from a large number of training sets (known labels) to perform predictions, and then calculate according to the error between the predicted value and the actual label value Gradient, using the method of gradient descent to perform parameter update, repeat iterations until convergence. The inference pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06N3/04G06N3/06
CPCG06N3/082G06N3/084G06N3/06G06N3/045
Inventor 张竞慧詹隽金嘉晖罗军舟
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products