Heterogeneous network perception model division and task placement method in pipelined distributed deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep learning and model division technology, applied in the field of distributed computing, can solve problems such as inability to adapt to the heterogeneity of the GPU cluster network, and achieve the effect of improving the training speed

Active Publication Date: 2019-12-03

SOUTHEAST UNIV

View PDF5 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The present invention mainly aims at the fact that the model division and task placement of distributed deep learning in the current pipeline training mode cannot adapt to the network heterogeneity of the GPU cluster, and proposes a network-aware model division and task placement method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0022] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0023] The present invention is mainly carried out in the GPU cluster environment with heterogeneous network topology.

[0024] figure 1 It shows the overall architecture diagram, which mainly includes GPU server nodes connected by heterogeneous network. Heterogeneity is reflected in two aspects: heterogeneity of GPU connection methods between nodes and within nodes, and heterogeneity of connection bandwidth between nodes. Usually, the connection of GPUs is as follows: the internal nodes are connected through PCIe, and the nodes are connected through Ethernet / Infiniband. Install the CUDA library and cuDNN library on each GPU, and use the PyTorch framework to perform calculations.

[0025] figure 2 Represents the overall flow chart. First, for neural network applications, the characterization is performed layer by layer, and the cu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a heterogeneous network perception model division and task placement method in assembly line distributed deep learning, which mainly comprises three parts, namely deep learningmodel description, model division and task placement and assembly line distributed training. According to the method, firstly, for resource requirements of deep learning application in the GPU training process, corresponding indexes such as calculation time, intermediate result communication quantity and parameter synchronization quantity in the training execution process of the deep learning application are described and serve as input of model division and task placement; then indexes and heterogeneous network connection topology of the GPU cluster are obtained according to model description, a dynamic programming algorithm based on min-max is designed to execute model division and task placement, and the purpose is to minimize the maximum value of task execution time of each stage afterdivision so as to ensure load balance. And finally, according to a division placement result, performing distributed training by using assembly line time-sharing injection data on the basis of modelparallelism, thereby realizing effective guarantee of training speed and precision.

Description

technical field [0001] The invention relates to a model division and task placement technology for heterogeneous network perception in pipeline distributed deep learning, and belongs to the technical field of distributed computing. Background technique [0002] Deep learning (deep learning) is a class of machine learning techniques that use multi-layer nonlinear information for supervised or unsupervised feature extraction and transformation, as well as techniques for pattern analysis and classification. Deep learning generally includes two processes, the training process and the inference process: the training process is to use the designed neural network to extract features from a large number of training sets (known labels) to perform predictions, and then calculate according to the error between the predicted value and the actual label value Gradient, using the method of gradient descent to perform parameter update, repeat iterations until convergence. The inference pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/04G06N3/06

CPCG06N3/082G06N3/084G06N3/06G06N3/045

Inventor 张竞慧詹隽金嘉晖罗军舟

Owner SOUTHEAST UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Heterogeneous network perception model division and task placement method in pipelined distributed deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology