Distributed training method based on hybrid parallelism

A training method and distributed technology, applied in the field of deep learning, can solve problems such as long training time and model failure to train, and achieve the effect of improving training speed

Pending Publication Date: 2021-03-09
西安烽火软件科技有限公司
View PDF0 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is to provide a distributed training method based on hybrid parallelism, which uses data parallelism and model parallelism in order to solve the problems of long training time in mass ID data training, model parameters exceeding the GPU memory, and the model cannot be trained. Hybrid parallel method, using multi-nodes and multi-GPUs to solve the above problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed training method based on hybrid parallelism
  • Distributed training method based on hybrid parallelism
  • Distributed training method based on hybrid parallelism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] Below in conjunction with accompanying drawing, technical scheme of the present invention is described in further detail:

[0042]A further detailed description of the invention in conjunction with the accompanying drawings, at least a specific description of a preferred embodiment, the level of specificity of this description should be such that those skilled in the art can reproduce the invention or utility model according to the described content, Instead of spending creative labor, such as no need to carry out groping research and experiments.

[0043] A kind of distributed model training method based on hybrid parallel, the present invention is introduced in detail here in conjunction with face recognition algorithm example facing large-scale ID, specifically comprises the following steps:

[0044] Step 1. Build a face recognition network model, in which the feature extraction network (backbone) can choose commonly used

[0045] Resnet50 model, the loss function u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed model training method based on hybrid parallelism, relates to the technical field of deep neural network model training, and solves the problems by adopting a hybrid parallelism mode of data parallelism and model parallelism and using multiple nodes and multiple GPUs. Firstly, for the problem of long training time, a distributed clustering method is adopted toperform parallel computing on mass data, and the training speed is increased; secondly, for the problem that a classification layer model occupies too much video memory during training, a model parallel mode is adopted, the classification layer model is segmented into a plurality of parts and deployed on a plurality of GPUs of a plurality of nodes in a cluster, meanwhile, the number of the nodescan be dynamically adjusted according to the size of the classification layer model, and classification model training under the large ID condition is met. According to the method, a hybrid parallel mode based on data parallelism and model parallelism is used, distributed cluster training is used, the model training efficiency can be greatly improved while the original deep learning training effect is kept, and classification model training under a large ID is met.

Description

technical field [0001] The invention relates to the technical field of deep learning for deep neural network model training, in particular to a hybrid parallel-based distributed training method. Background technique [0002] The concept of deep learning originates from the research of artificial neural network. It is a system science based on computer neural network theory and machine learning theory. It extracts and expresses information through multi-layer neural network, and combines the underlying features to form more abstract high-level features. , to learn the underlying regularity of the data samples. [0003] With the continuous improvement of industrial application requirements, large-scale model structure design and massive data model training have become mainstream methods, which makes the complexity and cost of deep learning continue to increase. For example, when training a large number of ID face recognition models, it takes nearly a day to train hundreds of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/084G06V40/172G06N3/045G06F18/24G06F18/214
Inventor 卢康王玮孙光泽杨赟王刚龙怡霖任鹏飞丁军峰刘慷赵智峰
Owner 西安烽火软件科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products