Unlock instant, AI-driven research and patent intelligence for your innovation.

Distributed parallel training method and device, and readable medium

A distributed and distributed storage technology, applied in the field of deep learning, can solve problems such as waste of GPU computing power, large amount of data, and unfavorable GPU efficiency

Inactive Publication Date: 2020-07-07
INSPUR SUZHOU INTELLIGENT TECH CO LTD
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Model parallelism and data parallelism each have their own advantages and disadvantages. Model parallelism is relatively large in communication data, which is not conducive to improving the efficiency of GPU during training. Data parallelism is only suitable for networks with relatively small models. When training large network structures, it will Insufficient video memory occurs
[0004] At present, there are many network architectures that support neural network training, including tensorflow, pytorch and other architectures, but most of them prefer to use data parallelism to train neural networks, and the efficiency of general training is not particularly high. Taking tensorflow as an example, in When using standard tensorflow to test on 128 Pascal GPUs, both Inception V3 and ResNet-101 waste nearly half of the GPU computing power

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed parallel training method and device, and readable medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0020] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0021] Based on the above purpose, the first aspect of the embodiments of the present invention proposes an embodiment of a method for distributed parallel training. figure 1 What is shown is a schematic diagram of an embodiment of the distributed p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a distributed parallel training method. The method comprises the following steps: distributing GPUs for a plurality of processes; setting the plurality of GPUs under the plurality of processes to adopt the same optimizer and network state parameters; under multiple processes, distributing different network layers of the training model to different GPUs; and allocating the training set to a plurality of processes, and performing training in parallel through a plurality of GPUs under the plurality of processes. The invention further discloses computer equipment and a readable storage medium. Different training layers of the training model are allocated to different GPUs to realize model parallel training, data are allocated to different threads to realize data parallel training, and different threads allocate multiple GPUs for training, so that model parallel and data parallel hybrid training is realized, and the data processing efficiency is improved.

Description

technical field [0001] The present invention relates to the technical field of deep learning, in particular to a distributed parallel training method, device and readable medium. Background technique [0002] With the development of scientific computing technology, the neural network has achieved rapid development. At the same time, due to the continuous enhancement of the programmability of the GPU, the application capability of the GPU has far exceeded the graphics rendering task, and has been continuously applied to new fields. With excellent parallel processing capabilities, GPUs are widely used in training and reasoning of neural networks. [0003] In terms of GPU training, there are mainly two training methods, one is model parallel training, and the other is data parallel training, in which model parallel training mainly assigns different layers in the model to different GPUs, and trains the model in a pipelined manner. the weight of. And data parallelism means that...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06N20/00
CPCG06F9/5027G06N20/00
Inventor 孙红岩
Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD