Gradient synchronization method and device in distributed training

A distributed and gradient technology, applied in the computer field, can solve the problems of delaying model training speed, long model training time, and large span.

Pending Publication Date: 2019-12-27
BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD +1
View PDF8 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the current distributed training, it is necessary to transfer gradient information and synchronize gradient information every time training is completed, so as to share the gradient on the distributed training nodes and find the minimum loss function. The problem of high-frequency gradient information transmission and the large amount of information transmitted leads to long model training time and large span, which seriously delays the speed of model training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gradient synchronization method and device in distributed training
  • Gradient synchronization method and device in distributed training
  • Gradient synchronization method and device in distributed training

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the application. However, the present application can be implemented in many other ways different from those described here, and those skilled in the art can make similar promotions without violating the connotation of the present application. Therefore, the present application is not limited by the specific implementation disclosed below.

[0064] Terms used in one or more embodiments of the present application are for the purpose of describing specific embodiments only, and are not intended to limit the one or more embodiments of the present application. As used in one or more embodiments of this application and the appended claims, the singular forms "a", "the", and "the" are also intended to include the plural forms unless the context clearly dictates otherwise. It should also be understood that the term "and / or" used in one or more embodiments of th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a gradient synchronization method and device in distributed training. The gradient synchronization method in distributed training comprises the following steps: grouping training data on each training node in a distributed training cluster to obtain a plurality of sub-training data on each training node, the training nodes in the distributed training cluster being in annularconnection; calculating the sub-training average gradient of each piece of sub-training data in the training nodes of the distributed training cluster; obtaining a sub-training accumulation gradientcorresponding to the sub-training average gradient according to the sub-training average gradient; and synchronizing the sub-training accumulation gradient to each training node of the distributed training cluster. The average gradient of different batches of training data is calculated on each training node, so that the oscillation range of the gradient is relatively small, the gradient descending direction can be determined more accurately, the training speed of the model is increased, and the training efficiency of the model is improved.

Description

technical field [0001] The present application relates to the field of computer technology, and in particular to a method and device for gradient synchronization in distributed training, a computing device, a computer-readable storage medium and a chip. Background technique [0002] At present, with the rapid development of computer technology, deep learning technology has also made rapid progress. With the deepening of deep learning technology, more and more complex algorithms have been developed. These algorithms require a large amount of data and consume a lot of time to be effective. Complete training, so distributed training was developed. [0003] In the model optimization of deep learning, it is necessary to use the gradient descent method to calculate the gradient to find the minimum loss function, so as to train the model and speed up the convergence of the model. In the current distributed training, it is necessary to transfer gradient information and synchronize ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/084G06N3/045
Inventor 李鑫王洪伟李长亮
Owner BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products