Distributed training method and device for deep learning model, and computing equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A distributed computing and deep learning technology, applied in the field of data processing, can solve the problems of low utilization of computing node hardware resources and low efficiency of distributed training and training, so as to improve throughput and hardware resource utilization, and reduce communication computing Compared with the effect of improving training efficiency

Pending Publication Date: 2021-11-12

ALIBABA GRP HLDG LTD

View PDF0 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, in the existing distributed training methods, the communication calculation ratio of each computing node (the time for the computing node to communicate with other computing nodes and the time for the computing node to perform gradient calculation, the ratio of the two) is relatively high, so that the computing node The utilization rate of hardware resources is not high, which makes the training efficiency of distributed training low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0039] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0040] Firstly, the implementation environment of the distributed training method of the embodiment of the present invention is introduced.

[0041] data center

[0042] A data center is a network of specific equipment for global collaboration, which is used to transmit, accelerate, display, calculate, and store data information on the Internet network infrastructure. In the future development, the data center will also become a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a distributed training method and device for a deep learning model and computing equipment. The method comprises the following steps: in each training step, acquiring a predetermined number of training data from a training data set as batch training data; calculating a gradient of a model parameter of the deep learning model on the batch training data, and taking the gradient as a local gradient; calculating an accumulated value of the local gradients of the preset number of training steps as an accumulated gradient; communicating with other computing nodes, and exchanging accumulated gradients of each other; and calculating a gradient average value of the accumulated gradients of all the computing nodes, and updating the model parameters based on the gradient average value.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a distributed training method, device and computing equipment of a deep learning model. Background technique [0002] Deep learning is an increasingly popular computing and machine learning implementation method in the industry, which can be used in various scenarios such as images, voice, video, and machine translation. Taking machine translation as an example, the effect of machine translation based on neural networks has been significantly improved, and it has been continuously developed in recent years. At present, in some languages and scenarios, the translation quality can even reach the level of human translation. [0003] Data Parallel (Data Parallel) is a form of distributed training for deep learning models, which divides the training data into multiple parts and trains on different computing nodes. If the computing nodes do not have shared public me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N20/00

CPCG06N20/00

Inventor 樊士庆孟晨王思宇龙国平杨军

Owner ALIBABA GRP HLDG LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Distributed training method and device for deep learning model, and computing equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology