Distributed training method and device for deep learning model, and computing equipment
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ALIBABA GRP HLDG LTD
- Publication Date
- 2021-11-12
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The present invention relates to the technical field of data processing, in particular to a distributed training method, device and computing equipment of a deep learning model. Background technique
[0002] Deep learning is an increasingly popular computing and machine learning implementation method in the industry, which can be used in various scenarios such as images, voice, video, and machine translation. Taking machine translation as an example, the effect of machine translation based on neural networks has been significantly improved, and it has been continuously developed in recent years. At present, in some languages and scenarios, the translation quality can even reach the level of human translation.
[0003] Data Parallel (Data Parallel) is a form of distributed training for deep learning models, which divides the training data into multiple parts and trains on different computing nodes. If the computing nodes do not have shared public me...