Unlock instant, AI-driven research and patent intelligence for your innovation.

Data synchronization method and device, equipment and storage medium

A data synchronization and data technology, applied in the field of model training, can solve the problems of resource idleness and waste

Pending Publication Date: 2022-08-09
LANGCHAO ELECTRONIC INFORMATION IND CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in modern data centers, acceleration devices such as GPUs and FPGAs are widely deployed. If only one type of device can be used for each data parallel training, it will inevitably cause idle and waste of resources.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data synchronization method and device, equipment and storage medium
  • Data synchronization method and device, equipment and storage medium
  • Data synchronization method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0042] In the prior art, communication between devices of the same type usually has the advantages of higher bandwidth and lower delay, while communication between heterogeneous devices usually has to pay a higher price. All are required to be of the same type. If various heterogeneous devices are forcibly placed in the same cluster for unified synchronous data parallel training, the efficiency will inevitably be very low. In view of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of model training, and discloses a data synchronization method and device, equipment and a storage medium, and the method comprises the steps: constructing a first-stage physical topology between acceleration equipment of the same type, and constructing a second-stage physical topology between acceleration equipment of different types; the acceleration devices in the secondary physical topology are connected through a cache coherence protocol; first processing is carried out on data to be synchronized in the acceleration equipment in a scatterreduce communication mode according to the first-level physical topology, and second processing is carried out on the data, subjected to the first processing, in the acceleration equipment according to the second-level physical topology; and performing third processing on the data subjected to the second processing in the acceleration equipment through an allgather communication mode according to the second-level physical topology, and performing fourth processing on the data subjected to the third processing in the acceleration equipment according to the first-level physical topology. Deep learning data parallelism based on various heterogeneous acceleration devices can be realized, and the hardware resource utilization rate and the data communication efficiency are improved.

Description

technical field [0001] The present invention relates to the technical field of model training, in particular to a data synchronization method, device, equipment and storage medium. Background technique [0002] With the wide application of deep neural network, its model size becomes larger and larger, and this growth makes efficient model training more important, and distributed training emerges as the times require. The current distributed model training methods include data parallelism and model parallelism. One of the most common and widely used methods is data parallelism. The data parallel method divides the input data to be trained, and trains multiple batches of data simultaneously on multiple acceleration devices during each training iteration. Data parallelism is divided into two methods: synchronous data parallelism and asynchronous data parallelism. Among them, in the synchronous data parallel method, after all the acceleration devices calculate the gradient of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L49/90G06N3/08H04L41/12
CPCH04L49/9094H04L41/12G06N3/08
Inventor 曹芳郭振华王丽高开赵雅倩李仁刚
Owner LANGCHAO ELECTRONIC INFORMATION IND CO LTD