Decentralized distributed training topological structure, training system and method

A topology, decentralized technology, applied in the field of distributed training, can solve the problems of slow training task speed, high bandwidth requirements, long iteration time, etc., to achieve accelerated training convergence process, short iteration time, and effective communication high effect

Active Publication Date: 2021-12-21
ZHEJIANG LAB
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The traditional centralized distributed training system has the shortcomings of long iteration time and high bandwidth requirements, and the performance of the distributed training system is completely dependent on the performance of the central training node. In addition, the decentralized distributed training system has a variety of topological structures, and different structures determine Different communication frequencies, communication times and communication volumes
These have greatly affected the performance of the decentralized distributed training system, resulting in slower training tasks and longer time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Decentralized distributed training topological structure, training system and method
  • Decentralized distributed training topological structure, training system and method
  • Decentralized distributed training topological structure, training system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below through the accompanying drawings and embodiments. However, it should be understood that the specific embodiments described herein are only used to explain the present invention, and not to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concepts of the present invention.

[0043] Before explaining the specific embodiments in detail, firstly, the terminology involved in the various examples of the present invention will be explained.

[0044] Distributed training: a method of using multiple training nodes for training based on multiple parallel strategies such as data parallelism and model parallelism;

[0045] Decentralized distributed training: distributed training method wi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a decentralized distributed training topological structure, and the structure is characterized in that the structure is an n-dimensional hypersquare topological structure, is a closed, compact and convex undirected graph, and is composed of a finite non-empty node set and a finite edge set; a one-dimensional skeleton of the topological structure is composed of a group of line segments which are orderly arranged in a space where the one-dimensional skeleton is located and aligned with each dimension and are equal in length, wherein the opposite line segments are parallel to each other, and the line segments intersecting at one point are orthogonal to each other. The method focuses on decentralized distributed performance training, the training tasks are 'homogenized', the load of the training tasks is uniformly distributed to each training node in the distributed training system, the system performance does not depend on the performance of a single training node any more, and the method has the advantages of short iteration time consumption, data localization and higher communication effectiveness.

Description

technical field [0001] The invention relates to the technical field of distributed training in computer technology, in particular to a decentralized distributed training topology, a training system and a method. Background technique [0002] With the massive growth of data and the dramatic increase in the size of deep models in artificial intelligence, a lot of training time and computing resources are required to complete effective training, so the field of distributed training has received a lot of attention. [0003] The traditional centralized distributed training system has the shortcomings of long iteration time and high bandwidth requirements, and the performance of the distributed training system is completely dependent on the performance of the central training node. In addition, the decentralized distributed training system has various topological structures, which are determined by different structures. different communication frequencies, times of communication, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/08G06K9/62
CPCH04L67/10G06F18/214
Inventor 杨非陈岱渊石永涛华炜鲍虎军
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products