A decentralized distributed training topology, training system and method

A technology of decentralization and training method, applied in the field of distributed training, can solve the problems of slow training task speed, high bandwidth requirements, long iteration time, etc., to achieve accelerated training convergence process, short iteration time, and effective communication high effect

Active Publication Date: 2022-04-01
ZHEJIANG LAB
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The traditional centralized distributed training system has the shortcomings of long iteration time and high bandwidth requirements, and the performance of the distributed training system is completely dependent on the performance of the central training node. In addition, the decentralized distributed training system has a variety of topological structures, and different structures determine Different communication frequencies, communication times and communication volumes
These have greatly affected the performance of the decentralized distributed training system, resulting in slower training tasks and longer time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A decentralized distributed training topology, training system and method
  • A decentralized distributed training topology, training system and method
  • A decentralized distributed training topology, training system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. However, it should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

[0043] Before explaining the specific implementation manners in detail, firstly, terms and terms involved in multiple examples in the present invention are explained.

[0044] Distributed training: based on multiple parallel strategies such as data parallelism and model parallelism, using multiple training nodes for training;

[0045] Decentralized distributed training: a distributed training...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a decentralized distributed training topology, comprising: the topology is an n-dimensional hypersquare topology, which is a closed, compact, and convex undirected graph, and the topology consists of a finite non- Composed of empty node sets and finite edge sets, the 1-dimensional skeleton of the topology is composed of a group of equal-length line segments aligned in each dimension in its space, where the opposite line segments are parallel to each other and intersect at a point The line segments are orthogonal to each other. The present invention focuses on decentralized distributed performance training, "homogenizes" the training tasks, and evenly distributes the training task load to each training node in the distributed training system. The system performance no longer depends on Based on the performance of a single training node, it has the advantages of short iteration time, data localization, and high communication effectiveness.

Description

technical field [0001] The invention relates to the technical field of distributed training in computer technology, in particular to a decentralized distributed training topology, training system and method. Background technique [0002] With the massive growth of data and the sharp increase in the size of deep models in artificial intelligence, a large amount of training time and computing resources are required to complete effective training, so the field of distributed training has received a lot of attention. [0003] The traditional centralized distributed training system has the shortcomings of long iteration time and high bandwidth requirements, and the performance of the distributed training system is completely dependent on the performance of the central training node. In addition, the decentralized distributed training system has a variety of topological structures, and different structures determine Different communication frequency, communication times and commun...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L67/10G06K9/62
CPCH04L67/10G06F18/214
Inventor 杨非陈岱渊石永涛华炜鲍虎军
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products