Supercharge Your Innovation With Domain-Expert AI Agents!

Efficient inter-chip interconnect topology for distributed parallel deep learning

An interconnection structure and inter-chip technology, applied in the field of computing, can solve problems such as inability to divide multiple computing tasks, long connection line delays, etc.

Pending Publication Date: 2021-08-03
ALIBABA GRP HLDG LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional hardware interconnect structure used to implement the AllReduce algorithm is based on a ring topology, which has many important problems, including the delay of long connection lines and the inability to split computing nodes to distribute multiple computing tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient inter-chip interconnect topology for distributed parallel deep learning
  • Efficient inter-chip interconnect topology for distributed parallel deep learning
  • Efficient inter-chip interconnect topology for distributed parallel deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. For the following description, please refer to the accompanying drawings. Unless otherwise specified, the same reference numerals in different drawings indicate the same or similar elements. The implementations set forth in the following description of the exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention described in the appended claims.

[0019] Distributed computing is the field of computer science that studies distributed systems. A distributed system is a system whose components are located on different networked computers that communicate and coordinate their actions by passing messages to each other.

[0020] Distributed deep learning is an implementation of deep learning algorithms. Since deep ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The application relates to an efficient inter-chip interconnect topology for distributed parallel deep learning, and provides a system comprising: a first group of computing nodes and a second group of computing nodes, wherein the first and second groups are neighboring devices and each of the first and second groups comprising: a set of computing nodes A-D, and a set of intra-group interconnects, wherein the set of intra-group interconnects communicatively couple computing node A with computing nodes B and C and computing node D with computing nodes B and C; and a set of inter-group interconnects, wherein the set of inter-group interconnects communicatively couple computing node A of the first group with computing node A of the second group, computing node B of the first group with computing node B of the second group, computing node C of the first group with computing node C of the second group, and computing node D of the first group with computing node D of the second group.

Description

Background technique [0001] Current approaches to distributed training of neural networks include applying simultaneous large-small batch stochastic gradient descent ("SDG") methods on multiple distributed computing nodes to attempt data-parallel acceleration. In this approach, the mode of communication between computing nodes is the so-called "AllReduce" algorithm. The traditional hardware interconnect structure used to implement the AllReduce algorithm is based on a ring topology, which has many important problems, including the delay of long connection lines and the inability to divide computing nodes to distribute multiple computing tasks. Contents of the invention [0002] An embodiment of the present disclosure provides a system, including a first group of computing nodes and a second group of computing nodes, wherein the first group of computing nodes and the second group of computing nodes are adjacent devices, and the first group The computing node and the second g...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F9/54G06N3/063
CPCG06F9/5038G06F9/542G06N3/063G06F9/505G06N3/08G06F9/5044
Inventor 韩亮焦阳
Owner ALIBABA GRP HLDG LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More