Cluster packet synchronization optimization method and system for distributed deep neutral network

A technology of deep neural network and group synchronization, which is applied in the field of distributed optimization of deep neural network, can solve the problems of increased number of parameter server requests, twists and turns in parameter update direction, poor model convergence effect, etc., to achieve high resource utilization and increase Affects the effect of weights, reducing the impact of stale gradients

Active Publication Date: 2017-08-04
HUAZHONG UNIV OF SCI & TECH
View PDF1 Cites 55 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the synchronization overhead between nodes is relatively high. While a node is waiting for other nodes to complete the current round of iterations, its own computing resources and network resources are idle. In heterogeneous clusters and large-scale homogeneous clusters, this The phenomenon is particularly serious
In a heterogeneous cluster, due to the large difference in the hardware configuration of the nodes, there are obvious performance differences among the nodes. Some nodes run fast, while others run slowly. Therefore, in each round of iteration, the fast All nodes need to wait for the slow nodes, resulting in idle resources of the fast nodes. The bottleneck of training lies in the slowest nodes; in a large-scale homogeneous cluster, although the performance of the nodes is the same, due to the large number of nodes, the overall The stability of the parameter server will be

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cluster packet synchronization optimization method and system for distributed deep neutral network
  • Cluster packet synchronization optimization method and system for distributed deep neutral network
  • Cluster packet synchronization optimization method and system for distributed deep neutral network

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0044] In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

[0045] The following first explains and describes the technical terms involved in the present invention:

[0046] Training data: also known as input data, that is, the processing objects that are input to the network model when training the neural network, such as images, audio, text, etc.;

[0047] Model parameters: the weight of the interconnected neurons in the neural network model and the bias bias on ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cluster packet synchronization optimization method and system for a distributed deep neutral network. The method comprises the following steps of grouping nodes in a cluster according to the performance, allocating training data according to the node performance, utilizing a synchronous parallel mechanism in the same group, using an asynchronous parallel mechanism between the different groups and using different learning rates between the different groups. The nodes with the similar performance are divided into one group, so that the synchronization overhead can be reduced; more training data can be allocated to the nodes with the good performance, so that the resource utilization rate can be improved; the synchronous parallel mechanism is used in the groups with the small synchronization overhead, so that an advantage of good convergence effect of the synchronous parallel mechanism can be exerted; the asynchronous parallel mechanism is used between the groups with the large synchronization overhead, so that the synchronization overhead can be avoided; the different groups uses the different learning rates to facilitate model convergence. According to the method and the system, a packet synchronization method is used for a parameter synchronization process of the distributed deep neutral network in a heterogeneous cluster, so that the model convergence rate is greatly increased.

Description

technical field [0001] The invention belongs to the technical field of distributed optimization of deep neural networks, and more particularly relates to a method and system for grouping synchronization optimization of distributed deep neural network clusters. Background technique [0002] At present, Deep Neural Network (DNN) has been applied in many fields such as image, speech, and natural language processing, and has made many breakthroughs. Due to the large scale of its training data and trained model parameters, deep neural networks require sufficient computing resources and storage resources. Therefore, the traditional single-machine node training mode can no longer meet the requirements, and distributed computing modes such as clusters must be used. [0003] Distributed Deep Learning usually adopts data parallel mode for model training. Such as figure 1 As shown, data parallelism refers to splitting the training data, storing one or more split training data on eac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/08H04L12/24G06N3/08
Inventor 蒋文斌金海叶阁焰张杨松马阳祝简彭晶
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products