Multi-machine multi-card hybrid parallel asynchronous training method for convolutional neural network

A convolutional neural network, multi-machine multi-card technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problems of long waiting time for synchronization and low GPU utilization, and reduce waiting time , improve training efficiency, and increase the number of GPUs

Inactive Publication Date: 2018-08-28
苏州纳智天地智能科技有限公司
View PDF6 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

And in the case of multiple GPUs, the waiting time for syn

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-machine multi-card hybrid parallel asynchronous training method for convolutional neural network
  • Multi-machine multi-card hybrid parallel asynchronous training method for convolutional neural network
  • Multi-machine multi-card hybrid parallel asynchronous training method for convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0047] A multi-machine multi-card hybrid parallel asynchronous training method oriented to convolutional neural networks proposed in this embodiment includes the following steps:

[0048] Step S1: Build a CNN model, set training parameters, and ensure that each machine and each GPU can run normally, and that there is no abnormality in network communication.

[0049] Step S2: Change the last layer to model parallelism, then a complete model is divided into 4 slices, which are calculated on 4 GPUs respectively, and communication of model parameters is no longer required. In the traditional GPU parallel method, the last layer is also data parallel, but the present invention adopts the All-gather algorithm of GPU, so that all data information (all picture features in this embodiment) can be obtained on each GPU. The GPU can train its own model slices based on all the data information, ensuring that each part of the model is learned from all the data.

[0050] The further implemen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multi-machine multi-card hybrid parallel asynchronous training method for the convolutional neural network. The method comprises steps that a CNN model is constructed, and training parameters are set; data parallelism of a Softmax layer is changed into model parallelism, the integral model is divided into multiple pieces, and the multiple pieces respectively correspond tomultiple GPUs for calculation; a source code of the Softmax layer is modified, instead of exchanging parameter data before calculating the result, Ring All-reduce communication algorithm operation ofthe calculation result is carried out; one multi-machine multi-card GPU is selected as a parameter server, and other GPUs are for training; in a Parameter Server model, each Server is only responsiblefor some of assigned parameter and processing tasks; each sub node maintains the own parameters, after update, the result is returned to a main node for global update, new parameters are further transmitted by the main node to sub nodes, and training is sequentially completed.

Description

technical field [0001] The invention relates to a method for improving the efficiency of deep learning training. Background technique [0002] The concept of deep learning originated from the study of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data. At present, deep learning has made breakthroughs in several important fields, namely speech recognition, image recognition, and natural language processing. Deep learning is an intelligent learning method that is closest to the human brain. It has many model parameters, a large amount of calculation, a large scale of training data, and consumes more computing resources. Therefore, for large-scale training data and models, it is necessary to accelerate training to improve work efficiency an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/08G06N3/045
Inventor 汪浩源程诚王旭光
Owner 苏州纳智天地智能科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products