Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multiple GPUs-based BPNN training method and apparatus

A training method and technology of a training device, which are applied in the field of neural network training, can solve the problems of high data synchronization overhead and low efficiency, and achieve the effect of reducing data synchronization overhead and improving training efficiency.

Active Publication Date: 2014-08-20
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF2 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, in the above-mentioned multi-GPU BPNN training method, there is a large overhead when synchronizing the synchronization weight data between the BPNNs of the GPUs. The weight value data of a large-scale BPNN can reach hundreds of megabytes. The communication time overhead of these BPNN weight values ​​can reach hundreds of milliseconds, resulting in the inefficiency of using multiple GPUs to train BPNNs, and the training process on a single GPU usually only takes tens of milliseconds. It can be seen that due to multiple GPUs The overhead of data synchronization between them is high, resulting in low efficiency of using multiple GPUs to train BPNN, and sometimes it is not as good as using a single GPU for BPNN training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multiple GPUs-based BPNN training method and apparatus
  • Multiple GPUs-based BPNN training method and apparatus
  • Multiple GPUs-based BPNN training method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] figure 1 The flow chart of the multi-GPU-based BPNN training method provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method includes:

[0042] S101. Control each GPU to perform forward calculation, and output O for forward calculation synchronously.

[0043] The forward calculation and reverse error calculation of BPNN are performed layer by layer, and the calculation output data of this layer can be synchronized between each GPU after the calculation of each layer is completed.

[0044]After the input layer transmits the data to the first hidden layer, each GPU is controlled to start forward calculation from the first hidden layer, and the forward calculation of each hidden layer can be completed and the forward calculation output O can be passed to the next At the same time as one hidden layer, the forward calculation output O of this layer is synchronized between each GPU, until the last layer of hidden layer transmits the forward c...

Embodiment 2

[0058] Image 6 A schematic diagram of a multi-GPU-based BPNN training device provided in Embodiment 2 of the present invention, such as Image 6 As shown, the device includes: a forward calculation unit 10 , a reverse error calculation unit 20 , and a weight update unit 30 .

[0059] The forward calculation unit 10 is used for controlling each GPU to perform forward calculation of BPNN, and synchronously outputting the forward calculation among each GPU.

[0060] The forward calculation and reverse error calculation of BPNN are performed layer by layer, and the calculation output data of this layer can be synchronized between each GPU after the calculation of each layer is completed.

[0061] After the data is passed from the input layer to the first hidden layer, the forward calculation unit 10 controls each GPU to start forward calculation from the first hidden layer, and the forward calculation of each hidden layer can be completed and the forward calculation While the o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a multiple graphics processing unit (GPU)s-based back-propagation neural network (BPNN) training method and apparatus. The method comprises the following steps: S1, controlling all GPUs to carry out BPNN forward calculation and synchronizing forward calculation outputs among all GPUs; S2, controlling all GPUs to carry out BPNN backward error calculation and synchronizing backward error calculation outputs among all GPUs; S3, controlling all GPUs to update the weight of the BPNN according to the forward calculation outputs obtained by synchronization and backward error calculation outputs obtained by synchronization. According to the invention, data synchronization costs of multiple GPUs during the BPNN training can be lowered; and the BPNN training efficiency of the multiple GPUs can be improved.

Description

【Technical field】 [0001] The invention relates to neural network training technology, in particular to a multi-GPU-based BPNN training method and device. 【Background technique】 [0002] BPNN (Back-Propagation Nueral Networks) backpropagation neural network is a multi-layer feedforward network trained by the error backpropagation algorithm proposed by a team of scientists headed by Rumelhart and McCelland in 1986. It is currently the most widely used neural network model. one. [0003] The topology of the BPNN model includes an input layer (input), a hidden layer (hide layer) and an output layer (output layer). The input layer is responsible for receiving input data from the outside world and passing it to the hidden layer; the hidden layer is the internal information processing layer, responsible for data processing, and the hidden layer can be designed as a single hidden layer or a multi-hidden layer structure; the last hidden layer is passed to the output After further p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08
Inventor 欧阳剑王勇
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products