Multiple GPUs-based BPNN training method and apparatus

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A training method and technology of a training device, which are applied in the field of neural network training, can solve the problems of high data synchronization overhead and low efficiency, and achieve the effect of reducing data synchronization overhead and improving training efficiency.

Active Publication Date: 2014-08-20

BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

View PDF2 Cites 16 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] However, in the above-mentioned multi-GPU BPNN training method, there is a large overhead when synchronizing the synchronization weight data between the BPNNs of the GPUs. The weight value data of a large-scale BPNN can reach hundreds of megabytes. The communication time overhead of these BPNN weight values can reach hundreds of milliseconds, resulting in the inefficiency of using multiple GPUs to train BPNNs, and the training process on a single GPU usually only takes tens of milliseconds. It can be seen that due to multiple GPUs The overhead of data synchronization between them is high, resulting in low efficiency of using multiple GPUs to train BPNN, and sometimes it is not as good as using a single GPU for BPNN training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0041] figure 1 The flow chart of the multi-GPU-based BPNN training method provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method includes:

[0042] S101. Control each GPU to perform forward calculation, and output O for forward calculation synchronously.

[0043] The forward calculation and reverse error calculation of BPNN are performed layer by layer, and the calculation output data of this layer can be synchronized between each GPU after the calculation of each layer is completed.

[0044]After the input layer transmits the data to the first hidden layer, each GPU is controlled to start forward calculation from the first hidden layer, and the forward calculation of each hidden layer can be completed and the forward calculation output O can be passed to the next At the same time as one hidden layer, the forward calculation output O of this layer is synchronized between each GPU, until the last layer of hidden layer transmits the forward c...

Embodiment 2

[0058] Image 6 A schematic diagram of a multi-GPU-based BPNN training device provided in Embodiment 2 of the present invention, such as Image 6 As shown, the device includes: a forward calculation unit 10 , a reverse error calculation unit 20 , and a weight update unit 30 .

[0059] The forward calculation unit 10 is used for controlling each GPU to perform forward calculation of BPNN, and synchronously outputting the forward calculation among each GPU.

[0060] The forward calculation and reverse error calculation of BPNN are performed layer by layer, and the calculation output data of this layer can be synchronized between each GPU after the calculation of each layer is completed.

[0061] After the data is passed from the input layer to the first hidden layer, the forward calculation unit 10 controls each GPU to start forward calculation from the first hidden layer, and the forward calculation of each hidden layer can be completed and the forward calculation While the o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a multiple graphics processing unit (GPU)s-based back-propagation neural network (BPNN) training method and apparatus. The method comprises the following steps: S1, controlling all GPUs to carry out BPNN forward calculation and synchronizing forward calculation outputs among all GPUs; S2, controlling all GPUs to carry out BPNN backward error calculation and synchronizing backward error calculation outputs among all GPUs; S3, controlling all GPUs to update the weight of the BPNN according to the forward calculation outputs obtained by synchronization and backward error calculation outputs obtained by synchronization. According to the invention, data synchronization costs of multiple GPUs during the BPNN training can be lowered; and the BPNN training efficiency of the multiple GPUs can be improved.

Description

【Technical field】 [0001] The invention relates to neural network training technology, in particular to a multi-GPU-based BPNN training method and device. 【Background technique】 [0002] BPNN (Back-Propagation Nueral Networks) backpropagation neural network is a multi-layer feedforward network trained by the error backpropagation algorithm proposed by a team of scientists headed by Rumelhart and McCelland in 1986. It is currently the most widely used neural network model. one. [0003] The topology of the BPNN model includes an input layer (input), a hidden layer (hide layer) and an output layer (output layer). The input layer is responsible for receiving input data from the outside world and passing it to the hidden layer; the hidden layer is the internal information processing layer, responsible for data processing, and the hidden layer can be designed as a single hidden layer or a multi-hidden layer structure; the last hidden layer is passed to the output After further p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08

Inventor 欧阳剑王勇

Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multiple GPUs-based BPNN training method and apparatus

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology