Parallel computing method and device for natural language processing model, equipment and medium

A technology for natural language processing and computing equipment, applied in the field of devices, parallel computing methods of natural language processing models, equipment and media, can solve the problems of general acceleration effect and low applicability, and achieve the effect of reducing computing time and improving efficiency

Pending Publication Date: 2022-04-15
NAT UNIV OF DEFENSE TECH
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, some relatively mature distributed training technologies have appeared, but the acceleration effect of distributed training for large-scale models and large-scale nodes is average, and the applicability is low, requiring a certain hardware threshold

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel computing method and device for natural language processing model, equipment and medium
  • Parallel computing method and device for natural language processing model, equipment and medium
  • Parallel computing method and device for natural language processing model, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] It should be noted that in the deep neural network model in the field of natural language processing, the common acceleration methods for distributed training include: data parallelism, model parallelism, and pipeline parallelism:

[0041] Data parallelism means that each device stores a copy of the model parameters, divides the training data into several parts and inputs them into the model for training, and regularly synchronizes the training results of each device to achieve the purpose of collaborative training. The disadvantage of this method is: the amount of data that needs to be communicated during synchronization is large, and the overhead of synchronization will slow down the training speed, which is more obvious in the distributed training of large-scale nodes. In addition, the storage space of the device needs to be larger than the size of the model, which requires high hardware conditions.

[0042]Model parallelism is a parallel method for large-scale neura...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a parallel computing method and device for a natural language processing model, equipment and a medium. In the scheme, a plurality of computing devices in different computing node groups are trained in an assembly line parallel mode, different computing node groups are subjected to gradient sharing in a data parallel mode, the assembly lines can be controlled in a certain number of nodes in a parallel mode, and the problem that in large-scale computing node training, the number of the nodes is too large is avoided. And the method can be effectively suitable for parallel training of a large-scale network model on large-scale computing nodes. Moreover, according to the scheme, the synchronous communication between the computing node groups is hidden in the pipeline parallel computing process, so that each computing node group can enter the next iterative computation as soon as possible after the iterative computation is finished, and the processing efficiency of the natural language processing model can be improved on the basis of ensuring the processing effect of the natural language processing model in this way. The calculation time of natural language processing model training is shortened, and the efficiency of distributed training is improved.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, and more specifically, relates to a parallel computing method, device, equipment and medium of a natural language processing model. Background technique [0002] The natural language processing model is a deep neural network model in the field of natural language processing, which is mainly used to realize functions such as language translation and question answering. As the number of model parameters continues to increase, model training becomes a very time-consuming task. In the training process of the neural network, the training data needs to be input into the neural network in batches for calculation, and the parameters of the neural network are updated iteratively until the output of the neural network is more accurate to achieve the desired effect, such as: the translated language is more accurate, the answer The question is more reasonable. [0003] However, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F9/54G06N3/04G06N3/08
Inventor 赖志权叶翔宇李东升黄震梅松竹乔林波
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products