Training method and device of pre-training language model, computer equipment and medium

A technology of language model and training method, which is applied in the field of pre-training language model training, can solve problems such as reducing performance, reducing the scope of downstream tasks of pre-training language models, and scarcity of low-resource language sample data, so as to improve performance and improve cross- Language expressive ability, effect of improving the range of categories

Pending Publication Date: 2022-04-26
ALIBABA DAMO (HANGZHOU) TECH CO LTD
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For a natural language processing model suitable for specific downstream tasks covering different languages, the pre-trained language model needs to be fully fine-tuned with training sample data covering different languages ​​related to large-scale downstream tasks, while international business (for example, cross-border e-commerce business, cross-border logistics business and international business intelligent customer service, etc.) often cover different languages, and the sample data of low-resource languages ​​(minor languages) related to international business is usually relatively scarce, making the training sample data of low-resource languages ​​related to downstream tasks difficult The fine-tuning of the pre-trained language model is insufficient, which reduces the performance of the natural language processing model in low-resource language scenarios, and also makes it unsuitable for downstream tasks where low-resource language sample data is scarce by fine-tuning the pre-trained language model. Natural language processing model for downstream tasks, which reduces the range of downstream tasks that pre-trained language models can apply to in different language scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and device of pre-training language model, computer equipment and medium
  • Training method and device of pre-training language model, computer equipment and medium
  • Training method and device of pre-training language model, computer equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The present disclosure is described below based on examples, but the present disclosure is not limited only to these examples. In the following detailed description of the disclosure, some specific details are set forth in detail. The present disclosure can be fully understood by those skilled in the art without the description of these detailed parts. In order to avoid obscuring the essence of the present disclosure, well-known methods, procedures, and procedures are not described in detail. Additionally, the drawings are not necessarily drawn to scale.

[0050] Application scenarios and application architecture of the present disclosure

[0051] The application scenarios of the embodiments of the present disclosure may include forming a natural language processing model suitable for specific downstream tasks with different language understanding capabilities, so as to use the natural language processing model to complete international business (for example, cross-bo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a training method and device of a pre-training language model, computer equipment and a medium. The method comprises the following steps: obtaining a training data combination for expressing the same semantics by using different languages; inputting the training data combination into a pre-training language model with different language understanding capabilities to enable the pre-training language model to pre-train the training data combination to obtain an output data combination corresponding to the training data combination, and calculating a loss value of the training data combination according to the output data combination; and updating model parameters of the pre-training language model by using the loss value so as to improve the similarity of the output data combination. According to the method, the downstream task processing performance of the natural language processing model in a low-resource language scene is improved, and the type range of downstream tasks applicable to the pre-training language model in a multi-language scene is widened.

Description

technical field [0001] The present disclosure relates to the field of computer technology, and more specifically, to a training method, device, computer equipment and media for a pre-trained language model. Background technique [0002] With the continuous development of deep learning technology and global internationalization, the trend of business internationalization is more obvious, and the demand for artificial intelligence (AI) technology that can handle multiple languages ​​is increasing day by day. Currently, in the field of natural language processing, pre-training language models is a way to achieve AI. For a natural language processing model suitable for specific downstream tasks covering different languages, the pre-trained language model needs to be fully fine-tuned with training sample data covering different languages ​​related to large-scale downstream tasks, while international business (for example, cross-border e-commerce business, cross-border logistics ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F40/205G06K9/62
CPCG06F40/30G06F40/205G06F18/214
Inventor 刘家豪罗福莉黄松芳
Owner ALIBABA DAMO (HANGZHOU) TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products