Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training method and device for multi-language semantic representation model, equipment and storage medium

A semantic representation and training method technology, applied in the direction of reasoning methods, computing models, semantic analysis, etc., can solve problems such as the inability to learn semantic alignment information in different languages, and the inability of multilingual semantic representation models to accurately realize information interaction in different languages. To achieve a strong practical effect

Pending Publication Date: 2020-11-27
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing multilingual semantic representation model cannot learn the semantic alignment information between different languages ​​during pre-training, resulting in the inability of the multilingual semantic representation model to accurately realize the information interaction between different languages.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and device for multi-language semantic representation model, equipment and storage medium
  • Training method and device for multi-language semantic representation model, equipment and storage medium
  • Training method and device for multi-language semantic representation model, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0030] figure 1 is a schematic diagram according to the first embodiment of the present application; as figure 1 As shown, this embodiment provides a training method for a multilingual semantic representation model, which may specifically include the following steps:

[0031] S101. Using several training corpora including multiple languages ​​to train the multiling...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a training method and device for a multi-language semantic representation model, electronic equipment and a storage medium, and relates to the field of natural language processing based on artificial intelligence. According to the specific implementation scheme, the method comprises the steps of: adopting a plurality of training corpora containing multiple languages to train a multi-language semantic representation model, so that the multi-language semantic representation model learns the semantic representation capacity of the various languages; for each training corpus in the plurality of training corpuses, generating a corresponding hybrid language corpus, the hybrid language corpus comprising corpuses of at least two languages; and training the multi-language semantic representation model by adopting each hybrid language corpus and the corresponding training corpus, so that the multi-language semantic representation model learns semantic alignment information of different languages. According to the technical scheme, the multi-language semantic representation model can learn semantic alignment information between different languages, semantic interactionbetween different languages can be achieved on the basis of the multi-language semantic representation model, and practicability is very high.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to the field of natural language processing based on artificial intelligence, and in particular to a training method, device, equipment and storage medium for a multilingual semantic representation model. Background technique [0002] Natural Language Processing (Natural Language Processing; NLP) is a very important subfield of Artificial Intelligence (AI). Most of the existing learning paradigms of NLP tasks adopt the method of pre-training and fine-tuning. Firstly, the pre-training task is used for preliminary modeling in the unsupervised corpus, and then the task data is used for fine-tuning on downstream tasks. And the existing experience shows that the pre-training model can play a role in constraining the regularization of model parameters, which can greatly improve the performance of downstream tasks. Based on the above, and with the continuous development of gl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/30G06F40/216G06F40/237
CPCG06F40/30G06F40/237G06F40/216G06F40/45G06F40/284G06N20/00G06F40/263G06N5/04
Inventor 欧阳轩王硕寰孙宇
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products