Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method, device, device and computer storage medium for training semantic representation model

A technology of semantic representation and training corpus, applied in the field of artificial intelligence, can solve the problems of high cost and difficulty in collecting enough corpus for training, and achieve the effect of reducing cost and high training efficiency

Active Publication Date: 2020-12-18
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The training of the pre-training model requires a lot of computing resources and is expensive. The cost of each model is as high as hundreds of thousands or even millions. Therefore, it is difficult to build enough corpus for each language for training
And for languages ​​with very scarce corpus such as Czech, it is even difficult to collect enough corpus for training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device, device and computer storage medium for training semantic representation model
  • Method, device, device and computer storage medium for training semantic representation model
  • Method, device, device and computer storage medium for training semantic representation model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] figure 1 The flow chart of the method for training a semantic representation model provided in Embodiment 1 of the present application. The execution subject of this method is a device for training a semantic representation model. The device may be an application located in a computer system / server, or an application located in a computer system / server A functional unit such as a plug-in or a software development kit (Software Development Kit, SDK) in an application of the server. Such as figure 1 As shown in , the method may include the following steps:

[0031] In 101, acquire a semantic representation model that has been trained for a first language as a first semantic representation model.

[0032] Taking English as the first language as an example, since English is an international language, there are usually a lot of English corpus, so using English can easily and well train a semantic representation model, such as the Transformer model. In this step, the train...

Embodiment 2

[0049] In this embodiment, on the basis of the first embodiment, a semantic representation model trained in the first language is further obtained as a second semantic representation model. The first semantic representation model is used as the basis for layer-by-layer migration training, and the second semantic representation model is used to compare the results of the first language output by the second semantic representation model in the process of training the semantic representation model of the second language. The results output by the first semantic representation model are aligned.

[0050] Here, an additional alignment model needs to be added to assist the migration training of the first semantic representation model, and the alignment model is used to perform the above-mentioned alignment processing.

[0051] by figure 2 Take the training in stage (a) as an example, such as image 3 As shown in , the Chinese-English parallel corpus and the English training corpu...

Embodiment 3

[0069] Figure 5 The device structure diagram of the training semantic representation model provided for the third embodiment of the present application, such as Figure 5 As shown in , the device includes: a first acquisition unit 01 and a training unit 02, and may further include a second acquisition unit 03. The main functions of each component unit are as follows:

[0070] The first acquiring unit 01 is configured to acquire a semantic representation model that has been trained for the first language as a first semantic representation model.

[0071] The training unit 02 is used to use the bottom layer and top layer of the first semantic representation model as the trained layer, initialize the trained layer, keep the model parameters of other layers unchanged, and use the training corpus of the second language to train the trained layer , until the training end condition is reached; add the untrained layers to the trained layer one by one from bottom to top, and execute...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The application discloses a method, device, device and computer storage device for training a semantic representation model, and relates to the technical field of natural language processing in artificial intelligence. The specific implementation plan is: obtain the semantic representation model trained for the first language as the first semantic representation model; use the bottom layer and top layer of the first semantic representation model as the trained layer, initialize the trained layer, and keep other The model parameters of the layer remain unchanged, and the trained layer is trained with the training corpus of the second language until the training end condition is reached; the untrained layers are added to the trained layer one by one from bottom to top, and are respectively executed: keep dividing the trained layer The model parameters of other layers other than the layer remain unchanged, and the training layer is trained with the training corpus of the second language until the training end conditions are reached respectively; after all the layers are trained, the semantic representation model for the second language is obtained. The application can reduce the cost, and the training efficiency is higher.

Description

technical field [0001] This application relates to the field of computer application technology, in particular to artificial intelligence technology. Background technique [0002] This year, the pre-training model represented by the BERT (Bidirectional Encoder Representation from Transformers) model has greatly improved the effect of NLP (Natural Language Processing, Natural Language Processing) tasks. However, the current mainstream semantic representation models focus on common languages ​​such as English, Chinese, French, and German. However, there are as many as thousands of languages ​​in the world, most of which have less corpus than English and other common languages. We call them low-resource languages. The training of pre-trained models requires a large amount of computing resources and is expensive. The cost of each model is as high as hundreds of thousands or even millions. Therefore, it is difficult to construct enough corpus for each language for training. And...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/30
CPCG06F40/30G06N3/08G06F40/216G06F40/284G06N3/045G06N3/044G06N20/00G06N5/04
Inventor 王硕寰刘佳祥欧阳轩孙宇吴华王海峰
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products