Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Language model training method and device and computer equipment

A language model and training method technology, applied in computing, neural learning methods, biological neural network models, etc., can solve problems such as difficult migration of vertical fields, difficult task data labeling, and difficulty in ensuring application scenarios, and achieve better recognition results. Effect

Pending Publication Date: 2020-10-13
PINGAN INT SMART CITY TECH CO LTD
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] The model trained on general data contains a lot of general language information, such as lexical and syntactic information, but lacks the specific semantic information contained in data in specific fields, so it usually needs to be fine-tuned on downstream data when using it. Optimize training, the above fine-tuning means fine-tuning, otherwise the effect will be poor, and a certain amount of data needs to be guaranteed when doing fine-tuning, otherwise it will be difficult to migrate from the general field to the vertical field
There are a large number of AI applications in the field of government affairs, but task data labeling is difficult, and it is difficult to ensure that a high-matching training model can be obtained according to the specific application scenarios of real-time requests.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language model training method and device and computer equipment
  • Language model training method and device and computer equipment
  • Language model training method and device and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0046] refer to figure 1 , a method for training a language model in this embodiment, comprising:

[0047]S1: Input the modified MLM task and the modified NSP task into the first Bert model for training, and obtain the first model parameters corresponding to the first Bert model;

[0048] S2: Apply the first model parameters to the second Bert model, and train the second Bert model through the modified MLM task and the modified NSP task, wherein the second Bert model is compared with the Describe the first Bert model, compress the parameter quantity of FFN layer, expand ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of artificial intelligence, and discloses a language model training method, which comprises the steps of inputting a modified MLM task and a modified NSP task into afirst Bert model for training to obtain a first model parameter corresponding to the first Bert model; applying the first model parameter to a second Bert model, and training the second Bert model through the modified MLM task and the modified NSP task; judging whether an output result of the second Bert model reaches a preset condition or not; and if so, judging that the second Bert model reaches the use standard. A structure and a training method of a Bert model are improved, the application range and accuracy of migrating the pre-trained language model to the specific application scene areexpanded, and the improved Bert model is trained under the current optimized task data, so that the training model can be more suitable for the specific application scene or field, the recognition effect is better, and the specific application scene according to the real-time request is ensured. The method is also suitable for the field of smart government affairs, thereby promoting the construction of smart cities.

Description

technical field [0001] This application relates to the field of artificial intelligence, in particular to a language model training method, device and computer equipment. Background technique [0002] The model trained on general data contains a lot of general language information, such as lexical and syntactic information, but lacks the specific semantic information contained in data in specific fields, so it usually needs to be fine-tuned on downstream data when using it. Optimizing training, the above fine-tuning means fine-tuning, otherwise the effect will be poor, and a certain amount of data needs to be guaranteed when doing fine-tuning, otherwise it will be difficult to migrate from the general field to the vertical field. There are a large number of AI applications in the field of government affairs, but it is difficult to label task data, and it is difficult to ensure that a high-matching training model can be obtained according to the specific application scenario ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08G06N3/04G06F40/253G06F40/30G06F16/27
CPCG06N3/08G06F40/253G06F40/30G06F16/27G06N3/044Y02D10/00
Inventor 江新洋
Owner PINGAN INT SMART CITY TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products