Language model pre-training method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A language model and pre-training technology, applied in the artificial field, can solve problems such as hindering Chinese migration, unbalanced samples, random effects of training effects, etc., to achieve the effect of improving prediction accuracy and improving prediction results.

Inactive Publication Date: 2019-07-19

人立方智能科技有限公司

View PDF6 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This pre-training method has the following problems: 1. It directly predicts the words that should appear in each position. Due to the huge vocabulary, high-frequency words appear in each position, resulting in unbalanced samples; 2. It is based on sequence modeling after word segmentation , which is not friendly to Chinese, which has ambiguous word segmentation, and hinders the migration of Chinese in downstream applications; 3. In modeling the relationship between sentence pairs, the negative example construction of the upper and lower sentences is arbitrary, which causes randomness to the final training effect. sexual influence

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0014] Using the company's internal job classification data as the corpus, the goal is to predict the job classification corresponding to each piece of work experience. In this example, there are 1930 categories in job classification. The network structure uses the Transformer in the BERT model as the basis, and a piece of work experience is input. After the Transformer, the output feature representation uses the Attention mechanism, and the prediction of the 1930 class is output. The training target uses cross-entropy optimization, and the parameters in the Transformer use the parameter values in the pre-trained BERT model. Using direct prediction, prediction after pre-training based on BERT, and prediction after pre-training with the method proposed in the present invention, three groups of experiments were used to compare the prediction results.

[0015] Among them, the pre-training process carried out by the method proposed in this application is as follows:

[0016]...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a language model pre-training method. The language model pre-training method comprises the following steps of carrying out word segmentation on the corpora in a model accordingto characters and sub-words; extracting 15% of the generated segmented words immediately for position covering, and calculating the covered semantic distribution; controlling the sub-word mixing in the model by using an independent door control unit; and performing synchronous training on the semantic distribution and the prediction of the masking words. According to the method, the prediction result of the model after the BERT pre-training can be obviously improved.

Description

technical field [0001] The invention belongs to the field of artificial technology, and in particular relates to a language model pre-training method based on an improved BERT model of mixed characters and subwords. Background technique [0002] Natural language processing is an important branch of artificial intelligence. Pre-trained language models have been proven to be quite effective in practice. Language Model (Language Model) is the probability distribution of a sequence of words. Specifically, the language model is to determine a probability distribution P for a text of length m, indicating the possibility of the existence of this text. The more commonly used language pre-training method is language pre-training based on the BRRT model, which includes the following steps: 1. Prepare text corpus with upper and lower sentences; 2. Use BPE (byte piece encoding, that is, simple word segmentation algorithm) to convert the text corpus 3. Covering / replacing 15% of the wo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/35G06F17/27

CPCG06F16/35G06F40/289G06F40/30

Inventor 陈瑶文

Owner 人立方智能科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Language model pre-training method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology