BERT model pre-training method, computer device and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A pre-training and model technology, applied in computer parts, computing, neural learning methods, etc., can solve the problems of unstable direction and amplitude of BERT model performance changes, affecting the structure and parameters of BERT models, etc., to enhance the understanding of synonyms and The effect of recognition ability, maintaining performance, and less pre-training time

Pending Publication Date: 2021-09-24

PING AN TECH (SHENZHEN) CO LTD

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The basic principle of the K-BERT model and the ERNIE model is to introduce additional pre-training tasks using external knowledge in the process of pre-training the BERT model. However, these related technologies will affect the BERT model itself after introducing additional pre-training tasks. The structure and parameters of the BERT model lead to instability in the direction and magnitude of performance changes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0047] In order to make the purpose, technical solution and advantages of the application clearer, the embodiments of the application will be described in detail below in conjunction with the accompanying drawings. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined arbitrarily with each other.

[0048] In this example, refer to figure 1 , the pre-training method of the BERT model includes the following steps:

[0049] S1. Obtain training data;

[0050] S2. Obtain the knowledge map of synonyms;

[0051] S3. Perform word vector embedding processing on the synonym knowledge map to obtain a knowledge matrix;

[0052]S4. Determine the mask matrix according to the training data;

[0053] S5. Load the BERT model; wherein, the BERT model includes multiple attention mechanism modules;

[0054] S6. For each attention mechanism module, input the training data, knowledge matrix and mask m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a BERT model pre-training method, which comprises the steps such as: loading a BERT model, calling a plurality of attention mechanism modules by the BERT model, obtaining a mask matrix, respectively inputting training data, a knowledge matrix and the mask matrix into each attention mechanism module for processing, and obtaining the output of each attention mechanism module; and performing character string splicing and linearization processing on the output of each attention mechanism module to obtain a semantic vector, determining a training loss value according to a comparison result of the semantic vector and the mask matrix, and adjusting network parameters of the attention mechanism modules. The knowledge matrix can be directly embedded into the multi-head attention mechanism of the BERT model, and the synonym understanding and recognition capability of the BERT model in a text matching task can be enhanced even if a pre-training task using external knowledge is not introduced to pre-train the BERT model. The invention can be widely applied to the technical field of natural languages.

Description

technical field [0001] The invention relates to the technical field of natural language, in particular to a BERT model pre-training method, a computer device and a storage medium. Background technique [0002] The full name of BERT is Bidirectional Encoder Representations from Transformers, which is a deep learning model based on Transformers architecture and encoder. After the BERT model has been pre-trained with unlabeled training data, it only needs to use a small amount of corresponding sample data for specific downstream processing tasks before being applied to specific downstream processing tasks to perform a small amount of training to be able to process downstream processing tasks. , this feature of the BERT model is very suitable for applications in natural language processing (NLP, Natural Language Processing) and other fields. At present, when the BERT model is applied to natural language processing, it still lacks the ability to understand and utilize synonyms. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/36G06F40/247G06F40/30G06K9/62G06N3/04G06N3/08

CPCG06F16/367G06F40/247G06F40/30G06N3/04G06N3/08G06F18/2415Y02D10/00

Inventor 吴天博王健宗

Owner PING AN TECH (SHENZHEN) CO LTD

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

BERT model pre-training method, computer device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology