Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

BERT model pre-training method, computer device and storage medium

A pre-training and model technology, applied in computer parts, computing, neural learning methods, etc., can solve the problems of unstable direction and amplitude of BERT model performance changes, affecting the structure and parameters of BERT models, etc., to enhance the understanding of synonyms and The effect of recognition ability, maintaining performance, and less pre-training time

Pending Publication Date: 2021-09-24
PING AN TECH (SHENZHEN) CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The basic principle of the K-BERT model and the ERNIE model is to introduce additional pre-training tasks using external knowledge in the process of pre-training the BERT model. However, these related technologies will affect the BERT model itself after introducing additional pre-training tasks. The structure and parameters of the BERT model lead to instability in the direction and magnitude of performance changes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • BERT model pre-training method, computer device and storage medium
  • BERT model pre-training method, computer device and storage medium
  • BERT model pre-training method, computer device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the purpose, technical solution and advantages of the application clearer, the embodiments of the application will be described in detail below in conjunction with the accompanying drawings. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined arbitrarily with each other.

[0048] In this example, refer to figure 1 , the pre-training method of the BERT model includes the following steps:

[0049] S1. Obtain training data;

[0050] S2. Obtain the knowledge map of synonyms;

[0051] S3. Perform word vector embedding processing on the synonym knowledge map to obtain a knowledge matrix;

[0052]S4. Determine the mask matrix according to the training data;

[0053] S5. Load the BERT model; wherein, the BERT model includes multiple attention mechanism modules;

[0054] S6. For each attention mechanism module, input the training data, knowledge matrix and mask m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a BERT model pre-training method, which comprises the steps such as: loading a BERT model, calling a plurality of attention mechanism modules by the BERT model, obtaining a mask matrix, respectively inputting training data, a knowledge matrix and the mask matrix into each attention mechanism module for processing, and obtaining the output of each attention mechanism module; and performing character string splicing and linearization processing on the output of each attention mechanism module to obtain a semantic vector, determining a training loss value according to a comparison result of the semantic vector and the mask matrix, and adjusting network parameters of the attention mechanism modules. The knowledge matrix can be directly embedded into the multi-head attention mechanism of the BERT model, and the synonym understanding and recognition capability of the BERT model in a text matching task can be enhanced even if a pre-training task using external knowledge is not introduced to pre-train the BERT model. The invention can be widely applied to the technical field of natural languages.

Description

technical field [0001] The invention relates to the technical field of natural language, in particular to a BERT model pre-training method, a computer device and a storage medium. Background technique [0002] The full name of BERT is Bidirectional Encoder Representations from Transformers, which is a deep learning model based on Transformers architecture and encoder. After the BERT model has been pre-trained with unlabeled training data, it only needs to use a small amount of corresponding sample data for specific downstream processing tasks before being applied to specific downstream processing tasks to perform a small amount of training to be able to process downstream processing tasks. , this feature of the BERT model is very suitable for applications in natural language processing (NLP, Natural Language Processing) and other fields. At present, when the BERT model is applied to natural language processing, it still lacks the ability to understand and utilize synonyms. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/36G06F40/247G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06F16/367G06F40/247G06F40/30G06N3/04G06N3/08G06F18/2415Y02D10/00
Inventor 吴天博王健宗
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products