BERT-FLAT-based Chinese named entity recognition method

A named entity recognition and entity recognition technology, applied in character and pattern recognition, instruments, biological neural network models, etc., can solve the problems of not representing ambiguity and not using lexical information.

Pending Publication Date: 2021-01-26
CHONGQING UNIV OF POSTS & TELECOMM
View PDF2 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] On the one hand, the common problem of traditional methods is that they cannot represent the polysemy of words
But character-based named entity recognition methods do not utilize lexical information, and lexical boundaries usually play a crucial role for entity boundaries

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • BERT-FLAT-based Chinese named entity recognition method
  • BERT-FLAT-based Chinese named entity recognition method
  • BERT-FLAT-based Chinese named entity recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051]The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0052]Such asfigure 1 As shown, a Chinese named entity recognition method based on BERT-FLAT includes but not limited to the following steps:

[0053]S1. Data set preprocessing, to obtain a preprocessed data set, and divide the preprocessed data set into a training set, a validation set, and a test set.

[0054]The original data set adopts the MSRA Chinese named entity recognition data set of Microsoft Research Asia. The data set contains 50,000 piec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of natural language processing, in particular to a BERT-FLAT-based Chinese named entity recognition method, which comprises the steps of inputting any Chinese sentence into a trained entity recognition model, and outputting a part-of-speech tagging result of each sentence in a training set to obtain a named entity recognition result. Based on an entity recognition model of BERT-Flat-Lattice-CRF, a BERT pre-training language model and a FlatLatte structure, the BERT pre-training language model learned from a large-scale corpus can calculate vector representation of words through context, ambiguity of the words can be represented, and semantic representation of sentences is enhanced; and according to the method, vocabulary information is introduced into theFlatLatte structure, potential hidden information in the text is fully mined, the vocabulary enhancement effect is achieved, and the Chinese named entity recognition accuracy is remarkably improved.

Description

Technical field[0001]The invention relates to the field of natural language processing, in particular to a Chinese named entity recognition method based on BERT-FLAT.Background technique[0002]Named Entity Recognition (NER) technology can be used to identify specific entity information in texts, such as person names, place names, organization names, etc. It is widely used in fields such as information extraction, information retrieval, intelligent question answering, and machine translation. Generally, the named entity recognition task is formalized as a sequence labeling task, and the entity boundary and entity type are jointly predicted by predicting each word or the label of each word.[0003]With the rapid development of neural networks, end-to-end solutions that do not rely on artificial features have gradually become the mainstream of NER technology. The first is the LSTM-CRF model based on one-way long short-term memory (LSTM) neural network. Based on the excellent sequence mode...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F40/30G06K9/62G06N3/04
CPCG06F40/295G06F40/30G06N3/045G06F18/214
Inventor 张璞王重阳刘华东熊安萍
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products