Chinese named entity identification method and Chinese named entity identification device based on RoBERTa-BiGRU-LAN model

A named entity recognition, roberta-bigru-lan technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve the problems of high model complexity, insufficient training time, small training data sets, etc., to reduce the number of models parameters, speed up model convergence, and improve accuracy

Active Publication Date: 2020-09-04
PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
View PDF2 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] (1) The word2vec model is a static word vector model, which cannot solve the problem of polysemy and polysemy. The difference of different meanings will interfere with the final named entity recognition effect
[0005] (2) When using the traditional Bert model for word embedding, due to the use of static masking, small training data sets, and insufficient training time, representation learning is insufficient; on the other hand, the model optimization rate and model performance using Bert are weak
[0006] (3) Compared with the traditional RNN, the BiLSTM model has too many parameters and the model complexity is higher
[0007] (4) CRF does not add other additional information to the sequence, and the computational complexity is high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese named entity identification method and Chinese named entity identification device based on RoBERTa-BiGRU-LAN model
  • Chinese named entity identification method and Chinese named entity identification device based on RoBERTa-BiGRU-LAN model
  • Chinese named entity identification method and Chinese named entity identification device based on RoBERTa-BiGRU-LAN model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work belong to the protection of the present invention. scope.

[0051] Such as figure 1 As shown, the Chinese named entity recognition method based on the RoBERTa-BiGRU-LAN model of the present embodiment includes the following steps:

[0052] Step S1, obtain the labeled corpus, and build a training data set; specifically include the following:

[0053] Step S11, the original sentence is segmented using a word seg...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of named entity recognition, and particularly relates to a Chinese named entity recognition method and device based on a RoBERTa-BiGRU-LAN model, and the method comprises the steps: converting a to-be-processed Chinese corpus into a word vector sequence; inputting the obtained word vector sequence into a first layer of BiGRU-LAN of a RoBERTa-BiGRU-LAN model, and obtaining a coding sequence fused with local information; inputting the obtained coding sequence into a second layer of BiGRU-LAN of the RoBERTa-BiGRU-LAN model, and obtaining attention distribution fused with global information; and obtaining a named entity identification result according to the obtained attention distribution. According to the improved word embedding method disclosed by the invention, Chinese representation is better carried out, and meanwhile, BiLSTM-CRF is improved into BiGRU-LAN, so that the parameters of the model are reduced, the complexity of the model is reduced, and the training time is saved.

Description

technical field [0001] The invention belongs to the technical field of named entity recognition, and in particular relates to a Chinese named entity recognition method and device based on the RoBERTa-BiGRU-LAN model. Background technique [0002] Entity is an important part of carrying semantic information in text, and is the core unit of knowledge graph. Named Entity Recognition (NER) aims to extract these valuable entities (person names, place names, institution names, proper nouns, events, etc.) information from text to meet the needs of various industries. Named entity recognition is one of the key steps in the field of natural language processing, an important basis for building knowledge graphs, and one of the core technologies in the fields of intelligent search and intelligent question answering. It is of great significance to realize tasks and realize artificial intelligence supported by knowledge. [0003] The current Chinese named entity recognition method widel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F40/284G06N3/04G06N3/08
CPCG06F40/295G06F40/284G06N3/08G06N3/045
Inventor 李邵梅胡新棒黄瑞阳李辉胡楠郑洪浩
Owner PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products