Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Named Entity Recognition Method and System Based on Pinglattice Enhanced Linear Transformer

A technology of named entity recognition and linear converter, applied in neural learning methods, instruments, biological neural network models, etc., can solve the problems of increasing model complexity, slow operation speed, parallel application, etc.

Active Publication Date: 2021-08-20
HANGZHOU YIWISE INTELLIGENT TECH CO LTD
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Some existing classic algorithms, such as Lattice LSTM, Soft-Lexicon, LR-CNN, LGN, etc., are very complex. Among them, Lattice LSTM only obtains lexical information ending with characters, and cannot be parallelized on the CNN model. ;Compared with Lattice LSTM, Soft-Lexicon introduces more vocabulary boundary information. Its weight is a static statistical information, which reduces the amount of calculation. compatible, but its accuracy and operation speed still need to be improved; LR-CNN adds high-level features as input feedback to adjust the weight of vocabulary to solve the problem of vocabulary conflicts, effectively improving the accuracy, but also increasing the complexity of the model , the operation speed is slow; similarly, as a graph-structured algorithm LGN, since the order of the input sequence is more important for named entity recognition, the named entity recognition based on the graph neural network still needs RNN as the underlying encoder to capture the order. Also requires a complex structure, and the training speed of the model is very slow
[0005] It can be seen that most of the existing named entity recognition models focus on the recognition accuracy. In order to improve the recognition accuracy, it is often necessary to sacrifice the model operation speed, and the improvement of the recognition accuracy is also very limited, which is very unfavorable in practical applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Named Entity Recognition Method and System Based on Pinglattice Enhanced Linear Transformer
  • A Named Entity Recognition Method and System Based on Pinglattice Enhanced Linear Transformer
  • A Named Entity Recognition Method and System Based on Pinglattice Enhanced Linear Transformer

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0103] In order to prove the effect of the present invention, this embodiment trains the model on 4 Chinese named entity recognition data sets, which are respectively Ontonotes NER, MSRA NER, Resume NER and Weibo NER data sets, and the introductions of the 4 data sets are as follows :

[0104] (1) Ontonotes NER dataset: Ontonotes 5.0 consists of 1745k English, 900k Chinese and 300k Arabic text data, with rich data sources, including telephone conversations, newsletters, broadcast news, broadcast conversations and blogs, etc., including 18 categories such as Person, Organization and Location are included.

[0105] (2) MSRA NER dataset: An open source named entity recognition dataset marked by Microsoft Research Asia. There are more than 50,000 named entity recognition and labeling data, including entity types such as locations, institutions, and tasks.

[0106] (3) Resume NER data set: resume data set, entities are uniformly marked with the BIOES annotation method, and the ent...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a named entity recognition method and system based on a flat grid enhanced linear converter, and belongs to the field of natural language processing named entity recognition. Firstly, the text sequence sample is obtained and the label category of the entity is marked, and the text sequence is converted into a grid structure; then the recognition model is trained by minimizing the negative log likelihood loss function to the named entity recognition model; in the process of named entity recognition, For the text sequence to be recognized, it is used as the input of the trained named entity recognition model after text preprocessing, and the recognition result is obtained according to the maximum prediction score. The invention introduces lexical information based on a more efficient flat grid structure for lexical enhancement, provides prior knowledge and entity lexical boundary information for the model, and improves the recognition accuracy of the model for entity boundaries and entity types. The context information is modeled by using a linear converter, which reduces the complexity of the model, significantly improves the efficiency of the model operation, and has higher practical value.

Description

technical field [0001] The invention relates to the field of natural language processing named entity recognition, in particular to a method and system for named entity recognition based on a pin grid enhanced linear converter. Background technique [0002] In recent years, with the rapid development of artificial intelligence technology, text data such as news content, personal blogs, intelligent customer service dialogue content, electronic medical records, etc. have gradually occupied a higher and higher proportion, which contains a lot of useful information and huge potential. value. Named entity recognition refers to the recognition of entity nouns with specific meanings in the text, such as place names, institution names, person names and proper nouns, etc. Named entity recognition is one of the indispensable steps in processing advanced tasks of natural language processing such as machine translation, question answering system, knowledge graph construction and inform...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06N3/04G06N3/08
CPCG06F40/295G06N3/08G06N3/044
Inventor 陈哲乾李一夫马一凡
Owner HANGZHOU YIWISE INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products