Chinese named entity recognition method based on reading understanding

A named entity recognition and reading comprehension technology, applied in the field of natural language processing, can solve problems such as good recognition effect, achieve good guidance and good performance

Inactive Publication Date: 2020-12-18
KUNMING UNIV OF SCI & TECH
View PDF7 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The invention provides a Chinese named entity recognition method based on reading comprehension, which is used to solve the problem that the existing recognition method can only recognize the entity recognition in the sentence, and the invention can perform entity recognition in the document, and the recognition effect is good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese named entity recognition method based on reading understanding
  • Chinese named entity recognition method based on reading understanding
  • Chinese named entity recognition method based on reading understanding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032]Example 1: Such asFigure 1-3 As shown, the Chinese named entity recognition method based on reading comprehension includes the following steps:

[0033]Step 1. Collect and sort out the public data set MSRA, perform word segmentation processing on the document-level corpus, and obtain the document-level sequence;

[0034]In Step 1 of the present invention, the public data set MSRA is collected and sorted. For each type of entity in the MSRA data set, we construct each type of retrieval label problem through "annotation description". The construction of retrieval label problem can be constructed manually by writing software. It can be constructed in other ways in the prior art. For example, an annotator wants to label all entities whose category label is LOC of location, and the label description of the corresponding location LOC is "country, city, mountains", then the retrieval label problem constructed by the corresponding location LOC using the label description is "find the countr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a Chinese named entity recognition method based on reading understanding, and belongs to the technical field of natural language processing. The method comprises the steps ofperforming word segmentation processing on document-level corpora to obtain a document-level sequence; obtaining a triple consisting of a retrieval tag problem, a document-level sequence entity and adocument-level sequence; taking the retrieval tag problem and the document-level sequence in the triple as input, and generating hidden output fused into document-level context information through a BERT coding layer; passing hidden output fused with the document-level context information through a convolutional neural network, obtaining semantic features of a long-distance context, capturing semantic information of the whole document context, and compressing the semantic information into feature mapping; and predicting all entities in the document through the prediction layer by utilizing semantic information of the context of the whole document, predicting start indexes and end indexes of the entities, and splicing the start indexes and the end indexes to generate named entities. According to the invention, entity identification in the document can be carried out, and the identification effect is good.

Description

Technical field[0001]The invention relates to a Chinese named entity recognition method based on reading comprehension, and belongs to the technical field of natural language processing.Background technique[0002]Named Entity Recognition (NER) (also known as entity identification, entity segmentation and entity extraction) is a subtask of information extraction, which aims to locate and classify named entity mentions in unstructured text as pre-defined persons such as people. Define the category. Name, organization, location, medical code, time expression, quantity, monetary value, percentage, etc. This is a basic NLP research problem that has been studied for many years. NER is a fundamental and key task in NLP. From the perspective of natural language processing, NER can be regarded as a type of unregistered word recognition in lexical analysis. It is the problem with the largest number of unregistered words, the most difficult recognition, and the greatest impact on the word segme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/295G06F40/289G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06F40/295G06F40/289G06F40/30G06N3/049G06N3/08G06N3/045G06F18/24
Inventor 余正涛刘奕洋高盛祥郭军军张亚飞毛存礼
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products