Unlock instant, AI-driven research and patent intelligence for your innovation.

Cross-document entity identification method combined with multi-task learning

A multi-task learning and entity recognition technology, applied in the field of natural language processing, can solve problems such as inconsistency between documents, and achieve the effect of improving accuracy and entity recognition accuracy

Pending Publication Date: 2021-02-09
湖南国发控股有限公司
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The above two types of methods ignore the interrelationship of sentences between documents, which can easily lead to the inconsistency of tags between documents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-document entity identification method combined with multi-task learning
  • Cross-document entity identification method combined with multi-task learning
  • Cross-document entity identification method combined with multi-task learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] refer to Figure 1-5 , this embodiment takes the medical text NCBI-disease as an example to process, Figure 5 is a part of the prediction result, the first column is the original text, each row corresponds to a token, the second column is the real label set (such as B-Disease), and the third column is the predicted label set. The experimental evaluation uses the F1 value, and the evaluation script is conlleval.pl 1 .

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a cross-document entity recognition method combined with multi-task learning. The overall architecture of the method comprises a data preprocessing module, a word embedding andcharacter embedding module, a sentence-level BiLSTM, a joint cross-document CRF module, a cross-document attention module and a multi-classification and loss calculation module based on multi-task learning. According to the cross-document entity recognition method combined with multi-task learning, the cross-document semantic representation of each token is generated by using an attention mechanism, and the entity recognition accuracy is improved by designing an auxiliary task by using multi-task learning. Extra characteristics such as part-of-speech and the like are not needed, repeated appearing of the same token in different documents is effectively utilized, cross-document semantic association is established, and the entity recognition accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a cross-document entity recognition method combined with multi-task learning. Background technique [0002] Named Entity Recognition (NER for short), also known as "proper name recognition", refers to the identification of entities with specific meanings in text, mainly including names of people, places, institutions, and proper nouns. Named entity recognition is an important basic tool in application fields such as information extraction, question answering system, syntax analysis, machine translation, and metadata annotation for Semantic Web. It plays an important role in the process of natural language processing technology becoming practical. There are many entity recognition methods, among which rule-based methods require high-quality dictionaries, and traditional machine learning methods need to use NLP tools to manually set effective rules, which is tim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06F40/30G06N3/04
CPCG06F40/295G06F40/30G06N3/044G06N3/045
Inventor 王东升范红杰胡振宇柳军飞
Owner 湖南国发控股有限公司