Method for automatically labeling entities in medical text

An automatic labeling, entity technology, applied in medical informatics, medical data mining, informatics and other directions, can solve problems such as a lot of manpower, material resources, less entity types, errors, etc.

Pending Publication Date: 2019-10-18
北京百奥知信息科技有限公司
View PDF2 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the field of medical named entity recognition, the technology used is generally to train the model, and then use the model to label the entity. Most of the data used in the training model is obtained through manual labeling, which requires a lot of manpower and material resources. There are fewer entity types, the number of overall labels is small, and some errors will also occur. Therefore, an automatic labeling method is needed to solve this problem, making the labeling process more intelligent and simple, and making the labeling results more accurate.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for automatically labeling entities in medical text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] The present invention will be further described below in combination with specific embodiments and accompanying drawings.

[0017] Such as figure 1 As shown, a method for automatically labeling entities in medical texts, the method includes the processing method for the automatic labeling process of medical entities, and the combined method for processing multiple entities, nested entities, etc. The corpus marked by the automatic labeling method is a medical corpus, and the labeling results are more biased towards medical entities, such as diseases, drugs, adverse reactions, genes, and the like. The combined method can effectively solve the problems of plural entities, nested entities, and unrecognizable entities due to irregular punctuation marks, and improves the accuracy of labeling.

[0018] Its method will consist of text collection, text preprocessing, dictionary data collection, dictionary construction, text annotation, and annotation post-processing.

[0019] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for automatically labeling entities in a medical text. The method aims to correctly label entities in a text and comprises the following steps: collecting the text; preprocessing the text; collecting entity dictionary data; preprocessing the dictionary data to generate an abbreviation dictionary, a phrase dictionary and other entity dictionaries; labeling the textdata by applying a labeling rule to obtain a labeling entity; and labeling post-processing: processing a labeling result by adopting a fuzzy recognition mode, adding entities with missed recognition,removing nested entities, and finally obtaining a final labeling result.

Description

technical field [0001] The invention belongs to the technical field of medical entity labeling, in particular to a method for automatically labeling entities in medical texts. Background technique [0002] In the field of medical named entity recognition, the technology used is generally to train the model, and then use the model to label the entity. Most of the data used in the training model is obtained through manual labeling, which requires a lot of manpower and material resources. There are fewer entity types, the number of overall annotations is small, and some errors will also occur. Therefore, an automatic labeling method is needed to solve this problem, making the labeling process more intelligent and simple, and making the labeling results more accurate. Contents of the invention [0003] In view of this, the technical problem to be solved by the present invention is a method for annotating entities in medical corpus, which can automatically annotate entities and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G16H50/70
CPCG16H50/70G06F40/205G06F40/279G06F40/295G06F40/289
Inventor 管仁初刘洪涛张浩贺宝润周丰丰
Owner 北京百奥知信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products