Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Named entity recognition method for Chinese medical text

A named entity recognition and text technology, which is applied in the field of medical text labeling, can solve the problems of difficult division and determination of entity boundaries, increase the difficulty of labeling tasks, etc., and achieve the effect of solving large length differences and good results

Active Publication Date: 2020-04-28
HARBIN ENG UNIV
View PDF13 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006](2) There are often many different modifiers and qualifiers in the context of the entity to be recognized and extracted, which makes it difficult to divide and determine the boundary of the entity
[0008](4) There may be huge differences in the length of different entities. For some disease names and drug names, the length of entities may be very long, and some entities may even Contains more than 10 characters, and some entities only contain 2-3 characters, which obviously increases the difficulty of the labeling task

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity recognition method for Chinese medical text
  • Named entity recognition method for Chinese medical text
  • Named entity recognition method for Chinese medical text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0060] A method for named entity recognition of Chinese medical texts based on multi-granularity feature fusion, whose features can be summarized as:

[0061] 1) Definition of medical entity category and construction of entity annotation dictionary;

[0062] With the aid of the annotation dictionary, text preprocessing such as word segmentation and entity category annotation is performed on the original medical text (text to be recognized);

[0063] By crawling the authoritative online medical service website, the obtained medical terms are divided into 12 artificially predefined categories (including disease entity, symptom entity, inspection entity, drug entity, operation entity, organ entity, and part entity) according to 1) , sign entity, past information entity, condition word entity, frequency word entity, degree word entity) for category labeling to construct a medical term word segmentation and labeling dictionary.

[0064] 2) On the marked corpus, use different algor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of medical text labeling, and particularly relates to a method for identifying named entities of Chinese medical texts. According to the method, the entities in the original medical text are automatically annotated by customizing a plurality of entity categories and constructing the medical term annotation dictionary according to the entity categories;on this basis, a multi-granularity feature fusion model is provided. The radicals of the Chinese characters are used as entity recognition and classification features to be applied to medical entity recognition tasks for the first time; features on three different granularities of words, characters and radicals of the characters in the medical texts are extracted, expressed and fused, and a modelis trained by utilizing an ID-CNN-CRF algorithm, so that the recognition work of medical entities in various medical texts is realized. The method has the advantages that the method can be applied tovarious medical texts such as electronic medical records and medical periodicals. Meanwhile, the problem that in the medical field, the length difference between different entities is large can be well solved, and the method has a good effect on recognition of the entities which are not logged in.

Description

technical field [0001] The invention belongs to the technical field of medical text labeling, and in particular relates to a method for named entity recognition of Chinese medical texts. Background technique [0002] The annotation of medical text is a hot issue in the application of natural language processing in the medical field. Medical texts mainly include medical journals and electronic medical records formed during the medical treatment process. Medical texts are considered to be the core data of medical information systems, so it is very important to use computer programs to automatically mine these knowledge from these texts. This technology mainly involves the use of program codes to apply natural language processing (NLP), information extraction (including entities and relationships, etc.) and other related technologies on medical texts for analysis and mining. [0003] Medical named entity recognition is one of the important tasks in the application of natural l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/117G06F40/289G06F40/295G06F40/279G06N3/04
CPCG06N3/045
Inventor 黄少滨张柏嘉申林山李熔盛李轶余日昌颜伟邹长明
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products