Medical text named entity identification method based on pre-training model and fine tuning technology

A named entity recognition and named entity technology, applied in the field of medical text named entity recognition, can solve the problem that the effect of named entity recognition depends on the quantity and quality of labeled data, and achieve the effects of improving model performance, training speed, and accuracy.

Pending Publication Date: 2019-10-18
WUYI UNIV
View PDF2 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Compared with the previous two methods, this method does not require strong linguistic knowledge, has high flexibil

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Medical text named entity identification method based on pre-training model and fine tuning technology
  • Medical text named entity identification method based on pre-training model and fine tuning technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The specific embodiment of the present invention will be further described below in conjunction with accompanying drawing:

[0032] Such as figure 1 As shown, the present invention provides a medical text named entity recognition method based on a pre-training model and fine-tuning technology. The present invention first uses large-scale unstructured medical texts such as electronic medical records to pre-train the BERT pre-training model to train Pre-trained models that contain information about semantic representations in text. Then, the resulting pre-trained model is fine-tuned by using stacked dilated convolutional neural network to obtain a deep neural network model capable of automatic recognition of named entities in the medical field.

[0033] Specifically: S1), using text data mining related technologies to perform preprocessing operations on medical texts such as electronic medical records; this process specifically includes the following steps:

[0034] S10...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a medical text named entity recognition method based on a pre-training model and fine tuning technology, and the method comprises the steps: firstly carrying out the pre-training of a BERT pre-training model through a large-scale unstructured electronic medical record and other medical texts, so as to train a pre-training model containing semantic representation informationin a text; and carrying out fine tuning on the generated pre-trained model by utilizing the stacked extended convolutional neural network so as to obtain a deep neural network model capable of carrying out automatic identification on the named entities in the medical field. According to the pre-training model provided by the invention, semantic information in the text can be captured more accurately, the semantic information can be migrated to a specific task more effectively, and the named entity recognition accuracy of the model is improved. The stacking expansion convolutional neural network and the pre-training model are combined to finely adjust the model, and finally, the named entities of the medical text are identified, so that semantic information in the text can be well captured,and parallel computing can be performed to improve the model training speed.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a medical text named entity recognition method based on a pre-training model and fine-tuning technology. Background technique [0002] Clinical medicine is a science that studies the etiology, diagnosis, treatment and prognosis of diseases, improves the level of clinical treatment, and promotes human health. Medical text data such as electronic medical records in clinical medicine has great value for research in the medical field. With the popularity of the Internet, more and more online clinical medical communities and clinical medical consultation websites have emerged. These online clinical medical websites also generate rich medical text data. There are a large number of real personal cases in these medical texts, which have a wealth of clinical medical value hidden. But most of these clinical medical texts are in an unstructured state. In order to fully tap the value...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F16/35G06N3/04G16H10/00
CPCG06F16/355G16H10/00G06F40/295G06F40/30G06N3/045
Inventor 陈涛杨开漠
Owner WUYI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products