Chinese electronic medical record entity labeling method based on BIC

An electronic medical record and entity technology, applied in neural learning methods, electrical digital data processing, medical data mining, etc., can solve problems such as consuming a lot of manpower and material resources, and limited annotation corpus

Pending Publication Date: 2020-10-13
SHANGHAI UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the field of biomedicine, the public annotation corpus is very limited, and manual annotation requires a lot of manpower and material resources. Therefore,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese electronic medical record entity labeling method based on BIC
  • Chinese electronic medical record entity labeling method based on BIC
  • Chinese electronic medical record entity labeling method based on BIC

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] In this example, see figure 1 , a BIC-based Chinese electronic medical record entity labeling method, the specific steps are as follows:

[0049] 1) First give the corresponding medical entity labeling specifications according to actual needs, then manually label a small amount of data, and process the manually labeled data into the data format required by the model to form training data;

[0050] 2) Then train the model parameters and generate a sequence labeling model, which includes BiLSTM, Iterative Dilated Convolutional Neural Network (IDCNN), and Conditional Random Field (CRF), where BiLSTM and IDCNN are used as the encoding end of the model , CRF as the decoding end of the model;

[0051] 3) Input the data to be labeled into the sequence labeling model, output the result, and obtain the data labeled by the machine;

[0052] 4) Then manual review and correction of part of the labeling errors, and then through data processing operations, to obtain the training da...

Embodiment 2

[0055] This embodiment is basically the same as Implementation 1, the special features are:

[0056] In this example, see figure 1 , in the step 2), the method for generating the sequence labeling model is as follows:

[0057] a. The input of the model is Chinese text. According to different lengths of text, it is divided into different training batches. Each training batch has 20 sentences of text. A batch of training texts are converted into tensors through the embedding layer, and each batch of training texts is passed through Fill in the gap to achieve the same length;

[0058] b. The tensor of the input data obtained by the embedding layer is processed by the encoding end. The encoding end is formed by the combination of BiLSTM and IDCNN. The number of neurons in the hidden layer of BiLSTM is set, and the output of the BiLSTM layer corresponds to the tensor;

[0059] c. Input the output of the BiLSTM layer to the IDCNN layer to extract local detail features of the text;...

Embodiment 3

[0063] This embodiment is basically the same as Implementation 1, the special features are:

[0064] In this example, see figure 1 , a BIC-based Chinese electronic medical record entity labeling method, the process of the method is as follows figure 1 As shown, firstly, the corresponding medical entity labeling specifications are given according to the actual needs, and then a small amount of data is manually labeled, and the manually labeled data is processed into the data format required by the model to form training data. Then train the model parameters to generate a sequence labeling model. Input the data to be labeled into the sequence labeling model, output the results, and obtain the machine-labeled data, then manually review and correct some labeling errors, and then perform data processing operations to obtain the training data required for the model, and perform model training again. Since the deep learning model used becomes better and better with the increase of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a new Chinese electronic medical record entity labeling method based on BIC, and belongs to the technical field of natural language processing, for solving the problems of Chinese electronic medical record entity identification and labeling. The new Chinese electronic medical record entity labeling method comprises the following steps: giving a corresponding medical entitylabeling specification according to an actual demand, manually labeling a small amount of data, and performing data processing on the manually labeled data to obtain a data format required by a model,so as to form training data; training model parameters, and generating a sequence labeling model, wherein the model comprises a bidirectional long-short-term memory network, an iterative cavity convolutional neural network and a conditional random field, and setting a decoding end of the model; inputting to-be-labeled data into the sequence labeling model, and outputting a result to obtain machine labeled data; and then correcting part of labeling errors through manual review, obtaining training data needed by the model through data processing operation, and conducting model training again. According to the new Chinese electronic medical record entity labeling method, automatic labeling of the Chinese electronic medical record data can be realized, and the accuracy is high.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a BIC-based Chinese electronic medical record entity labeling method. Background technique [0002] In the field of biomedicine, a large amount of data is generated every day, such as electronic medical records. Electronic medical records refer to digital information such as text, symbols, charts, graphics, data, images, etc. generated by medical staff using the information system of medical institutions in the course of medical activities, which can realize the storage, management, transmission and reproduction of medical records. entities, and there are many entity types. At present, most of the research on information extraction of electronic medical records is aimed at English electronic medical records. The research on Chinese electronic medical records started late, and has not yet formed a clear and systematic research task, and lacks a public annotati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/284G06F40/295G16H50/70G06N3/04G06N3/08
CPCG06F40/284G06F40/295G16H50/70G06N3/049G06N3/08G06N3/045
Inventor 滕国伟王逸凡
Owner SHANGHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products