Automatic labeling method for medical text data

A text data, automatic labeling technology, applied in the medical field, can solve the problems of model learning, expansion, and inaccurate data labeling results, achieving high accuracy and improving accuracy

Active Publication Date: 2020-07-28
BEIJING UNISOUND INFORMATION TECH
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] For the labeling noise in the seed data, the noise will continue to expand when using the above method for labeling, which will cause inaccurate data la...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic labeling method for medical text data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0056] figure 1 It is a flowchart of a method for automatically labeling medical text data in an embodiment of the present invention, such as figure 1 As shown, the method can be implemented as the following steps S101-S110:

[0057]In step S101, the original medical text data is preprocessed to obtain preprocessed medical text data, wherein the preprocessed medical text data includes: test data, source data and unlabeled data; wherein, preprocessing It is to manually label the original medical text data to obtain the test data T (labeled test data T), source data S (labeled source data S), and the rest of the original medical text data except for these two parts o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an automatic labeling method for medical text data. The method comprises the following steps: obtaining preprocessed medical text data according to original medical text data;performing initialization operation on preset parameters to obtain an initialization result; obtaining a primary population according to the initialization result and the preprocessed medical text data; carrying out fitness calculation on the primary population to obtain fitness corresponding to a first preset number of primary individuals; determining a second preset number of pairs of parents according to the primary population; obtaining a second preset number of cross individuals according to the second preset number of pairs of parents; obtaining a second preset number of variant individuals according to the second preset number of cross individuals; obtaining a second preset number of candidate individuals according to the second preset number of variant individuals; determining a new population according to the second preset number of candidate individuals; and obtaining a final labeling result according to the new population. Through the technical scheme of the invention, the obtained labeling result is high in accuracy.

Description

technical field [0001] The invention relates to the field of medical technology, in particular to an automatic labeling method for medical text data. Background technique [0002] Use the seed data for model training to obtain the model. Then the unlabeled data is used for label prediction by the model. All data labels are obtained. [0003] For the labeling noise in the seed data, the noise will continue to expand when using the above method for labeling, which will cause inaccurate data labeling results, and the model cannot learn the data patterns that do not appear in the seed data. When labeling, it will also cause inaccurate results of data labeling. Contents of the invention [0004] The present invention provides a method for automatically labeling medical text data, including: [0005] Preprocessing the original medical text data to obtain preprocessed medical text data, wherein the preprocessed medical text data includes: test data, source data and unlabeled ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16H50/70G06N3/12
CPCG16H50/70G06N3/126Y02A90/10
Inventor 王晔晗
Owner BEIJING UNISOUND INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products