Simulation text medical record generation method and system

A technology of medical records and texts, applied in the field of machine learning, can solve problems such as the imbalance of positive and negative samples, unclear application scope, and affecting the effect of machine learning-related algorithms, and achieve the effect of avoiding patient privacy, high quality and diversity

Active Publication Date: 2018-12-14
TSINGHUA UNIV
View PDF4 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the acquisition and use of electronic medical record data, on the one hand, may be restricted by the patient's personal wishes and laws and regulations due to issues such as patient privacy, which restricts the use of related algorithms such as machine learning based on big data; on the other hand Due to the large differences in the medical record data itself, there may be an imbalance of positive and negative samples (disease samples and non-disease samples) for certain types of diseases, which will affect the effect of machine learning related algorithms
In view of the above problems, it is an effective solution to generate simulated medical record data that restores the distribution of real medical record samples as much as possible. However, few technologies currently try to solve this problem.
A small number of medical record generation and text generation related technologies also have the following problems: 1. The role is only to assist in the generation of formatted medical records to meet the needs of standard formats, reducing the work of doctors' handwriting and typesetting, and does not involve automatic generation of simulated medical records
2. It can merge existing texts to generate new texts, but it does not involve machine learning-related algorithms, and the diversity of generated texts is also very limited
3. The relevant artificial intelligence-based text generation methods have limited scope of action (only text expansion, but cannot generate full text), and the scope of application is not clear, and it is not closely integrated with the medical field

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Simulation text medical record generation method and system
  • Simulation text medical record generation method and system
  • Simulation text medical record generation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] Embodiments of the method and system for generating a simulated text medical record according to the present invention will be described below with reference to the accompanying drawings. Those skilled in the art would recognize that the described embodiments can be modified in various ways or combinations thereof without departing from the spirit and scope of the invention. Accordingly, the drawings and description are illustrative in nature and not intended to limit the scope of the claims. Also, in this specification, the drawings are not drawn to scale, and like reference numerals denote like parts.

[0070] The method for generating the simulated text medical records proposed by the present invention is mainly based on a Generative Adversarial Network (GAN). A typical generative confrontation network needs to solve the game problem of binary minimization and maximization in the following form:

[0071]

[0072] where p data (x) is the real data distribution, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a simulation text medical record generation method and system. The original medical record is applied to generate positive samples, the word vector and the disease tag vector outputted by the generator in each cycle act as inputs, a new word vector is outputted and a sentence composed of multiple word vectors is generated by repetition for many times. As each word vector isgenerated, the generated word vector sequence is taken as the initial state, generator sampling is repeatedly operated to generate multiple sentences, and the discriminator takes an average of the reward values of all the sentence as the reward value of the present word vector, the generator is updated according to the obtained reward values of the sentences and the word vector and the process isrepeated until convergence. The convergent generator generates negative samples and the negative samples and the positive samples form a mixed medical record data set. The disease tag vector and the word vector sequence act as the input, the probability of each medical record coming from the real medical record is obtained and the discriminator is updated and the process is repeated until convergence. The patient privacy is involved and the simulated text medical record can assist other machine learning tasks so as to facilitate the research on the disease.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to a method and a system for generating medical records of simulated text. Background technique [0002] With the development of the times and the continuous improvement of informatization, the use of electronic medical records is becoming more and more extensive. At the same time, with the rapid development of machine learning and deep learning in recent years, people have begun to try to use machine learning to solve problems in the medical field and have achieved some results. However, the acquisition and use of electronic medical record data, on the one hand, may be restricted by the patient's personal wishes and laws and regulations due to issues such as patient privacy, which restricts the use of related algorithms such as machine learning based on big data; on the other hand Due to the large differences in the medical record data itself, there may be an imbalance of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16H50/70G16H50/20G16H10/60
CPCG16H10/60G16H50/20G16H50/70
Inventor 张学工关嘉麒闾海荣
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products