Large-scale unbalanced diabetes electronic medical record parallel classification neighborhood evidence Spark method

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of electronic medical records and diabetes, which is applied in the field of intelligent processing of medical information, can solve the problems of large amount of data, too many attributes of experimental test data, unbalanced parallel classification of electronic medical records of diabetes, and improve efficiency and accuracy. The effect of applying value

Active Publication Date: 2021-06-22

NANTONG UNIVERSITY

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The purpose of the present invention is to provide a large-scale unbalanced diabetes electronic medical record parallel classification neighborhood evidence Spark method, which solves the problem that the existing effective way to judge the state of diabetic lesions is to pass the pathological characteristic experiment of the etiology and pathogenesis of diabetes, resulting in experimental test data Too many attributes and a large amount of data will increase the workload of doctors in judging the pathological conditions of diabetic patients

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0062] see Figure 1 to Figure 3 , the present invention provides its technical scheme, a kind of neighborhood evidence Spark method for parallel classification of large-scale imbalanced diabetes electronic medical records, comprising the following steps:

[0063] Step 1. On the master node Master, read the large-scale unbalanced diabetes electronic medical record data set through the Hadoop distributed file system HDFS, and divide the training data set S according to the ratio of 4:1 TR and the test dataset S TE , the training dataset S will be TR Send it to the m child node, and convert the data into a four-tuple decision information system S=, the decision information system S is expressed as follows:

[0064] S=, where U={x 1 ,x 2 ,K,x M} represents the set of patient objects in the diabetes electronic medical record data set, M represents the number of diabetic electronic medical record patients; C={a 1 , a 2 , K, a n} represents the non-empty finite set of pathol...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a large-scale unbalanced diabetes electronic medical record parallel classification neighborhood evidence Spark method. The method comprises the steps: reading diabetes data on a main node, and dividing the diabetes data into a training set and a test set according to a ratio of 4: 1; carrying out Spark parallel undersampling on the diabetes training set on the child nodes to obtain a plurality of new training subsets; obtaining pathological feature reduction subsets on the sub-nodes through a Spark parallel pathological feature reduction device, updating pathological feature sets of the training subsets and the test subsets on each sub-node, obtaining prediction category label sets of the test subsets on the sub-nodes through a neighborhood evidence Spark parallel classifier, and obtaining prediction category label sets of the test subsets on the sub-nodes; and obtaining a final prediction category label on the main node according to a voting mechanism. The method has the beneficial effects that redundant attributes in large-scale data are removed, the calculation efficiency is improved, support information among samples is fully utilized, and the efficiency and precision of diabetes data classification are improved.

Description

technical field [0001] The invention relates to the technical field of medical information intelligent processing, in particular to a large-scale unbalanced diabetes electronic medical record parallel classification neighborhood evidence Spark method. Background technique [0002] Diabetes mellitus (DM) refers to the sugar, protein, fat, water and electrolytes caused by the decline of islet function and insulin resistance caused by various pathogenic factors such as genetic factors, endocrine, and dietary imbalance after dysfunction. A series of metabolic disorder syndromes. At the same time, there are more types of complications caused by diabetes, and doctors cannot effectively and accurately determine whether a patient has diabetes only by relying on the patient's physical signs. [0003] At present, the effective way to judge the condition of diabetes is through the pathological characteristic experiment of the etiology and pathogenesis of diabetes, but the experiment n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G16H10/60G16H15/00G06F16/182G06K9/62

CPCG16H10/60G16H15/00G06F16/182G06F18/24G06F18/214

Inventor丁卫平李铭孙颖秦廷帧鞠恒荣黄嘉爽高自强潘壬远

OwnerNANTONG UNIVERSITY

Large-scale unbalanced diabetes electronic medical record parallel classification neighborhood evidence Spark method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology