Disease danger factor extracting method based on combination of clustering and classification

A technology of risk factors and extraction methods, which is applied in the field of big data technology and medicine, can solve problems such as large amount of calculation, and achieve the effect of reducing errors

Inactive Publication Date: 2019-08-30
NANJING UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the clustering methods currently used for the extraction of disease risk factors are hierarchical clustering. When dividing clusters, professional medical knowledge is often required to determine the characteristics of the clusters, and the amount of calculation is large.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Disease danger factor extracting method based on combination of clustering and classification
  • Disease danger factor extracting method based on combination of clustering and classification
  • Disease danger factor extracting method based on combination of clustering and classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0051] The present invention is based on the method for extracting disease risk factors combined with clustering and classification, including the following:

[0052] 1. Construct the user information matrix and label vector according to the user questionnaire of a certain disease: In this example, the children's congenital heart disease data set is used to construct the user information matrix and label vector. There are 8672 examination cases in total, and each case has 39 A breast lump biopsy image shows the nuclei and questionnaire questions, and the answers are numerical indicators, including the mother's mode of production, lifestyle, pregnancy matters, father's lifestyle, and immediate health status. The size of the user information matrix thus constructed is 8672*40, and the first column represents the unique identification number of the case. The characteristics of the questions in the user information matrix in this embodiment are specifically: ['pregnancy', 'parity'...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a disease danger factor extracting method based on combination of clustering and classification. The method comprises the following steps of according to a user investigation questionnaire, constructing a user information matrix and a label vector; performing attribute segmentation on the user information matrix, thereby obtaining a plurality of user information matrix subsets and an original user information matrix; performing standardizing processing on each user matrix; performing dimension reducing processing on the standard user matrixes for obtaining dimension-reduced matrixes; clustering the dimension-reduced matrixes for obtaining the clustering population in different kinds; constructing a decision tree for each kind of clustering population, and performingstatistics on all decision trees through an integrated concept, and obtaining disease danger factors according to the magnitude of a hierarchy coefficient. Compared with a regression statistics method in the danger factor extracting method in a medical field, the method according to the invention has advantages of sufficiently utilizing original data and reducing an error rate caused by conclusion generation by a single decision tree through combination of clustering and classification.

Description

technical field [0001] The invention belongs to the field of big data technology and medicine, in particular to a method for extracting disease risk factors based on the combination of clustering and classification. Background technique [0002] Gastroesophageal reflux disease refers to a disease in which gastric reflux flows back into the esophagus, causing discomfort symptoms and complications. As a common clinical disease of the digestive system, it generally exists in various Asian and Western countries, and its incidence is increasing year by year. high trend. According to research, gastroesophageal reflux disease is related to various factors such as personal life, eating habits, and mental status, and the condition is prone to change. Therefore, exploring the risk factors of gastroesophageal reflux disease through big data technology is of great significance for the treatment and prevention of the disease. [0003] At present, for the risk factors of gastroesophagea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16H10/20G16H50/70
CPCG16H10/20G16H50/70
Inventor 沈兴鑫姚澜徐雷
Owner NANJING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products