Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Sample imbalance-oriented multi-disease classifier design method

A disease classification and design method technology, applied in the direction of instruments, calculations, computer components, etc., can solve problems such as unbalanced data sets, and achieve the effect of small feature subsets and good effects

Active Publication Date: 2021-03-26
TONGJI UNIV
View PDF14 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Only by solving the problem of unbalanced data sets can the accuracy of small-sample disease prediction be improved and artificial intelligence more popular

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample imbalance-oriented multi-disease classifier design method
  • Sample imbalance-oriented multi-disease classifier design method
  • Sample imbalance-oriented multi-disease classifier design method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to deepen understanding of the present invention, below will combine existing method and attached figure 1 The present invention is described in further detail. Existing methods are only used to explain the present invention, and do not constitute a limitation to the protection scope of the present invention.

[0035] This application is a multi-disease classification method for sample imbalance, the specific process is as follows figure 1 As shown, including the following five aspects:

[0036] Step 1, divide the unbalanced samples into sample subsets according to their disease categories;

[0037] Step 2, feature selection based on disease association rules;

[0038] Step 3, random iterative equalization sampling based on the unbalance degree as the upper limit;

[0039] Step 4, train the weak classifier and calculate the classification effect;

[0040] Step 5, complete the disease classification prediction by judging whether the difference of macro-F1 me...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention aims to overcome the defects in the prior art, and provides a sample imbalance-oriented multi-disease classifier design method, which comprises the following steps: dividing medical casedata into a plurality of case sample subsets according to disease categories, and performing a feature selection method of a disease association rule on each sample subset; selecting a feature vectorof the case sample subset, iteratively and randomly updating the adoption probability on the premise that the imbalance degree is an upper limit threshold value, equalizing the case sample subset, training a weak classifier of each sample subset, and calculating the classification effect of the weak classifier; and finally, determining whether the iterative generation of the multi-disease classifier is finished or not by judging whether the difference value of the macro-F1 meets an iterative convergence threshold or not.

Description

technical field [0001] The invention relates to the field of machine learning, in particular to unbalanced samples and integrated learning algorithms. Background technique [0002] Domestic machine learning models have also been gradually used in multi-disease classification methods, but in the medical field, it is difficult to directly construct multi-disease classification models for those medical cases with few training samples. And with the improvement of the diagnostic ability of the diagnostic model, the number of features it needs will continue to expand, and the imbalance of case samples will gradually increase, which will eventually cause the dimension disaster of the feature matrix, excessive calculation, low classification accuracy, and low training accuracy. Problems such as sample sparsity and overfitting will ultimately affect the classification quality of the classifier. [0003] In order to overcome the problem of unbalanced case samples in these medical fie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N5/02G06N20/20G16H50/20G16H50/70
CPCG06N5/025G06N20/20G16H50/20G16H50/70G06F18/2431G06F18/214Y02A90/10
Inventor 方钰徐蔚曲艺陆明名黄欣翟鹏珺
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products