Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Classification method for imbalance problem based on cost local generalization error

A classification method, error technique, applied in the field of imbalanced classification of datasets

Active Publication Date: 2019-08-09
SOUTH CHINA UNIV OF TECH
View PDF6 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the backpropagation algorithm has a disadvantage. The sequence of data input has a great impact on the results of model training. Compared with the data input later, the data input first has less impact on the model.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification method for imbalance problem based on cost local generalization error
  • Classification method for imbalance problem based on cost local generalization error
  • Classification method for imbalance problem based on cost local generalization error

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be described in detail below in conjunction with the embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the scope of application of the present invention. In addition, if there are processes or symbols that are not specifically described in detail below, those skilled in the art can refer to the existing Technically realized or understood.

[0066] Such as figure 1 Shown is a flow chart of a preferred embodiment of the classification method for the imbalance problem based on the cost local generalization error of the present invention. For any input data, first convert the text features in the data into numerical form through one-hot encoding; secondly, normalize the input data, and limit the value range of each dimension feature of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a classification method for an imbalance problem based on a cost local generalization error. The classification method comprises the following steps of performing normalizationprocessing on each-dimensional feature of data; according to the statistical law of the data set, calculating the misclassification cost of different types of samples; constructing a cost-sensitive multilayer perceptron neural network model; for each sample, calculating random sensitivity (ST-SM) corresponding to a local generalization error of the sample; and completing model training by combining cost sensitivity and random sensitivity of a local generalization error. According to the method, the disadvantage of excessive influence of a data level method on data set distribution is avoided,the advantage of improving the generalization ability of a local generalization error and the advantage of faster training and testing time of an algorithm-level-based method are effectively combined, so that the classification stability of the unbalanced data set and the accuracy of a classification result of each category are improved, and a more reasonable classification decision boundary is obtained.

Description

technical field [0001] The invention relates to the field of unbalanced classification of data sets, in particular to a classification method for unbalanced problems based on cost local generalization errors. Background technique [0002] With the development of computer science and technology, machine learning has played an extremely important role in many fields. In recent years, the situation of unbalanced datasets cannot be ignored in most fields and has become a hindrance to the development of machine learning. For example, in a physical examination, the number of healthy people is generally greater than the actual number of patients. If a healthy person is misclassified as a sick person, it only needs to be tested again manually; however, if the disease of the patient is ignored, the patient will not receive timely treatment. Ordinary machine learning assumes that the data set distribution is balanced, and the loss of misclassification is the same. Therefore, the mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06N3/04G06N3/063
CPCG06N3/063G06N3/045G06F18/214
Inventor 吴永贤刘政锡张建军
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products