Unlock instant, AI-driven research and patent intelligence for your innovation.

Classification method for imbalanced problems based on cost local generalization error

An error and cost technology, applied in the field of unbalanced classification of data sets, to achieve fast training, improved stability, and reasonable classification decision boundaries

Active Publication Date: 2022-03-29
SOUTH CHINA UNIV OF TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the backpropagation algorithm has a disadvantage. The sequence of data input has a great impact on the results of model training. Compared with the data input later, the data input first has less impact on the model.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification method for imbalanced problems based on cost local generalization error
  • Classification method for imbalanced problems based on cost local generalization error
  • Classification method for imbalanced problems based on cost local generalization error

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be described in detail below in conjunction with the embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the scope of application of the present invention. In addition, if there are processes or symbols that are not specifically described in detail below, those skilled in the art can refer to the existing Technically realized or understood.

[0066] Such as figure 1 Shown is a flow chart of a preferred embodiment of the classification method for the imbalance problem based on the cost local generalization error of the present invention. For any input data, first convert the text features in the data into numerical form through one-hot encoding; secondly, normalize the input data, and limit the value range of each dimension feature of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a classification method for unbalanced problems based on cost local generalization errors, which comprises the following steps: performing normalization processing on each dimension feature of data; and calculating misclassification costs of different types of samples according to statistical laws of data sets ; Construct a cost-sensitive multilayer perceptron neural network model; for each sample, calculate the stochastic sensitivity (ST‑SM) corresponding to its local generalization error; complete model training by combining cost sensitivity and stochastic sensitivity of local generalization error. The present invention effectively combines the advantages of local generalization error in improving generalization ability and the faster training and testing time of the method based on the algorithm level while avoiding the disadvantage of the data level method having too much influence on the data set distribution The advantages of the method improve the stability of the classification of the unbalanced dataset and the accuracy of the classification results of each category, and obtain a more reasonable classification decision boundary.

Description

technical field [0001] The invention relates to the field of unbalanced classification of data sets, in particular to a classification method for unbalanced problems based on cost local generalization errors. Background technique [0002] With the development of computer science and technology, machine learning has played an extremely important role in many fields. In recent years, the situation of unbalanced datasets cannot be ignored in most fields and has become a hindrance to the development of machine learning. For example, in a physical examination, the number of healthy people is generally greater than the actual number of patients. If a healthy person is misclassified as a sick person, it only needs to be tested again manually; however, if the disease of the patient is ignored, the patient will not receive timely treatment. Ordinary machine learning assumes that the data set distribution is balanced, and the loss of misclassification is the same. Therefore, the mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06V10/764G06V10/774G06K9/62G06N3/04G06N3/063
CPCG06N3/063G06N3/045G06F18/214
Inventor 吴永贤刘政锡张建军
Owner SOUTH CHINA UNIV OF TECH