Classification method based on conversion from majority class to minority class under unbalanced data set

A classification method, a technology of minority class, applied in the direction of still image data clustering/classification, still image data retrieval, etc., can solve problems such as increased computational overhead, lack of flexibility, and difficulty in migrating different datasets

Pending Publication Date: 2021-07-06
SOUTH CHINA UNIV OF TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the article "Neighborhood-based undersampling approach for handling imbalanced and overlapped data", it is clearly pointed out that although the improvement method at the data level is simple in thinking, it increases the computational overhead; although the improvement method at the algorithm level is faster, it lacks flexibility and is difficult to implement in different environments. Migrating between datasets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification method based on conversion from majority class to minority class under unbalanced data set
  • Classification method based on conversion from majority class to minority class under unbalanced data set
  • Classification method based on conversion from majority class to minority class under unbalanced data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0048] This part will describe the specific embodiment of the present invention in detail, and the preferred embodiment of the present invention is shown in the accompanying drawings. Each technical feature and overall technical solution of the invention, but it should not be understood as a limitation on the protection scope of the present invention.

[0049] In the description of the present invention, unless otherwise clearly defined, words such as setting, installation, and connection should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a classification method based on conversion from majority classes to minority classes under an unbalanced data set. The method comprises the following steps: preprocessing overall training data; selecting a certain number of samples as partners from the majority classes for each minority class sample, and determining the value range of the number; selecting proper samples from the majority classes to form a new sample set; changing labels of the samples in the set; carrying out lifting training on the weak classifier by adopting a new loss function pair forward addition model, wherein in the training process, the final classifier distribution is solved, and the weights of the optimal base classifier and the modified base classifier are solved for the target function in each step; and performing related pre-training by using a classifier, and determining the final number of the majority classes converted into the minority classes. The method not only can be applied to image classification and image recognition, but also can be applied to natural language processing and other scenes needing classification.

Description

technical field [0001] The invention relates to a classification method, more specifically, to a classification method based on conversion of majority class into minority class under an unbalanced data set. Background technique [0002] In image classification or recognition, many algorithms have a basic assumption that the data distribution is uniform. When we apply these algorithms directly to real data, such as medical treatment and fraud, most of the cases cannot achieve ideal results. Because the actual data is often very unevenly distributed, there will be a "long tail phenomenon", which is the problem of unbalanced classification. Generally speaking, a data set called an imbalanced data set needs to meet two conditions: the imbalance of the number of categories and the imbalance of misclassification costs. Taking the binary classification problem as an example, assuming that the number of samples of the negative class is much larger than that of the positive class, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/55
CPCG06F16/55
Inventor 何克晶王高山
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products