Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and system for classifying unbalanced data sets

A technology of unbalanced data and classification method, applied in the field of classification methods and systems for unbalanced data sets, can solve problems such as poor accuracy and low efficiency, and achieve the effects of reducing imbalance, improving accuracy and reducing impact.

Active Publication Date: 2022-07-08
TAIYUAN UNIV OF TECH
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a method and system for classifying unbalanced data sets, so as to solve the problems of low efficiency and poor accuracy in classifying unbalanced data sets in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for classifying unbalanced data sets
  • A method and system for classifying unbalanced data sets
  • A method and system for classifying unbalanced data sets

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0068] The purpose of the present invention is to provide a method and system for classifying unbalanced data sets, so as to solve the problems of low efficiency and poor accuracy when classifying unbalanced data sets in the prior art.

[0069] In order to make the above objects, features and advantages of the present invention more clearly understood, the present invention will be described in further detail below with reference to t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The method and system for classifying unbalanced data sets of the present invention calculate and obtain the class centers c of positive and negative training sets 1 and c 2 , determine the distance T of the two types of class centers, the positive class hyperplane, the negative class hyperplane, the first distance, the second distance, the third distance and the fourth distance, and then determine the fuzzy membership function. According to the fuzzy membership function and fuzzy A dual support vector machine determines the classification model. The optimized classification model is obtained by grid search algorithm and cross-validation method. Input the unbalanced data to be classified into the optimized classification model, and obtain the classification result of the unbalanced data to be classified. The method or system of the present invention assigns different membership values ​​to the sample points according to the difference in the contribution of the sample points to the classification hyperplane and the difference in the unbalanced rate of the two types of samples by using a deterministic classification model based on the fuzzy membership function. Therefore, the accuracy of the classification results when using the method or system of the present invention is improved.

Description

technical field [0001] The invention relates to the technical field of unbalanced data processing, in particular to a method and system for classifying unbalanced data sets. Background technique [0002] Many industry data often have imbalanced data distribution. Taking the binary classification problem as an example, if the proportion of one kind of samples is much larger than the proportion of the other kind of samples, the data set is an unbalanced data set. Among them, the majority class samples are also called negative class samples, the minority class samples are called positive class samples, and the ratio of the number of negative class samples to positive class samples is called Imbalanced Rate (IR). Typical examples include: fault diagnosis data, credit fraud data, medical diagnosis data, etc. Since the classification prediction accuracy of the minority class is more important in practice when classifying and predicting an unbalanced dataset, the commonly used cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62
CPCG06F18/2411G06F18/214
Inventor 张雪英李凤莲陈桂军张波魏鑫焦江丽
Owner TAIYUAN UNIV OF TECH