Hybrid framework-based unbalanced classification method, system and equipment and storage medium

A classification method and classification system technology, applied in the field of data processing, can solve problems such as data imbalance

Pending Publication Date: 2021-09-10
NAT UNIV OF DEFENSE TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, existing classification algorithms are mainly aimed at relatively ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hybrid framework-based unbalanced classification method, system and equipment and storage medium
  • Hybrid framework-based unbalanced classification method, system and equipment and storage medium
  • Hybrid framework-based unbalanced classification method, system and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0050] In one embodiment, such as figure 1 As shown, the present embodiment provides a method for unbalanced classification based on a hybrid framework, comprising the following steps:

[0051] Step 101, obtain the training data set D that contains most categories majority and the minority class training dataset D minority A given initial data set D;

[0052] Step 102, eliminate the data samples of most categories in the initial data set D by random undersampling method, and generate a new majority category data set, using D majority_reduced The dataset represents the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a hybrid framework-based unbalanced classification method, system and equipment and storage medium. According to the method, an unbalanced network anomaly detection data set is used for verifying a hybrid resampling integrated model. The number of majority classes is reduced by providing the combination of resampling methods, so that the processing speed is increased. The unbalanced dataset is processed at the data level, and the dataset is converted into equilibrium distribution using a resampling technique. An integrated model comprising 12 different classifiers is established, and compared with 5 classifiers in previous work, more choices are provided. The slightly balanced data obtained after the processing is classified by using an integrated model, so that a novel combination of undersampling and oversampling is provided to balance the imbalance among different data categories, and the processing speed is increased with less memory overhead.

Description

technical field [0001] The present application relates to the field of data processing, in particular to a hybrid framework-based imbalance classification method, system, device and storage medium. Background technique [0002] In the current era of big data, data mining and analysis occupy an increasingly important position in effective decision-making. Among various data mining techniques, classification analysis is one of the most widely used techniques, which can be applied to various business and engineering problems, such as cancer prediction, churn prediction, deception detection, face detection, fraud detection, etc. Classification analysis is a supervised classifier learning problem for predicting a variable consisting of a limited number of categories. Typically, classifier learning methods are designed to be used with reasonably balanced datasets. However, in many practical situations, datasets are often unbalanced. [0003] Currently, there are two mainstream ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/241G06F18/214
Inventor 郭得科陈锐罗来龙陈颖文
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products