Data equalization method, system and equipment

A technology of equalization and data, applied in the direction of instruments, character and pattern recognition, computer components, etc., can solve the problems of loss of important sample information, unbalanced data mining effect is not very ideal, destroying the distribution characteristics of the original data set, etc., to achieve The effect of mining effect enhancement

Inactive Publication Date: 2019-10-01
GUANGDONG UNIV OF TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, doing so will easily lead to the loss of important sample information, destroy the distribution ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data equalization method, system and equipment
  • Data equalization method, system and equipment
  • Data equalization method, system and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The core of the present application is to provide a data equalization method, system, device and computer-readable storage medium for improving the mining effect of unbalanced data.

[0059] In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0060] Please refer to figure 1 , figure 1 It is a flow chart of a data equalization method provided by the embodiment of the present application.

[0061] It ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data equalization method. The method comprises the following steps: receiving an input non-equalization data set; calculating a boundary judgment factor of each sample in theminority class sample set; sampling the boundary set of the plurality of types of sample sets through a sparse sampling strategy to obtain a new plurality of types of sample sets; carrying out interpolation through a partition neighborhood interpolation strategy, and constructing a new minority class sample set; and combining the new majority class sample set with the new minority class sample set to obtain a processed unbalanced data set. According to the invention, the decision space of a plurality of types of boundary samples is shrunk, and the sample boundaries of a plurality of types anda few types become clearer; and meanwhile, through a partition neighborhood interpolation strategy, the sample boundary becomes clearer, and the mining effect of the unbalanced data is improved. Theinvention further provides a data equalization system and equipment and a computer readable storage medium which have the above beneficial effects.

Description

technical field [0001] The present application relates to the field of data mining, in particular to a data equalization method, system, device and computer-readable storage medium. Background technique [0002] Unbalanced data mining has become one of the most challenging problems in the field of data analysis. There are a large amount of unbalanced data in practical applications, such as hospital patient diagnosis data, network intrusion data, telecom fraud data, etc., and the minority samples are usually It contains important information and is an important goal of data mining. [0003] At present, research on the classification of unbalanced data is mainly carried out from two levels of data and algorithms. At the algorithm level, improve the model’s ability to handle unbalanced datasets by optimizing traditional classification algorithms or proposing new classification algorithms, such as active learning, ensemble learning methods, single-class learning methods, and co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/24147G06F18/214
Inventor 蔡延光林枫蔡颢
Owner GUANGDONG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products