Hierarchy gravity model based imbalanced data classification method and system therefor

A gravitational model and data classification technology, applied in text database clustering/classification, electronic digital data processing, special data processing applications, etc.

Active Publication Date: 2016-02-10
DISCOVERY TECH SHENZHEN
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method affects the calculation accuracy of the data gravitational field, especially near the centroid of the data particle, because in the neighborhood of the data particle centroid, due to the relatively dense original data, the gradient of the data gravitational field in this area changes quickly, and the field comparison complex, and after the data particles are created, the data gravitational field calculated based on the data particles loses some information of the original gravitational field, so this will inevitably affect the classification accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hierarchy gravity model based imbalanced data classification method and system therefor
  • Hierarchy gravity model based imbalanced data classification method and system therefor
  • Hierarchy gravity model based imbalanced data classification method and system therefor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0068] The invention provides a method for classifying unbalanced data based on a hierarchical gravity model, comprising the following steps:

[0069] The first step is to obtain the sample set Z to be classified, and the samples in the sample set Z to be classified all contain D attributes, and D is a positive integer.

[0070] The second step is to divide the attribute values ​​​​of the samples into intervals, each attribute is divided into L intervals, L is a positive integer, and the sample is divided into L D A D-dimensional cube, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a hierarchy gravity model based imbalanced data classification method and a system therefor. The method comprises the following steps: S1, acquiring a to-be-classified sample set Z, wherein each sample in the to-be-classified sample set Z comprises D attributes, and D is a positive integer; S2, performing interval division on attribute values of samples, dividing each attribute into L intervals, dividing the samples into LD D-dimensional cubes, and calculating an attribute weight, wherein L is a positive integer; and S3, dividing feature space, placing the to-be-classified sample set Z into a corresponding hierarchy D-dimensional cube, and performing label classification on the samples in conjunction with a gravity model. The hierarchy gravity model based imbalanced data classification method and the system therefor have the following advantages: the attributes are weighted at different resolutions for a multi-hierarchy model, and classification performance of a hierarchy model is improved; classification efficiency is improved by dividing the attributes at different resolutions and establishing a hierarchy classification model; and classification precision of data in a space overlapping area is optimized by using a partial gravity model.

Description

technical field [0001] The invention relates to the field of computer data analysis and mining, in particular to an unbalanced data classification method and system based on a hierarchical gravity model. Background technique [0002] An unbalanced data set is a data set in which there is a large difference between the number of samples owned by each class. In the binary classification of unbalanced data sets, the class with a small number of samples is usually called a positive class, and the corresponding class with a large number of samples is called a negative class. Data imbalance is very common in current applications, such as medical diagnosis, intrusion detection, fraud prevention, and classification of things from satellite images. The correct rate of positive classification is our main concern. For example, in disease diagnosis, Misdiagnosis of healthy people will be resolved during reexamination, but cancer patients may be misdiagnosed as normal, which may cause i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 古平董振波王春元田洪泽杨炀张程李佳
Owner DISCOVERY TECH SHENZHEN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products