Unbalanced data classification system based on geometric structure integration

A technology of geometric structure and data classification, applied in the field of data processing, can solve the problems of lack of diversity and diversity of classifiers, difficulty in determining the number of weak classifiers, and complex technical structure, and achieve good geometric intuition, good intuition, high The effect of learning efficiency

Inactive Publication Date: 2019-03-19
EAST CHINA UNIV OF SCI & TECH
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In view of the lack of intuition, complex technical structure, lack of diversity and diversity among classifiers, and difficulty in determining the number of weak classifiers in the existing integrated learning technology, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Unbalanced data classification system based on geometric structure integration
  • Unbalanced data classification system based on geometric structure integration
  • Unbalanced data classification system based on geometric structure integration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] The present invention will be further introduced below in conjunction with accompanying drawing and embodiment: the method of the present invention is divided into three modules altogether.

[0014] Part 1: Input Module

[0015] The input data is to transform the unbalanced problem data in reality, and generate a data set in the form of a vector for subsequent modules to carry out

[0016] line processing. For an input sample i, its vector represents the vector x i , the dimension d of the vector, as follows:

[0017] x i =[x i,1 ,x i,2 ,...,x i,d ]∈R d

[0018] The input to the system contains a collection of minority and majority class samples. For the minority class sample set, it is expressed as

[0019] where n min is the number of minority class samples. The majority class sample set is expressed as

[0020] where n maj is the number of samples in the majority class.

[0021] Part II: Training Module

[0022] In this module, the collected minor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an unbalanced data classification system based on geometric structure integration, and the system comprises the following modules: an input module which converts collected samples according to the specific description of an unbalanced problem to obtain a sample set in a vector form, and the sample set in the vector form comprises a few types of samples and a plurality of types of samples; the training module is used for training the sample set in the vector form to obtain a few types of decision areas of the system; and the test module is used for inputting a to-be-discriminated sample and judging whether the to-be-discriminated sample is in the minority class decision area of the system to obtain the class to which the to-be-discriminated sample belongs. Accordingto the method, weak classifiers are designed by utilizing a supporting hyperplane principle, so that each weak classifier can identify a plurality of different types of samples, and labor division exists among the weak classifiers; through the combination of corresponding decision area spaces, the designed integration strategy can effectively identify a few types of samples and a plurality of types of samples, so that the problem of imbalance is effectively solved.

Description

technical field [0001] The invention relates to the field of data processing, in particular, an integrated classification system based on sample distribution geometric structure is designed to process sample distribution unbalanced data classification system. Background technique [0002] Today's world is ushering in a new round of technological development and transformation, and artificial intelligence will be an important force to promote its development and transformation. Pattern recognition is the study of using computers to imitate or realize the recognition ability of humans or other animals, so that the research objects can complete the task of automatic recognition. As we all know, the concept of linear or vector space widely exists in many scientific fields. When we use the concept of "space", we have established a close relationship with geometry. In the field of pattern recognition, many algorithms are also based on spatial projection mapping. When traditional...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35
Inventor 王喆李冬冬朱宗海杜文莉
Owner EAST CHINA UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products