Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Self-adaptive integrated unbalanced data classification method based on Euclidean distance

A Euclidean distance, data classification technology, applied in character and pattern recognition, instruments, computer parts and other directions, can solve problems such as poor basic classifier and integrated rule design, insufficient classification diversity, unbalanced data classification accuracy, etc. Improve the integrated output accuracy and avoid the effect of generalization performance degradation

Inactive Publication Date: 2019-12-03
DALIAN UNIV
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem of insufficient diversity of sub-categories in ensemble learning, poor performance basic classifier and ensemble rule design are not considered, this application proposes an unbalanced data classification method based on Euclidean distance-based self-adaptive integration, which improves the unbalanced data classification precision

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-adaptive integrated unbalanced data classification method based on Euclidean distance
  • Self-adaptive integrated unbalanced data classification method based on Euclidean distance
  • Self-adaptive integrated unbalanced data classification method based on Euclidean distance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] refer to figure 1 , which is a flowchart of the steps of the present invention, in conjunction with this figure, the implementation process of the present invention will be described in detail. The embodiments of the present invention are implemented on the premise of the technical solutions of the present invention, and detailed implementation methods and specific operation processes are given, but the protection scope of the present invention is not limited to the following embodiments.

[0050] An unbalanced data classification method based on Euclidean distance-based adaptive integration, including the generation of a candidate classifier pool, dynamic selection of a set of basic classifiers with strong classification capabilities, and the adaptive integration output of the basic classifiers, including the following steps in turn:

[0051] (1) Data preprocessing to obtain training set, verification set and test set; and apply random balance method in training set to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an self-adaptive integrated unbalanced data classification method based on Euclidean distance, which comprises the following steps of: firstly, obtaining a plurality of diversified balance subsets by using a random balance method, then establishing and obtaining a plurality of basic classifiers on each balance subset; and adding a classifier pre-selection algorithm before the dynamic selection algorithm. After a screened basic classifier is obtained, a new dynamic selection algorithm is provided, and by evaluating the condition of the sample classifier in the surrounding area of a to-be-classified sample, the capability is stronger when more minority class samples belong to the correct classification range. And finally, a prediction result obtained by the selected basic classifier by adopting a distance-based adaptive integration rule is output. According to the method, basic classifiers can be established on the generated diversified subsets, meanwhile, a dynamic selection algorithm is provided, the sub-classifier with the highest classification capacity can be selected out, finally, the proposed integration rule can provide a better output result, and finally, the unbalanced data classification precision is effectively improved.

Description

technical field [0001] The invention belongs to the field of artificial intelligence, and in particular relates to an unbalanced data classification method based on Euclidean distance-based self-adaptive integration. Background technique [0002] Imbalanced data refers to the situation in which the number of samples of one class or samples of multiple classes in the training sample is very different from the number of samples of other classes. According to the research report, the class imbalance problem occurs in a variety of real-world domains, such as facial age estimation, detecting oil spills in satellite images, anomaly detection, identifying fraudulent credit card transactions, software defect prediction, and image annotation, etc. Therefore, researchers have attached great importance to the problem of data imbalance and held several symposiums and conferences, such as the Association for the Advancement of Artificial Intelligence (AAAI) 2000, the International Confer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/285G06F18/24147G06F18/214
Inventor 王宾陈东张强魏小鹏周昌军
Owner DALIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products