Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data imbalance target identification method, system and device and storage medium

A target recognition and balancing technology, applied in the field of machine learning, can solve problems such as the inability to guarantee the authenticity of artificial samples and real samples, and achieve the effect of improving the credibility of predictions

Pending Publication Date: 2022-05-24
XI AN JIAOTONG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the overall size of the sample set is small, the idea of ​​preprocessing the data through oversampling or undersampling may not be able to achieve better results, because preprocessing often generates many artificial samples through interpolation. On a problem with a small data scale, there is no guarantee that artificial samples and real samples have the same authenticity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data imbalance target identification method, system and device and storage medium
  • Data imbalance target identification method, system and device and storage medium
  • Data imbalance target identification method, system and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0062] Please refer to Figure 1(a) and Figure 1(b), ensemble learning, as an important branch of machine learning, has always been a research hotspot in the entire field. Integrated learning is widely used in daily life and practical scenarios, and the results are very impressive. In solving the classification problem, the ensemble learning method trains multiple base classifiers for the same problem, focusing on how to use the advantages and complementary characteristics of multiple classifiers, and combine the prediction results of multiple classifiers to predict the category of unknown samples, ensemble learning It provides another idea for improving the generalization ability of the algorithm. In classification problems, ensemble learning ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data imbalance target recognition method, system and device and a storage medium. The method comprises the steps that a base classifier set is generated through training set samples; a base classifier set generated by training set samples is cooperatively used on a verification set, and a confusion probability matrix of each base classifier is solved; dynamically and preferably selecting a base classifier set for each to-be-predicted sample; and combining output results of the optimized base classifier set based on the Bayesian theory to obtain a final prediction category of the to-be-predicted sample. According to the method, a dynamic selection link is added in a multi-classifier system, so that a classifier set which is more beneficial to class prediction of the minority class samples can be found for the minority class samples, and the prediction credibility of the minority class samples is improved. The Bayesian theory is introduced in the classifier combination process, the confidence degree of each prediction result is given in combination with the previous performance of the base classifier, the finally obtained combined confidence degree vector can better reflect the detail information of the category to which the sample belongs, and the prediction confidence degree of the minority class of samples is also improved.

Description

technical field [0001] The invention belongs to the technical field of machine learning, and in particular relates to a data imbalance target identification method, system, device and storage medium. Background technique [0002] In recent years, research on machine learning and data mining has become more and more popular, and related applications are bringing more and more practical value to the world. Supervised learning, as one of the major problems in machine learning, is often used for target classification or identity recognition. Its working principle is to use data samples of known categories to train a machine learning model, and then predict the category of unknown samples based on the obtained model. In classification problems, this trained model is called a "classifier". As more and more models and algorithms move from academia to industry, many difficulties have gradually emerged and affected the implementation of these algorithms. Data imbalance can be said ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N20/20
CPCG06N20/20G06F18/24155G06F18/214
Inventor 宋楠朱洪艳
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products