Re-sampling and cost-sensitive learning integrated unbalanced data integration and classification method
A cost-sensitive, data-integrated technology, applied in instruments, character and pattern recognition, computer components, etc., can solve the problems of not paying attention to different test sample information, loss, high sensitivity to outliers and noise points
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0079] This embodiment provides an unbalanced data integration classification method that combines resampling technology and cost-sensitive learning. The flow chart is as follows figure 1 shown, including the following steps:
[0080] Step 1. Input training data set
[0081] Input an unbalanced data set X to be classified. The row vector corresponds to the sample dimension, and the column vector corresponds to the attribute dimension. X is randomly divided into 66% training set and 34% test set.
[0082] Step 2. Calculate the relative density of the spatial distribution of training samples
[0083] Define the class with a large sample size as the negative class, and the set of data points in the training set is T n ={x 1 ,x 2 ,...,x l}, the class with a small sample size is a positive class, and the set of data points in the training set is T p ={x l+1 ,x l+2 ,...,x n}, where l>>n-l+1;
[0084] from T n A particular data point x in i starting, calculate its differe...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com