Characteristic selection method based on sample characteristic distribution confusion degree

A feature selection method and sample feature technology, applied in the field of dimensionality reduction, can solve problems such as poor generalization ability, high selection efficiency, and high computational complexity, and achieve the effect of fast calculation speed and strong recognition ability

Inactive Publication Date: 2017-10-24
DALIAN MARITIME UNIVERSITY
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The filtering feature selection algorithm has high selection efficiency, relatively small amount of calculation, fast speed, and strong versatility, but the performance of the selected feature subset is usually poor
The feature subset selected by the package feature selection algorithm has a good classification effect, and the feature dimension of the selected feature subset is relatively small, but the computational complexity is high, and the speed is slower than the filtering method selection. poor generalization ability
Although the filtering feature selection method has a fast calculation speed and can quickly evaluate the features, the performance of the selected feature subset is generally poor.
The embedded feature selection method embeds the selection method into the learning algorithm, which can only be adapted to certain types of algorithms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Characteristic selection method based on sample characteristic distribution confusion degree
  • Characteristic selection method based on sample characteristic distribution confusion degree
  • Characteristic selection method based on sample characteristic distribution confusion degree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the implementation of the present invention. example, not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0048] The present invention designs a staged mixed feature selection method to effectively combine the Filter and Wrapper feature selection algorithms, and then complete the feature selection process in stages, that is, use the Filter method to quickly evaluate the features and then complete the specific classifier. The feature selection; specifically includes ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a characteristic selection method based on sample characteristic distribution confusion degree, which comprises the following steps: 1) sorting the value sets of each characteristic fi of all kinds of samples in the data set X in an ascending order; 2) determining the range of the limited value of the characteristic fi corresponding to each kind of the sample; 3) determining the number of confusing samples corresponding to the ith characteristic of the M-kind sample in the data set, and calculating the characteristic distribution confusion degree of the ith characteristic in the data set X so as to use the same method to obtain the Confusion value of each characteristic in the data set X; 4) ranking the characteristics in the data set X according to their significance and the obtained Confusion values to obtain an ordered characteristic set F; and 5) based on the set subset search strategy, using a classifier to search ordered characteristic set F or the subset Fsub formed by the partial characteristics in the ordered feature set F to obtain a desired characteristic subset D. The method can select the better-performance characteristic subsets, which improves the recognition capability for the characteristic subsets, and reduces the search times in the subset searching process.

Description

technical field [0001] The invention relates to a dimensionality reduction method, in particular to a feature selection method based on the confusion degree of sample feature value distribution. Background technique [0002] Feature selection is a method of feature dimensionality reduction. Unlike dimensionality reduction methods such as principal component analysis, feature selection is to solve a combinatorial optimization problem in which the amount of computation increases exponentially with the number of features. In the absence of prior assumptions about the relevant knowledge in the specific research field, to select a feature subset containing all important information from the original feature set, we can only exhaustively traverse all possible feature subsets. In this case In this case, as long as the number of features is a little more, the amount of calculation will become very large. Feature selection mainly includes four basic steps: generation of candidate fe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/28G06F18/214G06F18/24
Inventor 王演邱东杰史晓非于丽丽巴海木祖成玉
Owner DALIAN MARITIME UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products