Feature selection method based on filtering type and packaging type hierarchical progression

A feature selection method and filtering technology, applied in character and pattern recognition, complex mathematical operations, instruments, etc., can solve the problems of feature redundancy and irrelevant data set features, so as to reduce consumption, improve model performance, and make good evaluation effect of effect

Inactive Publication Date: 2021-08-10
HARBIN UNIV OF SCI & TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to solve the problem of irrelevant or redundant features of data sets, the present invention discloses a feature selection method based on filtering and encapsulation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature selection method based on filtering type and packaging type hierarchical progression
  • Feature selection method based on filtering type and packaging type hierarchical progression
  • Feature selection method based on filtering type and packaging type hierarchical progression

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to clearly and completely describe the technical solutions in the embodiments of the present invention, the present invention will be further described in detail below in conjunction with the drawings in the embodiments.

[0031] Taking the feature selection of the breast cancer data set collected by a cancer hospital as an example, the process of a feature selection method based on filtering and encapsulation in the embodiment of the present invention is as follows: figure 1 shown, including the following steps.

[0032] Step 1 The process of sorting features based on the filter-based variance sorting method and information gain method and the encapsulation-based Boruta sorting method is as follows:

[0033] Step 1-1 uses the variance sorting method and information gain method in the filtering method and the encapsulation-based Boruta sorting method to sort the feature importance from large to small;

[0034] Among them, the calculation formula of the varianc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a feature selection method based on filtering type and packaging type hierarchical progression. The method comprises the following steps: firstly, sorting features by using a filter type variance sorting method, an information gain sorting method and a Boruta sorting method based on a packaging type; distributing ranks to the sorted features according to importance degrees; fusing results of the three sorting methods; and then calculating correlation between every two features based on a Pearson's correlation coefficient, so as to obtain a feature fusion result; then setting a Pearson's correlation coefficient threshold of the features, selectively deleting part of the features according to the correlation between the features, and finally finding out the best feature combination based on a packaged sequence forward selection method in combination with a random forest model, thereby obtaining an optimal feature subset. The method has a good effect of selecting the optimal feature subset for the data set, and provides relatively accurate feature information for the learning model, so that the accuracy of the learning model is improved.

Description

Technical field: [0001] The invention relates to a feature selection method based on filtering and encapsulation, which has a good application in the feature selection of data sets. Background technique: [0002] There are a large number of redundant features and irrelevant features in the data set, which brings great challenges to data mining and seriously affects the accuracy and scientificity of data mining results. Therefore, before data mining, irrelevant features and irrelevant features in the data set Redundant features are processed. [0003] Feature selection is also called feature subset selection. Feature selection can propose irrelevant or redundant features from the original features, reduce the number of features to find the optimal feature subset, improve model accuracy, and reduce running time. The feature selection method can be divided into filtering type and encapsulation type according to whether it is independent of the subsequent learning algorithm. Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/15G06K9/62
CPCG06F17/15G06F18/24323G06F18/25
Inventor 李思琪苗世迪胡晓慧王瑞涛
Owner HARBIN UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products