Improved feature evaluation method based on mutual information

A technology of mutual information and feature evaluation, applied to instruments, character and pattern recognition, computer components, etc., can solve the problem of inability to efficiently evaluate the validity of complex signal features, achieve efficient feature selection tasks, and improve efficiency

Inactive Publication Date: 2018-09-21
TIANJIN UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0023] The existing evaluation criteria based on mutual information cannot effic...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improved feature evaluation method based on mutual information
  • Improved feature evaluation method based on mutual information
  • Improved feature evaluation method based on mutual information

Examples

Experimental program
Comparison scheme
Effect test

specific example

[0050] 1) If a feature subset with a dimension of 5 is given, where each feature contains 10 samples, then the feature subset S={S 1 ,S 2 ,S 3 ,S 4 ,S 5},Data are as follows:

[0051] The data for the feature subset is:

[0052] If the category label of the data L=[1 1 1 1 1 0 0 0 0 0]';

[0053] 2) Calculate the correlation D(S,L) of the feature subset as:

[0054] D(S,L)=I(S 1 ;L)+I(S 2 ;L)+I(S 3 ;L)+I(S 4 ;L)+I(S 5 ;L)

[0055] ≈0.3377+0.5+0.3377+0.1979+0.3195

[0056] =1.6929

[0057] 3) Calculate the redundancy R between the features in the feature subset as:

[0058]

[0059] 4) Calculate the evaluation value Eva of the feature subset as:

[0060] Eva=D(S,L)-R=1.2437

[0061] From the above calculation, the feature subset S={S 1 ,S 2 ,S 3 ,S 4 ,S 5} evaluates to 1.2437.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an improved feature evaluation method based on mutual information. According to the method, one piece of data with a feature subset dimension being m is input, and each featurecontains a plurality of samples; the relevancy of feature subsets, namely the sum of the mutual information of all the features in the feature subsets and a target category tag, is calculated; the redundancy of the features in the feature subsets, namely the average value of the mutual information of all the features in the feature subsets, is calculated; and evaluation values of the feature subsets are calculated. Through the improved feature evaluation method based on the mutual information, both the redundancy and the relevancy are considered according to feature effectiveness evaluation of complicated signals in combination with practical application, the problem that it is difficult to effectively measure feature effectiveness according to existing feature selection evaluation criteria currently is effectively solved, a feature selection task is completed more efficiently, and finally data mining and mode recognition efficiency is improved.

Description

technical field [0001] The invention relates to a feature evaluation method. In particular, it relates to an improved feature evaluation method based on mutual information that cannot efficiently evaluate the effectiveness of complex signal features in feature selection. Background technique [0002] 1. The concept of feature selection [0003] With the development of data acquisition and storage technology, high-dimensional data widely exist in many fields such as nature, finance, industry, biomedicine, etc., which contain complex nonlinear relationships among multiple features. Finding potentially useful information and building predictive models from high-dimensional data has become one of the most important aspects of data mining and knowledge discovery. Although high-dimensional data can provide rich information, it is increasingly difficult to build accurate predictive models as the dimensionality and scale of datasets continue to increase. At the same time, the exi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00
CPCG06F2218/08G06F2218/12
Inventor 张涛丁碧云赵鑫
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products