Unlock instant, AI-driven research and patent intelligence for your innovation.

Feature selection method based on attribute condition redundancy

A feature selection method and feature selection technology, applied in computer parts, instruments, characters and pattern recognition, etc., can solve the problems of redundant information of feature subsets and inability to measure three-way feature interaction.

Pending Publication Date: 2021-08-06
XIAN UNIV OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a feature selection method based on attribute condition redundancy, which solves the problem that the three-way feature interaction cannot be measured in the existing feature selection framework, and the selected feature subset contains more redundant information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature selection method based on attribute condition redundancy
  • Feature selection method based on attribute condition redundancy
  • Feature selection method based on attribute condition redundancy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0029] Definition 1 Entropy: In statistics, entropy is a measure of the uncertainty of random variables. The greater the degree of uncertainty of an event, the greater the entropy and the greater the amount of information. Entropy is defined as follows:

[0030]

[0031] where Y represents a random variable, y is the possible value of Y, and p(y) is the probability density function of Y. If Y is regarded as a class attribute, then feature selection based on mutual information is to reduce the uncertainty of the class by selecting some features, so it is necessary to study the impact of features on the class.

[0032] Definition 2 Conditional entropy: Conditional entropy measures the uncertainty of random variables based on the premise that a certain variable is known. Conditional entropy is defined as follows:

[0033]

[0034] where...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a feature selection method based on attribute condition redundancy. The method comprises the following steps: step 1, preprocessing a data set; 2, dividing the preprocessed data set into a training data set and a test data set; 3, calculating a mutual information value between each target feature in the training data set and a class label, selecting the target feature which enables the current mutual information value to be maximum, deleting the target feature from the original data set, adding the target feature into a set S which is initially empty, and then iteratively performing feature selection according to a feature selection algorithm based on attribute condition redundancy; and adding the features selected by each iteration into the set S, and carrying out iteration to finally obtain a feature subset with the size of m. According to the method, redundant information of two measurement features, namely redundancy and attribute condition redundancy, is used, so that some redundant features are removed more accurately, and the classification correctness of the test data is improved.

Description

technical field [0001] The invention belongs to the technical field of data mining and relates to a feature selection method based on attribute condition redundancy. Background technique [0002] Due to the rapid development of computer technology and data storage capacity, data processing capabilities have made a qualitative leap, and various big data are used in scientific research and social life. However, the continuous increase of data dimensions makes data processing more and more difficult. It is extremely urgent and important to study data dimensionality reduction technology. The data after dimensionality reduction not only effectively reduces the feature dimension, but also reduces the space required for the feature data storage, and also reduces the difficulty of obtaining key information between the main features, because the key features are retained, and the subsequent The time cost of algorithmic models searching for key features. Dimensionality reduction te...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
CPCG06F18/211G06F18/214
Inventor 周红芳朱柔柔
Owner XIAN UNIV OF TECH