Feature selection algorithm based on dynamic programming and K-means clustering

A k-means clustering and dynamic programming technology, which is applied in computing, computer parts, character and pattern recognition, etc., can solve problems that are easy to fall into a local optimal solution, the number of clusters cannot be determined, and the feature subset cannot be guaranteed to be low Noise, strong correlation, etc.

Inactive Publication Date: 2016-10-12
SOUTH CHINA UNIV OF TECH
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the K-means clustering algorithm itself has the disadvantages that the number of clusters cannot be determined, and it is easy to fall into a local optimal solution. At the same time, when using the K-means clustering algorithm alone for feature selection, it cannot guarantee that the selected feature subset has low noise. , strongly correlated features

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature selection algorithm based on dynamic programming and K-means clustering
  • Feature selection algorithm based on dynamic programming and K-means clustering
  • Feature selection algorithm based on dynamic programming and K-means clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] The present invention will be further described below in conjunction with specific examples.

[0070] Such as figure 1 As shown, the feature selection algorithm based on dynamic programming and K-means clustering described in this embodiment, i.e. DKFS (Dynamic programming and K-means clustering Feature Selection) algorithm, comprises the following steps:

[0071] 1) Use corresponding data preprocessing methods to solve problems such as data duplication and missing data attribute values ​​in feature data;

[0072] 2) With the help of the core idea of ​​dynamic programming, feature subsets are pre-selected, and the distance between classes and within classes is used as the performance function in the decision-making process of dynamic programming;

[0073] 3) Improve the original K-means clustering algorithm, focusing on determining the number of clusters and selecting the initial center point to optimize and improve the original K-means clustering algorithm, and introd...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a feature selection algorithm based on dynamic programming and K-means clustering. The feature selection algorithm includes that 1) data preprocessing is carried out to mainly solve the problems of data repetition and data attribute value missing in the feature data; 2) feature sub-sets are pre-selected by means of dynamic programming core idea and the within-class and between-class distance is taken as the performance function in the dynamic programming decision process; 3) an original K-means clustering algorithm is improved, the feature sub-sets generated at the dynamic programming stage are clustered by means of an improved K-means algorithm to reject redundant features, and the selected sub-sets are optimized. Based on the feature selection algorithm, the feature sub-sets being low in noise, high in correlation, and free of redundancy can be selected, the effective dimensionality reduction can be realized, the generalization ability and learning efficiency of the machine learning algorithm are improved, the running time of the algorithm is reduced, and finally a simple, efficient and easy-understand learning model is generated.

Description

technical field [0001] The invention relates to the field of feature engineering and machine learning, in particular to a feature selection algorithm based on dynamic programming and K-means clustering. Background technique [0002] The core idea of ​​dynamic programming is: decompose the complex original problem into several simple sub-problems, several sub-problems can also be called several stages, then the solution process of the original problem is transformed into a process of solving multiple stages, through these sub-problems (multi-stage) solution to obtain the solution of the original problem. The objective conditions at the beginning of each stage are called the state of the stage. When the state of a certain stage is determined, different decisions can often be made to enter the next stage. This kind of decision is called decision-making. The judgment basis for making a decision is the corresponding performance function, and the sequence formed by the decisions ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/211G06F18/23213
Inventor 董敏曹丹刘皓熙毕盛
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products