Data mining improved type K mean value clustering method based on linear discriminant analysis

A technology of linear discriminant analysis and clustering method, applied in the field of hm) algorithm, which can solve problems such as inability to cluster analysis of high-dimensional data, inability to achieve fast processing of high-dimensional data, etc.

Inactive Publication Date: 2014-03-26
NANJING UNIV OF POSTS & TELECOMM
View PDF4 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Technical problem: The present invention aims at the problems that the K-means clustering method cannot perform cluster analysis on high-dimensional data, and cannot quickly process high-dimensional data by the K-means clustering method, and provides an improvement based on linear discriminant analysis in data mining The K-means clustering method uses the linear mapping of linear discriminant analysis to map the original high-dimensional data to the low-dimensional space one by one, completes the linear dimensionality reduction operation, and obtains low-dimensional data suitable for K-means clustering analysis, and completes Cluster analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data mining improved type K mean value clustering method based on linear discriminant analysis
  • Data mining improved type K mean value clustering method based on linear discriminant analysis
  • Data mining improved type K mean value clustering method based on linear discriminant analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] Algorithm definition

[0050] In the key technology of the present invention (Linear Discriminant Analysis LDA), we minimize the intra-class distance as much as possible while maximizing the inter-class distance, and obtain the optimal projection direction to produce the best classification results, that is, choose such that The feature description sample that maximizes the ratio of the sample's inter-class dispersion to the sample's intra-class dispersion. For a given matrix A∈R d×n (R d×n represents the n-dimensional real linear space composed of all d×n real matrices), and a transformation matrix G∈R can be generated by using linear discriminant analysis d×l (R d×l represents the l-dimensional real linear space composed of all d×l real matrices), and each column vector a of the matrix A in the n-dimensional space i One-to-one mapping to vector y in l-dimensional space i ,which is:

[0051] the y i =G T a i ∈ R l (l

[0052] Divide the matri...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data mining improved type K mean value clustering method based on linear discriminant analysis, namely an LKM algorithm. Firstly, an LDA is adopted to conduct linear dimensionality reduction on an original n-dimensional data set A to obtain a one-dimensional data set Y, then a k mean value clustering algorithm is adopted to conduct clustering analysis on the data set Y after dimensionality reduction, and final results are output. The method that data dimensionality reduction is combined with the K mean value clustering method is adopted, and defects of the k mean value clustering algorithm on high-dimensional data are overcome through the data dimension reduction technology. The aims of lightening dimensionality curses and eliminating other uncorrelated attributes in high-dimensional space are achieved through data dimension reduction. Meanwhile, the performance of the k mean value clustering algorithm for processing the high-dimensional data is also improved, and the correlated defects of the k mean value clustering algorithm are overcome.

Description

technical field [0001] The present invention is an optimization method based on linear discriminant analysis (LDA) to improve the performance of the K-means clustering method, that is, the LKM (LDA-based K-Means algorithm) algorithm, which belongs to the cluster analysis in data mining Research areas. Background technique [0002] Cluster analysis is an important research field in data mining, and it is an important means and method for data division or grouping. The current clustering algorithms are generally divided into partition-based methods, hierarchical-based methods, density-based methods, grid-based methods, model-based methods and fuzzy clustering. K-means clustering method is a typical clustering algorithm based on distance division, which uses distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater the similarity. Because of its simple algorithm idea and easy realization of large-scale data clustering, K...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/90335
Inventor 王堃张玉华孙雁飞吴蒙郭篁陈思光
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products