K-means initial clustering center optimization method on basis of neighborhood information and mean difference degree

A technology of initial cluster center and neighborhood information, applied in character and pattern recognition, special data processing applications, instruments, etc. It can overcome problems such as low rate and achieve the effect of overcoming the sensitivity of outliers, overcoming blindness and randomness, and improving the accuracy rate and stability.

Inactive Publication Date: 2018-01-12
中科美络科技股份有限公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantages of the traditional K-means clustering algorithm are as follows: First, the clustering results of the algorithm are easily affected by the initial clustering center, and when the initial clustering center is unreasonably selected, there will be consistent clustering and failure to converge
Second, the adverse effects of outliers on the clustering results cannot be overcome, resulting in unstable clustering results and low accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • K-means initial clustering center optimization method on basis of neighborhood information and mean difference degree
  • K-means initial clustering center optimization method on basis of neighborhood information and mean difference degree
  • K-means initial clustering center optimization method on basis of neighborhood information and mean difference degree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to further illustrate the features of the present invention, please refer to the following detailed description and accompanying drawings of the present invention. The accompanying drawings are for reference and description only, and are not intended to limit the protection scope of the present invention.

[0027] Such as figure 1 As shown, this embodiment discloses a K-means initial cluster center optimization method based on neighborhood information and average difference degree, including the following steps:

[0028] S1. Input the sample set X={X of n objects 1 , X 2 ,...,X i ,...,X n},X i Be an m-dimensional vector, determine the clustering number K, and initialize the currently determined initial clustering center number k=0;

[0029] S2. Calculate the distance between two objects in the sample set, and form a distance matrix D;

[0030] S3. Calculate the overall average difference degree M of the sample set, and determine the neighborhood radius v...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a K-means initial clustering center optimization method on the basis of neighborhood information and the mean difference degree. The method includes the steps that 1, a sampleset X={X1, X2, ..., Xi, ..., Xn} with n objects is input, the clustering number K is determined, and the current determined initial clustering center number k=0 is initialized; 2, a distance matrix Dis formed; 3, the neighborhood radius value delta is determined; 4, the number Ni of samples in the delta neighborhood of each sample point Xi is calculated, and a matrix N is formed; 5, the sample point Xi corresponding to the maximum sample number Ni in the delta neighborhood in N is regarded as the first clustering center C1, k=k+1, and the corresponding Ni in N is set as 0; 6, the sample pointXj corresponding to the maximum sample number Nj in the delta neighborhood in N is searched for, the distances between Xj and the clustering centers {C1, C2, ..., Ck} are calculated, and the corresponding Nj in N is set as 0; 7, if the distances between Xj and the clustering centers are not less than the mean difference degree M, k=k+1, C(k+1)=Xj, or Step 6 is returned to; 8, if the current clustering center number k is equal to the clustering sort number K, K initial clustering centers are output, or Step 6 is returned to; 9, the whole sample set is clustered by means of the K-means clustering algorithm, and a clustering result is output.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a K-means initial clustering center optimization method based on neighborhood information and average difference degree. Background technique [0002] Clustering algorithm is an unsupervised classification algorithm, which refers to dividing a group of objects without category identification into several categories according to a certain similarity, so that the distance between objects between categories is as large as possible, and the distance between objects within a category is as small as possible. One of the basic methods of system modeling and data mining has been widely used in various fields, such as text classification, image recognition and other fields. [0003] The K-means clustering algorithm (K-means) is a partition-based dynamic clustering algorithm. Because of its simplicity, it has become one of the most popular clustering methods. The disadvantages of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06F17/30
Inventor 吴仲城吴紫恒李芳张俊罗健飞
Owner 中科美络科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products