K-means initial clustering center optimization method on basis of neighborhood information and mean difference degree
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of initial cluster center and neighborhood information, applied in character and pattern recognition, special data processing applications, instruments, etc. It can overcome problems such as low rate and achieve the effect of overcoming the sensitivity of outliers, overcoming blindness and randomness, and improving the accuracy rate and stability.
Inactive Publication Date: 2018-01-12
中科美络科技股份有限公司
View PDF0 Cites 4 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
The disadvantages of the traditional K-means clustering algorithm are as follows: First, the clustering results of the algorithm are easily affected by the initial clustering center, and when the initial clustering center is unreasonably selected, there will be consistent clustering and failure to converge
Second, the adverse effects of outliers on the clustering results cannot be overcome, resulting in unstable clustering results and low accuracy
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment Construction
[0026] In order to further illustrate the features of the present invention, please refer to the following detailed description and accompanying drawings of the present invention. The accompanying drawings are for reference and description only, and are not intended to limit the protection scope of the present invention.
[0027] Such as figure 1 As shown, this embodiment discloses a K-means initial cluster center optimization method based on neighborhood information and average difference degree, including the following steps:
[0028] S1. Input the sample set X={X of n objects 1 , X 2 ,...,X i ,...,X n},X i Be an m-dimensional vector, determine the clustering number K, and initialize the currently determined initial clustering center number k=0;
[0029] S2. Calculate the distance between two objects in the sample set, and form a distance matrix D;
[0030] S3. Calculate the overall average difference degree M of the sample set, and determine the neighborhood radius v...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
PUM
Login to view more
Abstract
The invention discloses a K-means initial clustering center optimization method on the basis of neighborhood information and the mean difference degree. The method includes the steps that 1, a sampleset X={X1, X2, ..., Xi, ..., Xn} with n objects is input, the clustering number K is determined, and the current determined initial clustering center number k=0 is initialized; 2, a distance matrix Dis formed; 3, the neighborhood radius value delta is determined; 4, the number Ni of samples in the delta neighborhood of each sample point Xi is calculated, and a matrix N is formed; 5, the sample point Xi corresponding to the maximum sample number Ni in the delta neighborhood in N is regarded as the first clustering center C1, k=k+1, and the corresponding Ni in N is set as 0; 6, the sample pointXj corresponding to the maximum sample number Nj in the delta neighborhood in N is searched for, the distances between Xj and the clustering centers {C1, C2, ..., Ck} are calculated, and the corresponding Nj in N is set as 0; 7, if the distances between Xj and the clustering centers are not less than the mean difference degree M, k=k+1, C(k+1)=Xj, or Step 6 is returned to; 8, if the current clustering center number k is equal to the clustering sort number K, K initial clustering centers are output, or Step 6 is returned to; 9, the whole sample set is clustered by means of the K-means clustering algorithm, and a clustering result is output.
Description
technical field [0001] The invention relates to the technical field of data mining, in particular to a K-means initial clustering center optimization method based on neighborhood information and average difference degree. Background technique [0002] Clustering algorithm is an unsupervised classification algorithm, which refers to dividing a group of objects without category identification into several categories according to a certain similarity, so that the distance between objects between categories is as large as possible, and the distance between objects within a category is as small as possible. One of the basic methods of system modeling and data mining has been widely used in various fields, such as text classification, image recognition and other fields. [0003] The K-means clustering algorithm (K-means) is a partition-based dynamic clustering algorithm. Because of its simplicity, it has become one of the most popular clustering methods. The disadvantages of the ...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.