Clustering result evaluation method and system

A result evaluation and clustering technology, which is applied in the field of clustering result evaluation methods and systems, can solve the problems that the accuracy cannot be directly calculated, the accuracy cannot be directly calculated, and the lack of accuracy of clustering results, etc.

Active Publication Date: 2019-12-17
SHENZHEN ZTE NETVIEW TECH +1
View PDF7 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in practical applications, due to various reasons, the accuracy cannot be directly calculated. For example, most face images are unlabeled data, and the accuracy cannot be directly calculated. Currently, commonly used clustering effectiveness indicators such as DB index (Davies-Bouldin index, Davidson Bouldin index), Duun index (Duun index, Dunn index), etc. are not suitable for real scenes of face clustering, lacking a category that can be applied to real face clustering The evaluation method for the accuracy of clustering results in unlabeled data scenarios, so it needs to be improved and improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Clustering result evaluation method and system
  • Clustering result evaluation method and system
  • Clustering result evaluation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] figure 1 A flow chart of the clustering result evaluation method involved in this embodiment is shown. like figure 1 As shown, the method involved in this embodiment includes:

[0031] Step 102: Extract features of the target. The target may be an object requiring pattern recognition, such as an image.

[0032]Use the recognition algorithm to identify the features of the target. For example, for face recognition, you can use the face recognition algorithm (Shangtang face recognition algorithm) to calculate the face image to be recognized, and calculate the 512-dimensional feature vector (the dimension of the feature vector is related to the algorithm) , and realize eigenvector normalization. The data volume of the feature is M.

[0033] Step 104: Perform clustering processing on the features to obtain clusters. Get the cluster C={C 1 , C 2 ,...,C K}, where K is a positive integer. The clustering results are calculated by clustering algorithms, such as k-means ...

Embodiment 2

[0039] figure 2 A flow chart of the clustering result evaluation method involved in this embodiment is shown. like figure 2 As shown, the present embodiment takes facial image recognition as an example, which includes:

[0040] Step 202: feature extraction of face image.

[0041] The front-end face image acquisition device transmits the face image to the ES (Elasticsearch) database server through the network. Firstly, feature extraction is performed on the face image. The so-called feature extraction refers to the representation of face information through some numbers, and these numbers are the features we want to extract. Common face features are divided into two categories, one is geometric features, and the other is representational features. Geometric features refer to the geometric relationship between facial features such as eyes, nose, and mouth, such as distance, area, and angle. Then use the deep learning neural network algorithm to calculate the final 512-di...

Embodiment 3

[0070] image 3 A schematic diagram of the results of the clustering result evaluation system involved in this embodiment is shown, including: an extraction module 310 , a clustering module 320 , a statistics module 330 , a first calculation module 340 , a second calculation module 350 and a third calculation module 360 ​​.

[0071] The extraction module 310 is used to extract features of the target, and the data volume of the features is M. The clustering module 320 is used to cluster the features to obtain the clusters C={C 1 , C 2 ,...,C K}, wherein K is a positive integer; Statistical module 330, used to count the total number C of pure clusters in the cluster cluster C P , where P is a positive integer and 0≤P≤K; the first calculation module 340 is used to calculate the number of ideal clusters C I ; The second calculation module 350 is used to calculate the data fusion rate correction coefficient n, wherein The third calculation module 360 ​​is used to calculate th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a clustering result evaluation method, which comprises the steps of extracting features of a target, the data volume of the features being M; clustering the features to obtaina clustering cluster C = {C1, C2,..., CK}, where K is a positive integer; counting the total number CP of pure clusters in the clustering cluster C, wherein P is a positive integer and is greater thanor equal to 0 and less than or equal to K; calculating an ideal clustering cluster number CI; and calculating a data fusion rate correction coefficient eta, and calculating a clustering evaluation index HI. The invention further discloses a clustering result evaluation system. According to the method and the system provided by the invention, the total number of the pure clusters and the ideal clustering cluster number are counted, so that the data fusion rate correction coefficient is calculated, the clustering evaluation index can be quickly obtained, and the index objectively and effectively reflects the accuracy of the clustering result.

Description

technical field [0001] The present application relates to the field of pattern recognition, in particular to a clustering result evaluation method and system. Background technique [0002] Clustering algorithm is an important algorithm in machine learning. It belongs to unsupervised learning, which is mainly used to analyze the inherent characteristics of data, find the distribution law of data, or as a preprocessing process of data to support further processing of data. In specific applications, for example, in the face image clustering algorithm, similar faces are divided into the same cluster according to the similarity of face features. Ideally, all images of the same person can be clustered into one class. It plays an important role in fusion, accelerated face comparison retrieval, target deployment, trajectory tracking and other fields. [0003] In clustering, due to the influence of "noise" and other factors, there are cases of misclassification. For example, in the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/2321G06F18/2411G06F18/25Y02D10/00
Inventor 何俊豪蔡振伟朱金华王赟裴卫斌
Owner SHENZHEN ZTE NETVIEW TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products