Unlock instant, AI-driven research and patent intelligence for your innovation.

Clustering evaluation measurement method, system and device and storage medium

A measurement method and clustering algorithm technology, applied in the field of data processing, can solve the problems of not considering the impact and the effectiveness of the clustering effect, and achieve the effect of reducing the impact and improving the effectiveness.

Pending Publication Date: 2021-02-09
SHENZHEN INSTITUTE OF INFORMATION TECHNOLOGY
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the clustering effectiveness index does not consider the dimension of each dimension of the sample and the influence of outliers when calculating the inter-cluster dispersion of data samples and the intra-cluster compactness of samples, while the dimension of each dimension of the sample and the influence of outliers points will have an important impact on the effectiveness of the clustering effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Clustering evaluation measurement method, system and device and storage medium
  • Clustering evaluation measurement method, system and device and storage medium
  • Clustering evaluation measurement method, system and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0113] Such as image 3 with Figure 4 They are respectively a schematic diagram of the classification result without standardization processing and a schematic diagram of the classification result processed by the cluster evaluation measurement method provided by the technical solution of the present application.

[0114] Obtain the wine data data to be classified, the data categories of wine data are 159, 271 and 348 respectively;

[0115] Input the wine data data into the cluster evaluation system without normalization processing and the penalty item of the system is not optimized, and the CH algorithm cluster evaluation system that introduces Z-Score standardization and adds the optimized penalty item for cluster evaluation. Class results such as image 3 with Figure 4 shown.

[0116] Depend on image 3 It can be seen that the optimal CH value is 2, which does not match the actual number of categories;

[0117] Depend on Figure 4 It can be seen that the optimal nu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a clustering evaluation measurement method, system and device and a storage medium, and the method comprises the steps: obtaining a to-be-clustered data set, and processing thedata set through a preset function, so as to generate a sample set with the same dimension and order of magnitude; in combination with a preset clustering algorithm and a set cluster number, generating a plurality of clusters from the sample set, and obtaining an inter-cluster dispersity value and an intra-cluster compactness value; constructing a penalty term according to the logarithmic function, and outputting a first clustering result in combination with the inter-cluster dispersity value, the intra-cluster compactness value and the penalty term. The method generates a sample set througha preset function processing set to-be-clustered data set, generatesg a plurality of clusters from the sample set according to a preset clustering algorithm and a set cluster number, outputs an inter-cluster dispersity value and an intra-cluster compactness value, and finally outputs a first clustering result in combination with the inter-cluster dispersity value, the intra-cluster compactness value and a penalty term. Therefore, the influence of dimensions and outliers of each dimension of the sample on the clustering result is reduced, and the effectiveness of the clustering effect is improved.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a cluster evaluation measurement method, system, device and storage medium. Background technique [0002] With the development of society and the advent of the intelligent age, clustering technology, as an important part of the learning process of unsupervised pattern recognition, has been widely used in important fields such as machine learning, pattern recognition and data mining. The purpose of clustering is to divide the originally scattered and seemingly irrelevant data samples into similar groups or clusters to obtain some internal data rules. A key task of clustering is to quantitatively evaluate the clustering results, especially to determine an optimal number of clusters or partition structure. The quality of the clustering results is determined by the clustering validity. The CH (Calinski-Harabasz) index is a common measurement method used to evaluate t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/2193G06F18/23
Inventor 赵妮蔡金成
Owner SHENZHEN INSTITUTE OF INFORMATION TECHNOLOGY