Unlock instant, AI-driven research and patent intelligence for your innovation.

Data clustering method and system and storage medium

A clustering method and data clustering technology, applied in other database clustering/classification, other database retrieval, other database indexing, etc., can solve the problems of lack of integrity and universal applicability of traditional clustering algorithms, and achieve practicality High, improved expression efficiency, and complete clustering results

Pending Publication Date: 2021-12-17
XI AN MIKESI INTELLIGENT TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] The purpose of the present invention is to provide a data clustering method, system and storage medium, which solves the technical problem that traditional clustering algorithms in the prior art lack integrity and universal applicability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data clustering method and system and storage medium
  • Data clustering method and system and storage medium
  • Data clustering method and system and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0067] Embodiment 1 of the present invention provides a data clustering method, such as figure 1 shown, including the following steps:

[0068] (1) Determine the data clustering conditions, including the following steps:

[0069] Identify factors that affect the similarity between data;

[0070] Determine the data dimension concerned by data clustering from many factors;

[0071] Determine the combination relationship of different dimension data;

[0072] Determine the clustering condition of the data according to the combination relationship of each dimension data.

[0073] The data clustering condition is determined based on the similarity between data, and the similarity between data is often affected by factors in multiple dimensions. Therefore, the data clustering condition in Embodiment 1 of the present invention is based on the following combinations of different dimensional data Relationships cluster data as follows:

[0074] (v 1 ,v 2 ,v 3 ,...,v j ),

[007...

Embodiment 2

[0094] HSV is a color space created according to the intuitive characteristics of color, also known as the hexagonal cone model. The parameters of the color in this model are: hue (h), saturation (s), lightness (v), and the value ranges are respectively It is: H: 0~180, S: 0~255, V: 0~255, the image is composed of several data points, each data point has h value, s value, v value.

[0095] Such as figure 2 As shown, Embodiment 2 of the present invention provides a data clustering method, for the data of 12 scattered and disordered data points: the hue h value, clustering is performed by the following method, including the following steps:

[0096] (1) Determine the conditions for data clustering, specifically:

[0097] The similarity between the data in this embodiment is only affected by one dimension: the difference Δh between the hue h values, so the condition for data clustering in this embodiment is to cluster the data according to Δh:

[0098] v 1 =Δh={a m1}=a 11 ,...

Embodiment 3

[0124] Such as Figure 7 As shown, the third embodiment of the present invention provides a data clustering method, for the data of 11 ordered data points in the Cartesian coordinate system: hue h value, x coordinate value, y coordinate value, clustering is performed by the following method , including the following steps:

[0125] (1) Determine the conditions for data clustering, specifically:

[0126] The similarity between the data in the third embodiment of the present invention is jointly affected by factors in two dimensions: the difference Δh between the hue h values, and the difference Δx between the x coordinate values, so the data clustering in the third embodiment of the present invention The condition is to cluster the data according to the combination relationship of Δh and Δx:

[0127] (v 1 ,v 2 ),

[0128] v 1 =Δh,

[0129] v 2 =Δx;

[0130] In the third embodiment of the present invention, the dimension data concerned by data clustering is Δh, and the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data clustering method and system and a storage medium. The method comprises the following steps: determining a data clustering condition; clustering the data according to the data clustering condition to obtain at least one first clustering result, and calculating an entropy load of each first clustering result, wherein the entropy load represents the average information amount borne by the first clustering result corresponding to the entropy load; and taking the maximum entropy load in the entropy loads, and taking the first clustering result corresponding to the maximum entropy load as a data clustering result. According to the invention, clustering is carried out from the whole data, the integrity of data clustering is realized, and the obtained clustering result is more complete and accurate; dependence and processing on any special data do not exist in the clustering process, and any data type is not limited, so that the invention is generally suitable for clustering of any data; the maximum bearing average information amount is adopted as the basis for determining the clustering result, and for a computer system with a certain storage space, the information amount capable of being stored by the computer system is larger, so that the information expression efficiency is improved.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence, in particular to a data clustering method, system and storage medium. Background technique [0002] In recent years, with the development and popularization of the Internet, the number of images, videos, texts and other data and the dimensions of the representation data are increasing. In order to utilize these massive data, it is necessary to quickly and effectively cluster these high-dimensional data. Therefore, a large number of clustering algorithms have been derived. [0003] As one of the important research topics in the field of machine learning, clustering algorithm has been widely used in data mining, face recognition, medical image analysis, image segmentation and other important fields. Image clustering is to divide target data with completely unknown labels and classify them into different clusters. It is an exploratory technique for grouping data features. It can usu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/906G06F16/901
CPCG06F16/906G06F16/9027
Inventor 邓少冬盛龙
Owner XI AN MIKESI INTELLIGENT TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More