Data clustering method and device

A data clustering and clustering technology, applied in relational databases, database models, structured data retrieval, etc., can solve the problem of random determination of the category center points of the initial clustering

Inactive Publication Date: 2019-08-09
重庆紫光华山智安科技有限公司
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of this application is to provide a method and device for data clustering in view of the deficiencies in the above-mentioned prior art, so as to solve the artificial subjective judgment of the number of clustered categories in the prior art, and the category center of the initial clustering There is a problem of random determination in the process of point selection

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data clustering method and device
  • Data clustering method and device
  • Data clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them.

[0045] First of all, before introducing this application, the names used in this application will be explained accordingly, and the specific explanation is as follows.

[0046] Kmeans algorithm: A typical algorithm in the clustering algorithm based on partition, this algorithm has the advantages of simple operation, the use of error square and criterion function, and high scalability and compressibility for the processing of large data sets.

[0047] Cluster analysis: refers to the analysis process of grouping a collection of physical or abstract obje...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data clustering method and device, and relates to the field of data mining. The method comprises the steps of calculating and obtaining a preset number of clustering center points according to a to-be-clustered data set and a preset algorithm; clustering the to-be-clustered data set according to each clustering center point to obtain a clustering result; if the clusteringresult meets the preset termination condition, stopping clustering, outputting a clustering center point set and class labels of all clusters in the clustering result, wherein all the clustering center points are included in the clustering center point set. Compared with the prior art, the problems that subjective judgment exists in the number of clustered categories, and random determination exists in selection of initial clustered clustering center points are solved.

Description

technical field [0001] The present application relates to the field of data mining, in particular, to a data clustering method and device. Background technique [0002] In today's fast-changing Internet, all kinds of data are everywhere. There is a lot of information behind the data that can be analyzed and mined. Data mining has become an essential technology today. It refers to the process of analyzing the hidden information from a large amount of data through various data mining algorithms. [0003] Cluster analysis is a clustering method based on the distance between data to classify. It is an important branch in the field of data mining and an unsupervised learning method. Cluster analysis has been widely used in machine learning, pattern recognition, data mining, image processing and other fields. Because the partition-based kmeans clustering algorithm has the characteristics of simplicity and high efficiency, it is widely loved by people. [0004] However, the de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458G06F16/28G06K9/62
CPCG06F16/2465G06F16/285G06F18/23213
Inventor 杨开平
Owner 重庆紫光华山智安科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products