Data clustering method and device, computer equipment and storage medium

A data clustering and data technology, applied in the computer field, can solve the problems of high label dependence and unsatisfactory effect.

Pending Publication Date: 2020-06-19
重庆亿创西北工业技术研究院有限公司
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the embodiments of the present invention is to provide a data clustering method, which aims to solve the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data clustering method and device, computer equipment and storage medium
  • Data clustering method and device, computer equipment and storage medium
  • Data clustering method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0026] First of all, in order to facilitate the understanding of the present invention, before the specific discussion of each step, a number of explanations and calculation formulas of professional technical words that need to be used are provided, as follows:

[0027] ① K-nearest neighbor data:

[0028] For a certain data p, its K-nearest neighbor data KNN p Expressed as:

[0029]

[0030] where d pq is the Euclidean distance between samples p and q, and the calculation formula of the Euclidean distance belongs to the conventional technical means of those skilled in the art, and will not be...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is suitable for the technical field of computers, and provides a data clustering method and device, computer equipment and a storage medium, and the method comprises the steps: respectively processing a plurality of pieces of clustering center data, and generating a plurality of corresponding core data sets; processing the remaining boundary data, respectively determining a core dataset associated with each piece of boundary data, and adding the core data sets into the corresponding core data sets; and performing clustering division on the to-be-clustered data according to the core data set. The data clustering method provided by the invention comprises the following steps of: firstly, dividing data which most possibly belongs to the same cluster into a core data set; compared with the prior art that the relevance between the remaining data and the divided data is determined, even if a certain point in the set is wrongly divided, the division of the remaining boundary data cannot be seriously influenced, and the technical problem that an existing clustering algorithm highly depends on the tag accuracy is solved.

Description

technical field [0001] The invention belongs to the technical field of computers, and in particular relates to a data clustering method, device, computer equipment and storage medium. Background technique [0002] Clustering is the process of dividing a collection of physical or abstract objects into multiple clusters (or classes) composed of similar objects, that is, classifying objects into different clusters, objects in the same cluster have great similarity, and different clusters The objects in have great dissimilarity. There are many kinds of clustering algorithms at present, such as DBSCAN algorithm (Density-Based Spatial Clustering of Applications with Noise, a density-based clustering algorithm), DPCA algorithm (DesityPeaks Clusering Algorithm, a density peak-based clustering algorithm), KNN algorithm (k-NearestNeighbor, k-nearest neighbor clustering algorithm), among which the more commonly used clustering algorithm is the DPCA algorithm. [0003] However, for th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/23213
Inventor 于会陈芦园王星南张洁董文敏杨海泽
Owner 重庆亿创西北工业技术研究院有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products