Data clustering method and system, and data processing equipment

A data clustering and clustering technology, applied in the field of data processing, can solve problems such as inability to block data processing, ignore potential behavioral characteristics of data, etc., achieve good results and avoid information loss

Inactive Publication Date: 2013-12-25
SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
View PDF0 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to provide a data clustering method, system and data processing equipment, aiming to solve the problem that the current machine learning algorithm in the prior art cannot directly process the block data, and it must be converted into standard data for processing. Dealing with issues where underlying behavioral features present in the data may be overlooked

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data clustering method and system, and data processing equipment
  • Data clustering method and system, and data processing equipment
  • Data clustering method and system, and data processing equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the object, technical solution and beneficial effects of the present invention more clear, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0024] see figure 1 , is the implementation process of the data clustering method provided by the embodiment of the present invention, which includes the following steps:

[0025] In step S101, input a data set composed of n objects with block data characteristics that need to be clustered and an expected number of categories k;

[0026] In the embodiment of the present invention, it is assumed that the data set to be clustered is X={x 1 ,x 2 ,L,x n},in x i = x ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is applicable to the field of data processing, provides a data clustering method, a data clustering system and data processing equipment. The method comprises the following steps: inputting a data set consisting of n objects with a block data feature required to be clustered and an expected class number k; selecting k block data objects from the data set to serve as an initial class center; calculating the distance from each object to the initial class center; distributing each block data object to the center closest to the block data object according to the calculated distance to form k disjointed classes; calculating the center of each class to serve as a new class center; repeatedly executing the step of distributing each block data object to the center closest to the block data object according to the calculated distance to form the k disjointed classes and the step of calculating the center of each class to serve as the new class center until the algorithm is converged; obtaining the division result of the data set. By the data clustering method, the data clustering system, and the data processing equipment, the data with the block feature can be processed directly without compressing the block data, so that the loss of information is avoided, and the obtained clustering result is better than the clustering effect obtained after the block data is compressed.

Description

technical field [0001] The invention belongs to the field of data processing, and in particular relates to a data clustering method, system and data processing equipment. Background technique [0002] With the rapid development of data automatic generation and collection technology, many fields have produced massive data that record the details of people's behavior, which provides the possibility for behavior pattern mining. These data describing the behavior of the collected objects have a common feature, that is, the behavior of each object is described by a collection of multiple records. We call the data set that records the behavior characteristics of the object a block of data. For example, a customer's purchase behavior or call behavior is reflected by the customer's purchase details or call details in a certain period of time. Through in-depth mining of block data, it helps us analyze and predict customer behavior. However, current machine learning algorithms canno...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 曹付元黄哲学梁吉业
Owner SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products