Rapid mass data cluster processing method for computer

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technology of massive data and processing methods, applied in the field of data processing, to achieve the effect of reducing computational complexity, convenient, fast and effective processing, and good structure

Inactive Publication Date: 2014-04-23

NORTH CHINA ELECTRIC POWER UNIV (BAODING)

View PDF2 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The object of the present invention is to aim at the disadvantages of the prior art, to provide a fast massive data clustering method with data profile analysis capability, to solve the problems of efficiency and cluster data profile analysis when a computer clusters a large amount of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] The object of the present invention is to provide a kind of fast massive data clustering processing method of computer with data profile analysis ability, described method is for the number of The data objects to be clustered, after The clustering results of any number of clusters can be obtained by combining calculations once, and the specific composition of the data objects contained in each sub-category and the centroid of the sub-category (that is, the arithmetic mean of the attribute values of the contained data objects) can be obtained. It has the characteristics of fast calculation speed and strong data analysis ability.

[0029] In order to achieve the above object, the technical solution adopted in the present invention comprises the following steps:

[0030] Step 1. Data object preprocessing. For all data objects to be analyzed (the number is ) for preprocessing, the specific method of preprocessing is: for any given data dimension is The data obje...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a rapid mass data cluster processing method for a computer. The method comprises the following steps: firstly preprocessing data objects to be analyzed to complete grouping of the data objects; then calculating similarity matrixes of the data objects in a group, and merging to generate new data objects according to the similarity; recording the merging and generation process and meanwhile deleting the original data objects; operating repeatedly until the amount of the data objects is equal to the number of clustering classification expected by user; finally obtaining the results of clustering processing by inquiring the merging records. According to the method, specific composition of each subclass data object with any number of clusters, the number of subclass data objects and centroid thereof can be obtained during once implementation process, and the distribution general situation of each subclass interior data object and characteristics thereof can be inquired, so that rapid effective processing of mass data is greatly facilitated.

Description

technical field [0001] The invention relates to a fast massive data analysis method with data profile analysis capability, which belongs to the technical field of data processing. Background technique [0002] When a computer processes data, in order to improve the processing speed, it is necessary to cluster massive data. The clustering is to divide a data set into different classes or clusters according to the similarity of the data itself (generally the distance criterion, the smaller the distance, the greater the similarity), so that the similarity of the data objects in the class is as large as possible. At the same time, the difference of data objects between classes should be as large as possible. Clustering processing can help people discover potential laws hidden behind massive data, which is of great significance for information processing and knowledge discovery, and has been widely used in many fields such as data mining, machine learning, pattern recognition, s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06F17/30

CPCG06F16/284

Inventor李中杨宏张珂

OwnerNORTH CHINA ELECTRIC POWER UNIV (BAODING)

Rapid mass data cluster processing method for computer

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology