Distributed-structure-based big data clustering method and device

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A distributed structure and clustering method technology, applied in the field of data mining, can solve problems such as hard division of intervals, no consideration of the different effects of big data data points on knowledge discovery tasks, uneven data distribution, etc.

Active Publication Date: 2015-07-29

品尚电子商务有限公司

View PDF2 Cites 26 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In the face of big data processing, the method based on sample sampling probability is generally adopted, but the sampling method does not consider the overall relative distance between data points or intervals and the uneven distribution of data, resulting in the problem of hard division of intervals

Although later, clustering, fuzzy concepts, and cloud models were introduced to improve the problem of interval division and achieved good results, but these methods did not consider the different effects of big data data points on knowledge discovery tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0098] The technical solutions of the present invention will be clearly and completely described below in conjunction with the accompanying drawings of the present invention. Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0099] see figure 1 , a kind of big data clustering method based on distributed structure that the present invention proposes, comprises:

[0100] Step S100, big data preprocessing, cleaning up the data in the real world by fill...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a distributed-structure-based big data clustering method. The distributed-structure-based big data clustering method comprises the following steps: S100, preprocessing big data; S200, segmenting and managing the big data; S300, establishing a clustering hypergraph model; S400, mapping the big data, specifically, respectively mapping segmented data blocks to the hypergraphs H=(V, E), namely mapping each data block to one hypergraph; S500, clustering all the data blocks respectively by using the hypergraphs; S600, reclustering clustering results of all the data blocks, obtained in the step S500, to obtain a final clustering result. According to the distributed-structure-based big data clustering method, by using a cloud platform in combination with a hypergraph theory, the big data is mined and clustered, so that fast, real-time and accurate analysis and processing of the big data are achieved.

Description

technical field [0001] The invention relates to the field of data mining, in particular to a large data clustering method and device based on a distributed structure. Background technique [0002] Over the past half century, with the full integration of computer technology into social life, the information explosion has accumulated to a degree that has begun to trigger changes. Not only is it flooding the world with more information than ever before, but its growth rate is accelerating. The subject of information explosion, such as astronomy and genetics, created the concept of "big data". Today, this concept is applied to almost all areas of human intelligence and development. The 21st century is an era of great development of data and information. Mobile Internet, social network, e-commerce, etc. have greatly expanded the boundaries and application scope of the Internet, and various data are rapidly expanding and becoming larger. Internet (social, search, e-commerce), m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06F17/30

CPCG06F16/285G06F16/35

Inventor马泳宇

Owner品尚电子商务有限公司

Distributed-structure-based big data clustering method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology