A Big Data Clustering Algorithm Based on Cloud Computing Platform

A cloud computing platform and clustering algorithm technology, applied in computing, digital data processing, special data processing applications, etc., can solve the problem of not considering the different effects of big data data points on knowledge discovery tasks, not considering the relative distance of data points, Uneven distribution of data and other problems, to achieve the effect of reducing data processing costs, facilitating development, and improving processing capacity and speed
CN103838863BInactive Publication Date: 2017-07-18INNER MONGOLIA UNIV OF SCI & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INNER MONGOLIA UNIV OF SCI & TECH
Publication Date
2017-07-18
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a large-data clustering algorithm based on a cloud computing platform. Primitive data is pre-processed; data are divided into M sub-data and distributed to M Map functions and local clustering is carried out on the sub-data; clusters with the same key are combined; if the number R of practical clusters is smaller than the number k of clusters, the number c of representative points and a constriction factor a are regulated and clustering is carried out again until termination conditions are achieved. if a new data set is generated, local clustering is carried out according to judgment conditions that the number K of new data source centers is larger than the obtained number K of clusters before updating or the number of new data source points is larger than the number of data source points before updating. According to the large-data clustering algorithm based on the cloud computing platform, the parallel computing capacity of a high-performance clustering system of cloud computing is used for solving the problem that mass data need to be processed in clustering, and therefore a relation of the data can be rapidly and efficiently dug up.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of data mining and relates to a big data clustering algorithm based on a cloud computing platform. Background technique

[0002] As an interdisciplinary subject in the fields of statistics, machine learning and data mining, cluster analysis has attracted many researchers to join in it, making it a very active research topic in the field of data mining research. So far, researchers at home and abroad have proposed many clustering algorithms. The main clustering methods can be divided into: partition-based methods, hierarchical-based methods, density-based methods, grid-based methods, and model-based methods. .

[0003] At the "Sixth International Symposium on Mobile Internet" held on August 21, 2012, Deng Kan, a Ph.D. in computer robotics from Carnegie Mellon, said that discovering the value in big data depends on data mining algorithms, and There must be data mining algorithms plus cloud computing parallel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More