Sample Distributed Clustering Calculation Method and Device

A distributed clustering and computing device technology, applied in the information field, can solve problems such as difficult to deal with large-scale data, and achieve the effect of ensuring clustering effect, improving computing efficiency, and widening application range
CN105095382BActive Publication Date: 2018-09-14北京鸿享技术服务有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
北京鸿享技术服务有限公司
Publication Date
2018-09-14

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a method and a device for sample distributed clustering calculation. The method comprises that characteristic values of all the samples to be clustered are acquired to constitute a characteristic value set; a calculation speed of each kind of usable calculation equipment is estimated; and before a similarity degree between any two characteristic values in the characteristic value set is less than a preset threshold value, the following steps are repeated, wherein all the characteristic values in the characteristic value set are distributed to at least one kind of calculation equipment according to the calculation speed of each usable calculation equipment, so that the at least one kind of calculation equipment can screen the distributed characteristic values on the premise that processing time satisfies a preset condition, and the similarity degree between any two characteristic values is less than the preset threshold value. The method and the device for the sample distributed clustering calculation provided by the invention can solve the problem that existing clustering algorithms can hardly deal with a large scale of data, can realize the distributed clustering calculation of the large scale of data which can hardly be handled by the prior art, and can effectively increase efficiency of the clustering calculation.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the field of information technology, in particular to a sample distributed clustering calculation method and device. Background technique

[0002] Clustering is a process of dividing a collection of physical or abstract objects into multiple classes composed of similar objects. It is widely used in the processing of various types of information, such as the integration and analysis of news texts, the organization of data files and the index creation, etc. In the prior art, common algorithms can be divided into Partitioning Methods, Hierarchical Methods, Density-Based Methods, Grid-Based Methods, Model-Based Methods The method (Model-Based Methods).

[0003] For example, in the partition method, given a data set with N tuples or records, the split method will construct K groups, and each group represents a cluster, K<N. And these K groups meet the following conditions:

[0004] (1) Each group contains at least one data re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More