Data parallel sequencing method and system

A sorting method and data technology, applied in data classification, processing input data, concurrent instruction execution, etc., can solve problems such as speed and efficiency to be improved, and achieve avoiding workload and cache space, high scalability, and fast transmission speed Effect
CN103530084AInactive Publication Date: 2014-01-22BEIJING QIHOO TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN Β· China
Current Assignee / Owner
BEIJING QIHOO TECH CO LTD
Publication Date
2014-01-22
Estimated Expiration
Not applicable Β· inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

The invention discloses a data parallel sequencing method and system. The system comprises a data source, a plurality of parallel processing units connected with the data source through a network and a communication interface. The method comprises the steps that data to be sequenced are divided into a plurality of data blocks, and the parallel processing units respectively obtain the data blocks and conduct sampling; a first parallel processing unit summarizes and sequences sampled data of the parallel processing units, the global sequencing interval sequence is determined according to the parallel processing units, and data intervals in the global sequencing interval sequence sequentially correspond to the parallel processing units; the parallel processing units judge the data intervals to which the data in the data blocks obtained by the parallel processing units belong, and the data are distributed to corresponding parallel processing units; the parallel processing units receive the data and conduct global sequencing; global sequencing results of the parallel processing units are combined in a sequencing mode. The data parallel sequencing method and system improve sequencing speed of large-scale data and meanwhile have high expansibility to data volume.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a data processing method and system, in particular to a data parallel sorting method and system. Background technique

[0002] Global sorting of data in large-scale data processing is a common operation, such as PageRank calculation. Traditional sorting algorithms can be divided into inner sorting and outer sorting. Among them, internal sorting includes insertion sorting, quick sorting, etc., and all data needs to be loaded into the memory for calculation. When the data to be sorted is large-scale data, the stand-alone memory becomes the bottleneck. Outer sorting is mainly based on the sorting algorithm of multi-way merge, which can handle large-scale data, but the speed is slow. At present, in PageRank calculation, a global sorting of the final calculation results is required, and the data scale is hundreds of GB. Considering the calculation scale, a parallel mechanism needs to be introduced. However, the existing parallel ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More