Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data parallel sequencing method and system

A sorting method and data technology, applied in data classification, processing input data, concurrent instruction execution, etc., can solve problems such as speed and efficiency to be improved, and achieve avoiding workload and cache space, high scalability, and fast transmission speed Effect

Inactive Publication Date: 2014-01-22
BEIJING QIHOO TECH CO LTD +1
View PDF4 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing parallel sorting method performs serial processing in the sampling link, so the speed and efficiency still need to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data parallel sequencing method and system
  • Data parallel sequencing method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0014] figure 1 A parallel sorting method according to an embodiment of the present invention is shown. The method is based on a system comprising a data source, a plurality of parallel processing units connected to the data source through a network, and an inter-parallel unit communication interface such as MPI enabling the plurality of parallel processing units to exchange data with each other, wherein the number of parallel proc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data parallel sequencing method and system. The system comprises a data source, a plurality of parallel processing units connected with the data source through a network and a communication interface. The method comprises the steps that data to be sequenced are divided into a plurality of data blocks, and the parallel processing units respectively obtain the data blocks and conduct sampling; a first parallel processing unit summarizes and sequences sampled data of the parallel processing units, the global sequencing interval sequence is determined according to the parallel processing units, and data intervals in the global sequencing interval sequence sequentially correspond to the parallel processing units; the parallel processing units judge the data intervals to which the data in the data blocks obtained by the parallel processing units belong, and the data are distributed to corresponding parallel processing units; the parallel processing units receive the data and conduct global sequencing; global sequencing results of the parallel processing units are combined in a sequencing mode. The data parallel sequencing method and system improve sequencing speed of large-scale data and meanwhile have high expansibility to data volume.

Description

technical field [0001] The invention relates to a data processing method and system, in particular to a data parallel sorting method and system. Background technique [0002] Global sorting of data in large-scale data processing is a common operation, such as PageRank calculation. Traditional sorting algorithms can be divided into inner sorting and outer sorting. Among them, internal sorting includes insertion sorting, quick sorting, etc., and all data needs to be loaded into the memory for calculation. When the data to be sorted is large-scale data, the stand-alone memory becomes the bottleneck. Outer sorting is mainly based on the sorting algorithm of multi-way merge, which can handle large-scale data, but the speed is slow. At present, in PageRank calculation, a global sorting of the final calculation results is required, and the data scale is hundreds of GB. Considering the calculation scale, a parallel mechanism needs to be introduced. However, the existing parallel ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F7/24G06F9/38G06F17/30
Inventor 陈建唐会军齐路
Owner BEIJING QIHOO TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products