GPU-based distributed big data parallel computing method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A parallel computing and big data technology, applied in computing, electrical digital data processing, resource allocation, etc., can solve problems such as complex management, high cost of working nodes, insufficient number of working nodes, etc., and achieve the effect of improving efficiency

Inactive Publication Date: 2019-08-30

BEIJING INSTITUTE OF TECHNOLOGYGY

View PDF1 Cites 8 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The additional working nodes in this way are costly and complex to manage, and the number of working nodes is far from enough to achieve even close to the parallelism at the key-value pair level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0035] Embodiment 1 specifically realizes a kind of GPU-based distributed big data parallel computing method proposed by the present invention, and its data flow diagram is as follows figure 1 shown.

[0036] This embodiment is based on the design idea of google MapReduce. To improve the efficiency of the Map and Reduce phases, the most direct way is to increase the number of working nodes and further subdivide the parallel granularity. However, if you increase the number of CPUs in the network or increase the number of CPU physical cores to increase the number of work nodes, the cost is high, the management is complicated, and the number of work nodes is far from enough to achieve or even approach the level of key-value pair parallelism.

[0037] GPU is a massively parallel computing hardware whose thread architecture and storage structure can be abstracted as figure 2structure shown. Each computing device (Compute Device) has several computing units (Compute Unit), and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a GPU-based distributed big data parallel computing method, including Map, Group, Reduce steps. In the Map step, a user program is executed on each input key value pair so asto be converted into an intermediate key value pair. In the Group step, all intermediate key value pairs are sorted and grouped. In the Reduce step, the user program is used for processing the groupedintermediate key value pairs, and a final calculation result is obtained. In the Map step and the Reduce step, each working node corresponds to one GPU thread, and the input key value pairs are submitted to different GPU threads for parallel processing. The GPU is used as a distributed working node for big data parallel computing, and equipment memory, thread scheduling and data sorting are effectively managed and optimized in the distributed computing process, so that the distributed computing efficiency can be effectively improved.

Description

technical field [0001] The invention relates to a parallel computing method, in particular to a GPU-based distributed big data parallel computing method. Background technique [0002] MapReduce was first proposed by Google as a parallel computing model and method for large-scale data processing. In two papers, Google announced the basic principles and main design ideas of MapReduce. Apache Hadoop is a set of open source software utilities that is basically an open source implementation of Google's MapReduce framework. [0003] The idea of MapReduce itself is not complicated, and its core idea is to process data at each stage in the form of key-value pairs. In general, MapReduce is generally divided into three stages: Map, Group, and Reduce. The specific input and output and processing flow of the three stages are as follows: [0004] The input of the Map stage is a data set of key-value pairs in a prescribed form. The input phase of Map has no special requirements for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F9/50

CPCG06F9/5027

Inventor 黄天羽毛续锟丁刚毅李鹏

Owner BEIJING INSTITUTE OF TECHNOLOGYGY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

GPU-based distributed big data parallel computing method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology