Method for improving computation performance of CPU (Central Processing Unit) +GPU (Graphics Processing Unit) heterogeneous device

A computing performance, heterogeneous technology, applied in computing, machine execution devices, complex mathematical operations, etc., to achieve the effects of improving processing speed, strong portability, and reducing read-write conflicts

Active Publication Date: 2013-04-17
中原动力智能机器人有限公司
View PDF4 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The problem to be solved by the present invention is: for the calculation speed of the existing technology platform, the present invention proposes a heterogeneous device based on CPU+GPU, which realizes th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for improving computation performance of CPU (Central Processing Unit) +GPU (Graphics Processing Unit) heterogeneous device
  • Method for improving computation performance of CPU (Central Processing Unit) +GPU (Graphics Processing Unit) heterogeneous device
  • Method for improving computation performance of CPU (Central Processing Unit) +GPU (Graphics Processing Unit) heterogeneous device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] The technical solution of the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0015] according to image 3 A heterogeneous device can be a computer with a general-purpose graphics card, or an intelligent terminal with a CPU+GPU integrated single-chip processor. As a controller, the CPU includes a GPU-side control part and a data storage part; as a processor, the GPU includes a global memory for storing data and multiple thread blocks, each thread block can be executed independently and in parallel, and each thread block contains an independent Shared memory and multiple threads, each of which can execute in parallel. exist image 3 On a heterogeneous device, follow the figure 1 process, figure 1 It is a flowchart of the present invention, comprising the following major steps: (1) the CPU end accepts the multiplicand and the multiplier that the user needs to calculate M to N-bit multiplicati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of high-performance computation of computers, and provides a method for improving computation performance of a CPU (Central Processing Unit) +GPU (Graphics Processing Unit) heterogeneous device. By the method, large-scale multi-precision computation is accelerated on the heterogeneous device based on CPU+GPU. According to the technical scheme provided by the invention, the method comprises the following steps of: firstly, transmitting all multipliers and multiplicands in a GPU through the CPU; then, independently processing a pair of multi-precision multiplications in parallel by each thread block of the GPU, simultaneously, executing computation in parallel and executing parallel translation in each thread block by each pair of multiplications; and finally, tidying the result and returning the result back to a CPU memory, thus obtaining the computation result. The method provided by the invention realizes GPU parallel processing for a large quantity of computation tasks, and greatly improves the computation performance.

Description

technical field [0001] The invention relates to the field of computer high-performance computing, and relates to a method for improving the computing accuracy of a CPU+GPU heterogeneous device. Background technique [0002] Large-scale multi-precision numerical calculations are often used in fields such as data encryption and decryption, architectural simulation verification, and reliable calculations for scientific research. Since the number of digits involved is far greater than the hardware precision of the current computer processor CPU (up to 64bit or 128bit), it needs to be expanded on the basis of hardware precision. The traditional solution mainly relies on the serial processing of the CPU, which is restricted by the development of processing in terms of computing speed, has great limitations, and can no longer meet the increasing computing needs. [0003] Therefore, large-scale data processing using GPUs in conjunction with CPUs has also gradually emerged, such as:...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F7/523G06F9/38G06F17/14
Inventor 李清都胡明杨芳艳唐宋冯鑫胡诗沂徐桂兰
Owner 中原动力智能机器人有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products