Unlock instant, AI-driven research and patent intelligence for your innovation.

An adaptive parallel method for traversing neighbors within a fixed radius under a cpu-gpu heterogeneous framework

An adaptive and traversal technology, applied in the field of high-performance computing, can solve problems such as waste of computing resources, achieve the effect of low cost and reduce redundant computing

Active Publication Date: 2019-03-01
EAST CHINA NORMAL UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional parallel model often uses the most suitable processor (CPU or GPU) in the host computer to calculate a specific problem, while other devices are idle, which is a waste of computing resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An adaptive parallel method for traversing neighbors within a fixed radius under a cpu-gpu heterogeneous framework
  • An adaptive parallel method for traversing neighbors within a fixed radius under a cpu-gpu heterogeneous framework
  • An adaptive parallel method for traversing neighbors within a fixed radius under a cpu-gpu heterogeneous framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The present invention will be described in detail below in conjunction with the accompanying drawings.

[0052] outline design

[0053] Like the traditional method, in the present invention, the simulation space is regarded as a grid composed of many disjoint cells. Every point in the space must fall in one of the cells. And the side length of the cell is the length of the lookup radius. Therefore, each point only needs to traverse the cell where it is located and the points in the cells around this cell to find all the points within the specified distance. In this way, the original time complexity can be reduced to O(n 2 )’s brute-force method drops to O(3 k wnN).

[0054] In order to overcome the unbalanced load of different threads in a thread block and the inconsistency of memory access in the traditional method, the present invention proposes an adaptive parallel method, so that the points responsible for the threads in a thread block are in the same cell mid...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an adaptive parallel algorithm for traversing neighbors in a fixed radius under a CPU-GPU (Central Processing Unit-Graphic Processing Unit) heterogeneous framework. The algorithm utilizes one parallel model, so that various characteristics of a GPU can fit with the character of the problem. The algorithm firstly introduces an adaptive parallel concept to recombine each thread in the GPU, so that the physically adjacent threads can process logically similar work, and thus many local characteristics in the GPU can be utilized. Secondly, the CPU-GPU heterogeneous framework is utilized, so that the CPU carries out coordination processing on some low-efficiency affairs which are generated by using adaptive parallel for the GPU. In order to show the characteristics of the algorithm, the algorithm is applied to a smoothed particle hydrodynamics method (SPH) and is compared with the existing method, and the algorithm reflects a great advantage on a problem of processing large-scale high-density particles.

Description

technical field [0001] The invention belongs to the field of high-performance computing, specifically a new parallel method for traversing neighbors within a fixed radius based on an adaptive parallel method under a CPU-GPU heterogeneous framework, involving SIMD architecture, GPU hardware characteristics, heterogeneous Task scheduling and load balancing under the platform, data interaction strategy, computer graphics and simulation, etc. Background technique [0002] The FNN problem is to deal with the interaction between all points within a given distance in a multi-dimensional Euclidean space. The fixed grid method is the most widely used method among them, especially in numerical methods. This method is widely used in natural environment simulation, biological simulation, behavior simulation and 3D reconstruction. Through this method, the time complexity of constructing neighbor information can be reduced to O(wn) (if a non-comparison-based sorting method is used), and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/46G06F9/50
CPCG06F9/466G06F9/5083
Inventor 阮骥鸣王长波秦洪
Owner EAST CHINA NORMAL UNIV