Parallel acceleration implementation method of N-body simulation in heterogeneous architecture

An implementation method and heterogeneous technology, applied in multi-program devices, program control design, instruments, etc., can solve the problems of huge amount of calculation and memory access, long calculation time, etc., to achieve correct calculation results, improved iteration speed, The effect of reducing load imbalance

Pending Publication Date: 2022-05-13
SHANGHAI JIAO TONG UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The N-body problem is mainly to solve the calculation that each particle in the space is affected by the other particles. Since it is necessary to calculate the force between each particle and all other particles, the complexity is O(N 2 ), when the number of particles is large, the amount of calculation and memory access is very large, and the calculation time is very long

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel acceleration implementation method of N-body simulation in heterogeneous architecture
  • Parallel acceleration implementation method of N-body simulation in heterogeneous architecture
  • Parallel acceleration implementation method of N-body simulation in heterogeneous architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Such as figure 1 As shown, this embodiment includes the following steps:

[0027] Step 1. At the beginning, the program reads the position information file *_xp_*.bin, the velocity information file *_vp_*.bin and the number of particles in each coarse grid file *_np_*.bin, coarse Grid velocity field information files *_vc_*.bin, checkpoint files z_checkpoint.txt and z_halofind.txt.

[0028] Step 2. Initialize the message passing interface (MPI) environment and the fast Fourier transform (FFT) environment according to parameter settings, and divide each process into a processing area, specifically: virtualize the entire space into a large cube, and divide the The space is equally divided into n 3 A small space, that is, image, corresponds to a process to process; each image is further divided into nc 3 coarse grid, and each coarse grid is divided into ncell 3 A fine grid, the junction area of ​​adjacent images is divided into ncb coarse grid areas, that is, the buffe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A parallel acceleration implementation method for N-body simulation in a heterogeneous architecture comprises the steps that a buffer area is updated after initialization, and after information transmission between a CPU memory and a GPU video memory, short-distance force between particles is calculated through a GPU in sequence by means of a bucket algorithm, and a result is returned to a CPU end; calculating long-range force and acceleration among the particles through the CPU, updating the speed of the particles, finally updating the speed information of the particles in the buffer area, and ending simulation when conditions are met; according to the method, the functions of main program calculation, data reading, data output and the like are carried out at the CPU end, hotspot function calculation in the program is carried out at the GPU end, force among particles is divided into short-range force and long-range force based on a particle-grid algorithm, and rapid calculation is realized by utilizing the calculation power and architecture characteristics of the CPU and the GPU.

Description

technical field [0001] The present invention relates to a technology in the field of computer simulation, in particular to a method for realizing parallel acceleration of N-body simulation in a heterogeneous architecture, which can be applied to industrial fields such as new material research and development, medical research and development, and real-time simulation of games. Background technique [0002] The N-body problem is one of the most representative, challenging and important topics in the field of high-performance computing. It has a wide range of applications. When the particles are celestial bodies in the macroscopic world, the formation of galaxies can be simulated. When the particles are In the case of molecules and plasmas in the microscopic world, the process of nuclear fusion can be simulated. The N-body problem is mainly to solve the calculation that each particle in the space is affected by the other particles. Since it is necessary to calculate the force ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
CPCG06F9/5027G06F9/505G06F2209/5018
Inventor 文敏华胡航王一超韦建文林新华
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products