Method for increasing calculation speed of SMP cluster system through MPI and OpenMP in hybrid parallel mode

A computing speed and cluster system technology, applied in the direction of concurrent instruction execution, machine execution devices, etc., can solve problems such as parallel optimization of algorithms, achieve the effect of reducing synchronization, reducing the number of barriers, and improving computing speed

Active Publication Date: 2015-03-25
INST OF SOFTWARE APPL TECH GUANGZHOU & CHINESE ACAD OF SCI
View PDF2 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in addition to the sparse matrix-vector multiplication, the conjugate gradient method also has calculation steps such as the multiplication and summation of multiple vectors and

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for increasing calculation speed of SMP cluster system through MPI and OpenMP in hybrid parallel mode
  • Method for increasing calculation speed of SMP cluster system through MPI and OpenMP in hybrid parallel mode
  • Method for increasing calculation speed of SMP cluster system through MPI and OpenMP in hybrid parallel mode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0087] In this embodiment, a method for solving large-scale linear equations is solved by using MPI and OpenMP mixed and parallel to improve the calculation speed for the SMP cluster system. The preconditioned conjugate gradient method is an iterative method for solving symmetric positive definite sparse matrix linear equations. It is widely used in engineering and scientific computing. Its algorithm is as follows:

[0088] take x (0) ∈ R n , calculate r (0) =b-Ax (0) , let p (0) =r(0)

[0089] For k=0, 1, 2, ..., calculate

[0090]

[0091] x (k+1) =x (k+1) +α k p (k)

[0092] r (k+1) =b-Ax (k+1) = r (k) -α k AP (k)

[0093] like Then output x′≡x (k+1) , to stop the calculation. otherwise,

[0094]

[0095] p (k+1) = r (k+1) +β k p (k)

[0096] Among them, in large-scale engineering and computing problems, x is a vector to be solved, b is a known vector, and A is a coefficient matrix, which is usually a large sparse matrix. A sparse matrix is ​​...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for increasing the calculation speed of an SMP cluster system through an MPI and an OpenMP in a hybrid parallel mode. The method comprises the steps that the number of MPI processes which can be called and the number of OpenMP threads are determined according to the number of calculation nodes and the number of usable CPU kernels in the nodes; an existing sub sparse matrix, a sub initial vector, a block vector and the maximum calculation tolerance are read in each process; a multi-thread compiling instruction is started for each process; circulation calculation of a precondition conjugate gradient method is conducted on all the processes, and the number of OpenMP barriers in circulation calculation is only three; if calculation errors are smaller than an allowable value, circulation is over, and otherwise circulation continues; calculation results of all the processes are reduced, and solutions of questions are output; when parallel calculation is conducted, firstly, MPI processes are started, multi-process decomposition is conducted on the questions, parallel among the nodes is started, each MPI process is allocated to one calculation node, and information is exchanged between the processes trough message transmission; then, in each MPI process, OpenMP guidance instructions are used for establishing one set of threads, and the threads are allocated to different processors of the calculation nodes to conduct parallel execution.

Description

technical field [0001] The invention relates to a parallel computing technology, in particular to a method for improving computing speed through parallel computing. Background technique [0002] The iterative method is currently the mainstream method for solving large sparse linear equations. The preconditioned conjugate gradient method in the iterative method is a method that reduces the number of iterations of the conjugate gradient method through preprocessing technology and can accelerate convergence. It is used in engineering and It has been widely used in scientific computing. The conjugate gradient method is a method for solving the numerical solution of a specific linear system, where the coefficient matrix is ​​a symmetric and positive definite real number matrix. With the increasing scale and complexity of scientific and engineering problems, the serial conjugate gradient method has been difficult to meet the requirements of the scale and speed of solving sparse l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/38
Inventor 罗海飙廖俊豪
Owner INST OF SOFTWARE APPL TECH GUANGZHOU & CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products