Method for optimizing finite difference algorithm in heterogeneous many-core framework

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of finite difference and optimization method, applied in the field of high-performance computing, can solve the problems of limited simulation range and simulation time, and low performance of finite difference numerical algorithm, so as to reduce the generation of bubbles and speed up the execution of instructions.

Inactive Publication Date: 2016-10-12

THE PLA INFORMATION ENG UNIV +2

View PDF3 Cites 8 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The technical problem to be solved by the present invention is to solve the problem that the performance of the finite difference numerical algorithm is low when running on a heterogeneous many-core architecture, and is limited by the simulation range and simulation time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0026] Example 1: Combining Figure 1-Figure 2 , an optimization method for a finite difference algorithm in a heterogeneous many-core architecture, the finite difference algorithm is optimized using a three-step progressive optimization method, and the specific steps of the three-step progressive optimization method are:

[0027]Step 1. Basic optimization, extracting loop invariants to reduce calculation intensity, eliminating loop branches to facilitate vectorization; specifically: reducing calculation intensity through loop expansion and invariant extraction basic optimization methods, changing initial values of loop variables and exit conditions to eliminate branches judge.

[0028] Step 2. Parallel optimization, using the OpenMP parallel model, by adding pragmas before the core loop to achieve thread-level parallelism, using built-in vector instructions to rewrite the core loop to achieve instruction-level parallelism; specifically: after the loop is divided into blocks...

specific Embodiment 2

[0035] Specific embodiment 2: combine Figure 1-Figure 2 , see figure 1 , figure 2 , the finite-difference numerical algorithm optimization method in the heterogeneous many-core system of the present invention, in the hybrid heterogeneous high-performance computer system based on the combination of many-core accelerator (MIC) and multi-core general-purpose processor (CPU), by transforming the initial value of the loop variable And the exit condition eliminates the branch judgment, because when the processor processes the conditional branch, the branch prediction logic unit will use a statistical method to predict the calculation result before the calculation result is available. Once the branch prediction error occurs, the instruction pipeline will return to the The branch position generates pipeline bubbles, resulting in waste of clock cycles. In addition, after the branch prediction fails, the compiler cannot continue to perform subsequent optimizations such as loop unrol...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of high-performance calculation, and relates to a method for optimizing a finite difference algorithm in a heterogeneous many-core framework. The method is used for optimizing the finite difference algorithm in a many-core accelerator (MIC) and multi-core general processor (CPU)-based hybrid heterogeneous high-performance computer system by using three progressive optimization methods. The method mainly comprises a basic optimization method, a parallel optimization method and a heterogeneous collaborative optimization method. The method disclosed in the invention has the beneficial effects as follows: the three progressive optimization methods are used for solving the problems of low calculation performance and bad parallel effect caused by leap-type access and parallel execution lack when converting the finite difference algorithm from a many-core system to a heterogeneous many-core; the method is an optimization method with high efficiency and expandability, and can be used for weakening the calculation strength and clearing obstacles for vectorization through basic optimization methods such as branch elimination, loop unrolling and invariant switching; and the parallel optimization method such as a core algorithm is rewritten by using a vector instruction set through analyzing data dependency and circulating partitioning, and a multi-threading and long-vector mechanism of the many-core processor is fully utilized.

Description

technical field [0001] The present invention belongs to the technical field of high-performance computing, specifically, belongs to the technical field of cooperative optimization of CPU and MIC in heterogeneous systems in the field of high-performance computers, and specifically relates to an optimization method of a finite difference algorithm in a heterogeneous many-core architecture. Background technique [0002] MIC (Many Integrated Cores), that is, "many-core architecture" has an architecture far more than CPU cores, and supports parallel computing functions with CPUs. [0003] In recent years, with the development of massively parallel architectures, heterogeneous many-core architectures have been widely used in the field of supercomputing. It can be seen from the Top500 list of supercomputers released every six months that more and more MICs that focus on parallel processing performance are integrated in high-performance clusters. Among them, the list released in Nov...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F9/30G06F9/38

CPCG06F9/30098G06F9/3851G06F9/3867

Inventor 许瑾晨张乾坤郝鑫单征戴涛周蓓郭绍忠

Owner THE PLA INFORMATION ENG UNIV

Method for optimizing finite difference algorithm in heterogeneous many-core framework

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

specific Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology