Automatic fine-grained two-stage parallel translation method

A fine-grained and automatic technology, applied in the computer field, can solve the problem of low computing efficiency, and achieve the effect of improving computing efficiency, simple logic branching, and high computing density

Pending Publication Date: 2022-04-26
WUHAN UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention proposes an automatic fine-grained two-level parallel translation method to solve or at least partially solve the technical problem of low computational efficiency in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic fine-grained two-stage parallel translation method
  • Automatic fine-grained two-stage parallel translation method
  • Automatic fine-grained two-stage parallel translation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The invention relates to an automatic fine-grained two-level parallel translation method for large data task processing. Firstly, the source C code is parsed by ANTLR to automatically generate an EBNF grammar description. According to the description of EBNF, ANTLR generates corresponding lexical and syntactic analyzers for the abstract syntax tree. The loop information extracted from the parser is analyzed, and if flow dependencies are found, the loop statements containing these dependencies are not parallelizable and flagged. If anti-dependencies and output dependencies between data are found, the loop structure is processed to eliminate dependencies. Such a loop statement is parallelizable if there are no data dependencies. After eliminating the anti-dependence and output dependencies caused by variable reuse, the array privatization technique is used to localize the storage unit related to the loop iteration, so that it can be separated from the interaction of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an automatic fine-grained two-stage parallel translation method, which comprises the following steps of: firstly, analyzing a source C code through ANTLR, automatically generating an EBNF grammar description, and generating a corresponding lexical and grammar analyzer; secondly, loop information extracted from the analyzer is analyzed, and if flow dependency relationships are found, loop statements containing the dependency relationships cannot be parallelized; and if the anti-dependency relationship and the output dependency relationship between the data are found, eliminating the dependency relationship. And if the data dependency relationship does not exist, the loop statement is parallelizable. According to the method, the parallel loop structure is mapped to a structure suitable for CUDA and CPU multi-thread execution, then corresponding CUDA codes and CPU multi-thread codes are generated, computing resources can be saved, and computing efficiency can be improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to an automatic fine-grained two-stage parallel translation method. Background technique [0002] Since NVIDIA released the GeForce 256 graphics processing chip and proposed the GPU concept in 1999, it has become one of the main options for accelerator components in current high-performance computing systems due to its powerful computing capabilities, flexible programming capabilities, and low power consumption. Widely used in computationally intensive programs. For the same task, compared to executing serial programs on the CPU, running parallel programs on the GPU can greatly reduce the running time, especially when processing large data, GPU parallel computing has more obvious advantages. [0003] Current parallel programming methods include MPI, OpenCL, OpenMP, etc. However, it is still a big challenge to manually or semi-automatically convert a large number of serial program...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F8/41G06F9/50
CPCG06F8/427G06F8/433G06F9/5027
Inventor 刘金硕黄朔邓娟刘宁王晨阳唐浩洲
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products