Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Matrix operation-based parallel computing method

A matrix operation and parallel computing technology, applied in the field of computer parallel computing, can solve a large number of scheduling codes, performance differences and other problems, and achieve the effect of simple and clear thinking, reducing difficulty and simplifying parallelism

Inactive Publication Date: 2011-02-23
TSINGHUA UNIV
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Different mapping methods can often lead to very large performance differences. Therefore, to achieve the best mapping requires programmers to have a certain understanding of the underlying hardware structure of the accelerator and make targeted optimization
(3) Need to write a lot of scheduling code
Since the development tools of various accelerators are considered to be as versatile as possible, their application programming interfaces are relatively low-level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Matrix operation-based parallel computing method
  • Matrix operation-based parallel computing method
  • Matrix operation-based parallel computing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0072] Embodiment 1, real number matrix multiplication.

[0073] Take the calculation of square matrix multiplication as an example: R=AB, where R, A, and B are all 1024×1024 square matrix. Using the method described in the present invention to realize calculation only needs about 10 lines of codes. However, using the parallel programming language OpenCL requires nearly 1,000 lines of scheduling code and 100 lines of OpenCL kernel code. The specific test comparison data are shown in Table 1 and Table 2.

[0074] Table 1 performance comparison

[0075]

[0076] Table 2 Comparison of code volume

[0077] platform

Embodiment 2

[0078] Embodiment 2 is a matrix operation expression for finding the maximum value of multiple elements.

[0079] The following takes finding the maximum value of multiple elements as an example to introduce how to use matrix operations to express specific algorithms. Given n elements a 1 , a 2 , a 3 ,...a n are real numbers. The specific implementation steps are as follows:

[0080] (1) Based on the above defined matrix operation formula, describe the algorithm to be described as a formula composed of at least two matrix operations

[0081] For the calculation purpose of this embodiment: to solve the problem of the maximum value of multiple elements, the algorithm to be described is described as: C=AB, which is described as an operation formula for multiplying matrix A and matrix B.

[0082] (2) Define the matrix and expression method involved in the operation

[0083] In this embodiment, vector A is defined as a row vector composed of ai and the number of elements is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a matrix operation-based parallel computing method, which is mainly designed for simplifying a parallel acceleration program and reducing the realizing difficulty of a parallel algorithm. The method comprises the following steps of: describing an algorithm to be described as a matrix operation formula; abstracting a computing task as an abstracted multiplication operator and / or an abstracted addition operator among all matrixes with certain element types; mapping the matrix operation into an accelerator programming model; and executing the operation and outputting a result by using an accelerator. The method describes the parallelism of the algorithm by using a matrix operation rule so as to reduce the difficulty in describing the parallel algorithm. Meanwhile, the parallel algorithm is realized on the base of the generalized matrix operation, the characteristics of a specific algorithm can be eliminated as many as possible, the mapping of more general parallel algorithm to the accelerator is realized, and scheduling software can be designed according to the common matrix operation, so that the software can be more generally suitable for various algorithms, repetitive work for developing accelerator scheduling codes is reduced and a development process is simplified.

Description

technical field [0001] The invention relates to the field of computer parallel computing. Background technique [0002] Today, parallel computing has become one of the most important ways to improve computing performance. In recent years, some large-scale parallel computing units have been widely used in the field of general computing. Under the control of the central processing unit of the computer, these large-scale parallel computing units assist in the completion of computing tasks, which can greatly improve the efficiency of completing the entire computing task. Graphics Processing Unit (GPU, Graphics Processing Unit) is a large-scale parallel computing unit (hereinafter referred to as accelerator), which can not only complete calculations related to graphics rendering, but also complete general-purpose calculations that have nothing to do with graphics. Compared with general-purpose processors, graphics processing units have extremely high floating-point performance....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/16
Inventor 汪玉吴天际杨华中
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products