Graphics processing unit based matrix transpose optimization method

A graphics processor and matrix transposition technology, applied in the direction of processor architecture/configuration, machine execution device, complex mathematical operations, etc. The effect of improving operation speed and operation accuracy and improving bandwidth utilization

Active Publication Date: 2014-04-30
北京新松佳和电子系统股份有限公司
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There is currently no memory-access-optimized opera

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graphics processing unit based matrix transpose optimization method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0030] Fig. 1 has shown the flow chart of the matrix transposition parallel optimization method based on graphics processor of the present invention, and the hardware platform that embodiment adopts is: ASUS motherboard, graphics card; Software platform is: Microsoft operating system, Microsoft Corporation development kit, but Not limited to this.

[0031] The matrix transposition optimization based on graphics processor of the present invention comprises the following steps:

[0032] Step S1: The transposed matrix has R rows and S columns, and each matrix element is a complex number, that is, includes the real part and the imaginary part of the number. Since there is no direct operation function for a two-dimensional ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a graphics processing unit based matrix transpose optimization method. The method includes: converting an R-line S-column input matrix into a one-dimensional array, allocating storage spaces and copying data; setting a two-dimensional index space; computing global identifiers, workgroup identifiers and local identifies of work items; partitioning the matrix, and corresponding matrix partitions to workgroups; applying for local memories in the workgroups, copying the data into the local memories and synchronously waiting for completion of data copying; computing column and line indexes of the transposed data in a global memory; computing locations of output data in the global memory and local memories; assigning the data of the local memories to the one-dimensional array in the global memory to realize conflict-free memory consolidation access; and copying the one-dimensional array to the memories to form an S-line R-column superposed matrix. By the method, parallel computation of consolidation access and matrix transpose is realized, and execution efficiency of programs is improved.

Description

technical field [0001] The invention belongs to the technical field of general computing graphics processors, and mainly relates to a matrix transposition optimization method based on graphics processors. Background technique [0002] General Purpose Computing on Graphics Processing Units-GPGPU (General Purpose Computing on Graphics Processing Units-GPGPU) is a technology that uses the graphics processor of a graphics card to handle general computing tasks. The graphics processing unit shares the computing tasks of the central processing unit, increasing the processing speed of the computer by hundreds or thousands of times, or even faster. This resulted in the Open Computing Language (Open Computing Language). Managed by a Computing Working Group made up of representatives from various processor and software manufacturers, the Open Computing Language provides a set of standard application programming interfaces that make programming graphics processors easier for programme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/16G06F9/38G06T1/20
Inventor 田卓樊双丽
Owner 北京新松佳和电子系统股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products