Block matrix multiplication vectorization method supporting vector processor with multiple MAC (multiply accumulate) operational units

A vector processor and computing component technology, which is applied in the field of data processing to achieve high-performance computing capabilities, easy operation, and improve the computing-to-memory ratio.

Active Publication Date: 2013-09-11
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The vector data access unit supports Load/Store of vector data and provides a large-capacity dedicated vector memory instead of the Cach

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Block matrix multiplication vectorization method supporting vector processor with multiple MAC (multiply accumulate) operational units
  • Block matrix multiplication vectorization method supporting vector processor with multiple MAC (multiply accumulate) operational units
  • Block matrix multiplication vectorization method supporting vector processor with multiple MAC (multiply accumulate) operational units

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0021] Such as figure 2 As shown, the present invention supports the block matrix multiplication vectorization method of the multi-MAC operation unit vector processor, and the specific process is:

[0022] (1) First, according to the number p of the vector processing unit VPE of the vector processor, the number m of MAC operation units in the VPE, the capacity s of the vector memory, and the data size d of the matrix elements, determine the optimal sub-matrix block size blocksize , determine the number of columns and rows of the sub-matrix of the multiplier matrix B and determine the number of rows and columns of the sub-matrix of the multiplicand matrix A.

[0023] (2) Divide the capacity s of the vector memory into two storage areas with equal capacity, Buffer0 and Buffer1, and realize the multiplication of the sub-matrix between ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A block matrix multiplication vectorization method supporting a vector processor with multiple MAC (multiply accumulate) operational units includes the steps of (1), determining the optimum block size of submatrix, the quantity of lines and rows of the submatrix of a multiplier matrix B and the quantity of lines and rows of the submatrix of a multiplier matrix A according to the quantity p of vector processing elements (VPE) of the vector processor, the quantity m of the MAC operational units in the VPE, the capacity s of a vector memory and data size d of matrix elements, (2) equally dividing the capacity s of the vector memory into two storage areas of Buffer 0 and Buffer 1, and realizing multiplication of the submatrix in the Buffer 0 and the Buffer 1 in a Pingpong mode until completing the multiplication operation of the whole matrix. The block matrix multiplication vectorization method has the advantages of easiness in implementing, convenience in operation, capabilities of improving parallelism of the vector processor and increasing operation efficiency thereof, and the like.

Description

technical field [0001] The invention mainly relates to the technical field of data processing, in particular to a block matrix multiplication vectorization method supporting a multi-MAC operation unit vector processor. Background technique [0002] With the increasing demand for high-performance computing for computing-intensive applications such as solving large-scale dense linear equations, 4G wireless communications, radar signal processing, high-definition video and digital image processing, computer architectures have undergone significant changes, and many new architectures have emerged, such as Many-core architecture, heterogeneous multi-core architecture, stream processor architecture and vector processor architecture, etc. These new architectures integrate multiple processor cores on a single chip, and each core contains a wealth of computing components , greatly improving the computing performance of the chip; at the same time, it also poses new challenges to softw...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/16
Inventor 刘仲陈书明窦强郭阳刘衡竹田希龚国辉陈海燕彭元喜万江华刘胜陈跃跃扈啸吴家铸
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products