Vector processor oriented large matrix multiplied vectorization realizing method

A technology of vector processors and large matrices, which is applied in the field of vector processors and data processing, can solve problems such as difficult to obtain performance, and achieve the effects of easy implementation, simple steps, and improved efficiency

Active Publication Date: 2012-04-11
NAT UNIV OF DEFENSE TECH
View PDF7 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method can achieve better results when the size of the matrix is ​​small, but as the size of the matrix gradually increases, it is difficult to ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vector processor oriented large matrix multiplied vectorization realizing method
  • Vector processor oriented large matrix multiplied vectorization realizing method
  • Vector processor oriented large matrix multiplied vectorization realizing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] Such as image 3 Shown, the vectorized realization method of the vector processor-oriented large matrix multiplication of the present invention comprises the following steps:

[0028] 1. Input the multiplicand matrix A and the multiplier matrix B; through the DMA controller, the multiplicand matrix A and the multiplier matrix B are transferred to the vector storage unit respectively. During the transfer process, such as Figure 4 As shown, the multiplier matrix B is reordered, that is, the 1st to nth rows in the multiplier matrix B are sequentially sorted into the 1st to nth columns.

[0029] Through the configuration of the DMA controller, each row of the multiplicand matrix A can be organized into a data frame, each column of the multiplier matrix B can be organized into a data frame, and the entire multiplier matrix B can be divided into p data frames. When the number of elements in the data frame is not equal to the multiple of the number K of parallel processing u...

Embodiment 2

[0034] Such as Figure 7 Shown, adopt the vectorization realization method of the vector processor-oriented large matrix multiplication of the present invention, the calculation scale is that the matrix of 16 * 16 is multiplied by the matrix of 16 * 16 (the vector processing unit number K is 8), Include the following steps:

[0035] 1. If Figure 6 As shown, input the multiplicand matrix A (16×16) and the multiplier matrix B (16×16); the multiplicand matrix A and the multiplier matrix B are transferred to the vector storage unit through DMA, and the multiplier matrix is ​​realized in this process The reordering of B (the reordering method is the same as in embodiment 1), the storage mode of the multiplicand matrix A and the multiplier matrix B in the vector unit is as follows Figure 5 (1) and Figure 5 (2) shown.

[0036]2. Load the elements of one row of the multiplicand matrix A and the elements of one column of the multiplier matrix B into the vector processing unit. S...

Embodiment 3

[0041] Such as Figure 10 As shown, the vectorized implementation method of vector processor-oriented multiplication of large matrices of the present invention, the calculation scale is 26 * 22 matrix multiplied by the scale of 22 * ​​27 matrix (the vector processing unit number K is 8), including The following steps:

[0042] 1. If Figure 9 As shown, the multiplicand matrix A and the multiplier matrix B are transferred to the vector storage unit by DMA, and the reordering of the multiplier matrix B is realized in this process (the reordering method is the same as that in Embodiment 1), and the multiplicand matrix A and multiplier matrix B are complemented with 0, and the storage method of multiplicand matrix A and multiplier matrix B in the vector unit is as follows Figure 8 (1) and Figure 8 (2).

[0043] 2. Load the elements of one row of the multiplicand matrix A and the elements of one column of the multiplier matrix B into the vector processing unit. Here, the rows...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a vector processor oriented large matrix multiplied vectorization realizing method, which comprises the following steps of: (1) inputting a multiplicand matrix A and a multiplier B; transporting the multiplicand matrix A and multiplier B to a vector storing unit by a DMA (direct memory access) controller; in transporting process, ordering the first to number n lines of the multiplier B into first to number n columns; (2) loading elements in one line of the multiplicand matrix A and in one column of the multiplier B to K parallel processing units and multiplying the elements in a one-to-one correspondence manner; reducing and summing the multiplied results in one pointed parallel processing unit; storing the summed result as a result matrix element in a vector storing unit; and (3) transferring to next line of the multiplicand matrix A and next column of the multiplier B; re-executing the step (2) until calculating all data frames and acquiring a result matrix C composed of matrix elements. The vectorization realizing method disclosed by the invention has the advantages of simple principle, convenient operation and capability of improving calculating efficiency.

Description

technical field [0001] The invention mainly relates to the fields of vector processors and data processing, in particular to a vectorized realization method of multiplication of large matrices. Background technique [0002] Matrix multiplication operations are involved in many scientific computing tasks and applications, such as image processing, signal encoding and decoding in communication systems, etc. For large-scale matrix multiplication calculation tasks, due to the large number of multiplication and addition operations involved, it is necessary to Takes a lot of computing time. How to implement matrix multiplication operations on processors simply and efficiently has always been a research hotspot in the industry. [0003] On traditional scalar processors, researchers have proposed a variety of effective matrix multiplication implementation methods to reduce the impact of data sorting operations on the completion of the entire matrix multiplication operation. Howeve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/16
Inventor 刘仲陈书明陈跃跃曾咏涛刘衡竹陈海燕龚国辉彭元喜陈胜刚
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products