Sparse matrix vector multiplication vectorization implementation method

A sparse matrix and vector multiplication technology, applied in the field of multi-core vector processors, can solve the problems of waste of computing resources, hardware overhead (large area, power consumption, etc.), reduce the number of times to read data, reduce computing time, and improve computing efficiency Effect

Active Publication Date: 2020-10-30
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But using this type of method, on the one hand, maxnonzeros is not necessarily an integer multiple of the number of VPEs (for example, maxnonzeros=27, the number of VPEs=16), when maxnonzeros is not an integer multiple of the number of VPEs, it will lead to waste of computi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse matrix vector multiplication vectorization implementation method
  • Sparse matrix vector multiplication vectorization implementation method
  • Sparse matrix vector multiplication vectorization implementation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0047] The present embodiment assumes that the problem solved by the SPMV algorithm is: Ax=b, wherein A is a sparse matrix to be calculated, the number of rows of the matrix is ​​n, the maximum number of non-zero elements in each row is maxnonzeros, and x is a dense vector to be calculated, The number of elements is n, b is the result vector, and the number of elements is n. Such as image 3 As shown, the steps of the implementation method of sparse matrix-vector multiplication and vectorization for multi-core vector processors in this embodiment are as follows:

[0048] Step 1. Perform data storage according to the TELL (Transposed ELLPACK) data format: construct the first matrix AV and the second matrix AC in DDR respectively, read the data values ​​of non-ze...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sparse matrix vector multiplication vectorization implementation method, which comprises the following steps of: 1, respectively constructing a first matrix and a second matrix in a DDR (Double Data Rate), reading non-zero element data values in a sparse matrix to be calculated according to rows and storing the non-zero element data values in the first matrix according tocolumns, and storing corresponding column index values in the second matrix according to columns; 2, configuring a data buffer area in AM; 3, respectively transmitting the dense vector to be calculated and the data in the second matrix from the DDR to the GSM; 4, sequentially transmitting the data in the first matrix from the DDR to a data buffer area; 5, reading a column index value in the second matrix, reading a data value of the to-be-calculated dense vector according to the read column index value, and transmitting the data value to a data buffer area; and step 6, performing vectorization calculation on the data in the data buffer area. The method has the advantages of simple implementation operation, high resource utilization rate, high calculation efficiency, low hardware overheadand the like.

Description

technical field [0001] The invention relates to the technical field of multi-core vector processors, in particular to a method for implementing vectorization of sparse matrix-vector multiplication for multi-core vector processors. Background technique [0002] HPL (High Performance Linpack) is the most important benchmark test program for evaluating the performance of high-performance computer systems, and it is also the main evaluation index. What HPL solves is a large-scale dense linear equation system, which has good data locality, and is easy to develop parallelism and locality. However, with the continuous development of high-performance applications, a large number of scientific computing applications, such as large-scale marine weather forecasting, have application characteristics that do not match HPL. Therefore, since 2013, HPCG (HighPerformance Conjugate Gradient) has been used as a new benchmark test program to supplement the performance evaluation of high-perfor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/16
CPCG06F17/16
Inventor 刘仲郭阳鲁建壮田希李程陈海燕刘胜陈小文雷元武王丽萍
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products