A sparse matrix storage method on simd many-core processor with multi-level cache

A technology of many-core processors and sparse matrix, which is applied in the field of parallel programming, can solve the problems of missing cache, high memory access delay overhead, low x-vector data reuse rate, low SIMD utilization rate, etc., to improve utilization rate, The effect of high density and improved computing efficiency
CN104636273BActive Publication Date: 2017-07-25UNIV OF SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIV OF SCI & TECH OF CHINA
Publication Date
2017-07-25

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a storage method of a sparse matrix on an SIMD multi-core processor with a multi-level cache. The method includes the steps of firstly, the maximum value a of the number of row nonzero elements in a matrix A and the number b of the nonzero elements which can be calculated at the same time by a processor SIMD unit are acquired, and a minimum value which is larger than a and is the multiple of b is calculated to serve as a temperature row width; secondly, for the matrix A, array Value and Colidx respectively sequentially stores the nonzero element value of each row and line coordinates, and 0 and -1 are respectively supplemented to the tail of each row whose number of the nonzero elements does not reach the temporary row width; thirdly, partitioning according to b lines is performed on Colidx and Value; fourthly, each line block is compressed according to rows, and the rows, with the nonzero elements, in the line block is allowed to centralize on the upper portion of the line block; sixthly, partitioning is performed on the line blocks according to b rows to obtain sub-blocks; all-zero sub-blocks are removed, and the sub-blocks are stored according to rows. The method has the advantages that the sparse matrix is divided into dense sub-blocks, the utilization of the processor SIMD processing unit and a register are increased while the locality of the nonzero elements is kept, and sparse matrix vector multiplication performance is increased.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the field of parallel program design, in particular to a sparse matrix storage method on a SIMD many-core processor with multi-level Cache. Background technique

[0002] Sparse matrix-vector multiplication (SpMV) is an important computing core of many scientific and engineering applications, and its computational efficiency is the key to the computational performance of scientific and engineering applications. The main function of the algorithm is to calculate y=y+Ax, where A is a two-dimensional sparse matrix, and both x and y are one-dimensional dense vectors. However, the core of the algorithm is on the modern SIMD many-core processor with multi-level Cache. Due to the irregularity of the non-zero element distribution of the sparse matrix, the SIMD utilization rate is very low, resulting in poor SpMV performance. To improve the performance of the algorithm, we often need to comprehensively consider the characteristics of th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More