A sparse matrix storage method on simd many-core processor with multi-level cache

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of many-core processors and sparse matrix, which is applied in the field of parallel programming, can solve the problems of missing cache, high memory access delay overhead, low x-vector data reuse rate, low SIMD utilization rate, etc., to improve utilization rate, The effect of high density and improved computing efficiency

Active Publication Date: 2017-07-25

UNIV OF SCI & TECH OF CHINA

View PDF2 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In order to obtain high computing performance on this kind of SIMD processor with multi-level Cache for sparse matrix-vector multiplication, it is necessary to overcome the computing bottleneck caused by the irregular distribution of non-zero elements of the sparse matrix:

[0005] (1) The utilization rate of SIMD is low;

[0006] (2) The data reuse rate in the x vector is low, which makes the cache miss and memory access delay overhead very large;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0027] This section applies the present invention to a typical sparse matrix-vector multiplication calculation on a SIMD many-core processor with multi-level Cache. Thereby, the object, advantages and key technical features of the present invention are further described. This implementation is only a typical example of the solution, and any technical solution formed by replacement or equivalent transformation falls within the scope of protection claimed by the present invention.

[0028] For a sparse matrix A to be computed:

[0029]

[0030] First, the matrix A is subjected to feature extraction, matrix scanning, column block, column compression, row block, and row-by-row storage according to the optimization method proposed by the present invention. converted to as Figure 5 ERB storage format, or directly store the sparse matrix to be calculated as Figure 5 storage format and save it to a file.

[0031] When calculating, read from the file Figure 5 The sparse matr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a storage method of a sparse matrix on an SIMD multi-core processor with a multi-level cache. The method includes the steps of firstly, the maximum value a of the number of row nonzero elements in a matrix A and the number b of the nonzero elements which can be calculated at the same time by a processor SIMD unit are acquired, and a minimum value which is larger than a and is the multiple of b is calculated to serve as a temperature row width; secondly, for the matrix A, array Value and Colidx respectively sequentially stores the nonzero element value of each row and line coordinates, and 0 and -1 are respectively supplemented to the tail of each row whose number of the nonzero elements does not reach the temporary row width; thirdly, partitioning according to b lines is performed on Colidx and Value; fourthly, each line block is compressed according to rows, and the rows, with the nonzero elements, in the line block is allowed to centralize on the upper portion of the line block; sixthly, partitioning is performed on the line blocks according to b rows to obtain sub-blocks; all-zero sub-blocks are removed, and the sub-blocks are stored according to rows. The method has the advantages that the sparse matrix is divided into dense sub-blocks, the utilization of the processor SIMD processing unit and a register are increased while the locality of the nonzero elements is kept, and sparse matrix vector multiplication performance is increased.

Description

technical field [0001] The invention relates to the field of parallel program design, in particular to a sparse matrix storage method on a SIMD many-core processor with multi-level Cache. Background technique [0002] Sparse matrix-vector multiplication (SpMV) is an important computing core of many scientific and engineering applications, and its computational efficiency is the key to the computational performance of scientific and engineering applications. The main function of the algorithm is to calculate y=y+Ax, where A is a two-dimensional sparse matrix, and both x and y are one-dimensional dense vectors. However, the core of the algorithm is on the modern SIMD many-core processor with multi-level Cache. Due to the irregularity of the non-zero element distribution of the sparse matrix, the SIMD utilization rate is very low, resulting in poor SpMV performance. To improve the performance of the algorithm, we often need to comprehensively consider the characteristics of th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F12/0811G06F9/38G06F12/1009

Inventor 韩文廷张爱民江霞安虹陈俊仕孙荪汪朝辉

Owner UNIV OF SCI & TECH OF CHINA

A sparse matrix storage method on simd many-core processor with multi-level cache

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology