An implementation method of heterogeneous many-core based on sparse matrix-vector multiplication based on Shenwei 26010 processor

A technology of sparse matrix and implementation method, which is applied in the direction of electrical digital data processing, instruments, machine execution devices, etc., can solve the problem of unbalanced load access bandwidth utilization sparse matrix, etc., so as to improve the utilization rate of memory access bandwidth and improve the overall Performance, the effect of reducing scheduling overhead

Active Publication Date: 2019-03-19
INST OF SOFTWARE - CHINESE ACAD OF SCI
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] The technical problem solved by the present invention is: for the brand-new Shenwei processor with independent intellectual property rights, a kind of sparse matrix vector multiplication (SpMV) heterogeneous many-core implementation method based on Shenwei 26010 processor is proposed, which solves the problem of CSR format SpMV During calculation, problems such as load imbalance between tasks between cores, memory access bandwidth utilization, and adaptive performance optimization of different types of sparse matrices have been solved, thereby improving the performance of sparse matrix-vector multiplication as a whole.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An implementation method of heterogeneous many-core based on sparse matrix-vector multiplication based on Shenwei 26010 processor
  • An implementation method of heterogeneous many-core based on sparse matrix-vector multiplication based on Shenwei 26010 processor
  • An implementation method of heterogeneous many-core based on sparse matrix-vector multiplication based on Shenwei 26010 processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The present invention will be described in detail below in conjunction with examples.

[0044] Such as figure 1 As shown, the implementation process of the main core version of SpMV is as follows:

[0045] (1) Carry out cyclic calculation for each row of the sparse matrix, first obtain the current row number and judge, if the current row number is less than the total number of rows of the sparse matrix, proceed to the next step;

[0046] (2) Traverse all the sparse non-zero elements in each row, obtain the value information of the current non-zero element and the column subscript information through array access, and obtain the value of the vector x according to the column subscript information, multiply the two together and obtain Accumulate to get the calculation result of the current row;

[0047] (3) Assign the calculation result to the vector y.

[0048] Such as figure 2 Shown, the specific realization of SpMV of the present invention is as follows:

[0049] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for realizing heterogeneous many-core of sparse matrix-vector multiplication based on domestic SW26010 processors. As the non-zero elements of a sparse matrix are distributed very irregularly, two different static and dynamic task partitioning methods are designed in the method, so as to adapt to different sparse matrices; a set of dynamic and static cache mechanism is provided, so as to promote the memory access hit rate of a vector x; a set of self-adapting optimization method is provided, specific to the sparse matrix, the optimal execution parameters can be dynamically selected, so as to promote the running performance. According to the method disclosed by the invention, 16 sparse matrices in a Matrix Market matrix set are adopted to conduct test, the running edition SpMV has about 10 times of acceleration to the utmost extent compared with the single main core of a domestic SW processor, and the average speed-up ratio is 6.51.

Description

technical field [0001] The invention relates to a method for realizing sparse matrix-vector multiplication SpMV (Sparse Matrix-Vector Multiplication), which is the core calculation of a sparse matrix, on a Shenwei many-core processor. In scientific computing and practical applications such as astrophysics and reservoir simulation. Background technique [0002] Sparse matrix-vector multiplication (SpMV) y=A*x is a very important computing kernel in scientific and engineering computing, and its performance often has a great impact on the overall performance of the application. SpMV is memory-intensive, the ratio of floating-point calculations to storage access in the algorithm is very low, and the distribution of non-zero elements in the sparse matrix is ​​very irregular. In the traditional SpMV implementation of CSR (Compressed Sparse Row) format, the vector x is accessed indirectly and irregularly, and the reusability is poor, which brings great challenges to the efficient ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/30
CPCG06F9/30007
Inventor 刘芳芳杨超吴长茂
Owner INST OF SOFTWARE - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products