Unlock instant, AI-driven research and patent intelligence for your innovation.

Self-adaptive multi-row folding storage method suitable for GPU (Graphics Processing Unit)

A self-adaptive, row-folding technology, applied in multi-program device, resource allocation, program control design, etc., can solve the problems of reduced performance, large difference in non-zero elements, large difference in the number of non-zero elements, etc.

Pending Publication Date: 2021-10-01
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Problems with CMRS: CMRS uses a fixed k-row fold for a matrix, which may cause a large difference in the number of non-zero elements in different fold rows
If the original matrix is ​​an irregular matrix, that is, the non-zero elements of different rows differ greatly, then fixed multi-row folding will exacerbate this irregularity, resulting in more serious load imbalance and reducing the overall performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-adaptive multi-row folding storage method suitable for GPU (Graphics Processing Unit)
  • Self-adaptive multi-row folding storage method suitable for GPU (Graphics Processing Unit)
  • Self-adaptive multi-row folding storage method suitable for GPU (Graphics Processing Unit)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] Embodiment 1: For irregular sparse matrices with large differences in the number of non-zero elements in different rows, there is still serious load imbalance in fixed multi-row folding. Therefore, the present invention proposes an adaptive multi-row folding storage format AMF-CSR, which is different from every k consecutive rows that are fixed in CMRS. In the new method, the number of folded matrix rows is variable. The flow chart of conversion from CSR to AMF-CSR format is as follows: figure 1 shown. The present invention counts each fold line as an f-line.

[0059] figure 1 shows an 8×8 sparse matrix. A fixed two-line fold generates 4 f-lines with their respective non-zero arguments {3, 3, 10, 4}. Apparently, the third f-row has far more non-zero arities than the others, thus becoming the performance bottleneck of SpMV. take T s =4, then adaptive multi-line folding generates 5 f-lines, and their respective non-zero arguments are {3, 3, 7, 3, 4} respectively. O...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a self-adaptive multi-row folding storage method and system suitable for a GPU (Graphics Processing Unit), and the method comprises the steps: collecting a CSR code for a sparse matrix, and obtaining a non-zero element threshold value of the sparse matrix by setting a folding granularity and a threshold value; traversing a row pointer array of the sparse matrix, and performing multi-row folding based on the non-zero element threshold value to obtain a row index and a row pointer position of each folded row; according to the row index and the row pointer position, partitioning the sparse matrix and collecting the row index and the row pointer position of the folded row corresponding to each folding block; and traversing a column index and a value array of the sparse matrix, and based on the column index, adding a non-zero element in a blocked folded row, and constructing an adaptive multi-row folding storage structure for realizing multi-row folding storage of the sparse matrix. According to the system, the self-adaptive multi-row folding storage method is realized through mutual cooperation of the modules, and the problem of load imbalance of the SpMV based on the CMRS format on an irregular matrix is solved.

Description

technical field [0001] The invention relates to the field of sparse linear algebraic data optimization, in particular to an adaptive multi-line folding storage method suitable for GPU. Background technique [0002] Sparse matrix-vector multiplication (SpMV) is the core calculation of many iterative algorithms for solving large-scale sparse linear systems, and it is widely used in many research fields, such as linear algebra, data mining, and graph analysis. The general form of SpMV is y=A·x, where A is a sparse matrix of m×n, and x and y are dense vectors of n×1 and m×1 respectively. In recent years, optimizing SpMV kernels on modern hardware architectures has attracted extensive attention from researchers. Compared with traditional CPUs, modern general-purpose graphics processing units (GPUs) tend to have higher peak floating-point calculations due to their large number of computing cores. However, the optimization of SpMV on GPU is somewhat challenging. This is mainly b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06T1/20G06T1/60G06F9/50
CPCG06T1/20G06T1/60G06F9/505
Inventor 计卫星高建花王一拙石峰
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More