Methods and systems for optimizing sparse matrix vector multiplication of a high-performance computing framework

A high-performance computing and sparse matrix technology, applied in the field of optimizing high-performance computing architecture sparse matrix-vector multiplication, can solve problems such as high memory access delay, affecting large-scale applications, and load imbalance

Inactive Publication Date: 2020-07-17
HUNAN UNIV
View PDF9 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the above defects or improvement needs of the prior art, the present invention provides a method and system for optimizing the sparse matrix-vector multiplication of the high-performance computing framework. The technical problem of high memory access delay in the parallel sparse matri

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods and systems for optimizing sparse matrix vector multiplication of a high-performance computing framework
  • Methods and systems for optimizing sparse matrix vector multiplication of a high-performance computing framework
  • Methods and systems for optimizing sparse matrix vector multiplication of a high-performance computing framework

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0055] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0056] Aiming at the three main design challenges of parallel sparse matrix multiplication in the "Sunway TaihuLight" architecture, we propose an optimized implementation method for sparse matrix-vector multiplication oriented to the "Sunway TaihuLight" architecture. In order to solve the problem of high memory access delay, we divide the sparse matrix multiplication operation into two parts: Co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for optimizing sparse matrix vector multiplication of a high-performance computing framework. According to the method, sparse matrix vector multiplication operation isdivided into column multiplication operation and row addition operation: the column multiplication operation is firstly carried out according to columns of a sparse matrix, and in the process, data access to an input vector x is changed from irregular and discrete to continuous; in the process, the data access of the output vector y is changed from irregular and discrete to continuous, so that the problem of high memory access delay caused by irregular data access is avoided; the invention further designs a four-layer division mechanism, which comprises kernel group layer division, customizeddivision, slave kernel layer division and local memory layer division, so that the multi-level calculation framework and the memory structure of the Sunway Taihu Light are fully utilized, and the problems of local memory limitation and load imbalance on the calculation kernel are avoided.

Description

technical field [0001] The invention belongs to the field of parallel computing, and more specifically relates to a method and system for optimizing sparse matrix-vector multiplication of a high-performance computing framework. Background technique [0002] At present, the high-performance computing architecture has been increasingly common in industrial applications. Among them, the "Sunway Taihu Light" based on the SW26010 multi-core heterogeneous processor is a typical high-performance computing architecture. It is developed by the National Parallel Computer Engineering Technology Research Center Independent research and development, the supercomputer installed in the National Supercomputing Wuxi Center is equipped with 40960 SW26010 processors. There are 4 core groups on each SW26010 processor, and each core group is installed with a master core and 8*8 slave cores, in which the master core is responsible for preprocessing, slave core computing task allocation, and some ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/16
CPCG06F17/16
Inventor 李肯立陈玥丹肖国庆阳王东唐卓周旭刘楚波
Owner HUNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products