Sparse matrix vector multiplication parallel task granularity parameter automatic tuning method and device

A sparse matrix and granular technology, applied in the field of parallel program task assignment, can solve problems affecting thread load balance, etc., to achieve the effect of improving load balance and better running performance

Active Publication Date: 2020-11-24
NAT UNIV OF DEFENSE TECH
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Using different task granularity will result in different allocatio

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse matrix vector multiplication parallel task granularity parameter automatic tuning method and device
  • Sparse matrix vector multiplication parallel task granularity parameter automatic tuning method and device
  • Sparse matrix vector multiplication parallel task granularity parameter automatic tuning method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] Embodiment 1: Automatic tuning method of sparse matrix vector multiplication parallel task granularity parameters.

[0034] figure 1 It is a flowchart of the automatic tuning method of sparse matrix-vector multiplication task granularity parameters of the present invention, including: a forecasting model building step, a statistical feature value acquisition step, an optimal task granularity parameter prediction step, and a configuration step.

[0035] S1, the prediction model construction step, using machine learning methods to construct a prediction model, between the statistical feature value space X and the parallel task granularity optimal value space Y, construct a prediction model f: X→Y, where x( x 1 ,x 2 ,...,x i ,...x n ) to represent the n-dimensional statistical eigenvector x of the sparse matrix, x i Indicates the statistical feature value, using y to represent the task granularity, in the statistical feature vector x(x 1 ,x 2 ,...,x i ,...x n ) co...

Embodiment 2

[0053] Embodiment 2: an automatic tuning device for sparse matrix vector multiplication parallel task granularity parameters.

[0054] image 3 It is a module diagram of the sparse matrix vector multiplication task granularity parameter automatic tuning device of the present invention, including: a prediction model building module, a statistical feature value acquisition module, an optimal task granularity parameter prediction module, and a configuration module.

[0055] The predictive model building module is used to construct a predictive model using machine learning methods. This module constructs a predictive model f: X→Y between the statistical feature value space X and the parallel task granularity optimal value space Y, where, using x(x 1 ,x 2 ,...,x i ,...x n ) to represent the n-dimensional statistical feature value vector x of the sparse matrix, x i Indicates the statistical feature value, using y to represent the task granularity, in the statistical feature val...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of parallel computing and discloses an automatic tuning method and device for sparse matrix vector multiplication parallel task granularity parameters. The method comprises a prediction model construction step, constructing a prediction model through a machine learning method; a statistical characteristic value obtaining step, analyzing the matrix original data file to obtain a statistical characteristic value of the matrix; an optimal task granularity parameter prediction step, inputting the obtained statistical characteristic value into a prediction model,and predicting an optimal parallel task granularity parameter value of the SpMV program when the matrix characteristic value is used as input; and a configuration step of adjusting the system task granularity during parallel running according to the prediction result. The device comprises a prediction model construction module, a statistical characteristic value acquisition module, an optimal taskgranularity parameter prediction module and a configuration module. According to the method, the parallel task granularity of the SpMV in different input matrixes is adaptively selected, so a purposeof improving the load balance and the overall computing performance of the parallel program is achieved.

Description

technical field [0001] The invention relates to a parallel program task allocation technology, in particular to an automatic tuning method and device for task granularity parameters of a sparse matrix vector multiplication parallel program. Background technique [0002] In the field of scientific computing and artificial intelligence, Sparse Matrix-Vector Multiplication (SpMV) has been widely used as a basic operator, and its corresponding operation module is also one of the most time-consuming modules in the software in this field. . Unlike dense matrices, sparse matrices have only a small number of nonzero elements and most of their elements are zero. These zero elements will not affect the operation result, but there will be additional overhead for accessing zero elements and performing operations on them, resulting in low operation efficiency. [0003] To this end, researchers use the sparseness of the matrix to store only the non-zero elements in the sparse matrix in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/50G06F9/48G06F17/16G06N20/00G06N20/10
CPCG06F9/5083G06F9/4806G06F17/16G06N20/00G06N20/10Y02D10/00
Inventor 方建滨黄春唐滔彭林张鹏范小康崔英博
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products