Matrix Multiplication Acceleration Method for Heterogeneous Fusion Architecture

An architecture and matrix multiplication technology, applied in the field of matrix multiplication acceleration, which can solve problems such as difficult to meet the performance requirements of many-core accelerators

Active Publication Date: 2020-05-22
NAT UNIV OF DEFENSE TECH
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Due to the difference in the design goals and instruction set structures of many-core accelerators, it is difficult for the traditional matrix multiplication implementation technology for general-purpose main processors to meet the performance requirements of many-core accelerators designed for specific applications. Therefore, it must be oriented to many-core accelerator target systems The structure accelerates matrix multiplication to improve the operation speed of matrix multiplication and meet the design goals of heterogeneous systems to the greatest extent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Matrix Multiplication Acceleration Method for Heterogeneous Fusion Architecture
  • Matrix Multiplication Acceleration Method for Heterogeneous Fusion Architecture
  • Matrix Multiplication Acceleration Method for Heterogeneous Fusion Architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0167] figure 1 It is an overall flow chart of the matrix multiplication acceleration method for general multi-core DSP of the present invention.

[0168] The steps of the present invention are as follows:

[0169] The first step is to design a block matrix multiplication version for heterogeneous fusion architecture, and get v cpu , v gpu , v mic , v target , v coi and v scif 6 versions, the specific steps are as follows:

[0170] 1.1 Configuration and initialization of heterogeneous fusion architecture, the specific method is:

[0171] Define the dimension of matrix A to be M×K, the dimension of matrix B to be K×N, the dimension of matrix C obtained by multiplying A and B to be M×N, where M, K, and N are all positive integers; the pth row of A The qth column element is a pq , 0≤p≤M-1, 0≤q≤K-1, the element of row q and column t of B is b qt , 0≤t≤N-1;

[0172] 1.2 If the heterogeneous fusion architecture is only composed of CPU, initialize the CPU (using MPI progr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a matrix multiplication acceleration method oriented to a heterogeneous fusion system structure, and aims to design a universal matrix multiplication acceleration method oriented to the heterogeneous fusion system structure for different many-core accelerator target system structures and improve the use efficiency of a heterogeneous system. According to the technical scheme, firstly, block matrix multiplied versions facing a heterogeneous fusion system structure are designed, the block matrix multiplied versions comprise vcpu, vgpu, vmic, vscif, vcoi and vtarget, and then the heterogeneous fusion multi-version matrix multiplied versions are integrated and packaged to generate a library file HU-xgemm of the heterogeneous fusion version; finally, an accelerator in anHU-xgemm adaptive heterogeneous fusion system structure is adopted. According to the invention, different target accelerators and processors can be self-adapted; matrix multiplication can be adaptively carried out according to different heterogeneous fusion system structures, matrix multiplication is carried out according to topological structures of CPUs or accelerators in the different heterogeneous fusion system structures, FMA parallel computing is carried out, the matrix multiplication speed is increased, and the use efficiency of a heterogeneous system is improved.

Description

technical field [0001] The invention relates to a matrix multiplication acceleration method, in particular to a heterogeneous fusion architecture matrix multiplication acceleration method for heterogeneous systems. Background technique [0002] With the continuous improvement of the computing performance of general-purpose accelerators and the wide application of accelerators, many-core accelerators will surely become an important development direction of high-performance computing, and accelerators such as GPU, MIC (Xeon Phi), and Matrix2000 have emerged to meet the needs of various fields. With the wide application and popularization of heterogeneous systems, many different types of heterogeneous architectures such as CPU+GPU, CPU+MIC, and CPU+Matrix2000 have emerged. [0003] Accelerator design goals and design principles determine the specificity and limitations of accelerators. Different accelerator manufacturers have developed programming models that adapt to them, suc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/16G06F7/523
Inventor 甘新标曾瑞庚杨志辉孙泽文吴涛刘杰龚春叶李胜国杨博徐涵晏益慧
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products