Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Matrix multiplication accelerating method for CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system

A heterogeneous system and matrix multiplication technology, applied in the field of matrix multiplication acceleration, can solve the problem that the heterogeneous system matrix multiplication method cannot meet the design goals of the CPU+DSP heterogeneous computing system

Active Publication Date: 2015-01-28
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this data division and transmission method and control strategy are only applicable to NVIDIA's unified architecture GPU platform
[0006] In summary, the traditional heterogeneous system matrix multiplication method cannot meet the design goals of the CPU+DSP heterogeneous computing system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Matrix multiplication accelerating method for CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system
  • Matrix multiplication accelerating method for CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system
  • Matrix multiplication accelerating method for CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] figure 1 It is a schematic diagram of the architecture of a heterogeneous computing system composed of the main processor CPU and the accelerator DSP based on the PCIE communication mode, in which the main processor end has memory and Cache, and the accelerator end has global storage space and array memory; the main processor and accelerator Communication and data transmission can only be carried out through the PCIE bus.

[0078] figure 2 It is a schematic diagram of the division and merging of matrix data. In the figure, the matrix A of M*K is divided into the block matrix (row vector) A of m*K i , ( ); K*N matrix B is divided into K*n block matrix (column vector) B j , ( ), and B j is divided into 1 B j and 2 B j Two submatrices, where, 1 B j The calculation is completed on the CPU side, and at the same time, 2 B j The calculation is done on the DSP side.

[0079] Concrete implementation steps of the present invention are as follows:

[0080] The fi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a matrix multiplication accelerating method for a CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system and aims at providing an efficient cooperative matrix multiplication accelerating method for the CPU+DSP heterogeneous system to increase the operation speed of matrix multiplication and maximize the computing efficiency of the CPU+DSP heterogeneous system. According to the technical scheme, the method comprises the following steps of firstly, initializing parameters, and performing information configuration on the CPU+DSP heterogeneous system; secondly, partitioning to-be-processed data which are allocated to computing nodes to a CPU and a DSP for cooperative processing according to the difference between the design target and the computing performance of the CPU of a main processor and the DSP of an accelerator; thirdly, concurrently performing data transmission and cooperative computation by the CPU and the DSP to obtain (ceiling of M / m*ceiling of N / n) block matrixes C(i-1)(j-1); finally, merging the block matrixes C(i-1)(j-1) to form an M*N result matrix C. With the adoption of the method, when being in charge of data transmission and program control, the CPU can actively cooperate with the DSP to complete matrix multiplication computation; moreover, the data transmission and the cooperative computation are overlapped, so that the matrix multiplication computation speed of the CPU+DSP heterogeneous system is increased, and the utilization rate of computation resources is improved.

Description

technical field [0001] The invention relates to a matrix multiplication acceleration method, in particular to a matrix multiplication acceleration method oriented to a CPU+DSP heterogeneous computing system. Background technique [0002] A heterogeneous computing system is a computer system built by processors with two different architectures, the main processor and the accelerator. At present, common heterogeneous computing systems are composed of CPU+GPU and CPU+MIC. With the continuous improvement of general DSP computing performance and the wide application of general DSP, CPU+DSP will become an important part of heterogeneous computing systems. Direction of development. [0003] Matrix multiplication is the most commonly used type of operation in numerical calculations. Many applications include the calculation process of matrix multiplication. Designing efficient matrix multiplication methods for CPU+DSP heterogeneous systems can effectively improve the calculation sp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F15/16G06F17/16
Inventor 刘杰迟利华甘新标晏益慧徐涵胡庆丰蒋杰李胜国王庆林皇甫永硕崔显涛周陈
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products