Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

DMA broadcast data transmission method based on host count in GPDSP

A technology for broadcasting data and transmission methods, applied in digital transmission systems, electrical digital data processing, transmission systems, etc., can solve the problems of reducing data transmission speed, increasing the number of DDR page changes, increasing memory access delay, etc., and reducing power consumption. Consumption and startup overhead, improve read efficiency, and reduce the number of page changes

Active Publication Date: 2018-06-29
NAT UNIV OF DEFENSE TECH
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the DMA technology overlaps the calculation operation of the core and the data movement operation of the storage unit, it reduces the impact of the data transmission speed between the internal storage unit and the external storage unit on the GPDSP processing performance to a certain extent.
However, as the number of processing cores integrated in GPDSP continues to increase, the existing DMA data transmission methods can no longer meet the data volume requirements of multi-core parallel processing. Efficient multi-core DMA involves the memory access requirements of applications and the hardware structure of multi-core GPDSP. characteristic
[0005] When common algorithms and applications such as matrix multiplication, fast Fourier transform, and HPL (High Performance Linpack) are implemented in parallel on multi-core GPDSP, all cores may access the same storage space for a period of time, such as GEMM matrix multiplication operations (C+=AB), the A matrix is ​​a shared matrix, and all DSP cores need matrix A; if the traditional DMA transmission method is used, each DSP core initiates a point-to-point transmission to read the data block on the same position of the DDR. Because the distance between each core and DDR is different, the data read by different cores may be on different DDR pages, which will lead to the loss of DDR page hits, increase the number of DDR page changes, and increase the memory access delay. , which greatly reduces the read efficiency of DDR; if there are multiple or all cores to start DMA transfer transactions, it will not only cause a lot of power consumption, but also cause pressure on the network, and there will be competition or hits when accessing the DDR storage space outside the core Lost etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DMA broadcast data transmission method based on host count in GPDSP
  • DMA broadcast data transmission method based on host count in GPDSP
  • DMA broadcast data transmission method based on host count in GPDSP

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0034] Such as Figure 1~5 As shown, the DMA broadcast data transmission method based on host counting in the GPDSP of the present embodiment includes: starting DMA broadcast data transmission by the host DMA, generating a broadcast read request and sending it to an external storage space through an on-chip network; Request to send the read return data to the on-chip network, each core in the GPDSP receives the read return data from the on-chip network and writes it into the storage space in the core, the host DMA receives the read return data and counts to confirm whether the data transmission is completed, that is, the initiator DMA The DSP core of the transmission transaction is responsible for generating the read request as the host in the broadcast transmis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a DMA broadcast data transmission method based on host count in GPDSP. The method comprises the following steps: a host DMA starts DMA broadcast data transmission to generate abroadcast read request, wherein the broadcast read request is sent to the off-core through an on-chip network; the host DMA receives read return data from each slave DMA, and counts the same to confirm whether the data transmission is finished; the host DMA sends a cache clearing command to all slave DMAs when confirming that the data transmission is finished, and each slave DMA ends the broadcast transmission after receiving the cache clearing command and executing the cache clearing operation. The DMA broadcast data transmission is realized by starting once DMA transmission affair, and themethod has the advantages of being simple in principle, low in cost, small in DMA transmission power consumption and starting expenditure, high in data transmission efficiency and DDR read efficiency,and large in transmission bandwidth.

Description

technical field [0001] The invention relates to the technical field of GPDSP (General Purpose Digital Signal Processor, general purpose digital signal processor), in particular to a DMA (Director Memory Access, direct storage access) broadcast data transmission method based on host counting in GPDSP. Background technique [0002] GPDSP is a new architecture that not only maintains the basic characteristics of embedded DSP and the advantages of high performance and low power consumption, but also can efficiently support general scientific computing. Efficient support for 64-bit high-performance computers and embedded high-precision signal processing. The structure has the following characteristics: ①It has direct representation of double-precision floating point and 64-bit vertex data, general-purpose registers, data buses, and instruction bits are more than 64 bits wide, and the address bus is more than 40 bits; ②CPU and DSP heterogeneous multi-core are tightly coupled, CPU ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/18G06F13/28G06F13/40
CPCG06F13/287G06F13/4068G06F2213/28H04L12/18H04L12/1863Y02D10/00
Inventor 马胜雷元武张美迪万江华陈胜刚李勇彭元喜孙书为
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products