Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method for data distribution in CPU-FPGA heterogeneous multi-core system

A CPU-FPGA and heterogeneous multi-core technology, which is applied in the field of data distribution for CPU-FPGA heterogeneous multi-core systems, can solve problems such as not optimizing system performance

Active Publication Date: 2019-03-15
SHANDONG UNIV
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the complexity of the design space, artificial data allocations made by the system programmer can lead to suboptimal system performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for data distribution in CPU-FPGA heterogeneous multi-core system
  • A method for data distribution in CPU-FPGA heterogeneous multi-core system
  • A method for data distribution in CPU-FPGA heterogeneous multi-core system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0093] In order to understand the memory-level performance of CPU-FPGA HMPSoC, a memory access latency model is constructed.

[0094] To measure the access latency of a single memory hierarchy using a Xilinx Zynq-7020 SoC at a clock frequency of 100MHz, a microbenchmark FPGA core was designed to perform millions of memory accesses of a specific type. For example, a kernel that repeatedly accesses a huge matrix column sort via ACP can be used to measure the average ACP read miss (i.e. L2 cache read miss) latency.

[0095] Sizing the matrix according to the size of the given L2 cache ensures that all column-ordered accesses result in a cache miss.

[0096] The experimental results are shown in Table 1:

[0097] Table 1

[0098]

[0099] The parameters will be used in some formula operations in step (3), for example:

[0100] In the definition of the execution time variable of the basic block:

[0101]

[0102] If a node represents an array a i memory load instructions...

Embodiment 2

[0118] The purpose of this example is to show that the allocation of data has a significant impact on the system performance of the Zedboard platform based on the Zynq-7020 SoC.

[0119] The experiment used the stimulus example of General Matrix Multiplication (GEMM) from Polybench:

[0120] Algorithm 1 Matrix Multiplication Algorithm

[0121]

[0122] We assume that at most 1 of the 3 arrays A, B, and C can fit in the on-chip BRAMs.

[0123] Experimental results such as Figure 4 shown, where A a B b C h Indicates the allocation scheme of arrays A, B, and C respectively allocated to ACP, BRAM, and HP ports.

[0124] The results show that the optimal allocation (A b C a B h ) than the worst allocation (A h C h B b ) is accelerated by a factor of 3.14.

[0125] In addition, it can be seen that some results may run counter to the traditional CPU-based SPM allocation scheme.

[0126] (1) Compared with the other two arrays, array C has significantly higher read and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method for data distribution in CPU-FPGA heterogeneous multi-core system is disclosed and comprises the following steps: compiling source code into middle code of low-level virtual machine LLVM through Clang front end; The intermediate code of LLVM is executed by LLVM, and the input data is received to obtain the data access trajectory and instruction trajectory. A dynamic data dependency graph(DDDG) is generated by instruction trajectory to represent the control flow and data flow of the FPGA core. The obtained data access trajectory is sent to the cache simulator, and the cache conflict graph CCG is obtained. The integer linear programming formula is constructed, and the integer linear programming formula is solved according to the dynamic data dependency graph (DDDG) and cache conflict graph (CCG), and the optimal data allocation scheme is obtained.

Description

technical field [0001] The disclosure relates to a data allocation method for a CPU-FPGA heterogeneous multi-core system. Background technique [0002] The statements in this section merely enhance the background related to the present disclosure and may not necessarily constitute prior art. [0003] Field-programmable gate arrays (Field-programmable Gate Arrays, FPGAs) are becoming an increasingly popular design choice in computer systems ranging from low-power embedded systems to high-performance computing architectures. Traditional FPGA design with Register-Transfer Level (RTL) programming requires extensive architecture and circuit experience, which is error-prone and time-consuming. A high-level synthesis (High-level Synthesis, HLS) tool compiles the C / C++ kernel into a corresponding hardware description language (Hardware Description Language, HDL) module. In recent years, HLS tools have been widely used in complex FPGA heterogeneous system design, shortening time to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
CPCG06F9/5027
Inventor 鞠雷荣雅洁李世清
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products