A method for data distribution in CPU-FPGA heterogeneous multi-core system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A CPU-FPGA and heterogeneous multi-core technology, which is applied in the field of data distribution for CPU-FPGA heterogeneous multi-core systems, can solve problems such as not optimizing system performance

Active Publication Date: 2019-03-15

SHANDONG UNIV

View PDF5 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Due to the complexity of the design space, artificial data allocations made by the system programmer can lead to suboptimal system performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0093] In order to understand the memory-level performance of CPU-FPGA HMPSoC, a memory access latency model is constructed.

[0094] To measure the access latency of a single memory hierarchy using a Xilinx Zynq-7020 SoC at a clock frequency of 100MHz, a microbenchmark FPGA core was designed to perform millions of memory accesses of a specific type. For example, a kernel that repeatedly accesses a huge matrix column sort via ACP can be used to measure the average ACP read miss (i.e. L2 cache read miss) latency.

[0095] Sizing the matrix according to the size of the given L2 cache ensures that all column-ordered accesses result in a cache miss.

[0096] The experimental results are shown in Table 1:

[0097] Table 1

[0098]

[0099] The parameters will be used in some formula operations in step (3), for example:

[0100] In the definition of the execution time variable of the basic block:

[0101]

[0102] If a node represents an array a i memory load instructions...

Embodiment 2

[0118] The purpose of this example is to show that the allocation of data has a significant impact on the system performance of the Zedboard platform based on the Zynq-7020 SoC.

[0119] The experiment used the stimulus example of General Matrix Multiplication (GEMM) from Polybench:

[0120] Algorithm 1 Matrix Multiplication Algorithm

[0121]

[0122] We assume that at most 1 of the 3 arrays A, B, and C can fit in the on-chip BRAMs.

[0123] Experimental results such as Figure 4 shown, where A a B b C h Indicates the allocation scheme of arrays A, B, and C respectively allocated to ACP, BRAM, and HP ports.

[0124] The results show that the optimal allocation (A b C a B h ) than the worst allocation (A h C h B b ) is accelerated by a factor of 3.14.

[0125] In addition, it can be seen that some results may run counter to the traditional CPU-based SPM allocation scheme.

[0126] (1) Compared with the other two arrays, array C has significantly higher read and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A method for data distribution in CPU-FPGA heterogeneous multi-core system is disclosed and comprises the following steps: compiling source code into middle code of low-level virtual machine LLVM through Clang front end; The intermediate code of LLVM is executed by LLVM, and the input data is received to obtain the data access trajectory and instruction trajectory. A dynamic data dependency graph(DDDG) is generated by instruction trajectory to represent the control flow and data flow of the FPGA core. The obtained data access trajectory is sent to the cache simulator, and the cache conflict graph CCG is obtained. The integer linear programming formula is constructed, and the integer linear programming formula is solved according to the dynamic data dependency graph (DDDG) and cache conflict graph (CCG), and the optimal data allocation scheme is obtained.

Description

technical field [0001] The disclosure relates to a data allocation method for a CPU-FPGA heterogeneous multi-core system. Background technique [0002] The statements in this section merely enhance the background related to the present disclosure and may not necessarily constitute prior art. [0003] Field-programmable gate arrays (Field-programmable Gate Arrays, FPGAs) are becoming an increasingly popular design choice in computer systems ranging from low-power embedded systems to high-performance computing architectures. Traditional FPGA design with Register-Transfer Level (RTL) programming requires extensive architecture and circuit experience, which is error-prone and time-consuming. A high-level synthesis (High-level Synthesis, HLS) tool compiles the C / C++ kernel into a corresponding hardware description language (Hardware Description Language, HDL) module. In recent years, HLS tools have been widely used in complex FPGA heterogeneous system design, shortening time to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F9/50

CPCG06F9/5027

Inventor 鞠雷荣雅洁李世清

Owner SHANDONG UNIV

A method for data distribution in CPU-FPGA heterogeneous multi-core system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology