A communication architecture construction method for inter-task communication of a slotless DPR system
By employing a chained communication strategy that coordinates bandwidth and location in a slotless DPR system, and adaptively selecting communication methods and task layouts, the problem of optimizing the inter-task communication architecture in a slotless DPR system is solved, and an efficient and flexible communication architecture is constructed.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
- Filing Date
- 2022-04-24
- Publication Date
- 2026-06-12
AI Technical Summary
Existing slotless DPR systems cannot adaptively select communication methods and build communication architectures when communicating between tasks, and cannot optimize based on the bandwidth requirements and dependencies of inter-task communication.
A chain-like communication strategy that coordinates bandwidth and location is adopted. Through ICAP virtual channels, fast transmission links and shared memory communication methods, combined with the inter-task communication bandwidth requirements and dependencies, an adaptive communication architecture is constructed.
It enables adaptive selection of communication methods based on actual needs, optimizes task layout, improves communication efficiency and resource utilization efficiency, and forms a flexible and efficient communication architecture.
Smart Images

Figure CN116974960B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the technical field of FPGA reconfigurable system design, and relates to the task communication architecture of dynamically reconfigurable systems, particularly a method for constructing a communication architecture for inter-task communication in slotless DPR systems. Background Technology
[0002] FPGA-based DPR systems can execute multiple tasks in parallel and allow for online real-time loading or unloading of each task, showing broad application prospects. Compared to traditional slot-based DPR systems, slotless DPR systems have stronger adaptability and higher resource utilization efficiency, allowing for online adjustment of the number, position, and size of task nodes, and enabling online construction of communication links based on the actual task layout.
[0003] Currently, slotless DPR systems can construct communication links between tasks online using methods such as ICAP virtual channel communication, fast transfer link communication, and shared memory communication. However, existing research only focuses on the principles and verification of each communication method, resulting in relatively simplistic approaches that fail to adaptively select communication methods and construct communication architectures based on inter-task bandwidth requirements and dependencies. This invention addresses these issues by proposing a method for constructing a communication architecture for inter-task communication in slotless DPR systems. Summary of the Invention
[0004] The purpose of this invention is to address the aforementioned problems and shortcomings by proposing a communication architecture construction method for inter-task communication in slotless DPR systems. Based on ICAP virtual channel communication, fast transmission link-based communication, and shared memory-based communication, this invention proposes a bandwidth and location-coordinated chained communication strategy according to the communication bandwidth requirements and dependencies between tasks, thereby enabling the construction of an adaptive communication architecture between nodes.
[0005] To achieve the above objectives, the technical solution adopted by the present invention is: a method for constructing a communication architecture for inter-task communication in a dynamically partially reconfigurable system, characterized by comprising the following steps:
[0006] (1) For the task set T = {T1, T2, ..., T} in the system, N Modeling: Using RE ti ={CRS Ti CTGS Ti CTGSD Ti Describe the i-th task T i The characteristics of resource regions, including CRS Ti Describe task T i Resource area size, CTGS TiDescribe task T i The distribution pattern of BRAM and DSP columns in the resource area, CTGSD Ti Describe task T i Distance characteristics between resource region BRAM and DSP columns;
[0007] (2) Using COM Ti-Tj Describe the two tasks T in the task diagram i and T j The communication method between the two task nodes is used as the connection weight, and the COM port is initially determined based on the inter-task communication bandwidth requirements. Ti-Tj ;
[0008] (3) Based on the communication relationship between tasks in the task graph, the task graph is decomposed into sub-task graphs that have no communication dependency with each other, and the connection weights between tasks with one-to-many communication and many-to-one communication in each sub-task graph are adjusted.
[0009] (4) According to the task layout constraints of different communication methods, merge the tasks in each subtask graph described in step (3) to obtain the set of virtual tasks to be laid out.
[0010] (5) Use the layout algorithm to determine the virtual layout position of each virtual task in the virtual task set to be laid out in step (4), and determine the actual layout position of each task in the original task graph based on the virtual layout position of each virtual task.
[0011] (6) Use bitstream relocation technology to configure the functions of each task node in the original task graph to the actual layout position described in step (5), and construct a chain communication architecture according to the connection weight described in step (3).
[0012] Furthermore, the specific implementation method of step (2) is as follows:
[0013] (2.1) If task T i and T j The communication bandwidth requirements between them are high, so a shared memory-based communication method is chosen. In this case, COM... Ti-Tj =share_mem;
[0014] (2.2) If task T i and T j For communication with low bandwidth requirements, an ICAP-based communication method is chosen. In this case, COM... Ti-Tj =icap;
[0015] (2.3) If task T i and T j The communication bandwidth requirement between them is between (2.1) and (2.2), so a communication method based on a fast transmission link is selected. At this time, COM Ti-Tj=qtl.
[0016] Furthermore, the specific implementation method of step (3) is as follows:
[0017] (3.1) For each subtask graph, start from the root node and scan according to the communication following relationship. For one-to-one communication tasks, retain the initial connection weight.
[0018] (3.2) For one-to-many and many-to-one communication tasks, change the connection weight to COM. Ti-Tj =icap.
[0019] Furthermore, the specific implementation method of step (4) is as follows:
[0020] (4.1) For each subtask graph, starting from the root node, scan according to the communication follow-up relationship. If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =icap, then task T i Simply place it into the set of virtual tasks to be deployed;
[0021] (4.2) If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =share_mem, merge them into a single virtual task Tv Ti-Tj Then, place them into the virtual task set to be laid out; among them, TV Ti-Tj Resource regional characteristics RE Ti-Tj For RE ti and RE tj The elements are stacked horizontally from left to right and share the same RE. ti BRAM resources for output data buffering;
[0022] (4.3) If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =qtl, merge them into a virtual task Tv Ti-Tj Then, place them into the virtual task set to be laid out; among them, TV Ti-Tj Resource regional characteristics RE Ti-Tj For RE ti and RE tj Vertically stacked from bottom to top, and as RE ti Add a column of CLB resources to the edge of the BRAM for output data buffering.
[0023] Furthermore, the specific implementation method of step (5) is as follows:
[0024] (5.1) If the virtual task is the original task described in step (4.1), then the virtual layout position is its final layout position;
[0025] (5.2) If the virtual task is the virtual task Tv using the shared memory communication method described in step (4.2) Ti-Tj Then task T i The layout position is left-aligned with the virtual layout position, task T j The layout position is right-aligned with the virtual layout position;
[0026] (5.3) If the virtual task is the virtual task Tv described in step (4.3) using the communication method based on the fast transmission link. Ti-Tj Then task T i The layout position is aligned with the virtual layout position, task T j The layout position is aligned with the virtual layout position.
[0027] Compared with the prior art, the beneficial effects of the present invention are: the present invention can adaptively select a suitable communication method and construct a communication architecture according to the communication bandwidth requirements and communication dependencies between tasks in the actual application system, while taking into account the communication bandwidth requirements and task placement constraints, and has the characteristics of practicality, efficiency and flexibility. Attached Figure Description
[0028] Figure 1 The flowchart is as follows:
[0029] Figure 2 Original task map
[0030] Figure 3 Task graph after initial determination of connection weights
[0031] Figure 4 Task graph after merging virtual tasks
[0032] Figure 5 Chain communication architecture diagram Detailed Implementation
[0033] The embodiments of the present invention are described in detail below. These embodiments are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention. A method for constructing a communication architecture for inter-task communication in a slotless DPR system according to the present invention will be described in detail below with reference to the accompanying drawings. Figure 1 As shown in the figure, this embodiment describes a communication architecture construction method for inter-task communication in a slotless DPR system, which includes the following steps.
[0034] (1) For the task set T = {T1, T2, ..., T} in the system, N Modeling: Using REti ={CRS Ti CTGS Ti CTGSD Ti Describe the i-th task T i The characteristics of resource regions, including CRS Ti Describe task T i Resource area size, CTGS Ti Describe task T i The distribution pattern of BRAM and DSP columns in the resource area, CTGSD Ti Describe task T i Distance characteristics of resource region BRAM and DSP columns.
[0035] (2) Using COM Ti-Tj Describe the two tasks T in the task diagram i and T j The communication method between the two task nodes is used as the connection weight, and the COM port is initially determined based on the inter-task communication bandwidth requirements. Ti-Tj .
[0036] (2.1) If task T i and T j The communication bandwidth requirements between them are high, so a shared memory-based communication method is chosen. In this case, COM... Ti-Tj =share_mem. This communication method is the fastest, but it requires the two communicating parties to be horizontally adjacent, with the output data buffer NDOB of the preceding task coinciding with the input data buffer NDIB of the following task.
[0037] (2.2) If task T i and T j For communication with low bandwidth requirements, an ICAP-based communication method is chosen. In this case, COM... Ti-Tj =ICAP. This communication method has no constraints on task layout location, but the communication bandwidth is limited by ICAP throughput.
[0038] (2.3) If task T i and T j The communication bandwidth requirement between them is between (2.1) and (2.2), so a communication method based on a fast transmission link is selected. At this time, COM Ti-Tj =qtl. This communication method requires the two communicating parties to be vertically adjacent, and requires a CLB column to be reserved on the side of the BRAM column used for communication, for constructing the communication link.
[0039] right Figure 2 The task diagram shown initially determines the COM port based on communication bandwidth requirements. Ti-Tj The subsequent task diagram is as follows Figure 3 As shown.
[0040] (3) Based on the communication relationship between tasks in the task graph, the task graph is decomposed into sub-task graphs that have no communication dependency between them, and the connection weights between tasks with one-to-many communication and many-to-one communication in each sub-task graph are adjusted.
[0041] (3.1) For each subtask graph, start from the root node and scan according to the communication following relationship. For one-to-one communication tasks, retain the initial connection weight.
[0042] (3.2) For one-to-many and many-to-one communication tasks, change the connection weight to COM. Ti-Tj =icap.
[0043] Figure 3 The task graph shown is an independent subtask graph. The connection weights between T2 and T3 / T4 in one-to-many communication, and between T5 and T4 and T6 in many-to-one communication, need to be adjusted. The adjusted task graph is as follows: Figure 4 As shown.
[0044] (4) Based on the task layout constraints of different communication methods, merge the tasks in each subtask graph described in step (3) to obtain a set of virtual tasks to be laid out.
[0045] (4.1) For each subtask graph, starting from the root node, scan according to the communication follow-up relationship. If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =icap, then task T i Place it directly into the set of virtual tasks to be deployed.
[0046] (4.2) If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =share_mem, merge them into a single virtual task Tv Ti-Tj Then, place them into the set of virtual tasks to be laid out.
[0047] Shared memory communication requires that the communicating parties be horizontally adjacent, with the output data buffer NDOB of the preceding task coinciding with the input data buffer NDIB of the following task. Therefore, virtual task Tv... Ti-Tj Resource regional characteristics RE Ti-Tj For task T i Resource regional characteristics RE ti and Task T j Resource regional characteristics RE tj The elements are stacked horizontally from left to right and share the same RE. ti Output the BRAM resources of the NDOB. For example, suppose task T...i The GS sequence is {R, M, L, R} (the rightmost BRAM is NDOB), and task T... j If the GS sequence is {R, M, D, R} (the leftmost BRAM is NDIB), then the merged virtual task Tv Ti-Tj The corresponding GS sequence is {R, M, L, R, M, D, R}.
[0048] (4.3) If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =qtl, merge them into a virtual task Tv Ti-Tj Then, place them into the set of virtual tasks to be laid out.
[0049] Communication based on high-speed transmission links requires that the communicating parties be vertically adjacent, and that at least one CLB column be left empty on the side of the boundary BRAM resource column. Therefore, virtual task TV Ti-Tj Resource regional characteristics RE Ti-Tj For task T i Resource regional characteristics RE ti and Task T j Resource regional characteristics RE tj Vertically stacked from bottom to top, and as RE ti Add a column of CLB resources to the BRAM edge of the NDOB. For example, suppose task T... i The GS sequence is {R, M, L, R, M, D, R} (the rightmost BRAM is NDOB), and task T... j If the GS sequence is {R, M, D, R} (the rightmost BRAM is NDIB), then the merged virtual task Tv Ti-Tj The corresponding GS sequence is {R, M, L, R, M, D, R, M}, and the additional CLB column on the far right is used to construct the fast transmission link communication link.
[0050] right Figure 4 The task graph shown, after merging, yields a virtual task set Tv = {Tv} T1-T2 TV T3-T5 ,T4,T6}.
[0051] (5) Use the layout algorithm to determine the virtual layout position of each virtual task in the virtual task set to be laid out in step (4), and determine the actual layout position of each task in the original task graph based on the virtual layout position of each virtual task.
[0052] In the virtual task set described in step (4), all tasks communicate using an ICAP-based method, and there are no longer constraints on their layout positions. Therefore, a layout algorithm can be used first to implement the layout of tasks in the virtual task set, and then the layout positions of each actual task in the original task graph can be determined according to the following method.
[0053] (5.1) If the virtual task is the original task described in step (4.1), then the virtual layout position is its final layout position.
[0054] (5.2) If the virtual task is the virtual task Tv using the shared memory communication method described in step (4.2) Ti-Tj Then task T i The layout position is left-aligned with the virtual layout position, task T j The layout position is right-aligned with the virtual layout position. For example, for task T based on shared memory communication in step (4.2) i (GS is {R, M, L, R}) and T j (GS is {R, M, D, R}), the TV after the merger of the two Ti-Tj If the position is (x, y), then T i The position is (x, y), T j The position is (x+3, y).
[0055] (5.3) If the virtual task is the virtual task Tv described in step (4.3) using the communication method based on the fast transmission link. Ti-Tj Then task T i The layout position is aligned with the virtual layout position, task T j The layout position is aligned with the virtual layout position. For example, for task T based on fast transmission link communication in step (4.3). i (GS is {R, M, L, R, M, D, R}, assuming the task region height is 1) and T j (GS is {R, M, D, R}), the TV after the merger of the two Ti-Tj If the position is (x, y), then T i The position is (x, y), T j The position is (x, y+1).
[0056] Based on the above strategies Figure 4 The task placement positions in the task graph shown will form a chain structure, such as... Figure 5 As shown.
[0057] The above description is merely an embodiment of the present invention and does not limit the patent scope of the present invention. Any equivalent structural or procedural transformations made based on the content of the present invention's specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of the present invention.
Claims
1. A method for constructing a communication architecture for inter-task communication in a slotless DPR system, characterized in that, Includes the following steps: (1) For the task set T = {T1, T2, ..., T} in the system, N Modeling: Using RE ti ={CRS Ti CTGS Ti CTGSD Ti Describe the i-th task T i The characteristics of resource regions, including CRS Ti Describe task T i Resource area size, CTGS Ti Describe task T i The distribution pattern of BRAM and DSP columns in the resource area, CTGSD Ti Describe task T i Distance characteristics between resource region BRAM and DSP columns; (2) Using COM Ti-Tj Describe the two tasks T in the task diagram i and T j The communication method between the two task nodes is used as the connection weight, and the COM port is initially determined based on the inter-task communication bandwidth requirements. Ti-Tj : (2.1) If task T i and T j The communication bandwidth requirements between them are high, so a shared memory-based communication method is chosen. In this case, COM... Ti-Tj =share_mem, where share_mem represents the communication method based on shared memory; (2.2) If task T i and T j For communication with low bandwidth requirements, an ICAP-based communication method is chosen. In this case, COM... Ti-Tj =icap, where icapa represents a communication method based on ICAP; (2.3) If tasks Ti and T j The communication bandwidth requirement between them is between (2.1) and (2.2), so a communication method based on a fast transmission link is selected. At this time, COM Ti-Tj = qtl, where qtl represents a communication method based on a fast transfer link; (3) Based on the communication relationship between tasks in the task graph, the task graph is decomposed into sub-task graphs that have no communication dependency with each other, and the connection weights between tasks with one-to-many communication and many-to-one communication in each sub-task graph are adjusted. (4) According to the task layout constraints of different communication methods, merge the tasks in each subtask graph described in step (3) to obtain the set of virtual tasks to be laid out. (5) Use the layout algorithm to determine the virtual layout position of each virtual task in the virtual task set to be laid out in step (4), and determine the actual layout position of each task in the original task graph based on the virtual layout position of each virtual task. (6) Use bitstream relocation technology to configure the functions of each task node in the original task graph to the actual layout position described in step (5), and construct a chain communication architecture according to the connection weight described in step (3).
2. The method for constructing a communication architecture for inter-task communication in a slotless DPR system according to claim 1, characterized in that, In step (3), the method for adjusting the connection weights between one-to-many and many-to-one communication tasks in each subtask graph is as follows: For each subtask graph, starting from the root node, scan according to the communication following relationship. For one-to-one communication tasks, retain the initial connection weights; for one-to-many and many-to-one communication tasks, change the connection weights to COM. Ti-Tj =icap.
3. The method for constructing a communication architecture for inter-task communication in a slotless DPR system according to claim 1, characterized in that, In step (4), the method for merging the tasks in each subtask graph described in step (3) to obtain the set of virtual tasks to be laid out is as follows: (4.1) For each subtask graph, starting from the root node, scan according to the communication follow-up relationship. If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =icap, then task T i Simply place it into the set of virtual tasks to be deployed; (4.2) If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =share_mem, merge them into a single virtual task Tv Ti-Tj Then, place them into the virtual task set to be laid out; among them, TV Ti-Tj Resource regional characteristics RE Ti-Tj For RE ti and RE tj The elements are stacked horizontally from left to right and share the same RE. ti BRAM resources for output data buffering; (4.3) If a certain task T i Its successor task T j Inter-connection weight is COM Ti-Tj =qtl, merge them into a virtual task Tv Ti-Tj Then, place them into the virtual task set to be laid out; among them, TV Ti-Tj Resource regional characteristics RE Ti-Tj For RE ti and RE tj Vertically stacked from bottom to top, and as RE ti Add a column of CLB resources to the edge of the BRAM for output data buffering.
4. The method for constructing a communication architecture for inter-task communication in a slotless DPR system according to claim 3, characterized in that, In step (5), the method for determining the actual layout position of each task in the original task diagram based on the virtual layout position of each virtual task is as follows: (5.1) If the virtual task is the original task described in step (4.1), then the virtual layout position is its final layout position; (5.2) If the virtual task is the virtual task Tv using the shared memory communication method described in step (4.2) Ti-Tj Then task T i The layout position is left-aligned with the virtual layout position, task T j The layout position is right-aligned with the virtual layout position; (5.3) If the virtual task is the virtual task Tv described in step (4.3) using the communication method based on the fast transmission link. Ti-Tj Then task T i The layout position is aligned with the virtual layout position, task T j The layout position is aligned with the virtual layout position.