Route planning method and device, computer device, storage medium and product
By performing full permutation and interval factor calculations on the many-core 4D architecture, the problem of inter-core routing planning is solved, and the efficiency and reliability of inter-core routing planning are improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TSINGHUA UNIVERSITY
- Filing Date
- 2023-12-19
- Publication Date
- 2026-06-12
AI Technical Summary
How to perform inter-core routing planning when physically mapping a many-core 4D architecture to a 2D architecture to improve power consumption?
By performing a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture, the first permutation method is determined, and the routing plan of the many-core 2D architecture is determined based on the interval factor and objective function of the computing cores.
It achieves better inter-core routing planning with lower computational complexity, improving planning efficiency and reliability.
Smart Images

Figure CN117729145B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of many-core chip technology, and in particular to a routing planning method, apparatus, computer equipment, storage medium and product. Background Technology
[0002] With the development of many-core chip technology, many-core 4D architecture has emerged. To improve the power consumption of many-core chips, it is necessary to physically map the many-core 4D architecture onto a 2D architecture. When physically mapping the many-core 4D architecture to a 2D architecture, inter-core routing planning is required, and how to perform inter-core routing planning is a problem that urgently needs to be solved. Summary of the Invention
[0003] Therefore, it is necessary to provide a routing planning method, apparatus, computer device, computer-readable storage medium, and product to address the aforementioned technical problems and solve the problem of inter-core routing planning when physically mapping a many-core 4D architecture onto a 2D architecture.
[0004] Firstly, this application provides a route planning method. The method includes:
[0005] The height, width, input depth, and output depth of the many-core 4D architecture are fully permuted to obtain the first permutation method;
[0006] For each of the first arrangement methods, the interval factor of each computing core in the many-core 4D architecture is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture under the first arrangement method.
[0007] Based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement, the routing plan of the many-core 2D architecture is determined.
[0008] In one embodiment, when determining the physical mapping of the many-core 4D architecture to a many-core 2D architecture, the spacing factor of each computing core in the many-core 4D architecture under the first arrangement includes:
[0009] Based on the base coordinates of each computing core in the many-core 4D architecture under the first arrangement, and the height, width, input depth, and output depth corresponding to the first arrangement, the sequence number of each computing core in the many-core 4D architecture under the first arrangement is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0010] Based on the principles of unique serial number and unique remainder, the interval factor of each computing core in the many-core 4D architecture under the first arrangement is determined according to the serial number of each computing core and the interval factor calculation formula.
[0011] In one embodiment, determining the routing plan for the many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement includes:
[0012] The computing cores in the many-core 4D architecture described in the first arrangement are divided into multiple sub-parts;
[0013] For each of the sub-parts, the minimum physical routing time for each sub-part is determined based on the interval factor of each computing core in the many-core 4D architecture under the first arrangement and the objective function.
[0014] The minimum physical routing time under the first arrangement is determined based on the minimum physical routing time of each of the sub-parts;
[0015] The routing plan for the many-core 2D architecture is determined based on the minimum physical routing time under each of the first arrangement methods.
[0016] In one embodiment, determining the minimum physical routing time for each sub-part based on the interval factor of each computing core in the many-core 4D architecture under the first arrangement and the objective function includes:
[0017] The computational kernels of each of the aforementioned sub-parts are fully permuted to obtain computational kernels under different second permutation methods;
[0018] The minimum physical routing time for each sub-part is determined based on the interval factor corresponding to the computational kernel under different second permutation methods and the objective function.
[0019] In one embodiment, determining the minimum physical routing time for each of the first permutations based on the minimum physical routing time for each of the sub-parts includes:
[0020] The minimum physical routing time of each of the sub-parts is summed to obtain the minimum physical routing time of the first arrangement.
[0021] In one embodiment, determining the routing plan for the many-core 2D architecture based on the minimum physical routing time under each of the first arrangement methods includes:
[0022] The routing plan for the many-core 2D architecture is performed based on the first arrangement corresponding to the minimum physical routing time under each of the first arrangement methods.
[0023] Secondly, this application also provides a route planning device. The device includes:
[0024] The permutation module is used to perform a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture to obtain each first permutation method.
[0025] The calculation module is used to determine, for each of the first arrangement methods, the interval factor of each calculation core in the many-core 4D architecture when the many-core 4D architecture is physically mapped to the many-core 2D architecture;
[0026] The planning module is used to determine the routing plan of the many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement.
[0027] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the steps of the method described in any of the first aspects above.
[0028] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, implements the steps of the method described in any of the first aspects above.
[0029] Fifthly, this application also provides a computer program product. The computer program product includes a computer program that, when executed by a processor, implements the steps of the method described in any of the first aspects above.
[0030] The routing planning methods, devices, computer equipment, storage media, and products described above, with their many-core 2D architecture routing planning based on the interval factors and objective functions of each computing core in the many-core 4D architecture under the first arrangement, can achieve better routing planning with lower computational complexity, thus improving the efficiency and reliability of inter-core routing planning. Attached Figure Description
[0031] Figure 1 This is a diagram illustrating the application environment of the routing planning method in one embodiment;
[0032] Figure 2 This is a flowchart illustrating a routing planning method in one embodiment;
[0033] Figure 3 This is a schematic diagram illustrating the physical mapping of a many-core 4D architecture to a many-core 2D architecture in one embodiment.
[0034] Figure 4 This is a flowchart illustrating the spacing factor of each computing core in a many-core 4D architecture under a first arrangement, in one embodiment where the many-core 4D architecture is physically mapped to a many-core 2D architecture.
[0035] Figure 5 This is a schematic diagram illustrating two unfolding methods when unfolding the many-core 4D architecture in one embodiment;
[0036] Figure 6 This is a flowchart illustrating the process of determining the routing plan for a many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under a first arrangement in one embodiment.
[0037] Figure 7 This is a flowchart illustrating a routing planning method in one exemplary embodiment.
[0038] Figure 8 This is a schematic diagram of a routing planning device in one embodiment;
[0039] Figure 9 This is a diagram of the internal structure of a server in one embodiment;
[0040] Figure 10 This is a diagram of the internal structure of a terminal in one embodiment. Detailed Implementation
[0041] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0042] The routing planning method provided in this application embodiment can be applied to, for example, Figure 1 In the application environment shown, computer device 102 performs a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture to obtain various first permutation methods. For each first permutation method, it determines the interval factor of each computing core in the many-core 4D architecture under the first permutation method, when physically mapping the many-core 4D architecture to the many-core 2D architecture. Based on the interval factor of each computing core in the many-core 4D architecture under the first permutation method and the objective function, it determines the routing plan for the many-core 2D architecture. Computer device 102 can be, but is not limited to, various personal computers, laptops, smartphones, tablets, IoT devices, and portable wearable devices. IoT devices can be smart speakers, smart TVs, smart air conditioners, smart in-vehicle devices, etc. Portable wearable devices can be smartwatches, smart bracelets, head-mounted devices, etc.
[0043] In one embodiment, such as Figure 2 As shown, a route planning method is provided, which can be applied to... Figure 1 Taking computer device 102 as an example, the following steps are included:
[0044] Step 202: Perform a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture to obtain each first permutation method.
[0045] The schematic diagram of the physical mapping from the many-core 4D architecture to the many-core 2D architecture is shown below. Figure 3 As shown, the upper part is the many-core 4D architecture, where I, J, M, and N represent the height, width, input depth, and output depth of the many-core 4D architecture, respectively. IA represents a cluster of computing cores, i.e., a group of computing cores. H represents the height of IA, W represents the width of IA, Cin represents the input data, Cout represents the output data, and core_num represents the number of computing cores. Each small cube in the upper part represents a core under the many-core 4D architecture, i.e., a computing core. The lower part is the many-core 2D architecture, where each small square represents a core under the many-core 2D architecture, i.e., a computing core.
[0046] Optionally, assuming the height of the many-core 4D architecture is 2, the width is 3, the input depth is 4, and the output depth is 8, the full permutation of (2,3,4,8) can yield 24 first permutations, such as (2,3,4,8), (2,3,8,4)...(8,4,3,2).
[0047] Step 204: For each first arrangement method, determine the interval factor of each computing core in the many-core 4D architecture under the first arrangement method when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0048] The calculation kernel interval factor may include, but is not limited to, the cross-kernel interval factor for row overlapping data, the cross-kernel interval factor for column overlapping data, the cross-kernel interval factor for partial kernel data, the cross-kernel interval factor for dimension-adjusted data, and the cross-kernel interval factor for kernel cluster data. This embodiment does not limit this.
[0049] Optionally, the interval factor of each computing core in the many-core 4D architecture under the first arrangement can be determined based on the height, width, input depth, output depth corresponding to the first arrangement, the base coordinates of each computing core in the many-core 4D architecture under the first arrangement, the interval factor calculation formula, and the preset calculation principles. The preset calculation principles here include, but are not limited to, the unique sequence number principle and the unique remainder principle, which are not limited in this embodiment.
[0050] Step 206: Determine the routing plan for the many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement.
[0051] Optionally, the interval factor of each computing core in the many-core 4D architecture under the first arrangement is substituted into the objective function for calculation to obtain the physical routing time under the first arrangement. Then, the routing plan of the many-core 2D architecture is determined based on the physical routing time under the first arrangement.
[0052] In the aforementioned routing planning method, the height, width, input depth, and output depth of the many-core 4D architecture are permuted to obtain various first permutation methods. For each first permutation method, the interval factor of each computing core in the many-core 4D architecture under the first permutation method is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture. Based on the interval factor of each computing core in the many-core 4D architecture under the first permutation method and the objective function, the routing plan for the many-core 2D architecture is determined. The routing plan for the many-core 2D architecture is obtained based on the interval factor of each computing core in the many-core 4D architecture under the first permutation method, which allows for a better routing plan with lower computational complexity, improving the efficiency and reliability of inter-core routing planning.
[0053] In one embodiment, when physically mapping a many-core 4D architecture to a many-core 2D architecture, the spacing factor of each computing core in the many-core 4D architecture under the first permutation is determined, and the process is as follows: Figure 4 As shown, it includes:
[0054] Step 402: Based on the base coordinates of each computing core in the many-core 4D architecture under the first arrangement method, and the height, width, input depth, and output depth corresponding to the first arrangement method, determine the sequence number of each computing core in the many-core 4D architecture under the first arrangement method when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0055] Optionally, if it is determined that the many-core 4D architecture will be physically mapped to the many-core 2D architecture, the many-core 4D architecture needs to be planned and deployed in stages. The staged planning can be divided into 5 stages: splitting in the height direction, splitting in the width direction, splitting in the input depth direction, splitting in the output depth direction, and pipeline planning between core clusters of computing cores. These 5 stages are not concurrent, and the execution order of each stage can be arbitrarily adjusted.
[0056] When unfolding the many-core 4D architecture, there are two unfolding methods, such as... Figure 5 As shown, I, J, M, and N represent the height, width, input depth, and output depth of the many-core 4D architecture, respectively, and the size of the unfolded plane is (∆ IA max(x) core ,y core )), where ∆ IA x is the cross-kernel spacing factor for the kernel cluster data. core Calculate the number of cores in the x-direction, y coreCalculate the number of cores in the y-direction. Figure 5 The direction indicated by the middle arrow is the arrangement direction of the computational kernels. The first expansion method is as follows: Figure 5 As shown on the left, the expansion is done first in the Y direction and then in the X direction. That is, the computing cores in the many-core 4D architecture are arranged first in the Y direction and then in the X direction. The second expansion method is as follows... Figure 5 As shown on the right, the expansion is done first in the X direction and then in the Y direction. That is, the computing cores in the many-core 4D architecture are arranged first in the X direction and then in the Y direction.
[0057] Assuming that the height, width, input depth, and output depth corresponding to the first arrangement are I, J, M, and N respectively, and the base coordinates of the computing cores in the many-core 4D architecture under the first arrangement are (i,j,m,n), taking the execution order of splitting in the input depth direction, splitting in the output depth direction, splitting in the width direction, and splitting in the height direction, and taking X-first and then Y-second as an example, the formula for calculating the sequence number of each computing core in the many-core 4D architecture under the first arrangement is shown in formula (1).
[0058] core_num=m+M●n+j●MN+i●JMN=x●X+y●Y (1)
[0059] In formula (1), core_num is the index of the computing core in the many-core 4D architecture under the first arrangement, and i, j, m, and n are the base coordinates of the computing core in the height direction, width direction, input depth direction, and output depth direction, respectively. x is the base coordinate of the computing core in the X direction, y is the base coordinate of the computing core in the Y direction, X is the width of the many-core 2D architecture, and Y is the height of the many-core 2D architecture.
[0060] Step 404: Based on the principles of unique serial number and unique remainder, determine the interval factor of each computing core in the many-core 4D architecture under the first arrangement method according to the serial number and interval factor calculation formula of each computing core in the many-core 4D architecture under the first arrangement method.
[0061] Optionally, assuming that the height, width, input depth, and output depth corresponding to the first arrangement are I, J, M, and N respectively, then, based on the unique number principle and the unique remainder principle, according to the number of each computing core in the many-core 4D architecture under the first arrangement, the calculation formula of the interval factor of the computing core in the many-core 4D architecture under the first arrangement is obtained according to the number of each computing core in the many-core 4D architecture under the first arrangement, as shown in formulas (2), (3), (4), (5), and (6).
[0062] ∆ I =1 (2)
[0063] ∆J =I (3)
[0064] ∆ M =IJ (4)
[0065] ∆ N =IJM (5)
[0066] (6)
[0067] In formula (6), x core Calculate the number of cores in the x-direction, y core Calculate the number of nuclei in the y-direction.
[0068] In this embodiment, based on the base coordinates of each computing core in the many-core 4D architecture under the first arrangement, and the height, width, input depth, and output depth corresponding to the first arrangement, the sequence number of each computing core in the many-core 4D architecture under the first arrangement is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture. Based on the principles of unique sequence number and unique remainder, the interval factor of each computing core in the many-core 4D architecture under the first arrangement is determined according to the sequence number and the interval factor calculation formula. The interval factor is obtained based on the principles of unique sequence number and unique remainder, and the sequence number and interval factor calculation formula of each computing core in the many-core 4D architecture under the first arrangement; this method is relatively universal and simplifies the calculation.
[0069] In one embodiment, the routing plan for the many-core 2D architecture is determined based on the interval factor of each computing core in the many-core 4D architecture under the first arrangement and the objective function, as follows: Figure 6 As shown, it includes:
[0070] Step 602: Divide each computing core in the many-core 4D architecture under the first arrangement into multiple sub-parts.
[0071] Optionally, assume that the height, width, input depth, and output depth corresponding to the first arrangement are I, J, M, and N, respectively, where I=2, J=3, M=4, and N=8. When physically mapping the many-core 4D architecture to a many-core 2D architecture, arranging first from the I direction and then from the M direction, it is necessary to divide I×M computing cores into one sub-part, i.e., every 8 computing cores form one sub-part. Based on this division rule, each computing core in the many-core 4D architecture under this first arrangement is divided into multiple sub-parts.
[0072] Step 604: For each sub-part, determine the minimum physical routing time for each sub-part based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement method.
[0073] The kernel interval factor includes the cross-kernel interval factor for row overlapping data, the cross-kernel interval factor for column overlapping data, the cross-kernel interval factor for partial kernel data, the cross-kernel interval factor for dimension-adjusted data, and the cross-kernel interval factor for kernel cluster data.
[0074] Optionally, for a certain sub-part, the interval factor of each computing core in the sub-part is substituted into the objective function for calculation to obtain the physical routing time of the sub-part. The objective function is shown in formula (7).
[0075] t add_2 =k●(∆ M S p_add +∆ N S res_add +∆ I S row_add +∆ J S col_add +∆ IA S IA (7)
[0076] In formula (7), Δ I For the cross-kernel interval factor of the row-overlapping data, ∆ J ∆ is the cross-kernel margin factor for overlapping column data. M For the internuclear spacing factor of partial kernel data, ∆ N To adjust the cross-kernel interval factor of the data for dimension adjustment, ∆ IA S is the cross-kernel spacing factor for kernel cluster data. p_add For some and additional new data, S res_add Additional data is added for graph reshaping, S row_add For overlapping row data, S col_add For overlapping column data, S IA For the data used to calculate the kernel, k is the coefficient, t add_2 To calculate the physical routing time of the core.
[0077] Optionally, the computational cores in the sub-part are arranged, and the physical routing time of the sub-part under different arrangements is obtained according to formula (7). Then, the minimum physical routing time of the sub-part is determined according to the physical routing time of the sub-part under different arrangements.
[0078] Step 606: Determine the minimum physical routing time under the first arrangement based on the minimum physical routing time of each sub-part.
[0079] Optionally, the minimum physical routing time of each sub-part is weighted and summed, and the result of the weighted sum is used as the minimum physical routing time under the first arrangement.
[0080] Step 608: Determine the routing plan for the many-core 2D architecture based on the minimum physical routing time under each first arrangement method.
[0081] Optionally, the minimum physical routing times for each of the first permutation methods are t1, t2, ..., t3, respectively. n It can be based on t1, t2…t n The first permutation corresponding to the maximum value in the data determines the routing plan for the many-core 2D architecture. Alternatively, it can be based on t1, t2…t… n The first permutation corresponding to the minimum value in the data determines the routing plan for the many-core 2D architecture.
[0082] In this embodiment, the computing cores in the many-core 4D architecture under the first arrangement are divided into multiple sub-parts. For each sub-part, the minimum physical routing time is determined based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement. Based on the minimum physical routing time of each sub-part, the minimum physical routing time under the first arrangement is determined. Based on the minimum physical routing time under each of the first arrangement methods, the routing plan for the many-core 2D architecture is determined. The routing plan for the many-core 2D architecture is obtained based on the minimum physical routing time under the first arrangement method, which better considers the physical routing time between cores, thus achieving a more reasonable inter-core routing plan from the dimension of physical routing time.
[0083] In one embodiment, the minimum physical routing time for each sub-part is determined based on the interval factor of each computing core in the many-core 4D architecture under the first arrangement and the objective function, including:
[0084] The computational kernels of each sub-part are fully permuted to obtain computational kernels under different second permutation methods;
[0085] The minimum physical routing time for each sub-part is determined based on the interval factor and objective function corresponding to the computational kernel under different second permutation methods.
[0086] Optionally, assuming a certain sub-part has 8 computational cores, all 8 computational cores are permuted to obtain computational cores under different second permutation methods. The interval factor corresponding to the computational cores under different second permutation methods is substituted into formula (7) for calculation to obtain the physical routing time under different second permutation methods. The minimum physical routing time under different second permutation methods is taken as the minimum physical routing time of the sub-part.
[0087] In this embodiment, the computational cores of each sub-part are fully permuted to obtain computational cores under different second permutation methods. Based on the interval factor and objective function corresponding to the computational cores under different second permutation methods, the minimum physical routing time of each sub-part is determined. Considering the permutation method of each sub-part from the perspective of physical routing time, a reasonable arrangement of computational cores for each sub-part can be achieved.
[0088] In one embodiment, determining the minimum physical routing time for each first permutation based on the minimum physical routing time for each sub-part includes:
[0089] The minimum physical routing times of each sub-part are summed to obtain the minimum physical routing time for the first arrangement.
[0090] Optionally, assume that the minimum physical routing times for each sub-part of the first permutation are T1, T2…T n For T1, T2...T n Perform a summation calculation, and use the result of the summation calculation as the minimum physical routing time for the first arrangement.
[0091] In this embodiment, the minimum physical routing times of each sub-part are summed to obtain the minimum physical routing time for the first arrangement. Considering the first arrangement from the perspective of physical routing time allows for a reasonable arrangement of computational cores under the first arrangement.
[0092] In one embodiment, the routing plan for the many-core 2D architecture is determined based on the minimum physical routing time under each first permutation, including:
[0093] Based on the minimum physical routing time among the first arrangement methods, perform routing planning for the many-core 2D architecture.
[0094] Optionally, assume that the minimum physical routing time under each of the first permutation methods is t1, t2...t n t1、t2…t n If the minimum value in the array is t2, then the routing plan for the many-core 2D architecture is performed based on the first permutation method corresponding to t2.
[0095] In this embodiment, routing planning is performed on the many-core 2D architecture based on the first arrangement method corresponding to the minimum physical routing time among the first arrangement methods. The routing planning for the many-core 2D architecture is obtained based on the minimum physical routing time under each of the first arrangement methods, achieving a more reasonable inter-core routing plan from the perspective of physical routing time.
[0096] In one exemplary embodiment, a route planning method is provided, the process of which is as follows: Figure 7 As shown, it includes:
[0097] Step 701: Perform a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture to obtain each first permutation method.
[0098] Step 702: For each first arrangement, based on the base coordinates of each computing core in the many-core 4D architecture under the first arrangement, and the height, width, input depth, and output depth corresponding to the first arrangement, determine the sequence number of each computing core in the many-core 4D architecture under the first arrangement when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0099] Step 703: Based on the principles of unique serial number and unique remainder, determine the interval factor of each computing core in the many-core 4D architecture under the first arrangement method according to the serial number and interval factor calculation formula of each computing core in the many-core 4D architecture under the first arrangement method.
[0100] Step 704: Divide each computing core in the many-core 4D architecture under the first arrangement into multiple sub-parts.
[0101] Step 705: Perform a full permutation of the computational kernels of each sub-part to obtain computational kernels under different second permutation methods.
[0102] Step 706: Determine the minimum physical routing time for each sub-part based on the interval factor and objective function corresponding to the computation kernel under different second permutation methods.
[0103] Step 707: Sum the minimum physical routing times of each sub-part to obtain the minimum physical routing time for the first arrangement.
[0104] Step 708: Perform routing planning for the many-core 2D architecture based on the first arrangement method corresponding to the minimum physical routing time among the minimum values of each first arrangement method.
[0105] The routing planning method described above, the routing planning of the many-core 2D architecture is obtained based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement, which can obtain a better routing plan with lower computational complexity, thus improving the efficiency and reliability of inter-core routing planning.
[0106] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0107] Based on the same inventive concept, this application also provides a routing planning apparatus for implementing the routing planning method described above. The solution provided by this apparatus is similar to the implementation described in the above method; therefore, the specific limitations in one or more routing planning apparatus embodiments provided below can be found in the limitations of the routing planning method described above, and will not be repeated here.
[0108] In one embodiment, such as Figure 8 As shown, a route planning device 800 is provided, including: an arrangement module 820, a calculation module 840, and a planning module 860, wherein:
[0109] The arrangement module 820 is used to perform a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture to obtain each first arrangement method;
[0110] The computing module 840 is used to determine the interval factor of each computing core in the many-core 4D architecture under the first arrangement method when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0111] The planning module 860 is used to determine the routing plan for the many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement.
[0112] In one embodiment, the computing module 840 is further configured to: determine the serial number of each computing core in the many-core 4D architecture under the first arrangement method, based on the base coordinates of each computing core in the many-core 4D architecture under the first arrangement method, and the height, width, input depth, and output depth corresponding to the first arrangement method; and determine the interval factor of each computing core in the many-core 4D architecture under the first arrangement method, based on the unique serial number principle and the unique remainder principle, and according to the serial number of each computing core in the many-core 4D architecture under the first arrangement method and the interval factor calculation formula.
[0113] In one embodiment, the planning module 860 is further configured to: divide each computing core in the many-core 4D architecture under the first arrangement into multiple sub-parts; for each sub-part, determine the minimum physical routing time of each sub-part based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement; determine the minimum physical routing time under the first arrangement based on the minimum physical routing time of each sub-part; and determine the routing plan for the many-core 2D architecture based on the minimum physical routing time under each of the first arrangement methods.
[0114] In one embodiment, the planning module 860 is further configured to: perform a full permutation of the computational cores of each sub-part to obtain computational cores under different second permutation methods; and determine the minimum physical routing time of each sub-part based on the interval factor and objective function corresponding to the computational cores under different second permutation methods.
[0115] In one embodiment, the planning module 860 is further configured to: sum the minimum physical routing times of each sub-part to obtain the minimum physical routing time of the first arrangement.
[0116] In one embodiment, the planning module 860 is further configured to: perform routing planning for the many-core 2D architecture based on the first permutation corresponding to the minimum value among the minimum physical routing times under each first permutation.
[0117] Each module in the aforementioned routing planning device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the corresponding operations of each module.
[0118] In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 9 As shown, the computer device includes a processor, memory, and a network interface connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The database stores data. The network interface communicates with external terminals via a network connection. When executed by the processor, the computer program implements a routing planning method.
[0119] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 10As shown, the computer device includes a processor, memory, communication interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, NFC (Near Field Communication), or other technologies. When executed by the processor, the computer program implements a routing planning method. The display screen can be an LCD screen or an e-ink display. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad mounted on the computer device casing, or an external keyboard, touchpad, or mouse.
[0120] Those skilled in the art will understand that Figure 9 and Figure 10 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0121] In one embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:
[0122] The height, width, input depth, and output depth of the many-core 4D architecture are fully permuted to obtain the first permutation method;
[0123] For each of the first permutation methods, the interval factor of each computing core in the many-core 4D architecture is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0124] Based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement, the routing plan for the many-core 2D architecture is determined.
[0125] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.
[0126] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor:
[0127] The height, width, input depth, and output depth of the many-core 4D architecture are fully permuted to obtain the first permutation method;
[0128] For each of the first permutation methods, the interval factor of each computing core in the many-core 4D architecture is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0129] Based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement, the routing plan for the many-core 2D architecture is determined.
[0130] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the steps in the above method embodiments.
[0131] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, performs the following steps:
[0132] The height, width, input depth, and output depth of the many-core 4D architecture are fully permuted to obtain the first permutation method;
[0133] For each of the first permutation methods, the interval factor of each computing core in the many-core 4D architecture is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture.
[0134] Based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement, the routing plan for the many-core 2D architecture is determined.
[0135] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0136] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0137] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0138] The above embodiments are merely illustrative of several implementation methods of this application, and their descriptions are relatively specific and detailed. However, they should not be construed as limiting the scope of this application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A route planning method, characterized in that, The method includes: The height, width, input depth, and output depth of the many-core 4D architecture are fully permuted to obtain the first permutation method; For each of the first arrangement methods, the interval factor of each computing core in the many-core 4D architecture is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture under the first arrangement method. Based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement, the routing plan of the many-core 2D architecture is determined.
2. The method according to claim 1, characterized in that, When determining that the many-core 4D architecture is physically mapped to a many-core 2D architecture, the spacing factor of each computing core in the many-core 4D architecture under the first arrangement includes: Based on the base coordinates of each computing core in the many-core 4D architecture under the first arrangement, and the height, width, input depth, and output depth corresponding to the first arrangement, the sequence number of each computing core in the many-core 4D architecture under the first arrangement is determined when the many-core 4D architecture is physically mapped to the many-core 2D architecture. Based on the principles of unique serial number and unique remainder, the interval factor of each computing core in the many-core 4D architecture under the first arrangement is determined according to the serial number of each computing core and the interval factor calculation formula.
3. The method according to claim 1, characterized in that, The step of determining the routing plan for the many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement includes: The computing cores in the many-core 4D architecture described in the first arrangement are divided into multiple sub-parts; For each of the sub-parts, the minimum physical routing time for each sub-part is determined based on the interval factor of each computing core in the many-core 4D architecture under the first arrangement and the objective function. The minimum physical routing time under the first arrangement is determined based on the minimum physical routing time of each of the sub-parts; The routing plan for the many-core 2D architecture is determined based on the minimum physical routing time under each of the first arrangement methods.
4. The method according to claim 3, characterized in that, The step of determining the minimum physical routing time for each sub-part based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement includes: The computational kernels of each of the aforementioned sub-parts are fully permuted to obtain computational kernels under different second permutation methods; The minimum physical routing time for each sub-part is determined based on the interval factor corresponding to the computational kernel under different second permutation methods and the objective function.
5. The method according to claim 3, characterized in that, Determining the minimum physical routing time for each of the first arrangement methods based on the minimum physical routing time of each of the sub-parts includes: The minimum physical routing time of each of the sub-parts is summed to obtain the minimum physical routing time of the first arrangement.
6. The method according to claim 3, characterized in that, The step of determining the routing plan for the many-core 2D architecture based on the minimum physical routing time under each of the first arrangement methods includes: The routing plan for the many-core 2D architecture is performed based on the first arrangement corresponding to the minimum physical routing time under each of the first arrangement methods.
7. A route planning device, characterized in that, The device includes: The permutation module is used to perform a full permutation of the height, width, input depth, and output depth of the many-core 4D architecture to obtain each first permutation method. The calculation module is used to determine, for each of the first arrangement methods, the interval factor of each calculation core in the many-core 4D architecture when the many-core 4D architecture is physically mapped to the many-core 2D architecture; The planning module is used to determine the routing plan of the many-core 2D architecture based on the interval factor and objective function of each computing core in the many-core 4D architecture under the first arrangement.
8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 6.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.
10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.