In-memory computing system and method, and server
By partitioning computing and storage resources and embedding different computing and storage logic in each region, and dynamically allocating resources using a partition mapping module, the problems of PCIe interface bandwidth blocking and insufficient SRIOV mechanism are solved, thus achieving efficient computing and storage resource management.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- XI AN UNIIC SEMICON CO LTD
- Filing Date
- 2025-11-11
- Publication Date
- 2026-07-02
Smart Images

Figure CN2025134113_02072026_PF_FP_ABST
Abstract
Description
An in-memory computing system, method and server
[0001] Cross-references to related applications
[0002] This application claims priority to Chinese Patent Application No. 202411895937.X, filed on December 23, 2024, the entire contents of which are incorporated herein by reference. Technical Field
[0003] This invention relates to the field of data processing technology, and in particular to an in-memory computing system, method and server. Background Technology
[0004] Data centers used for high-performance computing and big data analytics require a large number of high-performance computing and storage devices with virtualization and sharing capabilities. As shown in Figure 1, existing technologies typically use the PCIe (Peripheral Component Interconnect express) interface to implement traditional storage and acceleration devices, and combine it with the SRIOV (Single Root I / O Virtualization) mechanism to achieve virtualization, partitioning storage and computing resources and allocating them to different CPU (Central Processing Unit) cores.
[0005] Because PCIe is a non-cached coherent interface, the computation process requires frequent data exchange between acceleration devices, main memory, and storage devices via DMA (Direct Memory Access), causing bandwidth congestion. Furthermore, the SRIOV mechanism struggles to achieve dynamic online resource allocation and cannot achieve cross-CPU level resource sharing. Summary of the Invention
[0006] In view of the above problems, the present invention provides an in-memory computing system, method, server, and in-memory computing network. By partitioning computing and storage resources and embedding different in-memory computing logic in each region, on-demand in-memory computing is achieved. Simultaneously, the required access regions are dynamically allocated and switched according to the target software application scenario, improving the flexibility and scalability of computing and storage, and reducing the overhead and resource idle rate caused by data movement.
[0007] According to a first aspect of the present invention, an in-memory computing system is provided, comprising:
[0008] Multiple interfaces;
[0009] A partition mapping module is connected to the interface;
[0010] A computing memory module is connected to the partition mapping module, and the computing memory module includes multiple computing units and multiple memory units;
[0011] Multiple storage modules are connected to the computing memory module;
[0012] The partition mapping module is configured with interface path information. Based on the interface path information, the partition mapping module determines the target interface to be routed from the plurality of interfaces, and allocates corresponding computing units and storage modules to process the data requests sent by the target interface. The target interface is used to send data requests of the target software application corresponding to the interface path information.
[0013] Optionally, the partition mapping module includes multiple partition mapping units, and at least one partition mapping unit serves as the target mapping unit for configuring interface path information.
[0014] Optionally, the interface path information includes interface parameters and path parameters. The interface parameters are used to determine the target interface that the target mapping unit needs to route to; the path parameters are used to determine the computing unit and storage module allocated for processing the data request issued by the target interface.
[0015] Optionally, the target mapping unit is a partition mapping unit that is hit by the interface path information.
[0016] Optionally, the external scheduler determines the corresponding interface path information based on the data request of the target software application and sends it to the partition mapping module through dynamic configuration path.
[0017] Optionally, the data requests issued by each target interface are data requests from the same target software application.
[0018] Optionally, the partition mapping module may also allocate corresponding computing units and storage modules to process data requests issued by each target interface according to the interface path information.
[0019] Optionally, the partition mapping module determines the computing unit allocated to the data sent by each target interface from the currently idle computing units.
[0020] According to a second aspect of the present invention, an in-memory computing method is provided for the aforementioned in-memory computing system, the method comprising:
[0021] The partition mapping module determines the target interface that needs to be routed from multiple interfaces based on the configured interface path information, and allocates corresponding computing units and storage modules to the data requests of the target software application issued by the target interface.
[0022] The target interface sends a data request for the target software application to the partition mapping module, and the data request includes an access address.
[0023] The partition mapping module allocates the data request to the corresponding computing unit and storage module for processing.
[0024] Optionally, the partition mapping module includes multiple partition mapping units, and at least one partition mapping unit serves as a target mapping unit for configuring interface path information; the interface path information includes interface parameters and path parameters.
[0025] The partition mapping module determines the target interface to be routed from multiple interfaces based on the configured interface path information, and allocates corresponding computing units and storage modules to the data requests of the target software application issued by the target interface, including:
[0026] The target mapping unit determines the target interface to be routed from multiple interfaces based on the interface parameters.
[0027] The target mapping unit allocates a computing unit and a storage module to each target interface for processing the data request, based on the path parameters.
[0028] Optionally, the path parameters are determined by the currently released computing unit and the access address of the data request from the target software application.
[0029] Optionally, the method further includes:
[0030] After all data request processing for the target software application is completed, the corresponding computing unit and storage module are released.
[0031] According to a third aspect of the present invention, a server is provided, including the aforementioned in-memory computing system.
[0032] According to a fourth aspect of the present invention, an in-memory computing network is provided, comprising a plurality of the aforementioned in-memory computing systems and a switch, wherein the switch connects the plurality of in-memory computing systems.
[0033] The above-described one or more technical solutions in the embodiments of this specification have at least the following technical effects:
[0034] This specification provides an in-memory computing system, method, server, and in-memory computing network. The in-memory computing system includes multiple interfaces, a partition mapping module, a computing memory module, and multiple storage modules. The partition mapping module is configured with interface path information. Based on the interface path information, the partition mapping module determines the target interface to be routed from the multiple interfaces and allocates corresponding computing units and storage modules to process data requests issued by the target interface. The target interface is used to issue data requests from a target software application corresponding to the interface path information. Thus, by partitioning computing and storage resources and embedding different in-memory computing logic in each region, on-demand in-memory computing is achieved. Simultaneously, dynamically allocating and switching the required access regions according to the target software application scenario improves the flexibility and scalability of computing and storage, and reduces the overhead and resource idle rate caused by data movement.
[0035] The above description is merely an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention and to implement it in accordance with the contents of the specification, and in order to make the above and other objects, features and advantages of the present invention more apparent and understandable, specific embodiments of the present invention are described below. Attached Figure Description
[0036] Various other advantages and benefits will become apparent to those skilled in the art upon reading the following detailed description of preferred embodiments. The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:
[0037] Figure 1 shows a PCIe-based system architecture diagram in the prior art.
[0038] Figure 2 shows an architecture diagram of an in-memory computing system according to an embodiment of the present invention.
[0039] Figure 3 shows a flowchart of an in-memory computing method according to an embodiment of the present invention.
[0040] Figure 4 shows an architecture diagram of a server according to an embodiment of the present invention.
[0041] Figure 5 shows an architecture diagram of an in-memory computing network according to an embodiment of the present invention.
[0042] legend:
[0043] A-Interface, B-Data Path, C-Partition Mapping Module, D-Computing Memory Module, E-Memory Unit, G-Computing Unit, H-Dynamic Configuration Path, F1-Storage Module, F2-Storage Module, F3-Storage Module, Server 40, In-Memory Computing System 41, In-Memory Computing Network 50, Switch 51. Embodiments of the present invention
[0044] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations.
[0045] Therefore, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the invention without inventive effort are within the scope of protection of the invention.
[0046] It should be noted that similar labels and letters in the following figures indicate similar items. Therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.
[0047] In the description of this invention, it should also be noted that, unless otherwise explicitly specified and limited, the terms "set," "install," "connect," and "link" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal connection of two components. Those skilled in the art can understand the specific meaning of the above terms in this invention based on the specific circumstances.
[0048] Referring to Figure 2, this embodiment of the invention provides an in-memory computing system. The in-memory computing system in Figure 2 includes multiple interfaces A, a partition mapping module C, a computing memory module D, and multiple storage modules F1, F2, and F3.
[0049] In this embodiment, partition mapping module C is connected to multiple interfaces A; as shown in Figure 2, partition mapping module C is connected to multiple interfaces A via data path B. Computational memory module D is connected to partition mapping module C; as shown in Figure 2, computational memory module D is connected to partition mapping module C via data path B. Computational memory module D is also connected to multiple storage modules F1, F2, and F3 (which can also be referred to as storage media).
[0050] The computing memory module D includes multiple computing units G and multiple memory units E. Memory units E refer to on-chip memory, cache, lookup tables, and other hardware resources used to provide fast memory access and temporary storage during computation, achieving high-performance computing. Computing units G refer to the operator logic used for computation and the control logic for accessing on-chip memory units E or storage modules. The operator logic can be custom-designed hardware logic or general-purpose computing IP, and the computational operations it can implement include searching, comparing, sorting, encryption / decryption, decompression, and general matrix operations. The partition mapping module C provides flexible data flow routing management functions. The management process can include: an external scheduler configuring the partition mapping module C through dynamic configuration path H, distributing data requests from different interfaces A to the same (or different) computing units and storage modules.
[0051] For example, multiple unrelated software applications send data requests through multiple servers respectively. For the in-memory computing system of this embodiment, this is equivalent to traffic from interface A1 and traffic from interface A2 being routed to different computing units G and storage modules.
[0052] For example, when the same target software application runs on multiple computing nodes, for the in-memory computing system of this embodiment, traffic from interface A1 and traffic from interface A2 are considered to be from the same source and will be routed to the same computing unit G and storage module.
[0053] In detail, the external scheduler determines the corresponding interface path information based on the data request from the target software application. This interface path information is sent to the partition mapping module C via the dynamic configuration path H. The interface path information is used to configure the partition mapping module C. The configuration mainly includes two parts: configuring the interface and configuring the computing and storage paths.
[0054] Based on the interface path information, the partition mapping module C determines the target interface that needs to be routed from multiple interfaces, and allocates the corresponding computing units and storage modules to process the data requests sent by the target interface.
[0055] The target interface is used to send data requests to the target software application corresponding to the interface path information.
[0056] In this embodiment, there can be one or more target interfaces. It should be noted that each target interface is used to send data requests from the same target software application. For the partition mapping module C, the data streams sent by the aforementioned target interfaces belong to the same source.
[0057] The partition mapping module C also allocates corresponding computing units G and storage modules to process data requests issued by each target interface based on the interface path information. For target interfaces from the same source (i.e., the same target software application), the allocated computing units G and storage modules are consistent.
[0058] In this embodiment, there are multiple computing units G. When determining interface path information, some computing units G may be idle, while others may be active. An idle computing unit G indicates that there are currently no computing tasks to process, while an active computing unit G indicates that it is currently occupied and has unfinished computing tasks to process. Once the computing tasks of computing unit G are completed, computing unit G will be released, and then computing unit G will be idle.
[0059] Therefore, it is easy to understand that the computing unit G allocated by the partition mapping module C in this embodiment to the data sent to each target interface according to the interface path information is determined from the currently idle computing unit G.
[0060] When processing data requests from a target software application, the in-memory computing system may simultaneously process data requests from other software applications. Therefore, in this embodiment, the partition mapping module C may include multiple partition mapping units, and at least one partition mapping unit serves as the target mapping unit for configuring interface path information. The interface path information includes interface parameters and path parameters, wherein the interface parameters are used to determine the target interface that the target mapping unit needs to route to; and the path parameters are used to determine the computing units and storage modules allocated for processing the data requests issued by the target interface.
[0061] It should be noted that the target mapping unit is equivalent to the partition mapping unit that has been hit by the interface path information. When configuring the target software application, the external scheduler determines the target mapping unit from multiple partition mapping units, which are partition mapping units that have not been hit by other application software.
[0062] To enable a target mapping unit, the following steps are taken: Based on the target software application, target mapping unit C1 is enabled; based on the interface parameters, the consistent cache target interfaces A1, A2...An that target mapping unit C1 needs to route are set; and the access address range (A1_Address_low, A1_Address_high), (A2_Address_low, A2_Address_high)...(An_Address_low, An_Address_high) that each target interface needs to route is set; based on the path parameters, the route of target mapping unit C1 is set, for example, G2-F3, indicating routing to computing unit G2 for computation and storage module F3 for storage.
[0063] For example:
[0064] Based on the data request from the target software application, the target mapping unit C1 is enabled, and its target interface is set to A1-A3.
[0065] The access address ranges for target interface A1 of target mapping unit C1 are set as follows: A1_Address_low=0, A1_Address_high=1000; A2_Address_low=1000, A2_Address_high=4000; and A3_Address_low=5000, A3_Address_high=6000.
[0066] Set the routing path of target mapping unit C1 to G1-F1;
[0067] When a data request with access address 500 is received through the target interface A1, the computing unit allocated for processing is G1, and the storage module is F1.
[0068] When a data request with access address 1500 is received through target interface A2, the computing unit allocated for processing is G1, and the storage module is F1.
[0069] When a data request with access address 5200 is received through the target interface A3, the computing unit allocated for processing is G1, and the storage module is F1.
[0070] In this embodiment, the data request can be a read data request or a write data request.
[0071] For write data requests, the write data request of the target software application is issued through target interface A. After passing through data path B, it enters the target mapping unit and is then routed to the pre-configured computing unit G. Subsequently, depending on the different logical function implementation, the on-chip memory unit E is called to perform calculations, and the calculation results are returned to computing unit G. Finally, it is routed to the corresponding storage module (F1 / F2 / F3). At the same time, the write completion indication is returned to the target software application side through the target mapping unit via target interface A.
[0072] For read data requests, write data requests from the target software application are sent through the target interface A. The request is then passed directly to the corresponding storage module (F1 / F2 / F3) via the data path B, the target mapping unit, and the computing memory module C. After the read data is returned, it is determined whether to pass through the computing unit G again based on the configuration, and finally return to the target software application side.
[0073] For example:
[0074] When a read data request is issued, if the data was encrypted / compressed during writing, it needs to be decrypted / decompressed again by the computing unit G before being returned to the target software application.
[0075] When issuing multiple write data requests, configured as sorting operations, the computation unit G will write the multiple write data requests into the storage module in order according to the specified rules. When issuing multiple read data requests, expecting to read back sorted data, in this case, it is not necessary to go through the computation unit G again; the data can be directly returned to the target software application side.
[0076] When multiple write data requests are issued, configured as matrix multiplication and addition operations, the computation unit G will write the calculation results to the corresponding storage modules for storage. When read data requests are issued, they do not need to go through the computation unit G and are directly returned to the target software application side.
[0077] In summary, the in-memory computing system provided in this specification includes multiple interfaces, a partition mapping module, a computing memory module, and multiple storage modules. The partition mapping module is configured with interface path information. Based on this information, the partition mapping module determines the target interface to be routed from among the multiple interfaces and allocates corresponding computing units and storage modules to process data requests issued by the target interface. The target interface is used to issue data requests from the target software application corresponding to the interface path information. Thus, by partitioning computing and storage resources and embedding different in-memory computing logic in each region, on-demand in-memory computing is achieved. Simultaneously, dynamically allocating and switching the required access regions according to the target software application scenario improves the flexibility and scalability of computing and storage, and reduces the overhead and resource idle rate caused by data movement.
[0078] Based on the same inventive concept, and referring to Figure 3, this embodiment of the invention also provides an in-memory computing method for the aforementioned in-memory computing system, the method comprising steps 101-103:
[0079] Step 101: The partition mapping module determines the target interface to be routed from multiple interfaces based on the configured interface path information, and allocates corresponding computing units and storage modules to the data requests of the target software application sent by the target interface.
[0080] Step 102: The target interface sends a data request from the target software application to the partition mapping module. The data request includes the access address.
[0081] Step 103: The partition mapping module allocates data requests to the corresponding computing units and storage modules for processing.
[0082] In this embodiment, referring to Figure 2, the in-memory computing system includes multiple interfaces A. Interface A is connected to a partition mapping module C via a data path B. The partition mapping module C is connected to a computing memory module D via a data path B. The computing memory module D is connected to multiple storage modules F1, F2, and F3 (also referred to as storage media).
[0083] The computing memory module comprises multiple computing units G and multiple memory units E. Memory units E refer to on-chip memory, cache, lookup tables, and other hardware resources used to provide fast memory access and temporary storage during computation, achieving high-performance computing. Computing units G refer to the operator logic used for computation and the control logic for accessing on-chip memory units or storage media. The operator logic can be custom-designed hardware logic or general-purpose computing IP, and the computational operations it can implement include, but are not limited to, searching, comparing, sorting, encryption / decryption, decompression, and general matrix operations. The partition mapping module C provides flexible data flow routing management. The management process can be understood as follows: an external scheduler configures the partition mapping module C through dynamic configuration paths H, distributing data requests from different interfaces to the same (or different) computing units and storage modules.
[0084] For example, multiple unrelated software applications send data requests through multiple servers. For the in-memory computing system of this embodiment, this is equivalent to traffic from interface A1 and traffic from interface A2 being routed to different computing units and storage modules.
[0085] For example, when the same target software application runs on multiple computing nodes, for the in-memory computing system of this embodiment, traffic from interface A1 and traffic from interface A2 are considered to have the same source and will be routed to the same computing unit and storage module.
[0086] In detail, the external scheduler determines the corresponding interface path information based on the data request of the target software application. This interface path information is sent to the partition mapping module C through the dynamic configuration path H. The interface path information is used to configure the partition mapping module C. The configuration mainly includes two parts: configuring the interface and configuring the computing and storage paths.
[0087] Based on the interface path information, the partition mapping module C determines the target interface that needs to be routed from multiple interfaces, and allocates the corresponding computing units and storage modules to process the data requests sent by the target interface.
[0088] The target interface is used to send data requests to the target software application corresponding to the interface path information.
[0089] In this embodiment, there may be one or more target interfaces. It should be noted that each target interface is used to send data requests from the same target software application. For the partition mapping module C, the data streams sent by the aforementioned target interfaces belong to the same source.
[0090] The partition mapping module C also allocates corresponding computing units G and storage modules to process data requests issued by each target interface based on the interface path information. For target interfaces from the same source, the allocated computing units G and storage modules are consistent.
[0091] In this embodiment, there are multiple computing units. When determining the interface path information, some computing units may be idle, while others may be active. An idle computing unit indicates that there are currently no computing tasks to process, while an active computing unit indicates that the unit is currently occupied and has unfinished computing tasks to process. Once the computing tasks of a computing unit are completed, the unit is released and becomes idle.
[0092] Therefore, it is easy to understand that the computing units allocated according to the interface path information in this embodiment are determined from the currently idle computing units.
[0093] While processing data requests from the target software application, the in-memory computing system may also simultaneously process data requests from other software applications. Therefore, in this embodiment, the partition mapping module may include multiple partition mapping units, and at least one partition mapping unit serves as the target mapping unit for configuring interface path information. The interface path information includes interface parameters and path parameters, wherein the interface parameters are used to determine the target interface that the target mapping unit needs to route to; and the path parameters are used to determine the computing units and storage modules allocated for processing the data requests issued by the target interface.
[0094] It should be noted that the target mapping unit is equivalent to the partition mapping unit that has been hit by the interface path information. When configuring the target software application, the external scheduler determines the target mapping unit from multiple partition mapping units, which are partition mapping units that have not been hit by other application software.
[0095] To enable a target mapping unit, the following steps are taken: Based on the target software application, target mapping unit C1 is enabled; based on the interface parameters, the consistent cache target interfaces A1, A2...An that target mapping unit C1 needs to route are set; and the access address range (A1_Address_low, A1_Address_high), (A2_Address_low, A2_Address_high)...(An_Address_low, An_Address_high) that each target interface needs to route is set; based on the path parameters, the route of target mapping unit C1 is set, for example, G2-F3, indicating routing to computing unit G2 for computation and storage module F3 for storage.
[0096] For example:
[0097] Based on the data request from the target software application, the target mapping unit C1 is enabled, and its target interface is set to A1-A3.
[0098] The access address ranges for target interface A1 of target mapping unit C1 are set as follows: A1_Address_low = 0, A1_Address_high = 1000; A2_Address_low = 1000, A2_Address_high = 4000; and A3_Address_low = 5000, A3_Address_high = 6000.
[0099] Set the routing path of target mapping unit C1 to G1-F1;
[0100] When a data request with access address 500 is received through the target interface A1, the computing unit allocated for processing is G1, and the storage module is F1.
[0101] When a data request with access address 1500 is received through target interface A2, the computing unit allocated for processing is G1, and the storage module is F1.
[0102] When a data request with access address 5200 is received through the target interface A3, the computing unit allocated for processing is G1, and the storage module is F1.
[0103] In this embodiment, the data request can be a read data request or a write data request.
[0104] For write data requests, the write data request of the target software application is issued through target interface A. After passing through data path B, it enters the target mapping unit and is then routed to the pre-configured computing unit G. Subsequently, depending on the different logical function implementation, the on-chip memory unit E is called to perform calculations, and the calculation results are returned to computing unit G. Finally, it is routed to the corresponding storage module (F1 / F2 / F3). At the same time, the write completion indication is returned to the target software application side through the target mapping unit via target interface A.
[0105] For read data requests, write data requests from the target software application are sent through the target interface A. The request is then passed directly to the corresponding storage module (F1 / F2 / F3) via the data path B, the target mapping unit, and the computing memory module C. After the read data is returned, it is determined whether to pass through the computing unit G again based on the configuration, and finally return to the target software application side.
[0106] For example:
[0107] When a read data request is issued, if the data was encrypted / compressed during writing, it needs to be decrypted / decompressed again by the computing unit G before being returned to the target software application.
[0108] When issuing multiple write data requests, configured as sorting operations, the computation unit G will write the multiple write data requests into the storage module in order according to the specified rules. When issuing multiple read data requests, expecting to read back sorted data, in this case, it is not necessary to go through the computation unit G again; the data can be directly returned to the target software application side.
[0109] When multiple write data requests are issued, configured as matrix multiplication and addition operations, the computation unit G will write the calculation results to the storage module for storage. When read data requests are issued, they do not need to go through the computation unit G and are directly returned to the target software application side.
[0110] In summary, the in-memory computing method provided in this specification includes multiple interfaces, a partition mapping module, a computing memory module, and multiple storage modules. The partition mapping module is configured with interface path information. Based on this information, the partition mapping module determines the target interface to be routed from among the multiple interfaces and allocates corresponding computing units and storage modules to process data requests issued by the target interface. The target interface is used to issue data requests from a target software application corresponding to the interface path information. Thus, by partitioning computing and storage resources and embedding different in-memory computing logic in each region, on-demand in-memory computing is achieved. Simultaneously, dynamically allocating and switching the required access regions according to the target software application scenario improves the flexibility and scalability of computing and storage, and reduces the overhead and resource idle rate caused by data movement.
[0111] Based on the same inventive concept and referring to Figure 4, this embodiment of the invention also provides a server, server 40 including an in-memory computing system 41. The in-memory computing system 41 is shown in Figure 2, and its details can be found above, and will not be repeated here.
[0112] In summary, the server provided in this embodiment includes an in-memory computing system. This in-memory computing system comprises multiple interfaces, a partition mapping module, a computing memory module, and multiple storage modules. The partition mapping module is configured with interface path information. Based on the interface path information, the partition mapping module determines the target interface to be routed from the multiple interfaces and allocates corresponding computing units and storage modules to process data requests issued by the target interface. The target interface is used to issue data requests from the target software application corresponding to the interface path information. Thus, by partitioning computing and storage resources and embedding different in-memory computing logic in each region, on-demand in-memory computing is achieved. Simultaneously, dynamically allocating and switching the required access regions according to the target software application scenario improves the flexibility and scalability of computing and storage, and reduces the overhead and resource idle rate caused by data movement.
[0113] Based on the same inventive concept, and in conjunction with FIG5, this embodiment of the invention also provides an in-memory computing network, the in-memory computing network 50 including a plurality of the aforementioned in-memory computing systems 41 and switches 51.
[0114] Switch 51 includes at least one of top switch, core switch, and CXL switch.
[0115] In one embodiment, the top-of-rack switch and the core switch can connect different links into an in-memory computing network. The server rack includes the aforementioned in-memory computing system. Through a standard CXL (Compute Express Link) switch network, resource sharing can be achieved across CPUs, server racks, and networks.
[0116] In summary, the embodiments of this specification provide an in-memory computing network. This in-memory computing system includes multiple interfaces, a partition mapping module, a computing memory module, and multiple storage modules. The partition mapping module is configured with interface path information. Based on the interface path information, the partition mapping module determines the target interface to be routed from the multiple interfaces and allocates corresponding computing units and storage modules to process data requests issued by the target interface. The target interface is used to issue data requests from the target software application corresponding to the interface path information. Thus, by partitioning computing and storage resources and embedding different in-memory computing logic in each region, on-demand in-memory computing is achieved. Simultaneously, dynamically allocating and switching the required access regions according to the target software application scenario improves the flexibility and scalability of computing and storage, and reduces the overhead and resource idle rate caused by data movement.
[0117] The above are merely various embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. An in-memory computing system, wherein, include: Multiple interfaces; A partition mapping module is connected to the interface; A computing memory module is connected to the partition mapping module, and the computing memory module includes multiple computing units and multiple memory units; Multiple storage modules are connected to the computing memory module; The partition mapping module is configured with interface path information. Based on the interface path information, the partition mapping module determines the target interface to be routed from the plurality of interfaces, and allocates corresponding computing units and storage modules to process the data requests sent by the target interface. The target interface is used to send data requests of the target software application corresponding to the interface path information.
2. The in-memory computing system of claim 1, wherein, The partition mapping module includes multiple partition mapping units, and at least one partition mapping unit serves as the target mapping unit for configuring interface path information.
3. The in-memory computing system of claim 2, wherein, The interface path information includes interface parameters and path parameters. The interface parameters are used to determine the target interface that the target mapping unit needs to route to. The path parameters are used to determine the computing unit and storage module allocated for processing the data request issued by the target interface.
4. The in-memory computing system of claim 2, wherein, The target mapping unit is the partition mapping unit that is hit by the interface path information.
5. The in-memory computing system of claim 1, wherein, The external scheduler determines the corresponding interface path information based on the data request of the target software application and sends it to the partition mapping module through dynamic configuration path.
6. The in-memory computing system of claim 1, wherein, The data requests issued by each target interface are data requests from the same target software application.
7. The in-memory computing system of claim 1, wherein, The partition mapping module also allocates the corresponding computing unit and storage module to process each data request sent by the target interface according to the interface path information.
8. The in-memory computing system of claim 7, wherein, The partition mapping module determines the computing unit for allocating data sent to each target interface from the currently idle computing units.
9. An in-memory computing method, wherein, The in-memory computing method is applied to an in-memory computing system, the in-memory computing system comprising: Multiple interfaces; A partition mapping module is connected to the interface; A computing memory module is connected to the partition mapping module, and the computing memory module includes multiple computing units and multiple memory units; Multiple storage modules are connected to the computing memory module; The partition mapping module is configured with interface path information. Based on this information, the module determines the target interface to be routed from among multiple interfaces, and allocates corresponding computing units and storage modules to process data requests sent from the target interface. The target interface is used to send data requests from a target software application corresponding to the interface path information. The in-memory computing method includes: The partition mapping module determines the target interface that needs to be routed from multiple interfaces based on the configured interface path information, and allocates corresponding computing units and storage modules to the data requests of the target software application issued by the target interface. The target interface sends a data request for the target software application to the partition mapping module, and the data request includes an access address. The partition mapping module allocates the data request to the corresponding computing unit and storage module for processing.
10. The in-memory computing method of claim 9, wherein, The partition mapping module includes multiple partition mapping units, and at least one partition mapping unit serves as the target mapping unit for configuring interface path information; the interface path information includes interface parameters and path parameters. The partition mapping module determines the target interface to be routed from multiple interfaces based on the configured interface path information, and allocates corresponding computing units and storage modules to the data requests of the target software application issued by the target interface, including: The target mapping unit determines the target interface to be routed from multiple interfaces based on the interface parameters. The target mapping unit allocates a computing unit and a storage module to each target interface for processing the data request, based on the path parameters.
11. The in-memory computing method of claim 10, wherein, The path parameters are determined by the currently released computing unit and the access address of the data request from the target software application.
12. The in-memory computing method of claim 9, wherein, The method further includes: After all data request processing for the target software application is completed, the corresponding computing unit and storage module are released.
13. A server, wherein, Including in-memory computing systems; The in-memory computing system includes: Multiple interfaces; A partition mapping module is connected to the interface; A computing memory module is connected to the partition mapping module, and the computing memory module includes multiple computing units and multiple memory units; Multiple storage modules are connected to the computing memory module; The partition mapping module is configured with interface path information. Based on the interface path information, the partition mapping module determines the target interface to be routed from the plurality of interfaces, and allocates corresponding computing units and storage modules to process the data requests sent by the target interface. The target interface is used to send data requests of the target software application corresponding to the interface path information.