System, method and apparatus for invoking running resources of server, and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By constructing a CXL FABRIC network and dynamically expanding server memory resources using resource switches, the problem of insufficient server memory resource limits was solved, achieving efficient memory resource allocation and improved computing performance.

WO2026129622A1PCT designated stage Publication Date: 2026-06-25INSPUR SUZHOU INTELLIGENT TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: INSPUR SUZHOU INTELLIGENT TECH CO LTD
Filing Date: 2025-07-04
Publication Date: 2026-06-25

Application Information

Patent Timeline

04 Jul 2025

Application

25 Jun 2026

Publication

WO2026129622A1

IPC: G06F9/50

CPC: G06F9/5027

AI Tagging

Application Domain

Resource allocation

Technology Topics

EngineeringResource exchange

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Texitile light ageing test instrument
CN1588059Acompact structure Easy to assemble and disassemble Material analysis by optical meansTextile testingEngineering Light filter
Multi-dimensional training method and device of support vector machine
CN114186620AImprove linear separabilityimprove classificationKernel methods Character and pattern recognition Data set Descent algorithm
Loop structure of cold heat flows
CN1916533AImprove efficiencySimple configurationFluid circulation arrangement Heating and refrigeration combinations Heat flow Working fluid
Environment-friendly mobile collecting box for decoration cutting dust
CN108636005AThe dragging process is smoothavoid secondary flyingUsing liquid separation agent Working accessories Engineering Sediment
An IGBT lifetime prediction method based on a GA-Elman-LSTM combined model
CN115964937BImprove forecast accuracySolve the problem of easy to fall into local minimumInternal combustion piston engines Biological models Engineering Data mining

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

The upper limit of the server's extended memory resources is limited by the number of DIMM SLOTs in the MXC extension, which leads to insufficient resources for services with high memory requirements. Furthermore, reducing the number of PCIe ports on the GPU will affect performance.

Method used

The CXL FABRIC network is constructed using multiple resource switches. By routing between resource switches, the network can access the running resources connected to other resource switches, thereby enabling dynamic expansion and path optimization of resource devices.

Benefits of technology

It increases the upper limit of the amount of resources that the server can call to run, realizes efficient expansion and stable access to memory resources, reduces the transmission error rate, and improves computing efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025107142_25062026_PF_FP_ABST

Patent Text Reader

Abstract

The present application discloses a system, method and apparatus for invoking running resources of a server, and a storage medium. The system for invoking running resources of a server comprises: a target resource switch connected to a target server, the target resource switch being used for sending an invocation request to a candidate resource switch, sending, to the target server, target invocation information corresponding to a reference resource device having passed verification, and establishing a target route; and the candidate resource switch, used for responding to the invocation request, verifying a candidate resource device on the basis of the use state of the candidate resource device, notifying the target resource switch of information indicating that the reference resource device has passed the verification, and switching the use state of the reference resource device to an occupied state.

Need to check novelty before this filing date? Find Prior Art

Description

Server operation resource access system, method and apparatus, storage medium

[0001] Cross-references to related applications

[0002] This application claims priority to Chinese Patent Application No. 202411870240.7, filed on December 18, 2024, entitled "System, Method and Apparatus for Calling Server Operating Resources and Storage Medium", the entire contents of which are incorporated herein by reference. Technical Field

[0003] This application relates to a server runtime resource mobilization system, method, apparatus, and storage medium. Background Technology

[0004] Currently, the size of server expansion hardware resources has become a limiting factor for improving server performance. Usually, in order to improve server performance, more hardware resources can be added to the server.

[0005] In related technologies, servers expand hardware resources through specific connection methods. Taking memory resources as an example, the server directly connects to an MXC (Memory Expansion Controller) to directly expand memory resources. The MXC can expand a single memory port into four DIMM SLOTs (DIMM, Dual In-line Memory Modules), thereby allowing each of the four DIMM SLOTs to access the connected DIMM memory resources.

[0006] However, the upper limit of the memory resources that the server can expand using the above method is limited by the number of DIMM SLOTs expanded by MXC. This number is usually fixed. When the server is running services that have a large demand for memory resources, there may still be situations where the expanded memory resources do not meet the service operation requirements.

[0007] The inventors realized that, in related technologies, there is still no effective solution to the problem of the low upper limit of the amount of resources that servers can call to run. Summary of the Invention

[0008] According to an embodiment of this application, in a first aspect, a server runtime resource invocation system is provided, comprising: multiple resource switches, at least one resource switch having a switch port, a server port and a resource device port deployed on it, at least one resource switch being connected to at least one reference resource switch through the switch port, at least one resource switch being allowed to invoke runtime resources connected to other resource switches through routing, the server port being used to connect to a server, and the resource device port being used to connect to a resource device.

[0009] A target resource switch connected to the target server is used to send invocation requests to candidate resource switches. These candidate resource switches connect to candidate resource devices that meet the target server's target requirement information, which indicates the need to invoke the running resources connected to other resource switches. The target invocation information corresponding to the verified reference resource device is sent to the target server, and a target route is established. The target invocation information indicates that the running resources on the reference resource device are currently allowed to be invoked, and the target route indicates the resource invocation path between the reference resource device and the target server.

[0010] The candidate resource switch is used to respond to call requests, verify candidate resource devices based on their usage status, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to occupied status.

[0011] According to an embodiment of this application, in a second aspect, a method for invoking server runtime resources is provided, comprising:

[0012] Send a call request to the candidate resource switch, wherein the candidate resource switch is connected to the candidate resource device that meets the target requirement information of the target server, the target server is connected to the target resource switch, the target resource switch is allowed to call the running resources connected to other resource switches through routing, and the target requirement information is used to indicate that the running resources connected to other resource switches need to be called.

[0013] The process involves obtaining information indicating that a reference resource device has passed verification. Specifically, the candidate resource switch responds to the call request, verifies the candidate resource device based on its usage status, notifies the target resource switch that the reference resource device has passed verification, and switches the usage status of the reference resource device to an occupied state.

[0014] The target call information corresponding to the verified reference resource device is sent to the target server, and a target route is established. The target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

[0015] According to an embodiment of this application, a third aspect provides a server runtime resource invocation apparatus, comprising:

[0016] The first sending module is used to send a call request to the candidate resource switch, wherein the candidate resource switch is connected to the candidate resource device that meets the target requirement information of the target server, the target server is connected to the target resource switch, the target resource switch is allowed to call the running resources connected to other resource switches through routing, and the target requirement information is used to indicate that the running resources connected to other resource switches need to be called.

[0017] The first acquisition module is used to acquire information indicating that a reference resource device has passed verification. The candidate resource switch responds to the call request, verifies the candidate resource device based on its usage status, notifies the target resource switch that the reference resource device has passed verification, and switches the usage status of the reference resource device to an occupied state.

[0018] The second sending module is used to send the target call information corresponding to the verified reference resource device to the target server and establish a target route. The target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

[0019] According to an embodiment of this application, in a fourth aspect, a computer program product is also provided, including computer-readable instructions that are executed by a processor using the steps of any of the above method embodiments.

[0020] According to an embodiment of this application, in a fifth aspect, a non-transitory computer-readable storage medium is also provided, wherein computer-readable instructions are stored in the non-transitory computer-readable storage medium, wherein the computer-readable instructions are configured to perform the steps in any of the above method embodiments at runtime.

[0021] According to an embodiment of this application, in a sixth aspect, an electronic device is also provided, including a memory and a processor, wherein the memory stores computer-readable instructions, and the processor is configured to execute the computer-readable instructions to perform the steps in any of the above method embodiments.

[0022] Details of one or more embodiments of this application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the specification, drawings, and claims. Attached Figure Description

[0023] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0024] Figure 1 is a schematic diagram of server direct connection to MXC extended memory in related technologies;

[0025] Figure 2 is a schematic diagram of the server's resource call system according to an embodiment of this application;

[0026] Figure 3 is a schematic diagram of the operation of the target encoder according to an embodiment of this application;

[0027] Figure 4 is a schematic diagram of the physical ports in the server's resource call system according to an embodiment of this application;

[0028] Figure 5 is a schematic diagram of the operation of the target decoder according to an embodiment of this application;

[0029] Figure 6 is a schematic diagram of the target resource switch according to an embodiment of this application;

[0030] Figure 7 is a hardware structure block diagram of a computer device for a server running a resource call system according to an embodiment of this application;

[0031] Figure 8 is a flowchart of a server resource invocation method according to an embodiment of this application;

[0032] Figure 9 is a structural block diagram of a server operation resource invocation device according to an embodiment of this application;

[0033] Figure 10 is a schematic diagram of an electronic device according to an embodiment of this application;

[0034] Figure 11 is a schematic diagram of a computer program product according to an embodiment of this application;

[0035] Figure 12 is a schematic diagram of a non-transitory computer-readable storage medium according to an embodiment of the present application. Detailed Implementation

[0036] The embodiments of this application will be described in detail below with reference to the accompanying drawings and examples.

[0037] It should be noted that the terms "first," "second," etc., in the specification, claims, and drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

[0038] This application proposes a server runtime resource allocation system. Before describing the optional embodiments of this application, in order to better understand the inventive concept and the originality of this solution, related technologies will first be explained:

[0039] Figure 1 is a schematic diagram of a server directly connected to an MXC for memory expansion in related technologies. As shown in Figure 1, in related technologies, the server (HOST) is directly connected to the MXC in the MXC BOARD (memory expansion control board) via CXL (Compute Express Link). The MXC can expand the memory port of a single HOST into four DIMM SLOTs (DIMM, Dual In-line Memory Module), thereby allowing the four DIMM SLOTs to access the connected DIMM memory resources respectively.

[0040] However, the method of directly connecting the server to MXC extended memory in related technologies has the following technical problems:

[0041] 1) The memory that can be expanded by MXC direct connection to HOST is very limited. The upper limit of the amount of memory resources that the server can expand is limited by the number of DIMM SLOTs expanded by MXC. This number is usually fixed. When the server is running a business that has a large demand for memory resources, there may still be a situation where the expanded memory resources do not meet the business operation requirements.

[0042] 2) MXC and HOST are connected via CXL. The physical link for CXL transmission is a PCIe physical link. Current mainstream servers require a large number of GPU connections. Reducing the number of PCIe ports connected to GPUs in order to expand memory would be counterproductive.

[0043] To address the aforementioned technical problems, this embodiment provides a server runtime resource invocation system, comprising: multiple resource switches, at least one of which is equipped with switch ports, server ports, and resource device ports; at least one resource switch is connected to at least one reference resource switch via a switch port; at least one resource switch is allowed to invoke runtime resources connected to other resource switches via routing; the server port is used to connect to a server; and the resource device port is used to connect to resource devices; a target resource switch connected to a target server is used to send invocation requests to candidate resource switches, wherein the candidate resource switches are connected to candidate resource devices that meet the target server's target requirement information, and the target requirement information indicates that runtime resources connected to other resource switches need to be invoked; target invocation information corresponding to verified reference resource devices is sent to the target server, and a target route is established, wherein the target invocation information indicates that runtime resources on the reference resource devices are currently allowed to be invoked, and the target route indicates the resource invocation path between the reference resource devices and the target server; and candidate resource switches are used to respond to invocation requests, verify candidate resource devices according to their usage status, notify the target resource switch that the reference resource devices have passed verification, and switch the usage status of the reference resource devices to an occupied state.

[0044] This application proposes a server runtime resource invocation system, comprising: multiple resource switches, at least one of which is equipped with a switch port, a server port, and a resource device port; at least one resource switch is connected to at least one reference resource switch via a switch port; at least one resource switch is allowed to invoke runtime resources connected to other resource switches via routing; the server port is used to connect to a server; and the resource device port is used to connect to resource devices; a target resource switch connected to a target server is used to send invocation requests to candidate resource switches, wherein the candidate resource switches are connected to candidate resource devices that meet the target server's target requirement information, and the target requirement information indicates that runtime resources connected to other resource switches need to be invoked; target invocation information corresponding to verified reference resource devices is sent to the target server, and a target route is established, wherein the target invocation information indicates that runtime resources on the reference resource devices are currently allowed to be invoked, and the target route... This system is used to indicate the resource call path between a reference resource device and a target server. A candidate resource switch responds to call requests, verifies candidate resource devices based on their usage status, notifies the target resource switch that the reference resource device has passed verification, and switches the usage status of the reference resource device to an occupied state. That is, when the target server needs to call runtime resources connected to other resource switches, it can send a call request to the candidate resource switch through the target resource switch in the runtime resource call system. The candidate resource switch responds to the call request, verifies the candidate resource device based on its usage status, and notifies the target resource switch that the reference resource device has passed verification. Then, the target resource switch sends the target call information corresponding to the verified reference resource device to the target server and establishes a target route. The target resource switch can then call runtime resources on the reference resource devices connected to the candidate resource switches through the target route. This technical solution solves the problem of a low upper limit on the amount of runtime resources that a server can call in related technologies, achieving the technical effect of increasing the upper limit of the amount of runtime resources that a server can call.

[0045] Optionally, in this embodiment, the server's operating resources may include, but are not limited to, computing resources provided by GPUs (Graphics Processing Units), FPGAs (Field-Programmable Gate Arrays), and memory resources provided by storage devices. The following embodiment uses memory resources as an example to describe the system and methods for accessing the server's operating resources, without limiting the type of server operating resources.

[0046] Optionally, in this embodiment, Figure 2 is a schematic diagram of a server operation resource invocation system according to an embodiment of this application. As shown in Figure 2, the operation resource invocation system includes: multiple resource switches (for example, four resource switches: PBR CXL SWITCH 1 to 4), at least one resource switch is equipped with a switch port (for example, P2, only one is shown, the rest of the switch ports are not shown), a server port (for example, P2, only one is shown, the rest of the server ports are not shown) and a resource device port (for example, P3, only one is shown, the rest of the resource device ports are not shown). At least one resource switch is connected to at least one reference resource switch through the switch port. At least one resource switch is allowed to invoke the operation resources connected to other resource switches through routing. The server port is used to connect to the server (for example, server port P2 is connected to server HOST 1), and the resource device port is used to connect to the resource device (for example, resource device port P3 is connected to resource device TYPE1 / 2 / 3CXL DEVICE 2).

[0047] CXL (Compute Express Link) is a high-speed interconnect technology that provides high-bandwidth, low-latency connections between processors, accelerators, memory buffers, and intelligent input / output devices within a computer host. CXL can be used for high-speed, low-latency connections within a computer host, such as between the CPU and GPU, hard drives (e.g., solid-state drives, hybrid drives), and network interface cards (NICs, standard NICs, smart NICs). It can also be used for high-speed, low-latency connections between computer hosts, or between a computer host and external devices such as memory resource pools (e.g., external hard drives) or GPU resource pools. For example, multiple processors, accelerators, storage, and other devices can be connected via a high-speed channel, providing higher bandwidth and lower latency, thus improving computing efficiency.

[0048] PBR CXL SWITCH (Resource Switch, (PBR, Port Based Routing)). Multiple PBR CXL SWITCHs are interconnected to expand server operating resources. The memory resources of multiple servers (TYPE1 / 2 / 3CXL DEVICE 1 to 8) are interconnected to form a CXL FABRIC (CXL network architecture) memory pool, thereby achieving memory resource expansion. A single server shares the memory resources of the entire CXL FABRIC.

[0049] The runtime resource allocation system uses multiple hosts to expand memory resources. At least one host forms a CXL FABRIC via a PBR CXL SWITCH, thereby arbitrarily acquiring memory resources within the memory pool. Compared to a tree topology, the topology of the runtime resource allocation system in this application has the capability of a large number of external connection ports, and the access paths (i.e., routes) of hosts and devices can be defined strategically, increasing the stability of cross-switch access. The topology shown in Figure 2 is the simplest ring topology. Further, it can also be a more complex mesh topology. Access between hosts and devices (short for TYPE1 / 2 / 3 CXL devices) connected to the same PBR CXL SWITCH can be achieved through a single PBR CXL SWITCH. Any host and device connected to different PBR CXL SWITCHs only need to cross two PBR CXL SWITCHs to access each other.

[0050] Under the established firmware access strategy, during high-speed transmission, multiple hosts will have the optimal and shortest access path to access the memory in the entire memory pool (composed of TYPE1 / 2 / 3CXL DEVICE 1 to 8), which improves the efficiency of memory access during high-speed transmission and reduces the possibility of bit errors during transmission.

[0051] Optionally, in this embodiment, a target resource switch (e.g., PBR CXL SWITCH 1) connected to a target server (e.g., HOST 1) is used to send a call request to a candidate resource switch (e.g., PBR CXL SWITCH 2). The candidate resource switch is connected to a candidate resource device (e.g., TYPE1 / 2 / 3CXL DEVICE 4) that meets the target server's target requirement information. The target requirement information is used to indicate that the running resources connected to other resource switches need to be called. The target call information corresponding to the verified reference resource device (e.g., the verified TYPE1 / 2 / 3CXL DEVICE 4) is sent to the target server, and a target route is established. The target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

[0052] The candidate resource switch is used to respond to call requests, verify candidate resource devices based on their usage status, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to occupied status.

[0053] When the candidate resource switch confirms that the current usage status of the candidate resource device is occupied, it determines that the candidate resource device has failed the verification; when it confirms that the current usage status of the candidate resource device is idle, it determines that the candidate resource device has passed the verification.

[0054] As one or more optional solutions, a target encoder is also deployed on the target resource switch. The target encoder pre-constructs a device mapping table based on the topology connection relationship between multiple resource switches in the runtime resource call system. The device mapping table records one or more alternative resource devices that are allowed to provide runtime resources to the target server. The target encoder is used to filter out candidate resource devices that meet the target requirements from one or more alternative resource devices recorded in the device mapping table. The target server address, target switch address, and candidate device address are encoded to obtain the call request, where the target server address is the address of the target server, the target switch address is the address of the target resource switch, and the candidate device address is the address of the candidate resource device.

[0055] Optionally, in this embodiment, Figure 3 is a schematic diagram of the operation of a target encoder according to an embodiment of this application. As shown in Figure 3, a target encoder (Message Encoder) is also deployed on the target resource switch (e.g., PBR CXL SWITCH 1). The target encoder receives target demand information REQ 1 from a server (e.g., HOST 1). REQ 1 carries the HPA (Host Physical Address) of server HOST 1. The target encoder pre-constructs a device mapping table (also known as an interleave DPID table) based on the topology connection relationship between multiple resource switches in the resource call system. The target encoder is used to filter candidate resource devices (e.g., TYPE1 / 2 / 3CXL DEVICE 4) that meet the target demand information from one or more alternative resource devices recorded in the device mapping table. The target server address HPA, the target switch address SPID (Source PBR ID, also known as the source port-based routing identifier), and the candidate device address DPID (Destination PBR ID, also known as the destination port-based routing identifier) are encoded to obtain the call request REQ 2.

[0056] As one or more optional solutions, at least one resource switch is also deployed with a processor port, and the resource call system further includes a processor. The at least one resource switch is connected to the processor through the processor port. The at least one resource switch is used to collect the resource supply of the resource devices connected to the resource device ports, obtaining a first correspondence between the resource devices and resource supply on the at least one resource switch, wherein the resource supply is the amount of operating resources currently allowed to be provided by the corresponding resource device. The processor is used to obtain the first correspondence collected by at least one of the multiple resource switches from the processor port, obtaining multiple first correspondences; generate a second correspondence between the resource devices and resource supply on the resource call system based on the multiple first correspondences; and send the second correspondence to the target resource switch through the target processor port. The target encoder is used to match the candidate resource supply corresponding to at least one candidate resource device from the second correspondence, obtaining a third correspondence between the candidate resource device and the candidate resource supply; and filter candidate resource devices whose candidate resource supply is greater than or equal to the target resource demand of the target server from the third correspondence, wherein the target resource demand indicates the amount of operating resources currently needed by the target server.

[0057] Optionally, in this embodiment, Figure 4 is a schematic diagram of the physical ports in a server's runtime resource call system according to an embodiment of this application. As shown in Figure 4, the server includes multiple server ports, such as PE0-5. PE stands for PCIE PORT (Peripheral Component Interconnect Express), which is the core computing resource port of a host. Each port is PCIE x16 and supports bifurcation.

[0058] In addition, at least one resource switch is a PBR CXL switch, i.e., a CXL switch with PBR functionality. At least one resource switch may include, but is not limited to, 10 ports: PORT0-9: Physical ports for the PBR CXL switch, each physical port having a PCIe x16 physical channel supporting bifurcation. PORT0 is the default uplink management port, managing PBR CXL switch information and filling registers from a software perspective. The remaining ports are configurable ports and can be defined based on PORT0. In this topology, PORT1-4 are used as uplink data ports to receive CXL data from the host. PORT5-6 are used to connect downlink CXL devices, which in this topology are CXL TYPE3 memory expansion devices. PORT7-9 are used to connect the remaining CXL switches to form a CXL FABRIC.

[0059] PORT 6 connects to the DIMM BOARD (a memory expansion card with CXL memory resources placed in DDR5 form) via MXC (Type 3 memory expansion controller).

[0060] PORT 8 may be, but is not limited to, the processor port mentioned above. At least one resource switch PBR CXL SWITCH is connected to the processor MANAGEMENT MODULE (the module that manages the CXL SWITCH, which may be a small CPU (Central Processing Unit) or a combination module of BMC (Baseboard Management Controller) and PCIE SWITCH) through processor port PORT 8.

[0061] PORT 5 and 6 can be, but are not limited to, the ports for the resource devices mentioned above.

[0062] PORT 0, 7 and 9 can be, but are not limited to, the switch ports mentioned above.

[0063] PORT 1 to 4 can be, but are not limited to, the switch ports mentioned above.

[0064] The basic architecture of a CXL FABRIC-based server (which can be, but is not limited to, an AI (Artificial Intelligence) server) is thus formed by connecting resource devices to physical ports in the system through resource calls. After the host and GPU are powered on, the GPU receives PCIe resources and begins AI-related tasks. The CXL switch also simultaneously interacts with the host and MXC via uplink and downlink ports, converting this data into memory resources under the management of the MANAGEMENT MODULE. The memory resources configured for GPU acceleration according to system policies are not limited to the native memory resources directly connected to the host, but can also be synchronously obtained from the CXL extended memory pool (DIMM BOARD 1 to 8), accelerating the acquisition of memory addresses, improving data exchange efficiency, and reducing the data exchange error rate. The DIMM BOARD in Figure 4 can be, but is not limited to, equivalent to TYPE1 / 2 / 3 CXL DEVICE in Figure 2.

[0065] As one or more optional solutions, interconnected target decoders and target converters are deployed on the candidate resource switch. The target decoder is used to decode the candidate device address of the candidate resource device to be verified, the target server address of the target server, and the target switch address of the target resource switch from the call request; detect the current usage status of the candidate resource device; if the usage status is idle, transmit a reference verification message carrying the candidate device address, target switch address, and target server address to the target converter; and switch the current usage status of the candidate resource device to occupied status. The target converter is used to extract the target switch address from the reference verification message upon receiving it; convert the reference verification message into a target verification message; and send the target verification message to the target switch address, wherein the target verification message is used to indicate that the candidate resource device is a verified reference resource device.

[0066] Optionally, in this embodiment, Figure 5 is a schematic diagram of the operation of a target decoder according to an embodiment of this application. As shown in Figure 5, the GFD (Global Fabric Access Memory Device, where "global" refers to the entire system or network scope, "Fabric" usually refers to the network structure, and "access memory device" refers to the device used to access memory resources) initiates the target request information REQ 1 and sends the host physical address HPA to the target encoder. The target encoder encodes the data according to the Fabric Address Segment Table based on the received information and re-enters the target encoder. Then, it encodes the data according to the Interleave DPID Table. Thus, the output call request REQ 2 field contains DPID (candidate device address), SPID (target switch address), and HPA. The source PBR ID enters the GFD decoder, awaiting the next step. The GFD decoder receives the destination PBR ID, and the requested content enters the SPID decoder for Dynamic Capacity Protection. Then, based on the DPA consistency parameter, the data is calculated and enters the Snoop filter. The filter identifies and matches the relationship group DPID = PID x DPA, and enters the GFD Decoder Table to match it with the previous SPID. This result is then sent to the SPID decoder (the target decoder). After decoding, the SPID decoder sends an invalid acknowledgment message to the target converter. The target converter modifies the received information according to a standard format and outputs an invalid HOST PBR address back to the host. This completes the process of the host acquiring memory via PBR SWITCH.

[0067] As one or more alternative solutions, the operating resource call system also includes a resource expansion device, wherein at least one resource switch in the operating resource call system allows a resource expansion device to be connected through a resource device port; at least one resource expansion device is used to expand a connected resource device port into multiple ports that allow connection of resource devices.

[0068] Optionally, in this embodiment, the resource expansion device may be, but is not limited to, MXC.

[0069] As one or more optional solutions, an expansion controller and multiple device expansion ports are deployed on the resource expansion device. At least one of the multiple device expansion ports on the resource expansion device is connected to a resource device port through the expansion controller. At least one device expansion port is used to connect a resource device. The expansion controller is used to call the running resources of the resource device connected to the at least one device expansion port for the resource switch to which the connected resource device port belongs.

[0070] Optionally, in this embodiment, a resource device port can be expanded into multiple device expansion ports through the expansion controller of the resource expansion device, thereby achieving further resource device expansion.

[0071] As one or more optional solutions, the runtime resource invocation system also includes a processor, a target resource switch with a target resource device port connected to a target resource device, a target processor port deployed on the target resource switch, and interconnected resource detectors and request generators. The resource detector is connected to the processor through the target processor port. The resource detector is used to receive reference demand information from the target server through the target server port, wherein the target resource switch is connected to the target server through the target server port, and the reference demand information is used to indicate that the target server needs to invoke runtime resources. It detects whether the runtime resources allowed to be provided by the target resource device meet the reference demand information. If it is detected that the runtime resources allowed to be provided by the target resource device do not meet the reference demand information, it sends a target generation instruction to the request generator. If it is detected that the runtime resources allowed to be provided by the target resource device meet the reference demand information, it sends reference invocation information to the processor through the target processor port, wherein the reference invocation information is used to indicate that the runtime resources on the target resource device are allowed to be invoked by the target server. The request generator is used to generate target demand information upon receiving the target generation instruction. The processor is used to control the target server to invoke the runtime resources on the target resource device upon receiving the reference invocation information.

[0072] Optionally, in this embodiment, as shown in Figure 4, the resource mobilization system further includes a management CPU, a target resource device port (e.g., PORT 5) of the target resource switch (e.g., PBR CXL SWITCH 1) connected to the target resource device (DIMM BOARD 1), a target processor port (e.g., PORT 8) deployed on the target resource switch, and interconnected resource detectors and request generators. The resource detector is connected to the processor through the target processor port; the resource detector is used to receive requests from the target server (e.g., HOST) through the target server port (e.g., PORT 2). 1) Reference requirement information, wherein the target resource switch connects to the target server through the target server port, and the reference requirement information is used to indicate that the target server needs to call runtime resources; detects whether the runtime resources allowed to be provided by the target resource device meet the reference requirement information; if the runtime resources allowed to be provided by the target resource device do not meet the reference requirement information, a target generation instruction is sent to the request generator; if the runtime resources allowed to be provided by the target resource device meet the reference requirement information, a reference call information is sent to the processor through the target processor port, wherein the reference call information is used to indicate that the runtime resources on the target resource device are allowed to be called by the target server; the request generator is used to generate target requirement information upon receiving the target generation instruction; the processor is used to control the target server to call runtime resources on the target resource device upon receiving the reference call information.

[0073] The above method sets the priority for the target server to access runtime resources. Target server HOST 1 needs runtime resources to run services, which may include, but is not limited to, the GPU on HOST 1 needing runtime resources to run AI services. The GPU on HOST 1 preferentially uses the memory resources (one type of runtime resource) on the SSD (Solid State Drive) directly connected to HOST 1. If the memory resources on the SSD (Solid State Drive) directly connected to HOST 1 are insufficient for the GPU to run AI services, then resource devices on the directly connected resource switch are called (e.g., the target resource device DIMM BOARD1 on the target resource switch PBR CXL SWITCH 1). If the memory resources on the target resource device DIMM BOARD1 are insufficient for the GPU to run AI services, then resource devices on other resource switches are further called.

[0074] As one or more optional solutions, the target resource switch also deploys multiple virtual switches and multiple conversion ports. At least one virtual switch allows connection to a corresponding server through a server port. At least one virtual switch deploys multiple virtual ports, and at least one virtual port allows the establishment of a resource call path with one or more conversion ports. At least one conversion port is used to convert the connected at least one virtual port into a physical port, wherein at least one physical port is connected to a corresponding resource device port, and the resource call path is the path for calling and running resources. A processor is used to, upon receiving reference call information, locate the target virtual switch connected to the target server port from among the multiple virtual switches; filter the target conversion port connected to the target resource device port from among the multiple conversion ports; and control the establishment of a target resource call path between the target conversion port and the target virtual port in the target virtual switch.

[0075] Optionally, in this embodiment, Figure 6 is a schematic diagram of a target resource switch according to an embodiment of this application. As shown in Figure 6, the target resource switch also deploys multiple virtual switches (VCS, Virtual Component Switch) and multiple conversion ports (PHY PORT, PPB0, and PPB1). At least one conversion port is used to convert at least one connected virtual port into a physical port for connection to a resource device (e.g., PCIE / CXL DEVICE, TYEP3 POOLED0, and TYEP3 POOLED1). The resource device can be an SLD (Single Logical Device) or an MLD (Multi Logical Device). At least one virtual switch allows connection to a corresponding server HOST through a server port ROOT PORT. At least one virtual switch deploys multiple virtual ports (VPPB1 and 2). The ROOT PORT is the data port connecting the HOST, serving as an uplink port. VPPB is a virtual PCI-to-PCI bridge within the CXL switch owned by the host. It can be bound to a port that is disconnected, connected to a PCIe component, or connected to a CXL component. A VCS comprises entities within a physical switch that belong to a single VH. It is identified by a VCS ID (e.g., VCS 0 and 1).

[0076] After the PBR CXL SWITCH starts and performs its initialization, the processor port manages it via the PCIe port. Management primarily involves configuring the PPB, including static and dynamic configuration. Management is an external logical process that uses standardized commands to query and configure the system's operational status. The processor determines when reconfiguration is needed and initiates commands to execute the configuration. This can take any form, including but not limited to software running on the host, embedded software running on the BMC, embedded firmware running on another CXL device or CXL switch, or a state machine running within the CXL device itself. The processor transmits its management commands through the CXL SWITCH's CCI (Component Command Interface) channel, performing tasks such as memory allocation, telemetry management, logical device erasure, and error handling. Dynamic management allows monitoring of the system's expanded memory pool allocation, observing which GPU is allocating memory resources in real time, and enabling the allocation of additional memory.

[0077] The methods and embodiments provided in this application can be executed in a server device or a similar computing device. Taking a server device as an example, FIG7 is a hardware structure block diagram of a computer device for a server's resource retrieval system according to an embodiment of this application. As shown in FIG7, the server device may include one or more (only one is shown in FIG7) processors 102 (processor 102 may include, but is not limited to, microprocessors MCU or programmable logic devices FPGA, etc.) and a memory 104 for storing data. The server device may also include a transmission device 106 for communication functions and an input / output device 108. Those skilled in the art will understand that the structure shown in FIG7 is only illustrative and does not limit the structure of the server device. For example, the server device may also include more or fewer components than shown in FIG7, or have a different configuration than shown in FIG7.

[0078] The memory 104 can be used to store computer-readable instructions, such as software programs and modules of application software, such as the computer-readable instructions corresponding to the server's runtime resource call system in this embodiment. The processor 102 executes various functional applications and data processing by running the computer-readable instructions stored in the memory 104, thereby implementing the above-described method. The memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 104 may further include memory remotely located relative to the processor 102, and these remote memories can be connected to the server device via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.

[0079] The transmission device 106 is used to receive or send data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider for the server device. In one example, the transmission device 106 includes a Network Interface Controller (NIC), which can connect to other network devices via a base station to communicate with the Internet. In another example, the transmission device 106 may be a Radio Frequency (RF) module used for wireless communication with the Internet.

[0080] This embodiment provides a method for invoking server runtime resources. Figure 8 is a flowchart of a method for invoking server runtime resources according to an embodiment of this application. As shown in Figure 8, the process includes the following steps:

[0081] Step S12: Send a call request to the candidate resource switch. The candidate resource switch is connected to candidate resource devices that meet the target requirements information of the target server. The target server is connected to the target resource switch. The target resource switch is allowed to call the running resources connected to other resource switches through routing. The target requirements information is used to indicate that the running resources connected to other resource switches need to be called.

[0082] Optionally, in this embodiment, the server's runtime resource invocation method can be applied to the target resource switch in the runtime resource invocation system, and the target resource switch is a resource switch directly connected to the target server. As shown in Figure 4, when HOST 1 is the target server, PBR CXL SWICTH 1 is the target resource switch, and the other resource switches (e.g., PBR CXL SWICTH 2, 3 and 4) are candidate resource switches.

[0083] Step S14: Obtain the information that the reference resource device has passed verification. The candidate resource switch is used to respond to the call request, verify the candidate resource device according to the usage status of the candidate resource device, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to the occupied status.

[0084] Optionally, in this embodiment, taking DIMM BOARD 4 as a candidate resource device as an example, DIMM BOARD 4 is connected to PBR CXL SWICTH 2. Therefore, PBR CXL SWICTH 2, as a candidate resource switch, responds to the call request sent by PBR CXL SWICTH 1 and verifies the candidate resource device based on its usage status. For example, if the current usage status of candidate resource device DIMM BOARD 4 is idle, then candidate resource device DIMM BOARD 4 is determined to be a verified reference resource device. If the current usage status of candidate resource device DIMM BOARD 4 is occupied, then candidate resource device DIMM BOARD 4 has failed verification.

[0085] Step S16: Send the target call information corresponding to the verified reference resource device to the target server and establish a target route. The target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

[0086] Optionally, in this embodiment, all resource switches in the running resource call system are PBR CXL SWICTH, that is, CXL SWITCH with PBR function. PBR function: port-based routing function, that is, allowing the calling of running resources connected to other resource switches through routing.

[0087] As shown in Figure 4, the system needs to power on the CXL DEVICE and GPU first, following the same power-on sequence as standard PCIe devices. It then waits for the HOST to power on for information transmission. In this ring topology, the CXL DEVICE and GPU wait for the PBR CXL SWICTH to power on before initialization. At the same time, the HOST and the MANAGEMENT CPU start up and establish data transmission channels with their uplink and downlink devices according to the standard PCIe TRAINING process.

[0088] Before the host acquires extended memory (running resources) from the PBR CXL SWICTH for the first time, the processor needs to establish a TRAINGING management link with the PBR CXL SWICTH. This management link is established after the PBR CXL SWICTH performs its initial boot. The processor first acquires static PBR CXL SWICTH information, including port configuration (direction, upstream or downstream), bandwidth, supported speeds, the number of vPPBs for at least one VCS, initial port binding configuration, CCI (Component Command Interface) access settings, and any vendor-defined management permission settings. This acquisition of information constitutes the processor's static management of the PBR CXL SWICTH.

[0089] The GPU begins its task by first accessing the memory resources directly connected to the CPU, and then retrieving memory resources from the system's extended memory pool via the PBR CXL SWICTH. The process of retrieving runtime resources from the PBR CXL SWICTH extended memory pool is shown in Figure 5: The host first sends a memory address access request (REQ 1). After receiving the message, the uplink port of the CXL PBR SWITCH encodes it using the target encoder (Message Encoder). First, it converts the initial HOST PHYSICAL ADDRESS according to the Fabric Address Segment Table format. Then, it converts the converted message according to the Interleave DPID Table format and sends it to the target decoder (SPID Decoder). The request sent to the target decoder includes the Destination PBR ID, SOURCE PBR ID, and HOST PHYSICAL ADDRESS. The SPID is sent to the Global Fabric Access Memory Device (GFD Decoder Table). Following the standard format of the G-FAM Table, it waits for the DPID to enter the decoder's response confirmation before proceeding with routing within the PBR FABRIC. After the packet containing the SPID and DPA is sent to the target decoder, the dynamic capacity information is parsed and asset information protection is performed. Simultaneously, based on the DPA, continuous calculations are performed and sent to the Snoop filter. The Snoop filter forms a DPID=(PIDx, DPA) group from the received TAG-prefixed packets and sends it to the DPID decoder and GDT. In the GDT, the packet is encoded and matched with the previous SPID. If a match is found, routing within the CXL FABRIC is successful, and the next step can proceed. If no match is found, the DPID group information is resent to the SPID decoder for repeated processing. If a match is found within a certain number of attempts, the CXL FABRIC allows continued memory access; otherwise, the system retains the request and reports an error to the upper-layer software.If routing is successful after matching within the GDT, the DPID decoder further decodes the packet, sending it containing DPID, SPID, and HPA in BISnp format to the converter. The converter filters the packet information according to certain conversion rules and sends the HPST PHYSICAL ADDRESS packet in BISnp format to the host. This allows the host to obtain memory resources based on the PBR CXL FABRIC extension. The GPU then obtains this memory from the host via the PCIe channel to accelerate business operations (AI training).

[0090] After the host first acquires the extended memory from the CXL FABRIC, the processor will dynamically manage it. Dynamic management means that once the CCI is operational, the processor can send management commands to the PBR CXL SWICTH. The processor can perform actions on the PBR CXL SWICTH, such as querying PBR CXL SWICTH information and configuration details, binding or unbinding ports, registering to receive and process event notifications from the PBR CXL SWICTH (e.g., hot-plugging, accidental removal, and failure). The DIMM BOARD is an MLD device; the processor can perform MLD discovery, LD binding / unbinding, route management commands, and connect to the MLD by transmitting its management commands through the CCI channel of the PBR CXL SWICTH, either directly connected or connected via a device. The processor can perform memory allocation and QoS (Quality of Service) telemetry management, security (e.g., LD erasure after unbinding), error handling, etc. When the memory extended by the PBR CXL SWICTH is heavily used, the processor can monitor which port requires a large amount of memory resources and infer that the GPU connected to the host is under heavy workload and memory resources are scarce. Then, it can transmit corresponding commands in the processor's CCI channel to change the route for the host to acquire memory within the CXL FABRIC. For example, as shown in Figure 6, VPPB2 in VCS0 of the CXL FABRIC, which is connected to the virtual physical converter, can be connected to PPB1 to acquire memory resources from VCS1. This achieves dynamic allocation of the memory extended by the CXL FABRIC in AI work, improving memory utilization and the efficiency of AI inference and training.

[0091] In a ring topology with four PBR CXL SWICTHs, different hosts can dynamically access memory by crossing only two PBR CXL SWICTHs. Other hosts can access non-DOMAIN memory, while the GPU still uses the memory accessed by its own host. This greatly increases the amount of memory resources available to the GPU, and accessing memory via the CXL FABRIC formed by crossing PBR CXL SWICTHs does not increase latency.

[0092] As one or more optional schemes, the target invocation information corresponding to the verified reference resource device is sent to the target server, including:

[0093] S21, Detect the currently received message;

[0094] S22, upon detecting a target verification message sent by the candidate resource switch, the target call information corresponding to the reference resource device is sent to the target server. The candidate resource switch is configured to send a target verification message to the target resource switch after the candidate resource device has been verified. The target verification message is used to indicate that the candidate resource device is a reference resource device that has been verified.

[0095] Optionally, in this embodiment, the target verification message may be, but is not limited to, a BISnp format message sent by the aforementioned converter.

[0096] As one or more alternatives, the method further includes the following before sending a call request to the candidate resource exchange:

[0097] S31, Obtain the device mapping table, wherein the device mapping table is pre-constructed based on the topology connection relationship between multiple resource switches in the running resource call system, including the target resource switch, and the device mapping table records one or more alternative resource devices that are allowed to provide running resources to the target server;

[0098] S32, select candidate resource devices that meet the target requirements from one or more alternative resource devices recorded in the device mapping table;

[0099] S33, the target server address, target switch address and candidate device address are encoded to obtain the call request, where the target server address is the address of the target server, the target switch address is the address of the target resource switch, and the candidate device address is the address of the candidate resource device.

[0100] Optionally, in this embodiment, the device mapping table may be, but is not limited to, the Interleave DPID Table described above, i.e., an interleaved DPID table. The target server address, target switch address, and candidate device address may be encoded using a target encoder (Message Encoder) to obtain the call request REQ 2.

[0101] As one or more alternative solutions, candidate resource devices that meet the target requirements are selected from one or more alternative resource devices recorded in the device mapping table, including:

[0102] S41, obtain the second correspondence between resource devices and resource supply on the running resource call system, wherein the second correspondence is generated based on multiple first correspondences, at least one of the multiple resource switches is obtained by one of the resource switches collecting the resource supply of the resource devices connected to the resource device ports, at least one first correspondence records the correspondence between resource devices and resource supply on a corresponding resource switch, and the resource supply is the amount of running resources that the corresponding resource device is currently allowed to provide;

[0103] S42, Match at least one alternative resource supply quantity corresponding to the alternative resource equipment from the second correspondence relationship to obtain the third correspondence relationship between the alternative resource equipment and the alternative resource supply quantity;

[0104] S43, select candidate resource devices from the third correspondence relationship whose resource supply is greater than or equal to the target resource demand of the target server, wherein the target resource demand is used to indicate the amount of operating resources currently required by the target server.

[0105] Optionally, in this embodiment, the processor may obtain, but is not limited to, a second correspondence between resource devices and resource supply on the running resource call system.

[0106] As one or more optional schemes, before encoding the target server address, target switch address, and candidate device address to obtain the invocation request, the method further includes:

[0107] S51, Receive the initial server address in reference address format sent by the target server;

[0108] S52, the initial server address is encoded from the reference address format to the target address format to obtain the target server address, wherein the target address format is an address format that the resource switch in the running resource call system is allowed to recognize;

[0109] S53 adds the target server address, target switch address, and candidate device address to the initial request message to obtain the call request.

[0110] Optionally, in this embodiment, as mentioned above, when the GPU starts performing business operations, it first obtains the memory resources directly connected to the CPU, and then obtains memory resources from the system-extended memory pool through the PBR CXL SWICTH. The process of obtaining running resources from the memory pool extended by the PBR CXL SWICTH is shown in Figure 5: The HOST first sends a memory address access request REQ 1. After receiving the message, the uplink port of the CXL PBR SWITCH encodes it through the target encoder (Message Encoder). First, it converts the HOST PHYSICAL ADDRESS into the format of the first HOST PHYSICAL ADDRESS according to the Fabric Address Segment Table format, and then converts the converted message into the corresponding message according to the Interleave DPID Table format and sends it to the target decoder (SPID Decoder). The request sent to the target decoder will include the Destination PBR ID, SOURCE PBR ID, and HOST PHYSICAL ADDRESS.

[0111] This can be achieved, but is not limited to, by encoding the initial server address from the reference address format to the target address format using the target encoder based on the Fabric Address Segment Table (i.e., first converting the HOST PHYSICAL ADDRESS to the format of the first HOST PHYSICAL ADDRESS according to the format of the Fabric Address Segment Table).

[0112] Add the target server address, target switch address, and candidate device address to the initial request message to obtain the call request. The target server address can be the HOST PHYSICAL ADDRESS, the target switch address can be the SOURCE PBR ID, and the candidate device address can be the Destination PBR ID.

[0113] As one or more alternatives, the method further includes the following before sending a call request to the candidate resource exchange:

[0114] S61, Receive reference request information sent by the target server, wherein the reference request information is used to indicate to the target server that it needs to call up runtime resources;

[0115] S62, responding to reference requirement information, detects whether the operating resources allowed to be provided by the target resource device connected to the target resource switch meet the reference requirement information;

[0116] S63, if the target resource device is found to have insufficient runtime resources to meet the reference requirements, a call request is generated.

[0117] Optionally, in this embodiment, the target server sets a priority for calling runtime resources. The runtime resources required by target server HOST 1 to run services may include, but are not limited to, the GPU on HOST 1 needing runtime resources to run AI services. The GPU on HOST 1 preferentially uses the memory resources (a type of runtime resource) on the SSD (Solid State Drive) directly connected to HOST 1. If the memory resources on the SSD (Solid State Drive) directly connected to HOST 1 are insufficient for the GPU to run AI services, then the resource devices on the directly connected resource switch are called (e.g., the target resource device DIMM BOARD1 on the target resource switch PBR CXL SWITCH 1). If the memory resources on the target resource device DIMM BOARD1 are insufficient for the GPU to run AI services, then resource devices on other resource switches are further called.

[0118] As one or more optional solutions, receive reference requirement information sent by the target server, including:

[0119] S71, Receive reference resource demand from target server, wherein target server includes service operation device and resource provider device, reference resource demand is used to indicate the amount of operation resources required by service operation device to currently run service, resource provider device is configured to provide operation resources to service operation device, target server is configured to send reference resource demand to target resource switch when the amount of operation resources currently allowed to be provided by resource provider device does not meet the amount of resources required by service operation device to currently run service;

[0120] S72, add the reference resource requirement to the initial requirement information to obtain the reference requirement information.

[0121] Optionally, in this embodiment, the target server may be, but is not limited to, HOST 1 in Figure 4, and the corresponding service running device may be, but is not limited to, the GPU on HOST 1.

[0122] As one or more optional solutions, the system detects whether the operating resources allowed to be provided by the target resource device connected to the target resource switch meet the reference requirements, including:

[0123] S81, detects the target resource supply of the target resource equipment;

[0124] S82, extract the reference resource demand from the reference demand information;

[0125] S83, when the target resource supply is greater than or equal to the reference resource demand, determine the operating resources that the target resource equipment is allowed to provide to meet the reference demand information;

[0126] S84, when the target resource supply is less than the reference resource demand, determine that the operating resources allowed to be provided by the target resource equipment do not meet the reference demand information.

[0127] Optionally, in this embodiment, the target resource switch determines whether the operating resources allowed to be provided by the target resource device meet the reference requirement information by detecting whether the target resource supply of the directly connected target resource device is greater than or equal to the reference resource requirement.

[0128] The server operation resource call method proposed in this application uses four identical CXL SWITCHs with PBR functionality (PBR CXL SWITCH) to form a CXL FABRIC to work with an AI server, thereby accelerating AI operations. The four CXL SWITCHs are connected to the host via a PCIe physical link and interconnected to form a ring FABRIC. Using certain encoding and decoding, cross-host memory calls are performed within the FABRIC composed of CXL SWITCHs, forming a memory pool with a deeper pooling depth. The management CPU manages the entire CXL Fabric via a high-speed interface, employing a combination of dynamic and static management. It reads CXL switch information and configuration details, binds or unbinds ports, registers to receive and process event notifications from the CXL switch, hot-plugs devices, and handles errors. It can define basic parameters of the switch and CXL Fabric, monitor memory resource usage to infer which host is under heavy workload, and uses the CCI interface to feed back to the Fabric composed of CXL switches for resource allocation optimization within the PBR Fabric. This improves host memory efficiency, directly increasing the available memory capacity for GPU AI tasks and accelerating AI services.

[0129] It is worth noting that the server resource allocation method proposed in this application does not use a single switch connected to MXC and DIMM BORAD to expand memory, but instead uses four CXL switches interconnected for memory expansion. Using CXL switches with PBR functionality for memory expansion allows for cross-host memory access. The previous tree-like memory expansion topology is improved by adopting a ring topology to increase the flexibility of system memory access. High-speed management is achieved through a management module via the CCI interface, adding dynamic management compared to the previous static management. Previously, acceleration of AI services relied entirely on the native memory resources of the connected hosts; now, memory resources in the CXL extended memory pool can be accessed through the CXL FABRIC. During the dynamic management of the CXL FABRIC, the system's GPU load can be assessed based on the real-time memory access volume of each host, thereby improving the system's processing speed.

[0130] This solution effectively addresses the following issues: low memory utilization in tree-structured memory expansion topologies; the inability of hosts to access memory across switches; and the inability of servers to allocate runtime resources based on current business load.

[0131] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods according to the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods of the various embodiments of this application.

[0132] This embodiment also provides a server runtime resource allocation device for implementing the above embodiments and preferred embodiments; details already described will not be repeated. As used below, the term "module" can refer to a combination of software and / or hardware that performs a predetermined function. Although the device described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.

[0133] Figure 9 is a structural block diagram of a server operation resource invocation device according to an embodiment of this application; as shown in Figure 9, it includes:

[0134] The first sending module 902 is used to send a call request to the candidate resource switch, wherein the candidate resource switch is connected to candidate resource devices that meet the target requirement information of the target server, the target server is connected to the target resource switch, the target resource switch is allowed to call the running resources connected to other resource switches through routing, and the target requirement information is used to indicate that the running resources connected to other resource switches need to be called.

[0135] The first acquisition module 904 is used to acquire information that the reference resource device has passed verification. The candidate resource switch is used to respond to the call request, verify the candidate resource device according to the usage status of the candidate resource device, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to the occupied status.

[0136] The second sending module 906 is used to send the target call information corresponding to the verified reference resource device to the target server and establish a target route. The target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

[0137] In one or more exemplary embodiments, the second transmitting module includes:

[0138] The first detection unit is used to detect the currently received message;

[0139] The sending unit is used to send the target call information corresponding to the reference resource device to the target server when a target verification message is detected sent by the candidate resource switch. The candidate resource switch is configured to send a target verification message to the target resource switch after the candidate resource device has been verified. The target verification message is used to indicate that the candidate resource device is a reference resource device that has been verified.

[0140] In one or more exemplary embodiments, the apparatus further includes:

[0141] The second acquisition module is used to acquire a device mapping table before sending a call request to the candidate resource switch. The device mapping table is pre-constructed based on the topological connection relationship between multiple resource switches in the running resource call system. The multiple resource switches include the target resource switch. The device mapping table records one or more alternative resource devices that are allowed to provide running resources to the target server.

[0142] The filtering module is used to filter out candidate resource devices that meet the target requirements from one or more alternative resource devices recorded in the device mapping table.

[0143] The first encoding module is used to encode the target server address, the target switch address, and the candidate device address to obtain the call request. The target server address is the address of the target server, the target switch address is the address of the target resource switch, and the candidate device address is the address of the candidate resource device.

[0144] In one or more exemplary embodiments, the filtering module includes:

[0145] The acquisition unit is used to acquire a second correspondence between resource devices and resource supply on the running resource call system. The second correspondence is generated based on multiple first correspondences. At least one first correspondence is obtained by one of the multiple resource switches collecting the resource supply of the resource devices connected to the resource device ports. At least one first correspondence records the correspondence between resource devices and resource supply on a corresponding resource switch. The resource supply is the amount of running resources that the corresponding resource device is currently allowed to provide.

[0146] The matching unit is used to match at least one alternative resource supply quantity corresponding to an alternative resource device from the second correspondence relationship, so as to obtain a third correspondence relationship between the alternative resource device and the alternative resource supply quantity;

[0147] The filtering unit is used to filter out candidate resource devices from the third correspondence relationship whose alternative resource supply is greater than or equal to the target resource demand of the target server, wherein the target resource demand is used to indicate the amount of operating resources currently required by the target server.

[0148] In one or more exemplary embodiments, the apparatus further includes:

[0149] The first receiving module is used to receive the initial server address in reference address format sent by the target server before encoding the target server address, target switch address and candidate device address to obtain the call request;

[0150] The second encoding module is used to encode the initial server address from the reference address format to the target address format to obtain the target server address, wherein the target address format is an address format that the resource switch in the running resource call system is allowed to recognize;

[0151] Add a module to add the target server address, target switch address, and candidate device address to the initial request message to obtain the call request.

[0152] In one or more exemplary embodiments, the apparatus further includes:

[0153] The second receiving module is used to receive reference demand information sent by the target server before sending a call request to the candidate resource exchange. The reference demand information is used to indicate to the target server that it needs to call the running resource.

[0154] The response module is used to respond to the reference requirement information and detect whether the operating resources allowed to be provided by the target resource device connected to the target resource switch meet the reference requirement information.

[0155] The generation module is used to generate a call request when it is detected that the runtime resources allowed to be provided by the target resource device do not meet the reference requirements.

[0156] In one or more exemplary embodiments, the second receiving module includes:

[0157] The receiving unit is used to receive the reference resource requirement issued by the target server. The target server includes a service operation device and a resource providing device. The reference resource requirement is used to indicate the amount of operating resources required by the service operation device to currently run the service. The resource providing device is configured to provide operating resources to the service operation device. The target server is configured to send the reference resource requirement to the target resource switch when the amount of operating resources that the resource providing device is currently allowed to provide does not meet the amount of resources required by the service operation device to currently run the service.

[0158] The add unit is used to add the reference resource requirement to the initial requirement information to obtain the reference requirement information.

[0159] In one or more exemplary embodiments, the response module includes:

[0160] The second detection unit is used to detect the target resource supply of the target resource equipment.

[0161] The extraction unit is used to extract the reference resource demand from the reference demand information;

[0162] The first determining unit is used to determine the information on whether the operating resources provided by the target resource equipment meet the reference requirements when the target resource supply is greater than or equal to the reference resource demand.

[0163] The second determining unit is used to determine, when the supply of target resources is less than the demand for reference resources, that the operating resources allowed to be provided by the target resource equipment do not meet the reference demand.

[0164] It should be noted that the above modules can be implemented by software or hardware. For the latter, they can be implemented in the following ways, but are not limited to: all the above modules are located in the same processor; or, the above modules are located in different processors in any combination.

[0165] Referring to Figure 11, embodiments of this application also provide a computer program product, including computer-readable instructions that, when executed by a processor, implement the steps of the methods in various embodiments of this application; the computer program product further includes a non-volatile, non-transitory computer-readable storage medium that stores computer-readable instructions that, when executed by a processor, implement the steps of the methods in various embodiments of this application.

[0166] Referring to Figure 12, an embodiment of this application also provides a non-transitory computer-readable storage medium storing computer-readable instructions, wherein the computer-readable instructions are configured to execute the steps in any of the above method embodiments at runtime.

[0167] In one or more exemplary embodiments, the aforementioned non-transitory computer-readable storage medium may include, but is not limited to, various media capable of storing computer-readable instructions, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.

[0168] Embodiments of this application also provide an electronic device. FIG10 is a schematic diagram of an electronic device according to an embodiment of this application. As shown in FIG10, the electronic device includes a memory and a processor. The memory stores computer-readable instructions, and the processor is configured to run the computer-readable instructions to perform the steps in any of the above method embodiments.

[0169] In one or more exemplary embodiments, the electronic device may further include a transmission device and an input / output device, wherein the transmission device is connected to the processor and the input / output device is connected to the processor.

[0170] Specific examples in this embodiment can be found in the examples described in the above embodiments and exemplary implementations, and will not be repeated here.

[0171] Obviously, those skilled in the art should understand that the modules or steps of this application described above can be implemented using general-purpose computing devices. They can be centralized on a single computing device or distributed across a network of multiple computing devices. They can be implemented using computer-executable program code, and thus can be stored in a storage device for execution by a computing device. In some cases, the steps shown or described can be performed in a different order than those presented here, or they can be fabricated as separate integrated circuit modules, or multiple modules or steps can be fabricated as a single integrated circuit module. Thus, this application is not limited to any particular combination of hardware and software.

[0172] The above description is merely a preferred embodiment of this application and is not intended to limit this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the principles of this application should be included within the protection scope of this application.

Claims

1. A running resource calling system of a server, characterized by, include: Multiple resource switches, at least one of which is deployed with switch ports, server ports and resource device ports, at least one of which is connected to at least one reference resource switch through the switch ports, at least one of which is allowed to call the running resources connected to other resource switches through routing, the server port is used to connect to a server, and the resource device port is used to connect to a resource device; A target resource switch connected to the target server is used to send a call request to a candidate resource switch, wherein the candidate resource switch is connected to candidate resource devices that meet the target requirement information of the target server, and the target requirement information is used to indicate that the running resources connected to other resource switches need to be called; target call information corresponding to the verified reference resource device is sent to the target server, and a target route is established, wherein the target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server; and The candidate resource switch is configured to respond to the call request, verify the candidate resource device according to its usage status, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to an occupied status.

2. The system according to claim 1, characterized in that, The target resource switch is also deployed with a target encoder. The target encoder pre-constructs a device mapping table based on the topology connection relationship between multiple resource switches in the running resource call system. The device mapping table records one or more alternative resource devices that are allowed to provide running resources to the target server. as well as The target encoder is used to filter out the candidate resource devices that meet the target requirement information from one or more of the candidate resource devices recorded in the device mapping table. The target server address, target switch address, and candidate device address are encoded to obtain the call request, wherein the target server address is the address of the target server, the target switch address is the address of the target resource switch, and the candidate device address is the address of the candidate resource device.

3. The system according to claim 2, characterized in that, At least one of the resource switches is also equipped with a processor port, and the running resource call system further includes a processor. At least one of the resource switches is connected to the processor through the processor port. At least one of the resource switches is used to collect the resource supply of the resource devices connected to the resource device ports to obtain a first correspondence between the resource devices and the resource supply on at least one of the resource switches, wherein the resource supply is the amount of operating resources that the corresponding resource device is currently allowed to provide; The processor is configured to obtain, from the processor port, at least one of the resource switches collected the first correspondence relationship, thus obtaining multiple first correspondence relationships; generate a second correspondence relationship between the resource devices and the resource supply on the running resource call system based on the multiple first correspondence relationships; and send the second correspondence relationship to the target resource switch through the target processor port; and The target encoder is used to match at least one candidate resource supply corresponding to the candidate resource device from the second correspondence to obtain a third correspondence between the candidate resource device and the candidate resource supply; and to filter out candidate resource devices whose candidate resource supply is greater than or equal to the target resource requirement of the target server from the third correspondence, wherein the target resource requirement is used to indicate the amount of operating resources currently needed by the target server.

4. The system according to claim 1, characterized in that, The candidate resource switch is equipped with interconnected target decoders and target converters. The target decoder is used to decode from the call request the candidate device address of the candidate resource device to be verified, the target server address of the target server, and the target switch address of the target resource switch. The current usage status of the candidate resource device is detected; if the usage status is idle, a reference verification message carrying the address of the candidate device, the address of the target switch, and the address of the target server is transmitted to the target converter. And switch the current usage state of the candidate resource device to the occupied state; as well as The target converter is configured to extract the target switch address from the reference authentication message upon receiving the reference authentication message, and convert the reference authentication message into a target authentication message. Send the target verification message to the target switch address, wherein the target verification message is used to indicate that the candidate resource device is the verified reference resource device.

5. The system according to claim 1, characterized in that, The operational resource call system also includes a resource expansion device, and at least one of the resource switches in the operational resource call system allows a resource expansion device to be connected through a port of the resource device. as well as At least one of the resource expansion devices is used to expand a connected resource device port into multiple ports that allow connection to the resource device.

6. The system according to claim 5, characterized in that, The resource expansion device is equipped with an expansion controller and multiple device expansion ports, and at least one of the multiple device expansion ports on the resource expansion device is connected to a resource device port through the expansion controller; At least one of the device expansion ports is used to connect one of the resource devices; and The extension controller is used to invoke the operating resources of at least one of the resource devices connected to the extended port of the resource device for the resource switch to which the connected resource device port belongs.

7. The system according to claim 1, characterized in that, The running resource call system also includes a processor, the target resource device port of the target resource switch is connected to the target resource device, the target resource switch deploys a target processor port, and resource detectors and request generators are interconnected, the resource detector is connected to the processor through the target processor port; The resource detector is configured to receive reference demand information from the target server via the target server port, wherein the target resource switch is connected to the target server via the target server port, and the reference demand information indicates that the target server needs to call runtime resources; detect whether the runtime resources allowed to be provided by the target resource device meet the reference demand information; if the runtime resources allowed to be provided by the target resource device do not meet the reference demand information, send a target generation instruction to the request generator; if the runtime resources allowed to be provided by the target resource device meet the reference demand information, send reference call information to the processor via the target processor port, wherein the reference call information indicates that the runtime resources on the target resource device are allowed to be called by the target server; The request generator is configured to generate the target requirement information upon receiving the target generation instruction; and The processor is configured to, upon receiving the reference call information, control the target server to call the running resources on the target resource device.

8. The system according to claim 7, characterized in that, The target resource switch also deploys multiple virtual switches and multiple conversion ports. At least one of the virtual switches allows connection to a corresponding server through a server port. At least one of the virtual switches deploys multiple virtual ports. At least one of the virtual ports allows resource call paths to be established with one or more of the conversion ports. At least one of the conversion ports is used to convert at least one of the connected virtual ports into a physical port, wherein at least one of the physical ports is connected to a corresponding resource device port, and the resource call path is the path for calling and running resources; as well as The processor is configured to, upon receiving the reference call information, locate the target virtual switch connected to the target server port from among the plurality of virtual switches; filter the target conversion port connected to the target resource device port from among the plurality of conversion ports; and control the establishment of a target resource call path between the target conversion port and the target virtual port in the target virtual switch.

9. A method for retrieving resources from a server, characterized in that, include: A request to invoke is sent to a candidate resource switch, wherein the candidate resource switch is connected to candidate resource devices that meet the target requirements information of the target server, the target server is connected to the target resource switch, the target resource switch is allowed to invoke the running resources connected to other resource switches through routing, and the target requirements information is used to indicate that the running resources connected to other resource switches need to be invoked. Obtaining information indicating that a reference resource device has passed verification, wherein the candidate resource switch is configured to respond to the call request, verify the candidate resource device based on its usage status, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to an occupied status; and The target call information corresponding to the verified reference resource device is sent to the target server, and a target route is established. The target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

10. The method according to claim 9, characterized in that, The step of sending the target invocation information corresponding to the verified reference resource device to the target server includes: Detect the currently received message; and Upon detecting a target verification message sent by the candidate resource switch, the target call information corresponding to the reference resource device is sent to the target server. The candidate resource switch is configured to send the target verification message to the target resource switch after the candidate resource device has been verified. The target verification message is used to indicate that the candidate resource device is the reference resource device that has been verified.

11. The method according to claim 9, characterized in that, Before sending the call request to the candidate resource exchange, the method further includes: Obtain a device mapping table, wherein the device mapping table is pre-constructed based on the topology connection relationship between multiple resource switches in the running resource call system, the multiple resource switches including the target resource switch, and the device mapping table records one or more alternative resource devices that are allowed to provide running resources to the target server; From the device mapping table, candidate resource devices that meet the target requirements are selected; and The target server address, target switch address, and candidate device address are encoded to obtain the call request, wherein the target server address is the address of the target server, the target switch address is the address of the target resource switch, and the candidate device address is the address of the candidate resource device.

12. The method according to claim 11, characterized in that, The step of selecting candidate resource devices that meet the target requirement information from one or more candidate resource devices recorded in the device mapping table includes: A second correspondence between the resource devices and resource supply on the running resource call system is obtained, wherein the second correspondence is generated based on multiple first correspondences, at least one of the multiple resource switches is obtained by one of the resource switches collecting the resource supply of the resource devices connected to the resource device ports, and at least one first correspondence records the correspondence between the resource devices and the resource supply on a corresponding resource switch, wherein the resource supply is the amount of running resources that the corresponding resource device is currently allowed to provide; Match at least one candidate resource supply quantity corresponding to the candidate resource device from the second correspondence relationship to obtain a third correspondence relationship between the candidate resource device and the candidate resource supply quantity; and Candidate resource devices whose supply is greater than or equal to the target resource requirement of the target server are selected from the third correspondence, wherein the target resource requirement is used to indicate the amount of operating resources currently required by the target server.

13. The method according to claim 11, characterized in that, Before encoding the target server address, target switch address, and candidate device address to obtain the invocation request, the method further includes: Receive the initial server address in the reference address format sent by the target server; The initial server address is encoded from the reference address format into a target address format to obtain the target server address, wherein the target address format is an address format that the resource switch in the running resource call system is allowed to recognize; and The target server address, the target switch address, and the candidate device address are added to the initial request message to obtain the call request.

14. The method according to claim 9, characterized in that, Before sending the call request to the candidate resource exchange, the method further includes: Receive reference requirement information sent by the target server, wherein the reference requirement information is used to indicate that the target server needs to call up runtime resources; In response to the reference requirement information, it detects whether the operating resources allowed to be provided by the target resource device connected to the target resource switch meet the reference requirement information; and If the target resource device is found to have insufficient runtime resources to meet the reference requirements, the invocation request is generated.

15. The method according to claim 14, characterized in that, The receipt of reference requirement information sent by the target server includes: The system receives a reference resource requirement from the target server, wherein the target server includes a service operation device and a resource providing device. The reference resource requirement indicates the amount of operating resources needed by the service operation device to currently run its service. The resource providing device is configured to provide operating resources to the service operation device. The target server is configured to send the reference resource requirement to the target resource switch when the amount of operating resources currently allowed to be provided by the resource providing device does not meet the resource requirements of the service operation device to currently run its service. The reference resource requirement is added to the initial requirement information to obtain the reference requirement information.

16. The method according to claim 15, characterized in that, The step of detecting whether the operating resources allowed to be provided by the target resource device connected to the target resource switch meet the reference requirements includes: Detect the target resource supply of the target resource device; Extract the reference resource demand from the reference demand information; If the target resource supply is greater than or equal to the reference resource demand, it is determined that the operating resources allowed to be provided by the target resource device meet the reference demand information; and If the target resource supply is less than the reference resource demand, it is determined that the operating resources allowed to be provided by the target resource equipment do not meet the reference demand information.

17. A server resource allocation device, characterized in that, include: The first sending module is used to send a call request to a candidate resource switch, wherein the candidate resource switch is connected to candidate resource devices that meet the target requirement information of the target server, the target server is connected to the target resource switch, the target resource switch is allowed to call the running resources connected to other resource switches through routing, and the target requirement information is used to indicate that the running resources connected to other resource switches need to be called. The first acquisition module is used to acquire information indicating that a reference resource device has passed verification. The candidate resource switch is used to respond to the call request, verify the candidate resource device based on its usage status, notify the target resource switch that the reference resource device has passed verification, and switch the usage status of the reference resource device to an occupied status. The second sending module is used to send the target call information corresponding to the verified reference resource device to the target server and establish a target route, wherein the target call information is used to indicate that the running resources on the reference resource device are currently allowed to be called, and the target route is used to indicate the resource call path between the reference resource device and the target server.

18. A computer program product comprising computer-readable instructions, characterized in that, When the computer-readable instructions are executed by a processor, they implement the steps of the method according to any one of claims 9 to 16.

19. A non-transitory computer-readable storage medium, characterized in that, The non-transitory computer-readable storage medium stores computer-readable instructions, wherein when executed by a processor, the computer-readable instructions implement the steps of the method described in any one of claims 9 to 16.

20. An electronic device comprising a target memory, a processor, and computer-readable instructions stored in the target memory and executable on the processor, characterized in that, When the processor executes the computer-readable instructions, it performs the steps of the method described in any one of claims 9 to 16.