Graphics processor access system, access method, electronic device

By employing a non-PCIe protocol communication method between the host, communication module, and graphics processor, the problem of access difficulties caused by PCIe link anomalies is solved, enabling flexible access to the GPU configuration space and anomaly analysis, thereby improving system stability and efficiency.

CN119576826BActive Publication Date: 2026-06-30MOORE THREADS TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
MOORE THREADS TECH CO LTD
Filing Date
2024-11-29
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In existing technologies, accessing the GPU's configuration space via PCIe links offers limited flexibility, and data enumeration and transmission are difficult when PCIe links malfunction, leading to access difficulties.

Method used

By employing communication protocols other than PCIe, such as Universal Serial Bus protocol and Joint Test Action Group protocol, access to the GPU configuration space is achieved through communication between the host, communication module and graphics processor, forming an access path that does not require a PCIe link.

Benefits of technology

It improves the flexibility of accessing the GPU configuration space, ensuring that the cause of the anomaly can still be accessed and analyzed normally when the PCIe link is abnormal, thereby improving the efficiency of anomaly handling and system stability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119576826B_ABST
    Figure CN119576826B_ABST
Patent Text Reader

Abstract

This disclosure relates to the field of communications, proposing a graphics processor (GPU) access system, access method, and electronic device. The system includes a host and a communication module. The GPU to be accessed includes a first central processing unit (CPU). Communication between the host and the communication module, and between the communication module and the first CPU, employs protocols other than PCIe. The host, in response to receiving a first command, generates an access request and transmits the access request to the communication module. The communication module, in response to receiving the access request, transmits the access request to the first CPU. The first CPU, in response to the access request, accesses the GPU's configuration space and transmits the access result to the host via the communication module. This system can access the GPU's configuration space without using a PCIe link, improving the flexibility of accessing the GPU's configuration space.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of communications, and more particularly to a graphics processor access system, access method, and electronic device. Background Technology

[0002] Peripheral Component Interconnect Express (PCIe) is a high-speed serial computer expansion bus used to connect major hardware devices on the motherboard, such as graphics processing units (GPUs), memory, and network cards. These hardware devices connected via PCIe are also called PCIe devices.

[0003] Existing technologies use a root complex on the central processing unit (CPU) to manage PCIe devices and coordinate data transfer between them. To access the configuration space of a PCIe device, the root complex first enumerates the PCIe devices, establishes a PCIe topology, and then, after successful enumeration, sends a configuration request to the PCIe device via a PCIe link to access the configuration space. However, using only PCIe links to access the GPU's configuration space has low fault tolerance; if the PCIe link fails, accessing the GPU's configuration space becomes difficult. Improving the flexibility of accessing the GPU's configuration space and reducing dependence on PCIe links has become a pressing technical problem in this field. Summary of the Invention

[0004] In view of this, this disclosure proposes a graphics processor access system, access method, and electronic device. This system can access the GPU's configuration space without using a PCIe link, improving the flexibility of accessing the GPU's configuration space.

[0005] According to one aspect of this disclosure, a graphics processor access system is provided. The system includes a host and a communication module. The graphics processor to be accessed includes a first central processing unit (CPU). The host and the communication module, and the communication module and the first CPU, communicate using a protocol other than PCIe. The host is configured to, in response to receiving a first command, generate an access request and transmit the access request to the communication module. The communication module is configured to, in response to receiving the access request, transmit the access request to the first CPU. The first CPU is configured to, in response to the access request, access the configuration space of the graphics processor and transmit the access result of the configuration space to the host through the communication module.

[0006] In one possible implementation, the graphics processor is also connected to a second central processing unit via a PCIe link, and the first command is generated when the PCIe link malfunctions and transmitted by the second central processing unit to the host.

[0007] In one possible implementation, the first command is input by the user into the host.

[0008] In one possible implementation, the host and the communication module communicate using a Universal Serial Bus protocol, and the communication module and the first central processing unit communicate using a Serial Peripheral Interface protocol.

[0009] In one possible implementation, the host and the communication module communicate using the Ethernet protocol, and the communication module and the first central processing unit communicate using the Joint Test Action Group Protocol.

[0010] In one possible implementation, the access result is used to determine the cause of the anomaly in the PCIe link.

[0011] In one possible implementation, the access result is in machine language, and the host is further configured to parse the access result and display the parsing result, which is a string, and the parsing result is used to determine the cause of the PCIe link anomaly.

[0012] In one possible implementation, the host is either a physical host or a virtual host.

[0013] In one possible implementation, the communication module is a hardware module.

[0014] In one possible implementation, the host is equipped with a Linux system, the host runs a first application, the access request is generated by the first application, and the access result is parsed by the first application.

[0015] According to another aspect of this disclosure, a graphics processor access method is provided, the method being applied to a graphics processor access system, the system including a host and a communication module, the graphics processor to be accessed including a first central processing unit (CPU), the host and the communication module communicating, and the communication module and the first CPU communicating, using a protocol other than PCIe, the method comprising: the host, in response to receiving a first command, generating an access request and transmitting the access request to the communication module; the communication module, in response to receiving the access request, transmitting the access request to the first CPU; and the first CPU, in response to the access request, accessing the configuration space of the graphics processor, and transmitting the access result of the configuration space to the host through the communication module.

[0016] In one possible implementation, the graphics processor is also connected to a second central processing unit via a PCIe link, and the first command is generated when the PCIe link malfunctions and transmitted by the second central processing unit to the host.

[0017] In one possible implementation, the first command is input by the user into the host.

[0018] In one possible implementation, the host and the communication module communicate using a Universal Serial Bus protocol, and the communication module and the first central processing unit communicate using a Serial Peripheral Interface protocol.

[0019] In one possible implementation, the host and the communication module communicate using the Ethernet protocol, and the communication module and the first central processing unit communicate using the Joint Test Action Group Protocol.

[0020] In one possible implementation, the access result is used to determine the cause of the anomaly in the PCIe link.

[0021] In one possible implementation, the access result is expressed in machine language, and the method further includes: the host parses the access result and displays the parsing result, wherein the parsing result is a string, and the parsing result is used to determine the cause of the PCIe link anomaly.

[0022] In one possible implementation, the host is either a physical host or a virtual host.

[0023] In one possible implementation, the communication module is a hardware module.

[0024] In one possible implementation, the host is equipped with a Linux system, the host runs a first application, the access request is generated by the first application, and the access result is parsed by the first application.

[0025] According to another aspect of this disclosure, an electronic device is provided, including the graphics processor access system described above.

[0026] A graphics processor access system according to an embodiment of this disclosure includes a host, a communication module, and a first central processing unit (CPU) of the graphics processor to be accessed. In response to receiving a first command, the host generates an access request and transmits the access request to the communication module. In response to receiving the access request, the communication module transmits the access request to the first CPU. In response to the access request, the first CPU accesses the configuration space of the graphics processor and transmits the access result of the configuration space to the host via the communication module. Since the host, communication module, and graphics processor form a pathway for accessing the configuration space of the graphics processor, and the host and communication module, and the communication module and first CPU communicate using protocols other than PCIe, this system can access the GPU's configuration space without using a PCIe link, thus improving the flexibility of accessing the GPU's configuration space.

[0027] Other features and aspects of this disclosure will become clear from the following detailed description of exemplary embodiments with reference to the accompanying drawings. Attached Figure Description

[0028] The accompanying drawings, which are included in and form part of this specification, illustrate exemplary embodiments, features, and aspects of this disclosure together with the specification and serve to explain the principles of this disclosure.

[0029] Figure 1 This diagram illustrates how existing technologies access the configuration space of a GPU.

[0030] Figure 2 This illustrates an exemplary application scenario of a graphics processor access system according to embodiments of the present disclosure.

[0031] Figure 3 A schematic diagram showing the structure of a graphics processor access system according to an embodiment of the present disclosure is provided.

[0032] Figure 4a A schematic diagram showing a graphics processor access system connected to multiple graphics processors according to an embodiment of the present disclosure is provided.

[0033] Figure 4b A schematic diagram showing a graphics processor access system connected to multiple graphics processors according to an embodiment of the present disclosure is provided.

[0034] Figure 5A schematic diagram illustrating the flow of a graphics processor access method according to an embodiment of the present disclosure is shown. Detailed Implementation

[0035] Various exemplary embodiments, features, and aspects of this disclosure will now be described in detail with reference to the accompanying drawings. The same reference numerals in the drawings denote elements that have the same or similar functions. Although various aspects of the embodiments are shown in the drawings, they are not necessarily drawn to scale unless specifically indicated otherwise.

[0036] The term “exemplary” as used herein means “serving as an example, embodiment, or illustration.” Any embodiment illustrated herein as “exemplary” is not necessarily to be construed as superior to or better than other embodiments.

[0037] Furthermore, to better illustrate this disclosure, numerous specific details are set forth in the following detailed description. Those skilled in the art will understand that this disclosure can be practiced without certain specific details. In some instances, methods, means, components, and circuits well known to those skilled in the art have not been described in detail in order to highlight the main points of this disclosure.

[0038] Figure 1 This diagram illustrates how existing technologies access the configuration space of a GPU.

[0039] like Figure 1 As shown, the root union is located on the CPU (not shown). GPU1-GPU3 and the PCIe switch are all PCIe devices. GPU1 is directly connected to the root port P2 in the root union through the uplink port P4. GPU2 is connected to the downlink port P5 of the PCIe switch through the uplink port P7. GPU3 is connected to the downlink port P6 of the PCIe switch through the uplink port P8. The uplink port P3 of the PCIe switch is connected to the root port P1 in the root union.

[0040] The root consortium enumerates PCIe devices and determines the topology connections between them based on the enumeration results. Figure 1 In the example, if the enumeration is performed normally, the enumeration result can indicate that the PCIe devices include GPU1-GPU3 and the PCIe switch. The topology connection relationship of each PCIe device is as follows:

[0041] The PCIe link from GPU1 to the root union is: GPU1 (P4) —— root union (P2);

[0042] The PCIe link from GPU2 to the root union is: GPU2 (P7) — PCIe switch (P5) — PCIe switch (P3) — root union (P1);

[0043] The PCIe link from GPU3 to the root union is: GPU3 (P8) — PCIe switch (P6) — PCIe switch (P3) — root union (P1).

[0044] The connectivity status of a PCIe link is divided into three types: connected and stable, connected but unstable, and disconnected. When a PCIe link is connected and stable, the root union can be enumerated normally, and the PCIe link can transmit information normally. When a PCIe link is connected but unstable, the root union has a small probability of being enumerated normally, and the PCIe link has a small probability of transmitting information normally. When a PCIe link is disconnected, the root union cannot be enumerated normally, and the PCIe link cannot transmit information.

[0045] The GPU's configuration space can store GPU-related information. For example, the GPU2 configuration space can store the GPU2 manufacturer's identifier, the GPU2 device identifier, and so on. The GPU's configuration space can be accessed and data read / write operations can be performed. Taking GPU2 as an example, when access to GPU2's configuration space is needed, the user can use a configuration tool (lspci or setpci) running on the CPU to generate a configuration request. This configuration request can be encapsulated as a Transaction Layer Packet (TLP) that can be transmitted via PCIe. This configuration request can indicate that the GPU to be accessed is GPU2 and indicate whether the access type is a read or write access.

[0046] If the PCIe link from GPU2 to the root federation is connected and stable, the configuration request is routed to GPU2 via the PCIe link and received by the PCIe controller (not shown) on GPU2. The PCIe controller on GPU2 generates a response message after completing the data read / write operation in response to the configuration request. This response message is also encapsulated as a TLP and indicates that the configuration request has been completed. If the configuration request indicates a read access, the response message also includes the read data. GPU2 can return the response message to the CPU via the PCIe link. After receiving the response message, the user can continue to use the configuration tools to parse the response message, making the parsed content recognizable and displayable for the user to view.

[0047] The drawback of existing GPU configuration space access methods is that, since access to the GPU can only be achieved via a PCIe link, if the PCIe link is disconnected before enumeration, the CPU cannot enumerate properly, and the root union cannot access the corresponding PCIe device; if the PCIe link is disconnected after enumeration, the issued configuration request cannot reach the GPU. If the PCIe link is unstable, configuration request and response messages transmitted via the PCIe link may be lost or corrupted. Corrupted configuration requests cannot be responded to, and corrupted response messages cannot be parsed, resulting in the CPU not receiving or being able to parse response messages after issuing configuration requests. In other words, accessing the GPU's configuration space has poor flexibility and relies too heavily on the PCIe link.

[0048] Furthermore, in some application scenarios, when the PCIe link between the GPU and the root federation experiences anomalies (disconnection or instability), it is also necessary to obtain GPU-related information to analyze the cause of the PCIe link anomaly. Due to PCIe link anomalies, accessing the GPU configuration space through the PCIe link to obtain GPU-related information becomes extremely difficult. If GPU-related information from the GPU configuration space is not used for analysis, the accuracy of the analyzed cause of the PCIe link anomaly may be reduced, decreasing the efficiency of resolving PCIe link anomaly issues.

[0049] In view of this, this disclosure proposes a graphics processor access system, access method, and electronic device. This system can access the GPU's configuration space without using a PCIe link, improving the flexibility of accessing the GPU's configuration space.

[0050] Furthermore, even if the PCIe link malfunctions, the system disclosed herein can still access the GPU's configuration space normally. Analyzing the cause of the PCIe link malfunction based on the access results can ensure the accuracy of the analysis and improve the efficiency of resolving PCIe link malfunctions.

[0051] Figure 2 This illustrates an exemplary application scenario of a graphics processor access system according to embodiments of the present disclosure.

[0052] like Figure 2 As shown, the graphics processor access system (hereinafter referred to as the system) can connect to the graphics processor GPU_0 to be accessed. GPU_0 can be connected to the second central processing unit CPU_2 via a PCIe link. The second central processing unit CPU_2 can use... Figure 1 The existing technical solution described herein accesses the configuration space of GPU_0 via a PCIe link. The graphics processing unit (GPU) access system can also access the configuration space of GPU_0 without using a PCIe link.

[0053] GPU_0 and CPU_2 can be located on the same electronic device. The graphics processor access system can be located on one or more other electronic devices, or it can be located on the same electronic device as GPU_0 and CPU_2, or it can be partially located on the same electronic device as GPU_0 and CPU_2. This disclosure does not limit the physical location relationship between the graphics processor access system, GPU_0, and CPU_2.

[0054] Figure 3 A schematic diagram showing the structure of a graphics processor access system according to an embodiment of the present disclosure is provided.

[0055] like Figure 3 As shown, in one possible implementation, the system includes a host PC, a communication module HW1, and a graphics processor GPU_0 to be accessed, which includes a first central processing unit CPU_1. Communication between the host PC and the communication module HW1, and between the communication module HW1 and the first central processing unit CPU_1, uses protocols other than PCIe.

[0056] The host is used to generate an access request and transmit the access request to the communication module in response to receiving the first command;

[0057] The communication module is used to transmit the access request to the first central processing unit in response to receiving the access request;

[0058] The first central processing unit is used to access the configuration space of the graphics processor in response to an access request, and to transmit the access result of the configuration space to the host through the communication module.

[0059] For example, the system may include a user-controllable host. User control over the host includes controlling the host to generate access requests for the graphics processor to be accessed, controlling the host to parse access results, and so on. In practical applications, the host may also automatically generate access requests for the graphics processor to be accessed and automatically parse access results when certain conditions are met. This disclosure does not limit the triggering method for the host to generate access requests for the graphics processor to be accessed and to parse access results.

[0060] The graphics processor to be accessed may include a first central processing unit (CPU), meaning the first CPU may be a local CPU of the graphics processor to be accessed. The first CPU can access the graphics processor's configuration space via the graphics processor's internal bus.

[0061] The system may also include a communication module, which can communicate with the host and the first central processing unit using protocols other than PCIe, thereby enabling communication between the host and the first central processing unit. Various protocols can be selected for communication between the communication module and the host, and examples of the protocols used are given later.

[0062] The host can be used to generate an access request and transmit it to the communication module in response to receiving a first command. Different graphics processors (GPUs) may correspond to different first commands. After receiving the first command, the host can determine that the GPU to be accessed is the one corresponding to the first command. The access request can indicate the GPU to be accessed. For example, when the host receives a first command corresponding to GPU_0, it can determine that it needs to access GPU_0. In this case, the access request generated by the host also indicates access to GPU_0.

[0063] The access request also indicates the type of access. Access types include read access and write access. When the access request indicates a read access, it can also indicate the type of data to be read. When the access request indicates a write access, it can also include information about the data to be written to the graphics processor.

[0064] Different graphics processing units (GPUs) can connect to different communication modules, and each GPU can connect to only one communication module. Different GPUs can also connect to different ports of the same communication module. This disclosure does not impose any restrictions on the connection relationship between GPUs and communication modules.

[0065] Figure 4a and Figure 4b A schematic diagram showing a graphics processor access system connected to multiple graphics processors according to an embodiment of the present disclosure is provided.

[0066] like Figure 4a As shown, GPU_0 can be connected to communication module HW1, and GPU_3 can be connected to communication module HW3.

[0067] like Figure 4b As shown, GPU_0 and GPU_3 can be connected to ports P_1 and P_3 of communication module HW1, respectively.

[0068] The host can pre-store the connection relationships between communication modules and graphics processors (GPUs). In this case, after the host determines which GPU needs to be accessed, it can find the communication module connected to the GPU based on the connection relationship and send the access request to the found communication module. For example, assuming GPU_0 is connected to communication module HW1, when the host determines that it needs to access GPU_0, it can find the communication module HW1 connected to GPU_0 based on the connection relationship and send the resulting access request to communication module HW1.

[0069] The communication module can be used to transmit an access request to a first central processing unit (CPU) in response to receiving such an access request. If the communication module is connected to only one graphics processor (GPU), it can directly forward the access request to the CPU on the connected GPU. If the communication module is connected to multiple GPUs via multiple ports, it can determine the GPU to be accessed based on the indication of the access request and send the access request through the port connected to that GPU. Upon receiving the access request, the port on the GPU connected to the communication module can directly transmit it to the local CPU on the GPU.

[0070] The first central processing unit (CPU) is used to access the configuration space of the graphics processor in response to an access request, and transmit the access result of the configuration space to the host via a communication module. The data content of the access result can be the same as that of the response message in the prior art. When the access request indicates a read access, the access result includes the read data and indicates that the read has been completed. When the access request indicates a write access, the access result indicates that the write has been completed.

[0071] The first central processing unit (CPU) can transmit the access result to the host via the communication module. Since each graphics processor (GPU) is connected to only one communication module, and each communication module is connected to only one host, there is only one transmission path from the first GPU to the host. In this case, the communication module only needs to forward the access result to the connected host upon receiving it. Once the host receives the access result, it can determine that the access to the GPU has been completed.

[0072] A graphics processor access system according to an embodiment of this disclosure includes a host, a communication module, and a first central processing unit (CPU) of the graphics processor to be accessed. In response to receiving a first command, the host generates an access request and transmits the access request to the communication module. In response to receiving the access request, the communication module transmits the access request to the first CPU. In response to the access request, the first CPU accesses the configuration space of the graphics processor and transmits the access result of the configuration space to the host via the communication module. Since the host, communication module, and graphics processor form a pathway for accessing the configuration space of the graphics processor, and the host and communication module, and the communication module and first CPU communicate using protocols other than PCIe, this system can access the GPU's configuration space without using a PCIe link, thus improving the flexibility of accessing the GPU's configuration space.

[0073] In one possible implementation, the host can be a physical host or a virtual host.

[0074] For example, both physical hosts and virtual hosts can generate access requests; therefore, a host can be either a physical host or a virtual host. When the host is a physical host, it can operate independently of the second central processing unit (CPU) or run on the second CPU. When the host is a virtual host, it can be run by the second CPU or by other independent CPUs. When the host is run by other independent CPUs, the operation of the graphics processor (GPU) access system is unaffected regardless of whether the second CPU crashes, thus improving the stability of the GPU access system. Since the link between the host, the communication module, and the GPU is not a PCIe link, the GPU access system can function normally as long as the second CPU does not crash when the host is running on the second CPU, reducing the hardware cost of using the GPU access system to access the GPU. This disclosure does not limit the relationship between the host and the second CPU.

[0075] In one possible implementation, the communication module is a hardware module.

[0076] For example, the communication module can be a hardware module, equipped with hardware ports for communication with both the host and the graphics processing unit (GPU). When the host is a physical host, the port on the communication module that communicates with the host can be directly connected to the host. When the host is a virtual host, the port on the communication module that communicates with the host can be connected to the central processing unit (CPU) running the host. If a communication module connects to only one GPU, it only needs one port for communication with the GPU. If a communication module connects to one or more GPUs, it can reserve multiple ports, with each GPU connected to one port on the communication module. This disclosure does not limit the specific structure of the communication module.

[0077] When the communication module communicates with the graphics processor, it communicates with the first central processing unit (CPU) on the graphics processor. Therefore, the communication module can be configured to send data packets that can parse the first CPU. Any hardware module that meets this function can be used as the communication module in the graphics processor access system of this disclosure. The specific structure of the communication module is not limited in the embodiments of this disclosure.

[0078] The following describes an example of the communication protocol used between the host, communication module, and first central processing unit of this disclosure.

[0079] In one possible implementation, the host and the communication module communicate using the Ethernet protocol, and the communication module and the first central processing unit communicate using the Joint Test Action Group Protocol.

[0080] For example, the host and communication module can communicate using the Ethernet protocol, while the communication module and the first central processing unit (CPU) can communicate using the Joint Test Action Group (JTAG) protocol. In this scenario, the host can package the access request into an Ethernet packet and transmit it to the communication module. Upon receiving the Ethernet packet, the communication module first parses the access request and then packages it into a JTAG packet before transmitting it to the CPU. Similarly, the CPU can package the access result into a JTAG packet and transmit it to the communication module. Upon receiving the JTAG packet, the communication module first parses the access result and then packages the access request into an Ethernet packet before transmitting it to the host.

[0081] In one possible implementation, the host and the communication module communicate using a universal serial bus protocol, and the communication module and the first central processing unit communicate using a serial peripheral interface protocol.

[0082] For example, the host and communication module can communicate using the Universal Serial Bus (USB) protocol, while the communication module and the first central processing unit (CPU) can communicate using the Serial Peripheral Interface (SPI) protocol. In this case, the host can package the access request into a USB protocol data packet and transmit it to the communication module. After receiving the USB protocol data packet, the communication module first parses the access request and then packages the access request into an SPI protocol data packet and transmits it to the CPU. Similarly, the CPU can package the access result into an SPI protocol data packet and transmit it to the communication module. After receiving the SPI protocol data packet, the communication module first parses the access result and then packages the access request into a USB protocol data packet and transmits it to the host.

[0083] Those skilled in the art should understand that more communication protocols can be used for communication between the host and the communication module, and between the communication module and the first central processing unit. This disclosure does not limit the specific communication protocols used when the communication module communicates with the host or with the first central processing unit.

[0084] This disclosure uses a first command to trigger the host to generate an access request. The following example, using the first command corresponding to GPU_0, illustrates an exemplary method for generating the first command of this disclosure.

[0085] In one possible implementation, the graphics processor is also connected to a second central processing unit (CPU_2) via a PCIe link. The first command is generated when the PCIe link fails and is transmitted to the host by the second central processing unit.

[0086] For example, the graphics processor (GPU) is also connected to the second central processing unit (CPU_2) via a PCIe link, allowing CPU_2 to obtain response messages from GPU_0 in accordance with existing techniques. If the root union on CPU_2 fails to enumerate GPU_0 normally, fails to receive a response message from GPU_0, or receives a response message from GPU_0 but fails to parse it successfully, the PCIe link between CPU_2 and GPU_0 is considered faulty. The second CPU can then generate a first command by treating GPU_0 as the GPU to be accessed and transmit this first command to the host. In this case, the first command corresponds to GPU_0.

[0087] In this situation, the host can determine that there is an anomaly in the PCIe link between CPU_2 and GPU_0 upon receiving the first command, and generate an access request.

[0088] In this way, when a PCIe link malfunctions, the graphics processor access system can automatically generate an access request and transmit it to the first central processing unit, thereby enabling access to the graphics processor's configuration space.

[0089] In one possible implementation, the access results are used to determine the cause of the PCIe link anomaly.

[0090] For example, since the first command is generated when an anomaly occurs in the PCIe link between CPU_2 and GPU_0, the access result when the host is triggered to generate an access request using the first command can be used to determine the cause of the anomaly in the PCIe link between CPU_2 and GPU_0. The following describes an exemplary method for determining the cause of the PCIe link anomaly based on the access result.

[0091] In one possible implementation, the access result is in machine language. The host is also used to parse the access result and display the parsed result as a string. The parsed result is used to determine the cause of the PCIe link anomaly.

[0092] For example, if the access results are in machine language, users cannot directly extract useful information from them. Therefore, the host is also used to parse the access results. The parsed results can be strings, allowing users to directly extract useful information from them. The host can then display the parsed results to the user. Parsing access results can be implemented using existing technologies, which will not be elaborated upon here.

[0093] Analyzing the parsing results can determine the cause of the PCIe link anomaly. The analysis of the parsing results can be implemented using existing technology and can be performed by the user or the host; this disclosure does not limit the specific method for performing the analysis of the parsing results.

[0094] In one possible implementation, a Linux system is installed on the host, the host runs a first application, access requests are generated by the first application, and access results are parsed by the first application.

[0095] For example, the host can run a first application, which functions similarly to a configuration tool in existing technologies. Access requests can be generated by the first application, and access results can be parsed by the first application.

[0096] This approach ensures that even if a PCIe link malfunctions, the system can still access the GPU's configuration space. Analyzing the access results to determine the cause of the PCIe link malfunction guarantees accuracy and improves the efficiency of resolving PCIe link issues. Furthermore, it eliminates the need for PCIe link configuration request and response message transmission, resulting in even higher access efficiency.

[0097] The following describes another exemplary method for generating the first command of this disclosure.

[0098] In one possible implementation, the first command is entered by the user into the host.

[0099] For example, besides PCIe link anomalies, users might also need to read data stored in the GPU's configuration space or write new data to it. In this case, the user can actively write the first command and input it into the host. The host can still be used to parse the access results and run the first application, enabling it to generate access requests and parse the access results.

[0100] This approach makes it more flexible to trigger the host to generate access requests.

[0101] This disclosure also proposes a method for accessing a graphics processor. Figure 5 A schematic diagram illustrating the flow of a graphics processor access method according to an embodiment of the present disclosure is shown.

[0102] like Figure 5 As shown, in one possible implementation, the method is applied to a graphics processor access system, the system including a host and a communication module, the graphics processor to be accessed including a first central processing unit (CPU), and the host and the communication module, and the communication module and the first CPU, communicate using protocols other than PCIe. The method includes:

[0103] Step S51: Upon receiving the first command, the host generates an access request and transmits the access request to the communication module;

[0104] In step S52, the communication module, in response to receiving the access request, transmits the access request to the first central processing unit;

[0105] In step S53, the first central processing unit responds to the access request by accessing the configuration space of the graphics processor and transmits the access result of the configuration space to the host through the communication module.

[0106] In one possible implementation, the graphics processor is also connected to a second central processing unit via a PCIe link, and the first command is generated when the PCIe link malfunctions and transmitted by the second central processing unit to the host.

[0107] In one possible implementation, the first command is input by the user into the host.

[0108] In one possible implementation, the host and the communication module communicate using a Universal Serial Bus protocol, and the communication module and the first central processing unit communicate using a Serial Peripheral Interface protocol.

[0109] In one possible implementation, the host and the communication module communicate using the Ethernet protocol, and the communication module and the first central processing unit communicate using the Joint Test Action Group Protocol.

[0110] In one possible implementation, the access result is used to determine the cause of the anomaly in the PCIe link.

[0111] In one possible implementation, the access result is expressed in machine language, and the method further includes: the host parses the access result and displays the parsing result, wherein the parsing result is a string, and the parsing result is used to determine the cause of the PCIe link anomaly.

[0112] In one possible implementation, the host is either a physical host or a virtual host.

[0113] In one possible implementation, the communication module is a hardware module.

[0114] In one possible implementation, the host is equipped with a Linux system, the host runs a first application, the access request is generated by the first application, and the access result is parsed by the first application.

[0115] This disclosure also proposes an electronic device including the graphics processor access system described above, wherein the host in the graphics processor access system can be a virtual host.

[0116] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0117] The various embodiments of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the embodiments in the market, or to enable others skilled in the art to understand the embodiments disclosed herein.

Claims

1. A graphics processor access system, characterized in that, The system includes a host and a communication module. The graphics processor to be accessed includes a first central processing unit. The host and the communication module, and the communication module and the first central processing unit communicate using protocols other than PCIe. The host is configured to, in response to receiving a first command, generate an access request and transmit the access request to the communication module; The communication module is used to transmit the access request to the first central processing unit in response to receiving the access request; The first central processing unit is configured to, in response to the access request, access the configuration space of the graphics processor through the bus inside the graphics processor, and transmit the access result of the configuration space to the host through the communication module; The graphics processor is also connected to a second central processing unit via a PCIe link. The host and the second central processing unit are independent of each other. The first command is generated when the PCIe link fails and is transmitted from the second central processing unit to the host. In this configuration, different graphics processing units (GPUs) are connected to different communication modules, with each GPU connected to only one communication module. Specifically, the host computer, after determining the GPU to be accessed, identifies the communication module connected to the GPU based on the connection relationship between the communication module and the GPU, and transmits the access request to the identified communication module. The communication module is specifically used to transmit the access request to the first central processing unit (CPU) on the connected GPU; or... The communication module connects to multiple graphics processors through multiple ports. The access request indicates the graphics processor to be accessed. Specifically, the communication module is used to determine the graphics processor to be accessed based on the access request and send the access request through the port connected to the graphics processor to be accessed.

2. The system according to claim 1, characterized in that, The first command is input by the user into the host.

3. The system according to claim 1, characterized in that, The host and the communication module communicate using a Universal Serial Bus protocol, and the communication module and the first central processing unit communicate using a Serial Peripheral Interface protocol.

4. The system according to claim 1, characterized in that, The host and the communication module communicate using the Ethernet protocol, and the communication module and the first central processing unit communicate using the Joint Test Action Group protocol.

5. The system according to claim 1, characterized in that, The access results are used to determine the cause of the anomaly in the PCIe link.

6. The system according to claim 1, characterized in that, The access result is in machine language. The host is also used to parse the access result and display the parsing result. The parsing result is a string and is used to determine the reason for the anomaly in the PCIe link.

7. The system according to claim 1, characterized in that, The host can be a physical host or a virtual host.

8. The system according to claim 1, characterized in that, The communication module is a hardware module.

9. The system according to claim 6, characterized in that, The host is equipped with a Linux system and runs a first application. The access request is generated by the first application, and the access result is parsed by the first application.

10. A method for accessing a graphics processor, characterized in that, The method is applied to a graphics processor access system, the system including a host and a communication module, the graphics processor to be accessed including a first central processing unit, and the host and the communication module, and the communication module and the first central processing unit communicate using protocols other than PCIe. The method includes: Upon receiving the first command, the host generates an access request and transmits the access request to the communication module. In response to receiving the access request, the communication module transmits the access request to the first central processing unit; In response to the access request, the first central processing unit accesses the configuration space of the graphics processor through the bus inside the graphics processor, and transmits the access result of the configuration space to the host through the communication module; The graphics processor is also connected to a second central processing unit via a PCIe link. The host and the second central processing unit are independent of each other. The first command is generated when the PCIe link fails and is transmitted from the second central processing unit to the host. In this configuration, different graphics processing units (GPUs) are connected to different communication modules, with each GPU connected to only one communication module. Specifically, the host computer, after determining the GPU to be accessed, identifies the communication module connected to the GPU based on the connection relationship between the communication module and the GPU, and transmits the access request to the identified communication module. The communication module is specifically used to transmit the access request to the first central processing unit (CPU) on the connected GPU; or... The communication module connects to multiple graphics processors through multiple ports. The access request indicates the graphics processor to be accessed. Specifically, the communication module is used to determine the graphics processor to be accessed based on the access request and send the access request through the port connected to the graphics processor to be accessed.

11. An electronic device, characterized in that, The graphics processor access system includes any one of claims 1-9.