A PCIE device state detection method and system, an electronic device and a medium

By scanning and disabling/enabling the connection status of PCIe devices, combined with link training and register reading, the problem of insufficient PCIe device detection is solved, and efficient fault location and diagnosis are achieved.

CN117093427BActive Publication Date: 2026-06-23INSPUR SUZHOU INTELLIGENT TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INSPUR SUZHOU INTELLIGENT TECH CO LTD
Filing Date
2023-07-25
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In existing technologies, the number of tests and the stress level during PCIe device testing are insufficient, resulting in inadequate test results. Furthermore, the limited number of link training iterations makes it difficult to effectively locate the cause of the fault.

Method used

By scanning the PCIe device tree of the server motherboard, obtaining device identifiers, disabling and enabling the connection status between PCIe devices and the central processing unit, and performing PCIe link training after each connection status change, the system traverses and reads the relevant register results to analyze the device status, thereby achieving stress testing of the link.

Benefits of technology

It improves the efficiency of PCIe device detection and diagnosis, shortens the detection time, can obtain more comprehensive register information to locate the cause of the fault, and saves the time cost of system hot reset.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117093427B_ABST
    Figure CN117093427B_ABST
Patent Text Reader

Abstract

The application discloses a PCIE device state detection method and system, electronic equipment and medium, relates to the technical field of PCIE device, and comprises the following steps: scanning a PCIE device tree on a server mainboard, and obtaining a PCIE device identifier on the PCIE device tree; disabling and enabling the connection state of the PCIE device and a central processor according to the device identifier; after the connection state of the PCIE device is disabled and enabled each time, performing PCIE link training once within a preset time; and after each link training is completed, the register results associated with the PCIE device are read and traversed to analyze the PCIE device state. The application can perform pressure test on the PCIE device, effectively improves the detection and diagnosis efficiency of the PCIE device, and if the PCIE device fails, the register information can be inquired, and the failure cause can be analyzed and located.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the technical field of PCIe devices, and specifically to a PCIe device status detection method, system, electronic device, and medium. Background Technology

[0002] PCIe (Peripheral Component Interconnect Express) cables are widely used in servers and storage controllers. For example, the controller chips for SAS and SATA hard drives connect to the CPU (Central Processing Unit) via the PCIe bus. Currently, the following issues exist in the testing process for PCIe devices:

[0003] (1) According to the lspci query, there is a problem of insufficient test count and insufficient pressure. Although many problems occurred during this test, such as endpoint loss, speed reduction and lane reduction, the reasons for these problems could not be confirmed.

[0004] (2) Using the lspci query method, the link training is only completed once during the operating system restart. At the same time, since the operating system hot restart requires a lot of initialization operations, each hot restart takes a long time, resulting in a long overall time for this query detection method. However, the number of link training times is very small, which cannot put pressure on the link, resulting in insufficient detection results.

[0005] (3) The existing query detection method can only query a limited number of status registers, and some important register status information cannot be displayed. Therefore, it cannot provide more effective information for locating the cause of the fault. Summary of the Invention

[0006] To address at least one of the problems mentioned in the background art, this application provides a PCIe device status detection method, system, electronic device, and medium, which can perform pressure testing on PCIe devices, effectively improving the detection and diagnosis efficiency of PCIe devices. When a fault occurs, the cause of the fault can be analyzed and located by querying register information.

[0007] The specific technical solutions provided in this application are as follows:

[0008] Firstly, a method for detecting the status of a PCIe device is provided, the method comprising:

[0009] Scan the PCIe device tree on the server motherboard to obtain the PCIe device identifiers on the PCIe device tree;

[0010] The connection status between the PCIe device and the central processing unit is disabled or enabled based on the device identifier.

[0011] Each time the connection status of a PCIe device is disabled and enabled, a PCIe link training is performed once within a preset time.

[0012] After each link training is completed, the register results associated with the PCIe device are read and analyzed to determine the PCIe device status.

[0013] In one specific embodiment, the method further includes:

[0014] After traversing and reading the register results associated with the PCIe device, determine whether the link training count will reach the preset number;

[0015] If the number of link training attempts is less than the preset number, continue to disable and enable the connection between the PCIe device and the central processing unit according to the device identifier; and after each disable and enable of the PCIe device connection status, execute a PCIe link training step once within a preset time.

[0016] The training continues until the preset number of link iterations is completed.

[0017] In one specific embodiment, disabling and enabling the connection status between the PCIe device and the central processing unit based on the device identifier specifically includes:

[0018] The device identifier includes a downstream device identifier and an upstream device identifier;

[0019] The connection status between downstream PCIe devices and the central processing unit is disabled or enabled based on the downstream device identifier.

[0020] Alternatively, the connection status between the upstream PCIe device and the central processing unit can be disabled or enabled based on the upstream device identifier.

[0021] A buffer time is set between the disable and enable operations.

[0022] In one specific embodiment, after each link training is completed, the method includes:

[0023] The success of the link training is determined by reading the state value of the register through the link training state machine.

[0024] If the state value of the register is equal to the preset state value, the link training is successful and testing continues;

[0025] Alternatively, if the state value of the register is not equal to the preset state value, the link training fails and the test stops.

[0026] In one specific embodiment, the method further includes:

[0027] If the link training is successful, then the register results within the first preset range are read.

[0028] Alternatively, if the link training fails, the register results within a second preset range are read.

[0029] In one specific embodiment, the registers in the first preset range and the registers in the second preset range each include downstream device-related registers and upstream device-related registers, respectively.

[0030] The downstream device-related registers and the upstream device-related registers may be of the same or different types.

[0031] In one specific embodiment, before iterating through and reading the register results associated with the PCIe device for analyzing the PCIe device status, the method further includes:

[0032] Obtain the address information of the register to be traversed;

[0033] This allows for the execution of a script to iterate through and read register results based on the address information of the registers.

[0034] Secondly, a PCIe device status detection system is provided, the system comprising:

[0035] The scanning unit is used to scan the PCIe device tree on the server motherboard and obtain the PCIe device identifiers on the PCIe device tree.

[0036] The connection unit disables and enables the connection status between the PCIe device and the central processing unit based on the device identifier;

[0037] The training unit performs PCIe link training once within a preset time after each time the connection status of a PCIe device is disabled and enabled.

[0038] The analysis unit reads the register results associated with the PCIe device after each link training is completed to analyze the PCIe device status.

[0039] Thirdly, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to perform the following steps:

[0040] Step A: Scan the PCIe device tree on the server motherboard to obtain the PCIe device identifiers on the PCIe device tree;

[0041] Step B: Disable and enable the connection status between the PCIe device and the central processing unit according to the device identifier;

[0042] Step C: After each time the connection status of a PCIe device is disabled and enabled, perform PCIe link training once within a preset time.

[0043] Step D: After each link training is completed, the register results associated with the PCIe device are read and traversed to analyze the PCIe device status.

[0044] Fourthly, a computer-readable storage medium is provided, on which a computer program is stored, wherein the computer program, when executed by a processor, performs the following steps:

[0045] Step A: Scan the PCIe device tree on the server motherboard to obtain the PCIe device identifiers on the PCIe device tree;

[0046] Step B: Disable and enable the connection status between the PCIe device and the central processing unit according to the device identifier;

[0047] Step C: After each time the connection status of a PCIe device is disabled and enabled, perform PCIe link training once within a preset time.

[0048] Step D: After each link training is completed, the register results associated with the PCIe device are read and traversed to analyze the PCIe device status.

[0049] The embodiments of this application have the following beneficial effects:

[0050] 1. The embodiment of this application provides a method to adjust the PCIe link by scanning the PCIe device tree on the server motherboard, traversing and querying the BDF number of all PCIe devices in the tree, and disabling and enabling the connection status between the PCIe devices and the ports on the central processing unit based on the BDF number. After each disabling and enabling of the PCIe device connection status, a PCIe link training is performed once within a preset time. After each link training, the register results associated with the PCIe device are traversed and read for analysis of the PCIe device status. Through the process of disabling and enabling PCIe device connections in this application, and then performing link training after each disabling and enabling of the connection status between the PCIe device and the CPE, the PCIe link is subjected to pressure training, and the PCIe link is fully verified. Compared with the operation of hot reset of the entire system, it saves a lot of time and effectively improves the detection and diagnosis efficiency of PCIe devices. At the same time, the solution in this application can obtain detection information from more relevant registers for analysis and location of fault causes.

[0051] 2. After traversing and reading the register results associated with the PCIe device, determine whether the link training count has reached the preset number. Through the preset number of training and connection processes, stress test on the link is achieved, effectively improving the detection and diagnosis efficiency of PCIe.

[0052] 3. After the link training is completed, the link training state machine reads the state value of the register to determine whether the link training was successful. If the training is successful, the results in the relevant registers within the first preset range are obtained. If the training fails, the results in the relevant registers within the second preset range are obtained. Through the above scheme, regardless of whether the link training is successful or not, the relevant important register results will be obtained for fault analysis. The above scheme can realize the cause of the equipment failure and query the specific failure status of the equipment. Attached Figure Description

[0053] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0054] Figure 1 A schematic diagram illustrating the application environment according to this application is shown;

[0055] Figure 2 A schematic diagram showing the PCIE topology connection structure according to this application is provided.

[0056] Figure 3 A schematic diagram of the PCIe device status detection method according to this application is shown;

[0057] Figure 4 A schematic diagram showing the transition states of the LTSSM state machine according to an embodiment of this application is provided.

[0058] Figure 5 A schematic diagram showing the Recovery state according to Embodiment 1 of this application;

[0059] Figure 6 A schematic diagram of the sub-states in the retain training process according to Embodiment 1 of this application is shown;

[0060] Figure 7 A schematic diagram of a PCIe device status detection system according to this application is shown;

[0061] Figure 8 A schematic diagram of an electronic device according to this application is shown. Detailed Implementation

[0062] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0063] This application provides a PCIe device status detection method, which can be applied to, for example... Figure 1 In the application environment shown, terminal 102 communicates with server 104 via a network. The server scans the PCIe device tree on the motherboard to obtain PCIe device identifiers, where the PCIe device tree connects to terminal devices. Specifically, terminal 102 can be, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. Server 104 can be a standalone server or a server cluster consisting of multiple servers.

[0064] BDF (Bus, Device, Function) is a unique identifier for each function on the PCIe bus. LTSSM (Link Training and Status State Machine) refers to the link training state machine. In the PCIe protocol, two devices interconnected via PCIe are the CPU and the controller chip, such as... Figure 2 As shown, it specifically describes the topology of PCIe, in which the CPU is connected to several PCIe devices, and the other end of the CPU is also connected to a terminal device.

[0065] Example 1

[0066] A PCIe device status detection method, applied to servers, such as... Figure 3 As shown, it includes the following steps:

[0067] Step S1: Scan the PCIe device tree of the server motherboard to obtain the PCIe device identifiers on the PCIe device tree.

[0068] On the server motherboard, the lspci command is used to scan all devices on the PCIelink tree on the motherboard to obtain the identifiers of all PCIelink devices on the tree. Based on the device identifiers, the BDF uses lspci to traverse and query the PCIelink status of all endpoints on the left side of the tree.

[0069] Specifically, the topology of the PCIe tree in this embodiment includes, but is not limited to, Figure 2The structure shown is applicable to other PCIe topology connection structures as well as the scheme in this embodiment.

[0070] Step S2: Disable and enable the connection status between the PCIe device and the central processing unit according to the device identifier;

[0071] Specifically, the device identifier includes a downstream device identifier and an upstream device identifier;

[0072] The connection between a downstream PCIe device and the central processing unit is disabled or enabled based on the downstream device identifier; or the connection between an upstream PCIe device and the central processing unit is disabled or enabled based on the upstream device identifier; wherein a buffer time is reserved between the disable operation and the enable operation.

[0073] In this embodiment, the downstream disablelink register LinkDisable (bit 4) is used to disable and enable the connection status of this PCIe device. Specifically, the BDF number for disabling the connection is "setpci-sBDF 50.b = 0x10", and the corresponding BDF number for enabling the connection is "setpci-sBDF50.b = 0x00". The connection of this PCIe device is closed and enabled by "setpci-sBDF50.b = 0x10" and "setpci-sBDF50.b = 0x00". The above BDF numbers are applicable to the connection of downstream devices.

[0074] Step S3: After each time the connection status of a PCIe device is disabled and enabled, perform PCIe link training once within a preset time.

[0075] Step S4: After each link training is completed, the register results associated with the PCIe device are read and traversed to analyze the PCIe device status.

[0076] In one specific embodiment, after performing PCIE link training once within a preset time, a certain period of time is waited after the link training is completed to ensure that the link training is finished.

[0077] Specifically, after each link training is completed, the following steps are also taken: read the state value of the register through the link training state machine to determine whether the link training was successful; if the state value of the register is equal to the preset state value, the link training is successful and the test continues; or, if the state value of the register is not equal to the preset state value, the link training fails and the test stops.

[0078] Specifically, such as Figure 4 and 5As shown, the LTSSM state machine can display the transition process of each sub-state during training. In the process of disabling and enabling the operating system in this application, the sub-states that the link passes through are from detect, through revocery, Equalization and finally to the L0 state. Therefore, the LTSSM state machine is used to display the transition process of the state and to facilitate the quick location of the training state when a fault occurs or the training stops midway.

[0079] In one specific embodiment, after each disabling and enabling of the corresponding link, and after a 1-second link training period, the result in the register is read by the downstream LTSSM. Specifically, the current status value displayed by the LTSSM is "bit7:0". That is, after each disabling and enabling, the status of this register is checked to determine if training was successful. If the status value "bit7:0" equals "0x30", the next loop is executed; otherwise, an error is reported and the test stops. If the status value read by the LTSSM is "L0", it indicates that the link training was successful, and the testing process continues. Regardless of training success, the status values ​​of important downstream and upstream PCIe registers are read after each disabling and enabling. For example, the registers read include the Equalization register, the channel status register, and the LTSSM monitoring register.

[0080] Before traversing and reading the register results associated with the PCIe device to analyze the PCIe device status, the method further includes: obtaining the address information of the registers to be traversed, so as to realize the traversal and reading of register results based on the address information of the registers by executing a script.

[0081] Specifically, the address information of the relevant registers is obtained and added to the script to be executed for traversal, so as to realize the result of traversing and reading the registers based on the address information of the registers by executing the script.

[0082] To ensure that the specific cause of the fault can be located through the results in the registers, if the link training is successful, the register results within a first preset range are read; or, if the link training fails, the register results within a second preset range are read.

[0083] The registers in the first preset range and the registers in the second preset range both include downstream device-related registers and upstream device-related registers; wherein the downstream device-related registers and the upstream device-related registers are of the same type.

[0084] In a specific embodiment, when the link training process fails, the corresponding PCIE link is queried by lspci based on the BDF number, and the status displayed by the LTSSM state machine shows the stop status when the link fails. At the same time, the transition phase of each sub-state is confirmed by querying the register of the sub-state. The TXPeset values ​​of the upstream and downstream devices are queried by querying the "Register" register. The cause of the failure is located by using the above information.

[0085] In one specific implementation, when training is normal but an AER (Advanced Error Response) type error may occur, the LSPCI is used to query the channel status register in the link based on the BDF number to confirm the lane with the error. Then, the error type is obtained based on the register result, and the above information is printed to effectively locate the cause of the fault.

[0086] Specifically, the upstream device-related registers at least include: VendorID, DeviceID, I / 0Base, I / 0Limi, MemoryBase, MemoryLimit, PrefetchableMemoryBase, Prefetchable MemoryLimit, PrefetchableMemoryBaseUpper32Bits, PrefetchableMemory LimitUpper32Bits, DeviceControl, DeviceStatus, LinkCapabilities, LinkControl, LinkStatus, SlotCapabilities, SlotControl, SlotStatus, DeviceCapabilities2, DeviceControl2, LinkCapabilities2, LinkControl2, LinkStatus2, AERPolicy, PowerManagementControlStatus, PowerManagementCapabilities, and LinkControl3.

[0087] Specifically, the downstream device related registers include at least: UncorrectableErrorStatus, UncorrectableErrorMask, UncorrectableErrorSeverity, CorrectableErrorStatus, CorrectableErrorMask, AdvancedErrorCapabilitiesandControl, LaneError Status, LaneEqualizationControl, 16.0GT / sControl, 16.0GT / sStatus, 16.0GT / sLaneEqualizationControlRegister, 32.0GT / sControl, 32.0GT / sStatus, 32 .0GT / sLaneEqualizationControlRegister, LTSSMLoggerReadControl, VendorID, DeviceID, I / OBase, I / OLimi, MemoryBase, MemoryLimit, PrefetchableMemory Base, PrefetchableMemoryLimit, PrefetchableMemoryBaseUpper32Bits, PrefetchableMemoryLimitUpper32Bits, DeviceControl, DeviceStatus, Link Capabilities, LinkControl, LinkStatus, SlotCapabilities, SlotControl, SlotStatus, DeviceCapabilities2, DeviceControl2, LinkCapabilities2, LinkControl2, Link Status2, AERPolicy, PowerManagementControlStatus, PowerManagement Capabilites, and LinkControl3.

[0088] Step S5: Perform the link training for a preset number of times.

[0089] Specifically, this includes: after traversing and reading the register results, determining whether the number of link training iterations will reach the preset number;

[0090] If the number of link training iterations is less than the preset number, continue to disable and enable the connection between the PCIe device and the central processing unit based on the device identifier; and after each disabling and enabling of the PCIe device's connection status, perform a PCIe link training step once within a preset time period; until the preset number of link training iterations is completed.

[0091] In one specific embodiment, the preset number of tests is set to 200. The actual test time for completing 200 link training tests using the scheme in this embodiment is 40 minutes, while the detection process using the restart process in the prior art takes 50 hours. This shows that the scheme in this embodiment saves a lot of time and improves testing efficiency.

[0092] In one specific embodiment, retain training is used during PCIe link training; the retain training method uses the downstream retain register (bit 5) to retrain the interconnect. The function is accomplished by setting `setpci-sBDF50.b = 0x20`. Figure 6 As shown, this is the sub-process that needs to be jumped during training using retain. However, when using retain, it is not necessary to go through all the sub-processes completely. Only the process from recovery-L0 needs to be executed, which shortens the jump process and further speeds up the testing efficiency.

[0093] The solution in this embodiment can perform stress testing on PCIe devices, effectively improving the detection and diagnosis efficiency of PCIe devices. When a fault occurs, the cause of the fault can be analyzed and located by querying register information.

[0094] It should be understood that, although Figure 3 The steps in the flowchart are shown sequentially as indicated by the arrows, but these steps are not necessarily executed in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order in which these steps are executed, and they can be performed in other orders. Figure 3 At least some of the steps may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be executed in turn or alternately with other steps or at least some of the sub-steps or stages of other steps.

[0095] Example 2

[0096] Corresponding to the above embodiments, this application provides a PCIe device status detection system, such as... Figure 7 As shown, the system includes:

[0097] The scanning unit is used to scan the PCIe device tree on the server motherboard and obtain the PCIe device identifiers on the PCIe device tree.

[0098] The connection unit disables and enables the connection status between the PCIe device and the central processing unit based on the device identifier;

[0099] The training unit performs PCIe link training once within a preset time after each time the connection status of a PCIe device is disabled and enabled.

[0100] The analysis unit reads the register results associated with the PCIe device after each link training is completed to analyze the PCIe device status.

[0101] In one specific embodiment, a loop unit is also included, which is used to determine whether the number of link training times will reach a preset number after traversing and reading the register results;

[0102] If the number of link training iterations is less than the preset number, continue to disable and enable the connection between the PCIe device and the central processing unit based on the device identifier; and after each disabling and enabling of the PCIe device's connection status, perform a PCIe link training step once within a preset time period; until the preset number of link training iterations is completed.

[0103] In one specific embodiment, the connection unit specifically includes the device identifier, which includes a downstream device identifier and an upstream device identifier; it can disable and enable the connection state between the downstream PCIe device and the central processing unit based on the downstream device identifier; or, it can disable and enable the connection state between the upstream PCIe device and the central processing unit based on the upstream device identifier.

[0104] A buffer time is set between the disable and enable operations.

[0105] In a specific embodiment, the connection unit specifically includes a first reading module and a second reading module. The first reading module is used to traverse and read the register results within a first preset range if the link training is successful, and the second reading module is used to traverse and read the register results within a second preset range if the link training fails.

[0106] In one specific embodiment, the registers in the first preset range and the registers in the second preset range each include downstream device-related registers and upstream device-related registers; wherein, the downstream device-related registers and the upstream device-related registers may be of the same or different types.

[0107] In one specific embodiment, the analysis unit is further configured to obtain the address information of the registers to be traversed before traversing and reading the register results associated with the PCIe device for analyzing the PCIe device status, so as to realize the traversal and reading of register results based on the address information of the registers by executing a script.

[0108] For specific limitations regarding the PCIe device status detection system, please refer to the limitations of the PCIe device status detection method mentioned above, which will not be repeated here. Each module in the aforementioned PCIe device status detection system can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in the processor of the computer device in hardware form or independent of it, or they can be stored in the memory of the computer device in software form, so that the processor can call and execute the corresponding operations of each module.

[0109] Example 3

[0110] An electronic device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it performs the following steps:

[0111] Step 101: Scan the PCIe device tree of the server motherboard to obtain the PCIe device identifier on the PCIe device tree;

[0112] Step 102: Disable and enable the connection status between the PCIe device and the central processing unit according to the device identifier;

[0113] Step 103: After each time the connection status of a PCIe device is disabled and enabled, perform PCIe link training once within a preset time.

[0114] Step 104: After each link training is completed, the register results associated with the PCIe device are read in turn to analyze the PCIe device status.

[0115] In a specific embodiment, the method further includes step 105: after traversing and reading the register results associated with the PCIE device, determining whether the number of link training attempts will reach a preset number.

[0116] If the number of link training attempts is less than the preset number, continue to disable and enable the connection between the PCIe device and the central processing unit according to the device identifier; and after each disable and enable of the PCIe device connection status, execute a PCIe link training step once within a preset time.

[0117] The training continues until the preset number of link iterations is completed.

[0118] In one specific embodiment, step 102 specifically includes the device identifier including a downstream device identifier and an upstream device identifier; disabling and enabling the connection state between the downstream PCIe device and the central processing unit according to the downstream device identifier; or disabling and enabling the connection state between the upstream PCIe device and the central processing unit according to the upstream device identifier.

[0119] A buffer time is set between the disable and enable operations.

[0120] In one specific embodiment, after each link training step 103 is completed, it includes:

[0121] The success of the link training is determined by reading the state value of the register through the link training state machine.

[0122] If the state value of the register is equal to the preset state value, the link training is successful and testing continues;

[0123] Alternatively, if the state value of the register is not equal to the preset state value, the link training fails and the test stops.

[0124] In one specific embodiment, if the link training is successful, the register results of the first preset range are read traversed.

[0125] Alternatively, if the link training fails, the register results within a second preset range are read.

[0126] In one specific embodiment, the registers in the first preset range and the registers in the second preset range each include downstream device-related registers and upstream device-related registers, respectively.

[0127] The downstream device-related registers and the upstream device-related registers may be of the same or different types.

[0128] In one specific embodiment, before traversing and reading the register results associated with the PCIe device in step 104, the method further includes:

[0129] Obtain the address information of the register to be traversed, so as to realize the result of traversing and reading the register by executing the script according to the address information of the register.

[0130] In one embodiment, an electronic device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 8As shown, this electronic device includes a processor, memory, network interface, and database connected via a system bus. The processor provides computing and control capabilities. The memory includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores the operating system, computer programs, and database. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage medium. The database stores register status data. The network interface communicates with external terminals via a network connection. When executed by the processor, the computer program implements a PCIe device status detection method.

[0131] Those skilled in the art will understand that Figure 8 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the electronic device to which the present application is applied. The specific electronic device may include more or fewer components than shown in the figure, or combine certain components, or have different component arrangements.

[0132] Example 4

[0133] In one embodiment of this invention, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, performs the following steps:

[0134] Step 201: Scan the PCIe device tree of the server motherboard to obtain the PCIe device identifier on the PCIe device tree;

[0135] Step 202: Disable and enable the connection status between the PCIe device and the central processing unit according to the device identifier;

[0136] Step 203: After each time the connection status of a PCIe device is disabled and enabled, perform PCIe link training once within a preset time.

[0137] Step 204: After each link training is completed, the register results associated with the PCIe device are read in turn to analyze the PCIe device status.

[0138] In one specific embodiment, when the computer program is executed by the processor, it further performs the following steps: after traversing and reading the register results associated with the PCIe device, it determines whether the number of link training attempts will reach a preset number;

[0139] If the number of link training attempts is less than the preset number, continue to disable and enable the connection between the PCIe device and the central processing unit according to the device identifier; and after each disable and enable of the PCIe device connection status, execute a PCIe link training step once within a preset time.

[0140] The training continues until the preset number of link iterations is completed.

[0141] In one specific embodiment, when the computer program is executed by the processor, it further performs the following steps: the device identifier includes a downstream device identifier and an upstream device identifier; the connection state between the downstream PCIe device and the central processing unit is disabled and enabled according to the downstream device identifier; or, the connection state between the upstream PCIe device and the central processing unit is disabled and enabled according to the upstream device identifier.

[0142] A buffer time is set between the disable and enable operations.

[0143] In one specific embodiment, when the computer program is executed by the processor, it further implements the following steps after each link training is completed:

[0144] The success of the link training is determined by reading the state value of the register through the link training state machine.

[0145] If the state value of the register is equal to the preset state value, the link training is successful and testing continues;

[0146] Alternatively, if the state value of the register is not equal to the preset state value, the link training fails and the test stops.

[0147] In one specific embodiment, when the computer program is executed by the processor, it further implements the following steps: if the link training is successful, then iterate through and read the register results of the first preset range;

[0148] Alternatively, if the link training fails, the register results within a second preset range are read.

[0149] In one specific embodiment, when the computer program is executed by the processor, it further implements the following steps: the registers in the first preset range and the registers in the second preset range each include downstream device-related registers and upstream device-related registers;

[0150] The downstream device-related registers and the upstream device-related registers may be of the same or different types.

[0151] In one specific embodiment, the computer program, when executed by the processor, further includes the following steps: before traversing and reading the register results associated with the PCIe device for analyzing the PCIe device status, the program also includes:

[0152] Obtain the address information of the register to be traversed, so as to realize the result of traversing and reading the register by executing the script according to the address information of the register.

[0153] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), Rambus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

[0154] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0155] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.

Claims

1. A method for detecting the status of a PCIe device, characterized in that, The method includes: Scan the PCIe device tree of the server motherboard to obtain the PCIe device identifiers on the PCIe device tree; The connection status between the PCIe device and the central processing unit is disabled or enabled based on the device identifier. Each time the connection status of a PCIe device is disabled and enabled, a PCIe link training is performed once within a preset time. After each link training is completed, the register results associated with the PCIe device are read and traversed to analyze the PCIe device status. After traversing and reading the register results associated with the PCIe device, it is determined whether the link training count will reach a preset number. If the link training count is less than the preset number, the process of disabling and enabling the connection between the PCIe device and the central processing unit based on the device identifier continues. After each disabling and enabling of the PCIe device's connection status, a PCIe link training step is performed once within a preset time. A buffer time is reserved between the disabling and enabling operations until the preset number of link training operations are completed. After each link training is completed, the state value of the register is read through the link training state machine to determine whether the link training was successful. If the state value of the register is equal to the preset state value, the link training is successful and the test continues; or, if the state value of the register is not equal to the preset state value, the link training fails and the test stops. If the link training is successful, the register results of the first preset range are read; or, if the link training fails, the register results of the second preset range are read. The registers of the first preset range and the registers of the second preset range each include downstream device-related registers and upstream device-related registers.

2. The PCIe device status detection method according to claim 1, characterized in that, The connection status between the PCIe device and the central processing unit is disabled or enabled based on the device identifier, specifically including: The device identifier includes a downstream device identifier and an upstream device identifier; The connection status between downstream PCIe devices and the central processing unit is disabled or enabled based on the downstream device identifier. Alternatively, the connection status between the upstream PCIe device and the central processing unit can be disabled or enabled based on the upstream device identifier.

3. The PCIe device status detection method according to claim 1, characterized in that, in, The downstream device-related registers and the upstream device-related registers may be of the same or different types.

4. The PCIe device status detection method according to claim 1 or 2, characterized in that, Before iterating through and reading the register results associated with the PCIe device for analyzing the PCIe device status, the method further includes: Obtain the address information of the register to be traversed; This allows for the execution of a script to iterate through and read register results based on the address information of the registers.

5. A PCIe device status detection system based on the method of any one of claims 1 to 4, characterized in that, The system includes: The scanning unit is used to scan the PCIe device tree on the server motherboard and obtain the PCIe device identifiers on the PCIe device tree. The connection unit disables and enables the connection status between the PCIe device and the central processing unit based on the device identifier; The training unit performs PCIe link training once within a preset time after each time the connection status of a PCIe device is disabled and enabled. The analysis unit reads the register results associated with the PCIe device after each link training is completed to analyze the PCIe device status.

6. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 4.

7. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 4.