Serial bus link status classification alarm method and electronic device

By acquiring serial bus link status information and verifying it using preset rules and platform information database, the types of PCIe link degradation events can be distinguished, solving the problem of low operation and maintenance efficiency in existing technologies and achieving efficient handling of abnormal degradation events.

CN122240430APending Publication Date: 2026-06-19INSPUR SUZHOU INTELLIGENT TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
INSPUR SUZHOU INTELLIGENT TECH CO LTD
Filing Date
2026-05-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, when a server system detects a decrease in PCIe link speed or bandwidth, it cannot effectively distinguish whether the cause is a device malfunction or a generational inconsistency in PCIe, resulting in low operational efficiency.

Method used

By acquiring the status information of the serial bus link, a preliminary judgment is made based on preset judgment rules to determine whether the drop event is expected or abnormal. After verification through the platform information database, the alarm type is finally determined and different types of alarms are generated.

Benefits of technology

It improved operational efficiency, reduced the time spent troubleshooting false faults, and enhanced the intelligence level and operational efficiency of system management.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240430A_ABST
    Figure CN122240430A_ABST
Patent Text Reader

Abstract

This application discloses a serial bus link status classification alarm method and electronic device, relating to the field of high-speed serial bus device technology. When a link experiences a rate or bandwidth decrease event, the method acquires the link's status information, pre-judges the decrease event based on the status information and preset judgment rules, initially determining whether the decrease event is expected or abnormal, and then verifies the pre-judgment result against a platform information database to ultimately determine whether the decrease event is expected or abnormal. This method can differentiate between decrease events and generate different types of alarms for different decrease events, allowing maintenance personnel to handle only alarms related to abnormal decreases, thus improving maintenance efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of high-speed serial bus device technology, and in particular to a serial bus link status classification alarm method and electronic device. Background Technology

[0002] In server systems, high-speed serial bus (Peripheral Component Interconnect Express, PCIe) is widely used to connect processors and peripheral devices. However, with the iterative upgrade of PCIe technology, the PCIe generations supported by processors and peripheral devices may be different. This can lead to rate reduction events or bandwidth reduction events in the PCIe link, in order to ensure that the rate and bandwidth requirements of lower generation PCIe can be met.

[0003] In related technologies, when a link rate or bandwidth reduction event is detected, it is directly determined that a device failure has occurred in the system, and an alarm message is generated for maintenance personnel to perform system maintenance. However, the above technologies cannot effectively distinguish whether the alarm message is caused by a device failure or by the inconsistency in PCIe generations, resulting in low maintenance efficiency. Summary of the Invention

[0004] This application provides a serial bus link status classification alarm method and electronic device to at least solve the problem of low operation and maintenance efficiency in related technologies.

[0005] This application provides a serial bus link status classification alarm method, including: in response to a degradation event occurring on the serial bus link, acquiring the status information of the serial bus link, wherein the degradation event includes a rate degradation event and a bandwidth degradation event, and the status information is used to perform category analysis on the degradation event; generating a preliminary judgment result based on the status information and preset judgment rules, wherein the preliminary judgment result is used to initially indicate whether the degradation event belongs to an expected degradation or an abnormal degradation; and verifying the preliminary judgment result based on the platform information database to generate a target result, wherein the target result is used to indicate the alarm type corresponding to the degradation event.

[0006] This application also provides a serial bus link status classification alarm device, comprising: an acquisition module, configured to acquire status information of the serial bus link in response to a degradation event occurring on the serial bus link, wherein the degradation event includes a rate degradation event and a bandwidth degradation event, and the status information is used to perform category analysis on the degradation event; a first processing module, configured to generate a preliminary judgment result based on the status information and preset judgment rules, the preliminary judgment result being used to initially indicate whether the degradation event belongs to an expected degradation or an abnormal degradation; and a second processing module, configured to review the preliminary judgment result based on a platform information database and generate a target result, the target result being used to indicate the alarm type corresponding to the degradation event.

[0007] This application also provides an electronic device, including: a memory for storing a computer program; and a processor for implementing the steps of any of the above-described serial bus link status classification alarm methods when executing the computer program.

[0008] This application also provides a computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, it implements the steps of any of the above-described serial bus link status classification alarm methods.

[0009] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of any of the above-described serial bus link status classification and alarm methods.

[0010] The serial bus link status classification alarm method and electronic device provided in this application obtain link status information when a link rate decreases or bandwidth decreases. Based on the status information and preset judgment rules, the method makes a preliminary judgment on whether the decrease is expected or abnormal. The method then reviews the judgment result against the platform information database to ultimately determine whether the decrease is expected or abnormal. This allows for the differentiation of decrease events and the generation of different types of alarms for different decrease events. This enables maintenance personnel to handle alarms only for abnormal decreases, improving maintenance efficiency. Attached Figure Description

[0011] To more clearly illustrate the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0012] Figure 1 A schematic diagram of an application environment architecture provided in this application embodiment;

[0013] Figure 2 A flowchart illustrating the serial bus link status classification and alarm method provided in this application embodiment. Figure 1 ;

[0014] Figure 3 A schematic diagram illustrating the visualization of a target result provided in an embodiment of this application;

[0015] Figure 4 A flowchart illustrating the serial bus link status classification and alarm method provided in this application embodiment. Figure 2 ;

[0016] Figure 5A schematic diagram of the structure of the serial bus link status classification alarm device provided in the embodiments of this application;

[0017] Figure 6 A schematic diagram of the structure of the electronic device provided in this application. Detailed Implementation

[0018] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of this application.

[0019] It should be noted that, in the description of this application, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. The terms "first," "second," etc., in this application are used to distinguish similar objects and are not used to describe a specific order or sequence.

[0020] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0021] The specific application environment architecture or hardware architecture upon which the serial bus link status classification alarm method depends is described here. (Reference) Figure 1 , Figure 1 This is a schematic diagram of an application environment architecture provided in an embodiment of the present application. In the server system, there are a central processing unit 10 and a connection device 11, which are connected to each other via a high-speed serial bus link.

[0022] The Central Processing Unit (CPU) is the core computing and control unit of the server. It supports specific generations of PCIe bus standards (such as Gen4 and Gen5), and provides speed limit support for connected devices through the built-in PCIe controller. It is the initiator of PCIe link communication and the main body of resource allocation.

[0023] Connecting device 11 is a terminal peripheral that supports the PCIe protocol, such as a graphics processing unit (GPU), a non-volatile memory express solid-state drive (NVMe SSD), a network card, or other hardware devices. Connecting device 11 has a built-in capability register (LnkCap) that can declare its maximum supported PCIe speed (e.g., Gen5) and maximum bandwidth (e.g., x4). Connecting device 11 also has a built-in status register (LnkSta) that can declare the currently operating PCIe speed and bandwidth.

[0024] The PCIe speeds supported by servers are upgraded with each generation of products, from Gen3, Gen4, Gen5, Gen6, etc. Generally, a CPU supporting the highest PCIe Gen6 speed is used with PCIe Gen6 connection devices, and the PCIe speed is negotiated to the Gen6 speed. However, in some cases, to utilize existing servers, existing connection devices, and newly purchased servers and connection devices, a CPU supporting the highest PCIe Gen4 speed is used with PCIe Gen5 connection devices. In this case, the PCIe speed is negotiated to the Gen4 speed. The PCIe link speed is negotiated to the highest speed supported by both parties; this is a normal and expected speed reduction compatibility behavior. Another scenario is that with the advent of PCIe Gen5 and Gen6, for PCIe type hard drives such as U.2 and E3 drives, the bandwidth that originally supported x4 can now be met by PCIe Gen5 and PCIe Gen6 x2. These PCIe hard drives, which originally supported a maximum bandwidth of x4, can now be paired with x2 backplanes. The PCIe link bandwidth is negotiated to the highest bandwidth supported by both parties x2, and PCIe devices will also exhibit normal and expected bandwidth reduction compatibility behavior.

[0025] However, in related technologies, server management systems typically report such speed reduction or bandwidth degradation as a "Bus Degraded" or other critical alarm event when they detect it. For example, when a Gen5 GPU runs on a server that only supports Gen4, even if the link status is normal, the system will still record a Critical level alarm log, causing maintenance personnel to spend a lot of time troubleshooting false faults and severely interfering with the efficiency of handling real problems. Furthermore, as server scale increases and hardware heterogeneity grows, similar problems will be further exacerbated, directly reducing system availability and maintenance costs. Therefore, there is an urgent need for a method that can intelligently distinguish between expected speed / bandwidth degradation and abnormal speed / bandwidth degradation to optimize the alarm mechanism and improve the intelligence level of system management.

[0026] Based on this, this application provides a serial bus link status classification alarm method. When a link experiences a rate decrease or bandwidth decrease event, the method acquires the link status information, makes a preliminary judgment on the decrease event based on the status information and preset judgment rules, initially determines whether the decrease event is expected or abnormal, and reviews the judgment result based on the platform information database to finally determine whether the decrease event is expected or abnormal. This method can distinguish decrease events and generate different types of alarms for different decrease events, allowing maintenance personnel to handle only alarms for abnormal decreases, thereby improving maintenance efficiency.

[0027] Figure 2 A flowchart illustrating the serial bus link status classification and alarm method provided in this application embodiment. Figure 1 ,like Figure 2 As shown, embodiments of this application provide a serial bus link status classification and alarm method, which is described in detail below:

[0028] S101. In response to a drop event occurring on the serial bus link, obtain the status information of the serial bus link. The drop event includes a rate drop event and a bandwidth drop event. The status information is used to perform category analysis on the drop event.

[0029] The serial bus link, also known as the PCIe link, is a high-speed data transmission channel connecting the CPU and PCIe peripherals, hereinafter referred to as the link.

[0030] A rate reduction event refers to the negotiation of the PCIe link rate to the highest rate supported by both the CPU and the connected device when their supported rates differ. For example, if the highest rate supported by the CPU is PCIe Gen4 and the highest rate supported by the connected device is PCIe Gen5, the PCIe link rate will be negotiated to the highest rate supported by both, PCIe Gen4, and the PCIe device will exhibit normal and expected rate reduction compatibility behavior.

[0031] A bandwidth reduction event refers to the negotiation of PCIe link bandwidth to the highest bandwidth supported by both the CPU and the connected device when their bandwidths differ. For example, when a PCIe hard drive with a maximum bandwidth of x4 is mounted on a backplane with a bandwidth of x2, the PCIe link bandwidth is negotiated to the highest bandwidth supported by both devices, x2, and the PCIe device will exhibit normal and expected bandwidth reduction compatibility behavior.

[0032] A drop event can be a rate drop event, a bandwidth drop event, or a combination of both.

[0033] Status information consists of multi-dimensional data used to analyze the categories of degradation events. The categories of degradation events can be expected degradation and abnormal degradation. Expected degradation refers to the speed reduction and / or bandwidth reduction behavior caused by hardware incompatibility, while abnormal degradation refers to the speed reduction and / or bandwidth reduction behavior caused by link failure or equipment failure.

[0034] The status information may include the highest PCIe version supported by the CPU, the highest PCIe version and maximum bandwidth supported by the PCIe device, the actual rate and bandwidth of the link currently negotiated, the link training status, the link error count, and the bandwidth allocated by the Basic Input Output System (BIOS) to the PCIe slot or NVMe backplane.

[0035] After detecting a drop event, the BIOS can actively collect link-related data, covering multiple dimensions such as CPU, PCIe devices, link operating status and configuration allocation, to provide data support for subsequent category analysis.

[0036] For example, the status information may include first status information and second status information. When the drop event is a rate drop event, the BIOS collects rate-related data to constitute the first status information; when the drop event is a bandwidth drop event, the BIOS collects bandwidth-related data to constitute the second status information; when the drop event includes both a rate drop event and a bandwidth drop event, the BIOS collects both rate-related data and bandwidth-related data to constitute the status information.

[0037] In one possible implementation, if the rate drop event is a rate drop event, then in response to the rate drop event occurring on the serial bus link, the status information corresponding to the serial bus link is obtained, including: in response to the rate drop event, obtaining first status information, which includes at least one of the following: a first rate, a second rate, a third rate, a link status, and an error count, wherein the first rate is used to indicate the maximum bus transmission rate supported by the connected device, the second rate is used to indicate the maximum bus transmission rate supported by the processor, the third rate is used to indicate the current actual bus transmission rate, the link status is used to indicate whether the serial bus link is in a normal state or a disconnected state, and the error count is used to indicate the number of hardware-level errors detected during the operation of the serial bus link.

[0038] The connection device is a PCIe device, hereinafter referred to as the device.

[0039] The first rate is the PCIe rate supported by the PCIe device, denoted as device_max_gen, which can be obtained by reading the device's capability register (LnkSta). The second rate is the highest PCIe rate supported by the CPU, denoted as cpu_max_gen, which can be obtained by reading the processor's registers. The third rate is the PCIe rate negotiated by the PCIe device, i.e., the current actual rate, denoted as current_speed, which can be obtained by reading the device's status register. The link status includes normal status (LinkUp) and disconnected status (LinkDown). The error count is the number of various errors that occur in the PCIe link during data transmission, link training, etc., including but not limited to transmission errors, checksum errors, link synchronization errors, etc., used to reflect the stability and fault status of the PCIe link. The link status and error count can be obtained by reading the Advanced Error Reporting (AER).

[0040] In one possible implementation, if the drop event is a bandwidth drop event, then in response to the drop event occurring on the serial bus link, the status information corresponding to the serial bus link is obtained, including: in response to the bandwidth drop event, obtaining second status information, the second status information including at least one of the following: first bandwidth, second bandwidth, third bandwidth, link status, and error count, wherein the first bandwidth is used to indicate the maximum bus transmission bandwidth supported by the connected device, the second bandwidth is used to indicate the bus transmission bandwidth allocated to the connected device, and the third bandwidth is used to indicate the current actual bus transmission bandwidth.

[0041] The first bandwidth is the PCIe bandwidth supported by the PCIe device, denoted as device_max_width, which can be obtained by reading the device's capability register. The second bandwidth is the bandwidth allocated by the BIOS to the PCIe slot or NVMe backplane, denoted as bandwidth_allocation, which can be obtained by reading the BIOS's bandwidth allocation information. The third bandwidth is the PCIe bandwidth negotiated by the PCIe device, i.e., the current actual bandwidth, denoted as current_width, which can be obtained by reading the device's status register. The link status includes normal status (LinkUp) and disconnected status (LinkDown). The error count is the number of various errors that occur in the PCIe link during data transmission, link training, etc., including but not limited to transmission errors, verification errors, link synchronization errors, etc., used to reflect the stability and fault status of the PCIe link. The link status and error count can be obtained by reading the AER.

[0042] S102. Based on the status information and preset judgment rules, generate a prediction result. The prediction result is used to initially indicate whether the descent event is an expected descent or an abnormal descent.

[0043] The preset judgment rules are the algorithm logic used to initially determine the attributes of the degradation event, including the speed reduction judgment rule and the bandwidth reduction judgment rule. By comparing the key parameters in the status information, it is possible to identify whether the degradation event is a normal behavior caused by hardware incompatibility.

[0044] The prediction result is a preliminary conclusion generated after executing preset judgment rules, including expected decline and abnormal decline. The prediction result can include a first prediction result and a second prediction result. For a rate decline event, the corresponding prediction result is the first prediction result; for a bandwidth decline event, the corresponding prediction result is the second prediction result.

[0045] For example, when the rate decrease event is a rate reduction event, a first prediction result can be determined based on the first state information and the rate reduction judgment rule. The first prediction result is used to indicate whether the rate decrease event is expected or abnormal. When the rate decrease event is a bandwidth decrease event, a second prediction result can be determined based on the second state information and the bandwidth reduction judgment rule. The second prediction result is used to indicate whether the bandwidth decrease event is expected or abnormal.

[0046] The BIOS can perform logical operations on the collected status information according to preset judgment rules, preliminarily determine the attributes of the degradation event and generate a conclusion. The core is to output a predictive conclusion of expected degradation or abnormal degradation by comparing key parameters (such as device capabilities and CPU / BIOS allocation capabilities, link status and error count).

[0047] For example, the speed reduction judgment rule is as follows: when the first rate is greater than the second rate, and the third rate is equal to the second rate, and the link status is normal, and the error count is equal to 0, the speed reduction event is expected, specifically expected speed reduction; when the third rate is less than the first rate, or the error count is greater than 0, the speed reduction event is abnormal speed reduction; in other cases, further analysis is required, such as issuing an error alarm and sending it to the operation and maintenance personnel.

[0048] For example, the bandwidth reduction judgment rule is as follows: when the first bandwidth is greater than the second bandwidth, and the third bandwidth is equal to the second bandwidth, and the link status is normal, and the error count is equal to 0, the reduction event is expected reduction, specifically expected bandwidth reduction; when the third bandwidth is less than the first bandwidth, or the error count is greater than 0, the reduction event is abnormal reduction; in other cases, further analysis is required, such as issuing an error alarm and sending it to the operation and maintenance personnel.

[0049] During the BIOS phase, upon detecting a rate reduction or bandwidth decrease, before reporting to the Baseboard Management Controller (BMC), if a rate reduction event is identified, it not only reads the current rate but also the device's PCIe capability register to obtain its declared maximum supported version (e.g., Gen5). If a bandwidth reduction event is identified, it also needs to combine this with the BIOS's actual bandwidth allocation data for the PCIe device to calculate the maximum bandwidth corresponding to the riser slot or NVMe backplane. Based on the above judgment rules, it is determined whether the reduction event is expected or abnormal.

[0050] For example, when the descent event is a rate decrease event, a first prediction result is determined based on the first state information and the judgment rule, including: when the descent event is a rate decrease event, the first prediction result is determined based on the first state information and the judgment rule, including: when the first rate is greater than the second rate, the third rate is equal to the second rate, the link state is normal, and the error count is equal to the counting threshold, the first state flag is set to a first value, the first value being used to identify the rate decrease event as an expected decrease; when the third rate is less than the first rate or the error count is greater than the counting threshold, the first state flag is set to a second value, the second value being used to identify the rate decrease event as an abnormal decrease; a first event indication information is generated based on the first state information, the first event indication information being used to indicate that the current event is a rate decrease event; the first state flag, the first event indication information, and the first state information are determined as the first prediction result.

[0051] The counting threshold is set in advance to distinguish between expected and abnormal drops. Generally, the counting threshold can be 0. That is, when the error count is greater than 0, it indicates that there is an anomaly such as a transmission error in the link. This anomaly may be caused by non-incompatibility factors such as hardware failure or link damage. When the error count is 0, it indicates that there is no anomaly such as a transmission error in the link.

[0052] The first status flag is an identifier used to indicate the type of the current rate decline event. For example, the first status flag might be ExpectedDegradeFlag. The first value is the value of the first status flag corresponding to an expected decline; for example, 1. ExpectedDegradeFlag=1 indicates that the rate decline event is an expected decline. The second value is the value of the first status flag corresponding to an abnormal decline; for example, 0. ExpectedDegradeFlag=0 indicates that the rate decline event is an abnormal decline.

[0053] Furthermore, the first state information, flag bit, and first event indication information are used as the first prediction result and passed to the BMC so that the BMC can perform further processing based on the first prediction result. Optionally, the first prediction result may also include the device's unique hardware identifier, device name, etc., to indicate the device corresponding to the current rate decrease event.

[0054] For example, when the bandwidth decrease event is a bandwidth decrease event, a second prediction result is determined based on the second state information and the bandwidth decrease judgment rule, including: when the first bandwidth is greater than the second bandwidth, the third bandwidth is equal to the second bandwidth, the link status is normal and the error count is equal to the counting threshold, the second state flag is set to a third value, the third value is used to identify the bandwidth decrease event as expected decrease; when the third bandwidth is less than the first bandwidth or the error count is greater than the counting threshold, the second state flag is set to a fourth value, the fourth value is used to identify the bandwidth decrease event as abnormal decrease; a second event indication information is generated based on the second state information, the second event indication information is used to indicate that the current event is a bandwidth decrease event; the second state flag, the second event indication information and the second state information are determined as the second prediction result.

[0055] The second status flag is an identifier used to indicate the type of bandwidth degradation event. The second status flag can be the same as the first status flag, such as ExpectedDegradeFlag. The third value is the value of the second status flag corresponding to an expected degradation; for example, ExpectedDegradeFlag=1 indicates that the bandwidth degradation event is expected. The fourth value is the value of the second status flag corresponding to an abnormal degradation; for example, ExpectedDegradeFlag=0 indicates that the bandwidth degradation event is abnormal.

[0056] Furthermore, the second status information, flag bit, and second event indication information are used as the second prediction result and passed to the BMC so that the BMC can perform further processing based on the second prediction result. Optionally, the second prediction result may also include the device's unique hardware identifier, device name, etc., to indicate the device corresponding to the current bandwidth decrease event.

[0057] In one possible implementation, a rate decrease event and a bandwidth decrease event may occur simultaneously in the link. A first prediction result is generated for the rate decrease event, and a second prediction result is generated for the bandwidth decrease event. The first and second prediction results are then sent to the BMC as the final prediction result.

[0058] S103. Review the prediction results based on the platform information database and generate target results. The target results are used to indicate the alarm type corresponding to the descent event.

[0059] The platform information database is a data storage module deployed in BMC. It stores information such as device whitelists, CPU models and the highest PCIe version, PCIe Riser / backplane models and bandwidth limits, and historical operating behavior, which are used to provide the configuration data required for verification.

[0060] The review of the prediction results refers to the BMC performing a secondary verification of the prediction results generated by the BIOS, combining the data in the platform information database with the actual hardware status, and correcting the initial judgment deviation.

[0061] The target result is the final attribute conclusion of the decline event determined after review, which directly maps to the corresponding alarm type, including informational alarms (corresponding to expected decline) and critical alarms (corresponding to abnormal decline).

[0062] For example, the process of verifying the prediction result based on the platform information database and generating a target result includes: when the first status flag is a first value and / or the second status flag is a third value, verifying the status information based on the platform information database to obtain a first verification result, which indicates whether the drop event is an expected drop or an abnormal drop; when the first status flag is a second value and / or the second status flag is a fourth value, checking whether the connected device is in the whitelist in the platform information database to obtain a second verification result, which indicates whether the drop event is an expected drop or an abnormal drop; and generating a target result based on the first verification result and / or the second verification result.

[0063] The platform information database is the information database in BMC used for data storage. It stores core data such as server hardware configuration, device whitelist, hardware historical operation behavior, and PCIe bus-related specifications. For example, it includes CPU model and maximum PCIe version, PCIe Riser / backplane model and bandwidth limit, etc., providing authoritative data support for verification.

[0064] After receiving the event from the BIOS, the BMC first checks the flag bit. When ExpectedDegradeFlag=1, the BMC calls the hardware configuration data in the platform information database to verify the authenticity and logical rationality of the status information collected by the BIOS, and finally determines whether the degradation event is an expected event.

[0065] The method described above introduces a configurable whitelist, allowing administrators to flexibly adjust alarm policies based on actual scenarios (such as the compatibility of specific high-performance devices on a specific platform). This design endows the method with high adaptability and manageability, enabling it to meet the granular management needs of complex heterogeneous deployment environments.

[0066] For example, when the first status flag is a first value and / or the second status flag is a third value, the status information is verified according to the platform information database to obtain a first verification result, including: when the decline event is a rate decline event, determining whether the second rate in the status information is the same as the maximum rate supported by the processor recorded in the platform information database; if they are the same, determining that the first verification result is a rate decline event belonging to an expected decline; if they are different, determining that the first verification result is a rate decline event belonging to an abnormal decline; when the decline event is a bandwidth decline event, determining whether the second bandwidth in the status information is the same as the maximum bandwidth supported by the hardware for allocating bandwidth to the connected device recorded in the platform information database; if they are the same, determining that the first verification result is a bandwidth decline event belonging to an expected decline; if they are different, determining that the first verification result is a bandwidth decline event belonging to an abnormal decline.

[0067] First, based on the first or second event indication information in the prediction results, determine whether the current event is a speed decrease or a bandwidth decrease. For speed decrease events, primarily verify the specific CPU model collected by the BMC to confirm the maximum PCIe version supported by the CPU. If there are no issues, it is considered an expected decrease; otherwise, it is considered an abnormal decrease. For bandwidth decrease events, use the PCIe Riser information or backplane model collected by the BMC to determine the maximum PCIe bandwidth supported by the Riser and backplane. Verify that the bandwidth allocated to the PCIe slot or backplane by the BIOS matches the maximum PCIe bandwidth supported by the Riser and backplane. If the decrease is expected, it is considered an abnormal decrease; otherwise, it is considered an abnormal decrease.

[0068] Optionally, when ExpectedDegradeFlag=0, the device whitelist is further checked. If the device is in the whitelist, it is determined to be an expected degradation; otherwise, it is determined to be an abnormal degradation.

[0069] For example, generating a target result based on a first review result and / or a second review result includes: generating a first target result when the first review result and / or the second review result indicate that the decline event is an expected decline, wherein the first target result is used to indicate that the alarm type of the decline event is an informational alarm; and generating a second target result when the first review result and / or the second review result indicate that the decline event is an abnormal decline, wherein the second target result is used to indicate that the alarm type corresponding to the decline event is a critical alarm.

[0070] When the rate drop event is expected, an Informational level log (such as "PCIe device [device name] is running at the expected rate [Gen4]") is recorded in the System Event Log (SEL) and displayed as a prompt on the Web interface to obtain the first target result.

[0071] When the rate decrease event is an abnormal decrease, a Critical level alarm is recorded in SEL, and the error is highlighted in the Web interface to obtain the second target result.

[0072] When the bandwidth drop event is expected, an Informational level log (such as "PCIe device [device name] is operating at expected bandwidth [X2]") is recorded in SEL and displayed as a prompt message on the Web interface to obtain the first target result.

[0073] When the bandwidth drop event is an abnormal drop, a Critical level alarm is recorded in SEL, and the error is highlighted in the web interface to obtain the second target result.

[0074] See Figure 3 , Figure 3 This is a visualization diagram of a target result provided in an embodiment of this application. The left side shows an informational alarm, indicating that the current link is in a normal state, with a speed of Gen4 and a state of expected speed reduction, meaning PCIe device A is operating at the expected speed of Gen4. The right side shows a critical alarm, indicating that the current link is in an abnormal state, with a speed of Gen1 and a state of abnormal speed reduction. Error code 777 is used to indicate the error type and provides maintenance suggestions, namely, checking the connection of PCIe device A.

[0075] In the above method, by generating and displaying the target results, operation and maintenance personnel can intuitively and instantly understand the real reasons for the speed reduction and bandwidth reduction, without having to perform repetitive manual screening and investigation of a large number of known and normal compatibility events. This allows them to focus their energy on the real fault handling, greatly saving operation and maintenance time and manpower costs.

[0076] Optionally, the first and / or second review results can be adjusted based on historical operational behavior. By comparing the current event with the device's historical link status data (such as rate, bandwidth, and error count over the past week), it can be determined whether the current rate or bandwidth decrease event is an expected decrease. For example, if a GPU device has negotiated at Gen4 rate multiple times in the past week with no error count, the current event can be adjusted to an expected decrease; if the device suddenly drops to Gen1 with a surge in error count, it is determined to be an abnormal decrease.

[0077] The above solutions improve alarm reliability and avoid misjudgments caused by link status at a single point in time. For example, through historical behavior analysis, the system can identify periodic speed reduction events caused by platform compatibility (such as devices negotiating to Gen4 under specific loads), thereby reducing false alarms. Furthermore, historical behavior correlation analysis can provide data support for fault prediction, further enhancing the level of intelligent operation and maintenance.

[0078] The serial bus link status classification alarm method provided in this application obtains the link status information when a rate decrease or bandwidth decrease event occurs. Based on the status information and preset judgment rules, it makes a preliminary judgment on the decrease event, initially determining whether the decrease event is expected or abnormal. The judgment result is then reviewed based on the platform information database to finally determine whether the decrease event is expected or abnormal. This method can distinguish decrease events and generate different types of alarms for different decrease events, allowing maintenance personnel to handle only alarms for abnormal decreases, thus improving maintenance efficiency.

[0079] Furthermore, the serial bus link status classification alarm method provided in this application embodiment uses a hierarchical collaborative architecture where the BIOS performs pre-judgment and the BMC makes the final decision. This architecture has clear responsibilities and the software modules are easy to implement, maintain, and upgrade independently. It possesses good forward compatibility and scalability, easily supporting future CPU platforms, PCIe standards, and even integrating more advanced intelligent judgment models. The solution is primarily implemented through upgrades at the existing firmware (BIOS / BMC) level, requiring no changes to the hardware design, resulting in low deployment costs and facilitating rapid deployment in existing and new devices.

[0080] Figure 4 A flowchart illustrating the serial bus link status classification and alarm method provided in this application embodiment. Figure 2 ,like Figure 4 As shown, embodiments of this application provide a serial bus link status classification and alarm method, which is described in detail below:

[0081] S201. Read the maximum speed and maximum bandwidth supported by the connected device from the capability register of the connected device, and read the current actual speed and current actual bandwidth from the status register of the connected device.

[0082] The highest speed supported by the connected device is the first rate in the above embodiments, the highest bandwidth supported by the connected device is the first bandwidth in the above embodiments, the current actual rate is the third rate in the above embodiments, and the current actual bandwidth is the third bandwidth in the above embodiments.

[0083] S202. Determine whether a rate decrease event or a bandwidth decrease event has been triggered.

[0084] A rate reduction event is triggered when the current actual speed is less than the maximum speed supported by the connected device, i.e., when the third rate is less than the first rate; a bandwidth reduction event is triggered when the current actual bandwidth is less than the maximum bandwidth supported by the connected device, i.e., when the third bandwidth is less than the first bandwidth.

[0085] If a rate decrease event or bandwidth decrease event is triggered, proceed to step S203; otherwise, terminate the method.

[0086] S203. In response to a drop event occurring on the serial bus link, obtain the status information of the serial bus link. The drop event includes a rate drop event and a bandwidth drop event. The status information is used to perform category analysis on the drop event.

[0087] S204. Generate a prediction result based on the status information and the preset judgment rules.

[0088] The prediction results include a first prediction result and a second prediction result. The first prediction result is used to initially indicate that the decline event is an expected decline, and the second prediction result is used to initially indicate that the decline event is an abnormal decline.

[0089] S205. Review the prediction results based on the platform information database and generate the target results.

[0090] The serial bus link status classification alarm method provided in this application breaks through the traditional passive response mode of detection and alarm, and constructs a closed-loop intelligent decision-making method from perception to analysis to decision-making. By comprehensively utilizing multi-dimensional data for collaborative analysis, it realizes proactive cognition and intelligent decision-making of PCIe link status, promoting the evolution of server management system from passive response to proactive cognition and intelligent intervention operation and maintenance mode. By intelligently identifying and distinguishing between expected degradation caused by hardware compatibility and abnormal degradation reported by real faults, it eliminates the persistent problem of false alarms caused by these factors at the source, improves the alarm signal-to-noise ratio to a new level, and makes the operation and maintenance alarm system more reliable and trustworthy.

[0091] The following two specific examples illustrate this method.

[0092] Example 1: Taking a server equipped with a CPU of model A (supporting up to PCIe Gen4) and installing a GPU of model B (supporting PCIe Gen5) as an example, the method includes:

[0093] When the system starts up, the BIOS collects the following: the maximum bus transfer rate supported by the CPU (i.e., the second rate) is Gen4; the rate supported by the GPU (i.e., the first rate) is read through the PCIe configuration space and is Gen5; the current negotiated rate after link training (i.e., the third rate) is Gen4 x16, the link status is LinkUp, and the error count is 0.

[0094] The BIOS, based on preset judgment rules, determines that the device's supported version (Gen5) is greater than the CPU's supported version (Gen4), the current speed (Gen4) is equal to the CPU's supported version (Gen4), and the link status is normal. Therefore, it initially determines that the degradation is expected, specifically an expected speed reduction, and generates event indication information. The BIOS records the status flag bit ExpectedDegradeFlag=1, and packages it together with the device's unique hardware identifier, device name, supported version, current speed, event indication information, etc., and sends it to the BMC.

[0095] Upon receiving this event, the BMC checks the status flag, which is ExpectedDegradeFlag=1, and determines that it is a rate reduction event based on the event indication information. It queries the CPU-related information collected by the BMC and confirms that the CPU platform's upper limit is Gen4. After comprehensive evaluation and finding no other abnormal factors, the final ruling is an expected rate reduction, specifically an expected speed decrease.

[0096] Furthermore, an Informational record is generated in SEL: Information|PCIe LinkNormal|Device [B GPU] (Gen5 capable) operating at expected speed [Gen4] due to plantform [A CPU] limit. This record indicates that the PCIe link is normal, and the B-type GPU (supporting Gen5 speed) is operating at the expected speed of Gen4 due to the plantform [A CPU] limit.

[0097] On the PCIe device status page of the BMC web interface, a blue information icon "i" is displayed next to the GPU card, with a hover prompt stating "Device is running at the expected rate (Gen4)".

[0098] If the link drops to Gen1 due to a fault, the BIOS predicts an abnormal drop, and the BMC ultimately determines it as an abnormal speed reduction. The SEL logs a Critical alarm, and a red warning icon is displayed on the web interface.

[0099] Example 2: Taking a server equipped with a CPU (supporting up to PCIe Gen5) and installing a U.2 backplane with 14 interfaces, each supporting x2 link bandwidth (14 * x2 U.2), and fully configured with U.2 PCIe Gen5 SSDs as an example, the method includes:

[0100] When the system starts up, the BIOS collects the following: CPU PCIe Max (i.e., the second bandwidth) is Gen5; the SSD supports Gen5 and the supported bandwidth (i.e., the first bandwidth) is x4, which is read through the PCIe configuration space; the current negotiated bandwidth (i.e., the third bandwidth) after link training is x2, the link status is LinkUp, and the error count is 0.

[0101] The BIOS, based on preset judgment rules, determines that the device's supported bandwidth x4 is greater than the bandwidth allocated to the backplane by the BIOS x2 multiplied by the number of hard drives, the current bandwidth (x2) is equal to the bandwidth allocated to the backplane by the BIOS x2 multiplied by the number of hard drives, and the link status is normal. Therefore, it initially determines that the bandwidth reduction is expected, specifically an expected reduction in bandwidth, and generates event indication information. The BIOS records the status flag bit ExpectedDegradeFlag=1, and packages it together with the device's unique identifier, device name, supported version, current bandwidth, backplane model, the bandwidth actually allocated to the backplane by the BIOS, and event indication information, and sends it to the BMC.

[0102] Upon receiving this event, the BMC checks the status flag, which is ExpectedDegradeFlag=1, and determines that it is a bandwidth reduction event based on the event indication information. It queries relevant information collected by the BMC regarding the backplane, hard drives, etc., confirming that the maximum supported bandwidth of each hard drive under the backplane x2 equals the bandwidth allocated to the backplane by the BIOS x2*14. After comprehensive assessment and finding no other abnormal factors, the final decision is that the reduction is expected, specifically an expected bandwidth decrease.

[0103] Furthermore, an Informational record is generated in SEL: Information|PCIe LinkNormal|Device [SSD] (width x4 capable) operating at expected width [x2] dueto BP [14 U.2 BP] limit. This record indicates that the PCIe link is normal, and the device SSD (supporting x4 bandwidth) is operating at expected bandwidth x2 due to the 14 U.2 BP limit.

[0104] On the PCIe device status page of the BMC Web interface, a blue information icon “i” is displayed next to the U.2 hard drive, with a hover message “Device is operating at expected bandwidth (x2)”.

[0105] If the link drops to X1 due to a fault, the BIOS predicts an abnormal drop, and the BMC ultimately confirms it as an abnormal drop. The SEL logs a Critical alarm, and a red warning icon is displayed on the web interface.

[0106] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods according to the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method.

[0107] Figure 5 This is a schematic diagram of the serial bus link status classification alarm device provided in an embodiment of this application. Figure 5 As shown, embodiments of this application also provide a serial bus link status classification alarm device 20, including: an acquisition module 21, used to acquire the status information of the serial bus link in response to a drop event occurring in the serial bus link, the drop event including a rate drop event and a bandwidth drop event, the status information being used to perform category analysis on the drop event; a first processing module 22, used to generate a preliminary judgment result based on the status information and preset judgment rules, the preliminary judgment result being used to initially indicate whether the drop event belongs to an expected drop or an abnormal drop; and a second processing module 23, used to review the preliminary judgment result based on the platform information database and generate a target result, the target result being used to indicate the alarm type corresponding to the drop event.

[0108] In one possible implementation, the acquisition module 21 is specifically configured to: in response to a rate decrease event, acquire first state information, the first state information including at least one of the following: a first rate, a second rate, a third rate, a link status, and an error count, wherein the first rate is used to indicate the maximum bus transmission rate supported by the connected device, the second rate is used to indicate the maximum bus transmission rate supported by the processor, the third rate is used to indicate the current actual bus transmission rate, the link status is used to indicate whether the serial bus link is in a normal state or a disconnected state, and the error count is used to indicate the number of hardware-level errors detected during the operation of the serial bus link; and / or, in response to a bandwidth decrease event, acquire second state information, the second state information including at least one of the following: a first bandwidth, a second bandwidth, a third bandwidth, a link status, and an error count, wherein the first bandwidth is used to indicate the maximum bus transmission bandwidth supported by the connected device, the second bandwidth is used to indicate the bus transmission bandwidth allocated to the connected device, and the third bandwidth is used to indicate the current actual bus transmission bandwidth.

[0109] In one possible implementation, the first processing module 22 is specifically used to: when the rate decrease event is a rate decrease event, determine a first prediction result based on the first state information and the rate decrease judgment rule, wherein the first prediction result is used to indicate whether the rate decrease event is an expected decrease or an abnormal decrease; and when the rate decrease event is a bandwidth decrease event, determine a second prediction result based on the second state information and the bandwidth decrease judgment rule, wherein the second prediction result is used to indicate whether the bandwidth decrease event is an expected decrease or an abnormal decrease.

[0110] In one possible implementation, the first processing module 22 is further configured to: set a first state flag to a first value when the first rate is greater than the second rate, the third rate is equal to the second rate, the link status is normal, and the error count is equal to the counting threshold; set the first state flag to a second value when the third rate is less than the first rate or the error count is greater than the counting threshold; generate first event indication information based on the first state information; the first event indication information is used to indicate that the current event is a rate decrease event; and determine the first state flag, the first event indication information, and the first state information as a first prediction result.

[0111] In one possible implementation, the first processing module 22 is further configured to: set the second status flag to a third value when the first bandwidth is greater than the second bandwidth, the third bandwidth is equal to the second bandwidth, the link status is normal, and the error count is equal to the counting threshold; set the second status flag to a fourth value when the third bandwidth is less than the first bandwidth or the error count is greater than the counting threshold; generate second event indication information based on the second status information; the second event indication information is used to indicate that the current event is a bandwidth decrease event; and determine the second status flag, the second event indication information, and the second status information as the second prediction result.

[0112] In one possible implementation, the second processing module 23 is specifically used to: when the first status flag is a first value and / or the second status flag is a third value, verify the status information according to the platform information database to obtain a first verification result, which is used to indicate whether the drop event is an expected drop or an abnormal drop; when the first status flag is a second value and / or the second status flag is a fourth value, check whether the connected device is in the whitelist in the platform information database to obtain a second verification result, which is used to indicate whether the drop event is an expected drop or an abnormal drop; and generate a target result based on the first verification result and / or the second verification result.

[0113] In one possible implementation, the second processing module 23 is further configured to: when the decline event is a rate decline event, determine whether the second rate in the status information is the same as the maximum rate supported by the processor recorded in the platform information database; if they are the same, determine that the first verification result is a rate decline event belonging to expected decline; if they are different, determine that the first verification result is a rate decline event belonging to abnormal decline; when the decline event is a bandwidth decline event, determine whether the second bandwidth in the status information is the same as the maximum bandwidth supported by the hardware allocated bandwidth for the connected device recorded in the platform information database; if they are the same, determine that the first verification result is a bandwidth decline event belonging to expected decline; if they are different, determine that the first verification result is a bandwidth decline event belonging to abnormal decline.

[0114] In one possible implementation, the serial bus link status classification alarm device 20 is further configured to: generate a first target result when the first verification result and / or the second verification result indicate that the drop event is an expected drop, wherein the first target result is used to indicate that the alarm type of the drop event is an informational alarm; and generate a second target result when the first verification result and / or the second verification result indicate that the drop event is an abnormal drop, wherein the second target result is used to indicate that the alarm type corresponding to the drop event is a critical alarm.

[0115] In one possible implementation, the second processing module 23 is further configured to: read the maximum speed and the maximum bandwidth supported by the connected device from the capability register of the connected device; read the current actual speed and the current actual bandwidth from the status register of the connected device; trigger a rate decrease event when the current actual speed is less than the maximum speed supported by the connected device; and trigger a bandwidth decrease event when the current actual bandwidth is less than the maximum bandwidth supported by the connected device.

[0116] For a description of the features in the embodiment of the serial bus link status classification alarm device, please refer to the relevant description of the embodiment of the serial bus link status classification alarm method, which will not be repeated here.

[0117] Figure 6 A schematic diagram of the structure of the electronic device provided in this application. Figure 6 As shown, the electronic device 30 provided in this embodiment includes at least one processor 301 and a memory 302. Optionally, the electronic device 30 further includes a communication component 303. The processor 301, memory 302, and communication component 303 are connected via a bus.

[0118] In the specific implementation process, at least one processor 301 executes computer execution instructions stored in memory 302, causing at least one processor 301 to execute the above-described serial bus link status classification alarm method embodiment.

[0119] The specific implementation process of processor 301 can be found in the above method embodiments, and its implementation principle and technical effect are similar. It will not be repeated here.

[0120] In the above embodiments, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in the application can be directly manifested as being executed by a hardware processor, or executed by a combination of hardware and software modules within the processor.

[0121] The memory may include random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device.

[0122] The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.

[0123] Embodiments of this application also provide a computer-readable storage medium storing a computer program, wherein the computer program is configured to execute the steps in any of the above embodiments of the serial bus link status classification alarm method when running.

[0124] In one exemplary embodiment, the aforementioned computer-readable storage medium may include, but is not limited to, various media capable of storing computer programs, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.

[0125] Embodiments of this application also provide a computer program product, which includes a computer program that, when executed by a processor, implements the steps in any of the above embodiments of the serial bus link status classification alarm method.

[0126] Embodiments of this application also provide another computer program product, including a non-volatile computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps in any of the above embodiments of the serial bus link status classification alarm method.

[0127] Any of the components, modules, units, parts, methods, and operations described herein can be implemented using software, firmware, hardware (e.g., fixed logic circuitry), manual processing, or any combination thereof. Alternatively or additionally, any functionality described herein can be performed at least in part by one or more hardware logic components, such as, but not limited to, a central processing unit (CPU), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), an application-specific standard product (ASSP), a system-on-a-chip (SoC), a complex programmable logic device (CPLD), a microcontroller unit (MCU), etc. The terms "system," "computing device," or "apparatus" as used herein encompass various means, devices, and machines for processing data, including, for example, one or more programmable processors, computers, SoCs, or combinations thereof. The apparatus may also include code that creates an execution environment for the computer program in question, such as code constituting processor firmware, a protocol stack, a database management system, an operating system, a cross-platform runtime environment, a virtual machine, or a combination thereof. The aforementioned computer program (also known as a program, software, software application, app, script, or code) can be written in any form of programming language, including compiled or interpreted languages, declarative or procedural languages, and can be deployed in any form, including as a standalone program or as a module, component, subroutine, object, or other unit suitable for a computing environment.

[0128] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0129] The serial bus link status classification and alarm method provided in this application has been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only intended to help understand the method and core ideas of this application. It should be noted that those skilled in the art can make several improvements and modifications to this application without departing from the principles of this application, and these improvements and modifications also fall within the protection scope of the claims of this application.

Claims

1. A serial bus link status classification and alarm method, characterized in that, include: In response to a drop event occurring on the serial bus link, the status information of the serial bus link is obtained. The drop event includes a rate drop event and a bandwidth drop event. The status information is used to perform category analysis on the drop event. Based on the status information and preset judgment rules, a prediction result is generated, which is used to initially indicate whether the decline event is an expected decline or an abnormal decline; The prediction results are reviewed based on the platform's information database to generate target results, which are used to indicate the alarm type corresponding to the decline event.

2. The method according to claim 1, characterized in that, The status information includes first status information and second status information. The step of acquiring the status information corresponding to the serial bus link in response to a descent event occurring on the serial bus link includes: In response to the rate decrease event, first status information is acquired, the first status information including at least one of the following: a first rate, a second rate, a third rate, a link status, and an error count, wherein the first rate indicates the maximum bus transmission rate supported by the connected device, the second rate indicates the maximum bus transmission rate supported by the processor, the third rate indicates the current actual bus transmission rate, the link status indicates whether the serial bus link is in a normal or disconnected state, and the error count indicates the number of hardware-level errors detected during the operation of the serial bus link; and / or, In response to the bandwidth decrease event, second status information is obtained, which includes at least one of the following: first bandwidth, second bandwidth, third bandwidth, link status, and error count, wherein the first bandwidth is used to indicate the maximum bus transmission bandwidth supported by the connected device, the second bandwidth is used to indicate the bus transmission bandwidth allocated to the connected device, and the third bandwidth is used to indicate the current actual bus transmission bandwidth.

3. The method according to claim 2, characterized in that, The judgment rules include a speed reduction judgment rule and a bandwidth reduction judgment rule; the prediction results include a first prediction result and a second prediction result; and generating the prediction result based on the state information and the preset judgment rules includes: When the descent event is a rate decrease event, the first prediction result is determined based on the first state information and the deceleration judgment rule. The first prediction result is used to indicate whether the rate decrease event is an expected decrease or an abnormal decrease. When the decrease event is a bandwidth decrease event, the second prediction result is determined according to the second status information and the bandwidth decrease judgment rule. The second prediction result is used to indicate whether the bandwidth decrease event is an expected decrease or an abnormal decrease.

4. The method according to claim 3, characterized in that, When the descent event is a rate decrease event, determining the first prediction result based on the first state information and the judgment rule includes: When the first rate is greater than the second rate, the third rate is equal to the second rate, the link status is normal, and the error count is equal to the counting threshold, the first status flag is set to a first value, which is used to identify the rate drop event as an expected drop. When the third rate is less than the first rate or the error count is greater than the count threshold, the first status flag is set to a second value, which is used to identify the rate decrease event as an abnormal decrease. First event indication information is generated based on the first state information, and the first event indication information is used to indicate that the current event is the rate decrease event; The first status flag, the first event indication information, and the first status information are determined as the first prediction result.

5. The method according to claim 4, characterized in that, When the drop event is a bandwidth drop event, determining the second prediction result based on the second state information and the judgment rule includes: When the first bandwidth is greater than the second bandwidth, the third bandwidth is equal to the second bandwidth, the link status is normal, and the error count is equal to the counting threshold, the second status flag is set to the third value, which is used to identify the bandwidth drop event as an expected drop. When the third bandwidth is less than the first bandwidth or the error count is greater than the count threshold, the second status flag is set to a fourth value, which is used to identify the bandwidth drop event as an abnormal drop. A second event indication information is generated based on the second status information, and the second event indication information is used to indicate that the current event is the bandwidth decrease event; The second status flag, the second event indication information, and the second status information are determined as the second prediction result.

6. The method according to claim 5, characterized in that, The step of reviewing the prediction results based on the platform information database and generating the target result includes: When the first status flag is the first value and / or the second status flag is the third value, the status information is verified according to the platform information database to obtain a first verification result. The first verification result is used to indicate whether the decline event is an expected decline or an abnormal decline. When the first status flag is the second value and / or the second status flag is the fourth value, check whether the connected device is in the whitelist in the platform information database to obtain a second verification result. The second verification result is used to indicate whether the decline event is an expected decline or an abnormal decline. The target result is generated based on the first review result and / or the second review result.

7. The method according to claim 6, characterized in that, When the first status flag is the first value and / or the second status flag is the third value, the status information is verified according to the platform information database to obtain a first verification result, including: When the decline event is a rate decline event, it is determined whether the second rate in the status information is the same as the maximum rate supported by the processor recorded in the platform information database. If they are the same, the first review result is determined to be that the rate decline event belongs to the expected decline. If they are different, the first review result is determined to be that the rate decline event belongs to the abnormal decline. When the bandwidth decrease event is a bandwidth decrease event, it is determined whether the second bandwidth in the status information is the same as the maximum bandwidth supported by the hardware that allocates bandwidth to the connection device as recorded in the platform information database. If they are the same, it is determined that the first review result is that the bandwidth decrease event belongs to the expected decrease. If they are different, it is determined that the first review result is that the bandwidth decrease event belongs to the abnormal decrease.

8. The method according to claim 6, characterized in that, The step of generating the target result based on the first review result and / or the second review result includes: When the first review result and / or the second review result indicate that the decline event is an expected decline, a first target result is generated, which is used to indicate that the alarm type of the decline event is an informational alarm. When the first review result and / or the second review result indicate that the decline event is an abnormal decline, a second target result is generated, which is used to indicate that the alarm type corresponding to the decline event is a critical alarm.

9. The method according to claim 1, characterized in that, Before obtaining the status information of the serial bus link in response to a descent event occurring on the serial bus link, the method further includes: Read the maximum speed and maximum bandwidth supported by the connection device from the capability register of the connection device; Read the current actual speed and current actual bandwidth from the status register of the connected device; The rate drop event is triggered when the current actual speed is less than the maximum speed supported by the connected device; The bandwidth drop event is triggered when the current actual bandwidth is less than the maximum bandwidth supported by the connected device.

10. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor, configured to implement the steps of the serial bus link status classification alarm method as described in any one of claims 1 to 9 when executing the computer program.