Monitoring system, method, apparatus and device for basic input / output system, and non-volatile storage medium
By monitoring multiple startup stages of the basic input/output system and using the controller to identify startup timeouts and switch to the backup system, the problem of excessively long server startup time is solved, and fast startup is achieved.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- INSPUR SUZHOU INTELLIGENT TECH CO LTD
- Filing Date
- 2025-03-12
- Publication Date
- 2026-06-25
AI Technical Summary
In existing technologies, server startup speeds are low, and the switching of primary and backup basic input/output systems requires a long wait time, resulting in excessively long startup times.
The first controller monitors multiple startup stages of the basic input/output system. When the actual startup time exceeds the predetermined time, it controls the switch to switch the first basic input/output system to the second basic input/output system, thus shortening the time for determining startup timeout.
In the event of a failure during the startup of the basic input/output system, it can quickly switch to the backup system, shortening the server startup time and improving the startup speed.
Smart Images

Figure CN2025082200_25062026_PF_FP_ABST
Abstract
Description
Monitoring systems, methods, devices, equipment, and non-volatile storage media for basic input / output systems.
[0001] Cross-references to related applications
[0002] This application claims priority to Chinese Patent Application No. 202411877757.9, filed on December 19, 2024, entitled "Monitoring System, Method, Apparatus, Device, and Medium for Basic Input / Output Systems", the entire contents of which are incorporated herein by reference. Technical Field
[0003] This application relates to the field of server technology, and in particular to monitoring systems, methods, apparatus, devices, and non-volatile storage media for basic input / output systems. Background Technology
[0004] In server systems, the Basic Input Output System (BIOS) is a set of programs embedded in a read-only memory (ROM) chip on the server motherboard. It stores the computer's most important basic input / output programs, power-on self-test (POST) programs, and system startup programs. If the BIOS fails, the server will fail to boot. Currently, some server devices are designed with two BIOS memories as a primary and backup BIOS. However, when the primary BIOS fails, it takes a considerable amount of time to switch to the backup BIOS, resulting in excessively long boot times.
[0005] There is a technical problem with the low startup speed of the server in the relevant technologies. Summary of the Invention
[0006] The purpose of this application is to provide a monitoring system, method, apparatus, device, and non-volatile storage medium for a basic input / output system to improve the startup speed of a server.
[0007] To solve the above-mentioned technical problems, the first aspect of this application provides a monitoring system for a basic input / output system, including a first controller and a first switching switch;
[0008] The first controller is configured to obtain the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on. If it is detected that the actual startup time exceeds the first startup time corresponding to the startup phase of the first basic input / output system, the controller determines that the first basic input / output system has failed to start and controls the first switch to switch the first basic input / output system to the second basic input / output system.
[0009] There are multiple startup phases, and the first startup time is determined based on the device configuration parameters corresponding to the startup phase.
[0010] On the one hand, the boot-up phase includes at least two of the following: the central processing unit boot phase, the memory initialization phase, the peripheral loading phase, the operating system loader loading phase, and the operating system running phase.
[0011] On the other hand, the first controller monitors the central processing unit startup phase of the first basic input / output system according to the startup state, including:
[0012] A power-on signal for the central processing unit is detected, indicating the start of the central processing unit startup phase.
[0013] If a CPU reset signal is detected from the CPU, it is determined that the CPU startup phase has been completed.
[0014] On the other hand, the first controller monitors the memory initialization phase of the first basic input / output system according to the startup state, including:
[0015] A central processing unit reset signal was detected, indicating the start of the memory initialization phase.
[0016] If a peripheral initialization signal is detected, it indicates that the memory initialization phase has been completed.
[0017] On the other hand, the first controller monitors the peripheral loading phase of the first basic input / output system according to the startup state, including:
[0018] Peripheral initialization signal detected, indicating the start of peripheral loading phase;
[0019] If the operating system loader loading signal is detected, it is determined that the peripheral loading phase has been completed.
[0020] On the other hand, the first controller monitors the operating system loader loading phase of the first basic input / output system according to the startup state, including:
[0021] An operating system loader loading signal was detected, indicating that the operating system loader loading phase has begun.
[0022] If an operating system kernel initialization signal is detected, it is determined that the operating system loader loading phase has been completed.
[0023] On the other hand, the first controller monitors the operating system operation phase of the first basic input / output system according to the startup state, including:
[0024] The operating system kernel initialization signal was detected, indicating that the operating system runtime phase has begun.
[0025] If the operating system is detected to have completed a preset number of running cycles, then the operating system's running phase is considered complete.
[0026] On the other hand, the first controller acquires the startup status and identifies that the actual startup time exceeds the first startup time, including:
[0027] After determining that the first basic input / output system has entered the current power-on phase, the first controller listens to the output information of the first basic input / output system.
[0028] If no information indicating the completion of the current power-on startup phase is received from the first basic input / output system within the corresponding first startup time, the first controller determines that the actual startup time corresponding to the current power-on startup phase exceeds the corresponding first startup time.
[0029] On the other hand, the first controller listens to the output information of the first basic input / output system, including:
[0030] After the first controller hears the information sent by the first basic input / output system indicating the start of the power-on phase, it configures and starts the first timer corresponding to the power-on phase according to the first startup time corresponding to the power-on phase.
[0031] After the first controller receives the message from the first basic input / output system that the power-on startup phase has been completed, it shuts down the first timer.
[0032] On the other hand, the first controller acquires the startup status and identifies that the actual startup time exceeds the first startup time, including:
[0033] After determining that the first basic input / output system has entered the current power-on startup phase, the first controller accesses the first basic input / output system during the first startup time corresponding to the power-on startup phase to obtain the startup status.
[0034] If, after reaching the first startup time, the startup state is that the first basic input / output system has not completed the current startup phase, then the first controller determines that the actual startup time corresponding to the current startup phase exceeds the corresponding first startup time.
[0035] On the other hand, the first controller determines that the first basic input / output system has entered the current power-on startup phase, including:
[0036] The first controller accesses the first basic input / output system and, after obtaining information that the first basic input / output system has completed the previous boot-up phase, determines that the first basic input / output system has entered the current boot-up phase.
[0037] On the other hand, after determining that the first basic input / output system has entered the current power-on startup phase, the first controller accesses the first basic input / output system during the first startup time corresponding to the power-on startup phase to obtain the startup status, including:
[0038] After determining that the first basic input / output system has entered the current power-on startup phase, the first controller multiplies the first startup time corresponding to the current power-on startup phase by a preset scaling factor to obtain the second startup time.
[0039] The first controller configures the second timer according to the second startup time, configures the third timer according to the first startup time, and starts the second and third timers.
[0040] After the second timer expires, the first controller accesses the first basic input / output system to obtain the startup status. If the first basic input / output system has not completed the current startup phase, it continues to wait. If the first basic input / output system has completed the current startup phase, it closes the third timer and determines that the first basic input / output system enters the next startup phase.
[0041] When the third timer expires, the first controller accesses the first basic input / output system to obtain the startup status. If the first basic input / output system has not completed the current startup phase, it determines that the actual startup time exceeds the first startup time. If the first basic input / output system has completed the current startup phase, it determines that the first basic input / output system enters the next startup phase.
[0042] On the other hand, the first controller is a baseboard management controller;
[0043] After power-on, the first controller communicates with the first basic input / output system to obtain the startup status of the first basic input / output system, including:
[0044] The baseboard management controller obtains the startup status through intelligent platform management interface commands.
[0045] On the other hand, the first controller is a complex programmable logic device;
[0046] After power-on, the first controller communicates with the first basic input / output system to obtain the startup status of the first basic input / output system, including:
[0047] Complex programmable logic devices receive startup status information from the first basic input / output system via an integrated circuit bus.
[0048] On the other hand, the first controller is a complex programmable logic device;
[0049] After power-on, the first controller communicates with the first basic input / output system to obtain the startup status of the first basic input / output system, including:
[0050] The complex programmable logic device receives startup status information sent by the board management controller.
[0051] To address the aforementioned technical problems, a second aspect of this application also provides a monitoring method for a basic input / output system, applied to a first controller, comprising:
[0052] After power-on, the startup status of the first basic input / output system is obtained by communicating with the first basic input / output system.
[0053] If it is detected that the actual startup time exceeds the first startup time corresponding to the startup phase in the first basic input / output system, then the startup of the first basic input / output system is determined to have failed.
[0054] After determining that the first basic input / output system has failed to start, control the first switch to switch the first basic input / output system to the second basic input / output system;
[0055] There are multiple startup phases, and the first startup time is determined based on the device configuration parameters corresponding to the startup phase.
[0056] To address the aforementioned technical problems, a third aspect of this application also provides a monitoring device for a basic input / output system, applied to a first controller, comprising:
[0057] The monitoring unit is configured to obtain the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on;
[0058] The identification unit is configured to determine that the first basic input / output system has failed to start if it detects that the actual startup time in the power-on startup phase of the first basic input / output system exceeds the first startup time corresponding to the power-on startup phase.
[0059] The control unit is configured to control the first switch to switch the first basic input / output system to the second basic input / output system after determining that the first basic input / output system has failed to start.
[0060] There are multiple startup phases, and the first startup time is determined based on the device configuration parameters corresponding to the startup phase.
[0061] To address the aforementioned technical problems, a fourth aspect of this application also provides a monitoring device for a basic input / output system, comprising:
[0062] The memory is configured to store computer programs;
[0063] The processor is configured to execute computer programs, which, when executed by the processor, implement the steps of the monitoring method for the basic input / output system described above.
[0064] To address the aforementioned technical problems, a fifth aspect of this application also provides a non-volatile storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the monitoring method for the basic input / output system described above.
[0065] To address the aforementioned technical problems, a sixth aspect of this application also provides a computer program product, comprising a computer program that, when executed by a processor, implements the steps of the monitoring method for the basic input / output system described above.
[0066] The monitoring system for the basic input / output system provided in this application has the advantage of determining the first startup time based on the device configuration parameters corresponding to multiple power-on and startup stages of the first basic input / output system by the first controller. After power-on, it communicates with the first basic input / output system to obtain the startup status of the first basic input / output system. If it is detected that the actual startup time in the power-on and startup stage of the first basic input / output system exceeds the first startup time corresponding to the power-on and startup stage, it is determined that the first basic input / output system has failed to start, and the first switching switch is controlled to switch the first basic input / output system to the second basic input / output system. Thus, when a fault occurs in a certain stage of the startup process of the first basic input / output system, resulting in the inability to start, it can quickly switch to the second basic input / output system, shortening the time for determining the startup timeout of the first basic input / output system, thereby improving the server power-on and startup speed.
[0067] The monitoring method, apparatus, equipment, non-volatile storage medium, and computer program product for the basic input / output system provided in this application have the aforementioned beneficial effects, which will not be elaborated further here. Attached Figure Description
[0068] To more clearly illustrate the technical solutions of the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0069] Figure 1 is an architecture diagram of a monitoring system for a basic input / output system provided in an embodiment of this application;
[0070] Figure 2 is a timing diagram of the startup process of a basic input / output system provided in an embodiment of this application;
[0071] Figure 3 is a flowchart of a monitoring method for a basic input / output system provided in an embodiment of this application;
[0072] Figure 4 is a schematic diagram of the structure of a monitoring device for a basic input / output system provided in an embodiment of this application. Detailed Implementation
[0073] The core of this application is to provide a monitoring system, method, apparatus, device, and non-volatile storage medium for a basic input / output system, used to improve the startup speed of a server.
[0074] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0075] In server systems, the BIOS Flash memory is used to store the BIOS firmware. In one type of server, the BIOS Flash memory stores the Unified Extensible Firmware Interface (UEFI) firmware. After the server powers on, the BIOS initializes the hardware and boots the operating system (OS).
[0076] To improve boot reliability, two Basic Input / Output Systems (BIOS) are typically deployed on the server motherboard, one primary and one backup. The primary BIOS is the BIOS that starts by default when the server boots up. If the primary BIOS is damaged due to viruses, human error during upgrades, or other reasons, causing it to fail to boot and load the operating system, i.e., if the primary BIOS fails to boot and crashes, the system can switch to the backup BIOS to boot, thus preventing the server system from failing to boot and improving the reliability of server use.
[0077] In traditional solutions, there is no automatic failover mechanism between primary and backup basic input / output systems. Maintenance personnel must manually switch to the backup basic input / output system storage after observing a startup failure of the primary system. This requires 24-hour manual monitoring, wasting manpower and causing the server to remain offline for extended periods.
[0078] Therefore, those skilled in the art propose setting a timer for the primary basic input / output system. If the primary basic input / output system fails to start successfully within the predetermined time, the system can be automatically switched to the backup basic input / output system via a switch to start it, thereby eliminating the need for manual intervention and shortening server downtime.
[0079] However, in related technologies, this automatic switching scheme for primary and backup basic input / output systems requires a certain timeout period before switching to the backup basic input / output system if the central processing unit fails to load the basic input / output system or if the basic input / output system fails to start after loading. Moreover, to avoid misjudging the primary basic input / output system's startup failure during the normal startup phase, the maximum startup time of the primary basic input / output system needs to be considered and the timer duration needs to be set. This results in a longer waiting time before automatically switching to the backup basic input / output system after the primary basic input / output system fails, leading to a longer server boot time.
[0080] Therefore, the monitoring system for the basic input / output system provided in this application embodiment uses a first controller to determine the first startup time based on the device configuration parameters corresponding to multiple power-on startup stages of the first basic input / output system. After power-on, the system communicates with the first basic input / output system to obtain its startup status. If it is detected that the actual startup time in the power-on startup stage of the first basic input / output system exceeds the first startup time corresponding to the power-on startup stage, it is determined that the first basic input / output system has failed to start. The system then controls a first switching switch to switch the first basic input / output system to a second basic input / output system. This allows for a quick switch to the second basic input / output system when a failure occurs at a certain stage of the startup process of the first basic input / output system, shortening the time for determining the startup timeout of the first basic input / output system and thus improving the server's power-on startup speed.
[0081] Figure 1 is an architecture diagram of a monitoring system for a basic input / output system provided in an embodiment of this application.
[0082] As shown in Figure 1, the monitoring system of the basic input / output system provided in this application embodiment may include a first controller and a first switching switch 101.
[0083] The first controller is configured to obtain the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on. If it is detected that the actual startup time in the power-on startup phase of the first basic input / output system exceeds the first startup time corresponding to the power-on startup phase, it is determined that the first basic input / output system has failed to start, and the first switching switch 101 is controlled to switch the first basic input / output system to the second basic input / output system.
[0084] There are multiple startup phases, and the first startup time is determined based on the device configuration parameters corresponding to the startup phase.
[0085] In the embodiments of this application, the first controller may be a Baseboard Management Controller (BMC) or a Complex Programmable Logic Device (CPLD).
[0086] The first switch 101 can be a physical switch on a server motherboard or baseboard management controller board. The first switch 101 is configured to select the connection between the server's platform controller (such as the Platform Controller Hub, PCH, also known as the integrated southbridge) and the memory of the basic input / output system. That is, for both the first and second basic input / output systems, the first switch 101 includes at least two selection channels: a first channel selects the memory where the platform controller and the first basic input / output system reside, and a second channel selects the memory where the platform controller and the second basic input / output system reside. The first switch 101 can be connected to the platform controller, to the memory where the first basic input / output system resides, and to the memory where the second basic input / output system resides via a Serial Peripheral Interface (SPI) bus.
[0087] If the first controller uses a complex programmable logic device (CPL), the CPL can be connected to the first switch 101 via an integrated circuit bus to control the chip select of the first switch 101. Optionally, the CPL can be connected to the platform controller via the first integrated circuit bus, and the CPL can be connected to the controlled terminal of the first switch 101 via a second integrated circuit bus. Both the first and second integrated circuit buses can be two-wire serial buses (Inter-Integrated Circuit, I2C) or improved inter-integrated circuit buses (I3C).
[0088] If the first controller is a baseboard management controller, the baseboard management controller can be connected to the platform controller via a low pin count bus (LPC bus). The baseboard management controller can control the chip select of the first switching switch 101 through Intelligent Platform Management Interface (IPMI) commands.
[0089] In this embodiment, the first controller may use a first bus to communicate with the first basic input / output system and the second basic input / output system to obtain their startup status.
[0090] If the first controller uses a complex programmable logic device, the first bus can be an integrated circuit bus. The complex programmable logic device can be connected to the first basic input / output system through the third integrated circuit bus. The complex programmable logic device can be connected to the second basic input / output system through the fourth integrated circuit bus. Both the third and fourth integrated circuit buses can be two-wire serial buses (Inter-Integrated Circuit, I2C) or improved integrated circuit buses.
[0091] If the first controller is a baseboard management controller, the baseboard management controller can communicate with the first basic input / output system and the second basic input / output system through intelligent platform management interface commands to obtain their status. For example, for the first basic input / output system, the first basic input / output system can output its own startup status by sending intelligent platform management interface commands to the baseboard management controller.
[0092] After the server is powered on, there is a default basic input / output system (PIS), which is defined as the first PIS in this embodiment. This can be either the primary PIS or the backup PIS as defined in related technologies. In an optional implementation, the primary PIS can be used as the first PIS by default. However, if the primary PIS fails before the server is shut down, the backup PIS may be set to start by default on the next power-on. In this case, the first PIS refers to the backup PIS.
[0093] After the first controller powers on, it can read the BIOS Flash Slot Number register of the BIOS Flash memory and determine the first BIOS based on the value of the register. If the slot number register uses 0 or 1 to represent two BIOSes, and the value read by the first controller is neither 0 nor 1, then the BIOS with slot number 0 is selected as the first BIOS by default and written to a register in the Non-Volatile Random Access Memory (NVRAM). If the value read by the first controller is 0 or 1, it monitors whether the BIOS startup has timed out. If a timeout occurs, the value of the slot number register is inverted before setting the segment; otherwise, the BIOS chip select is performed according to the read value of the slot number register.
[0094] In this embodiment, a second basic input / output system is defined as a secondary basic input / output system that is in a normal state and can be started normally. That is, the second basic input / output system can start normally after the first basic input / output system fails to start. In optional implementations, there can be one or more second basic input / output systems.
[0095] In this embodiment of the application, the process of starting the basic input / output system is divided into multiple stages, and starting one stage is called the power-on startup stage.
[0096] In some optional embodiments of this application, the startup process of the Basic Input / Output System (PIOS) is divided into multiple startup stages. This may include: grouping the startup process of at least one piece of hardware or the loading process of at least one piece of software into the same startup stage according to the hardware startup sequence and software loading sequence of the server during the PIOS startup process. In optional embodiments, the startup sequence of various types of hardware and software can be divided according to the hardware startup timing and software loading timing of the server during the PIOS startup process. All or part of the hardware and software can be used as the basis for dividing the startup stages, and the order of each startup stage is determined according to the startup sequence.
[0097] Depending on the server's device configuration parameters (e.g., software and hardware configuration parameters), the startup time for each boot-up phase may differ during normal startup. In this embodiment, the startup time corresponding to each boot-up phase during normal startup is defined as the first startup time, and a first startup time that can cover the maximum duration of normal execution of each boot-up phase is determined based on the device configuration parameters corresponding to the boot-up phase.
[0098] The first controller can pre-store the first startup time corresponding to each boot-up stage and monitor whether the actual startup time of each boot-up stage has exceeded the time limit. In addition, the first startup time corresponding to each boot-up stage may be different in different boot-ups of the server. In this case, the first controller can pre-store the first startup time corresponding to each boot-up stage under different boot-ups of the server, or the first basic input / output system can send an updated first startup time to the first controller after booting.
[0099] According to the Unified Extensible Firmware Interface (UEFI) specification, the boot phase of a Basic Input / Output System (BIOS) can be divided into a CPU boot phase, a memory initialization phase, a peripheral loading phase, an operating system loader loading phase, and an operating system runtime phase. In some optional embodiments of this application, the boot phase may include at least two of the following: CPU boot phase, memory initialization phase, peripheral loading phase, operating system loader loading phase, and operating system runtime phase. In other optional embodiments of this application, multiple phases may be monitored as a single boot phase. In still other optional embodiments of this application, more or fewer boot phases may be configured.
[0100] After power-on, the first controller determines the first basic input / output system by reading the slot number register of the basic input / output system memory, and obtains the startup status of the first basic input / output system by communicating with it. Whenever the first basic input / output system enters a startup phase, the controller monitors the execution status of the startup phase according to the first startup time corresponding to that startup phase. If the startup phase has not been completed before the first startup time has expired, it means that the actual startup time of the startup phase has exceeded the first startup time, which also means that the startup phase has failed. At this time, there is no need to wait for the subsequent startup phase, that is, there is no need to wait for the entire startup phase of the first basic input / output system to time out. The controller can determine that the startup of the first basic input / output system has failed and control the first switching switch 101 to switch the first basic input / output system to the second basic input / output system for startup to load the operating system.
[0101] The monitoring system for the basic input / output system provided in this application embodiment obtains the startup status of the first basic input / output system after power-on by the first controller based on the first startup time of multiple power-on startup stages of the first basic input / output system according to the device configuration parameters. If it is detected that the actual startup time in the power-on startup stage of the first basic input / output system exceeds the first startup time corresponding to the power-on startup stage, it is determined that the first basic input / output system has failed to start, and the first switching switch 101 is controlled to switch the first basic input / output system to the second basic input / output system. In this way, when the first basic input / output system fails, it can switch to the second basic input / output system in a timely manner, shorten the time for determining the basic input / output system startup timeout, and improve the server startup speed.
[0102] As described in the above embodiments, the first controller can be a complex programmable logic device or a baseboard management controller.
[0103] If the first controller is a baseboard management controller, the first controller obtains the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on. This may include: the baseboard management controller obtaining the startup status through intelligent platform management interface commands.
[0104] If the first controller is a complex programmable logic device, the first controller obtains the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on. This may include the complex programmable logic device receiving startup status information output by the first basic input / output system through an integrated circuit bus.
[0105] If the first controller is a complex programmable logic device, the first controller obtains the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on, and may further include: the complex programmable logic device receiving startup status information sent by the board management controller.
[0106] In some optional embodiments of this application, the first controller may further include a complex programmable logic device (CPL) and a baseboard management controller (BMC). The CPL and BMC can respectively acquire the startup status of the first basic input / output system (PIS) or the startup status of the second PIS communication, and monitor the startup process of the first or second PIS in conjunction with the first startup time. In this case, the CPL and BMC can act as backups for each other, with only one having control over the first switching switch 101 at any given time. The two first controllers monitor each other's operating status, and if the first controller with control fails, the other first controller performs the monitoring task for the PIS. This improves the reliability of the PIS startup process, thereby increasing server boot efficiency.
[0107] Based on the above embodiments, this application describes the communication process between the first controller and the first basic input / output system.
[0108] In some optional embodiments of this application, scripts can be deployed in the first and second basic input / output systems to output a startup status to the first controller during the startup process. The first controller then obtains the startup status and identifies that the actual startup time exceeds the first startup time. This can include: after determining that the first basic input / output system has entered the current power-on startup phase, the first controller listens to the output information of the first basic input / output system; if no information indicating the completion of the current power-on startup phase is received from the first basic input / output system within the corresponding first startup time, the first controller determines that the actual startup time corresponding to the current power-on startup phase exceeds the corresponding first startup time.
[0109] At this time, the first controller listens to the output information of the first basic input / output system, which may include: after the first controller listens to the information sent by the first basic input / output system indicating the start of the power-on phase, configuring and starting the first timer corresponding to the power-on phase according to the first startup time corresponding to the power-on phase; after the first controller listens to the information sent by the first basic input / output system indicating the completion of the power-on phase, turning off the first timer.
[0110] In some optional embodiments of this application, the basic input / output system (PIS) may be obtained by the first controller actively acquiring the startup status of the first or second PIS without modification. If the first controller acquires the startup status and identifies that the actual startup time exceeds the first startup time, the process may further include: after determining that the first PIS has entered the current power-on startup phase, the first controller accesses the first PIS to obtain the startup status within the first startup time corresponding to the power-on startup phase; if, after the first startup time is reached, the startup status is that the first PIS has not completed the current power-on startup phase, then the first controller determines that the actual startup time corresponding to the current power-on startup phase exceeds the corresponding first startup time.
[0111] At this point, the first controller determines that the first basic input / output system (PIS) has entered the current power-on startup phase. This can include: the first controller accessing the first PIS and obtaining information that the first PIS has completed the previous power-on startup phase, and then determining that the first PIS has entered the current power-on startup phase. That is, when actively acquiring the startup status of the first PIS, the first controller can determine to enter the current power-on startup phase based on the signal indicating the end of the previous power-on startup phase. Since this method of actively acquiring the startup status of the first PIS may have a delay—that is, the first PIS may have already completed the previous power-on startup phase for a period of time before the first controller accesses the first PIS—the first controller can acquire the startup status and the completion time of the previous power-on startup phase when accessing the first PIS, and dynamically set the first startup time corresponding to the current power-on startup phase.
[0112] After determining that the first basic input / output system has entered the current power-on startup phase, the first controller accesses the first basic input / output system within the first startup time corresponding to the power-on startup phase to obtain the startup status. This may include: after determining that the first basic input / output system has entered the current power-on startup phase, the first controller multiplies the first startup time corresponding to the current power-on startup phase by a preset scaling factor to obtain a second startup time; the first controller configures a second timer according to the second startup time and configures a third timer according to the first startup time, and starts the second and third timers; after the second timer expires, the first controller accesses the first basic input / output system to obtain the startup status. If the first basic input / output system has not completed the current power-on startup phase, it continues to wait; if the first basic input / output system has completed the current power-on startup phase, it closes the third timer and determines that the first basic input / output system has entered the next power-on startup phase; when the third timer expires, the first controller accesses the first basic input / output system to obtain the startup status. If the first basic input / output system has not completed the current power-on startup phase, it determines that the actual startup time exceeds the first startup time; if the first basic input / output system has completed the current power-on startup phase, it determines that the first basic input / output system has entered the next power-on startup phase. In other words, since the first basic input / output system may have already completed the previous power-on phase for a period of time before the first controller accesses it, the first controller can also obtain the second power-on time by multiplying the pre-stored first power-on time by a preset proportional coefficient, and monitor the power-on phase according to the second power-on time. After the second power-on time is reached, the first controller accesses the first basic input / output system to obtain its startup status. If the first basic input / output system has already completed the previous power-on phase at this time, the monitoring can be ended in advance, and the monitoring of the current power-on phase can be started in advance.
[0113] Based on the above embodiments, in this application embodiment, the boot-up phase includes at least two of the following: central processing unit boot phase, memory initialization phase, peripheral device loading phase, operating system loader loading phase, and operating system running phase.
[0114] Figure 2 is a timing diagram of the startup process of a basic input / output system provided in an embodiment of this application.
[0115] In this embodiment, the first controller can start a timer after determining that the power-on startup phase has begun. The timer's timing period is configured to the first startup time corresponding to the power-on startup phase. The first controller stops the timer after determining that the power-on startup phase has been completed. If the power-on startup phase has not been completed after the timer's timing period, the power-on startup phase is determined to have timed out. A watchdog timer can be used as the timer.
[0116] In this embodiment of the application, the first controller monitors the central processing unit startup phase of the first basic input / output system according to the startup state, which may include: detecting a signal that the central processing unit is powered on and determining that the central processing unit startup phase has started; if a central processing unit reset signal issued by the central processing unit is detected, determining that the central processing unit startup phase has been completed.
[0117] As shown in Figure 2, the CPU startup phase refers to the boot process of the CPU firmware on the motherboard (including the Unified Extensible Firmware Interface Platform Initialization (UEFIPI) and the Management Engine (ME)). After the server motherboard powers on, the first controller powers on and, upon monitoring the CPU's power-on, determines the start of the CPU startup phase. It then activates the watchdog timer corresponding to this phase, with a timeout set to 4 minutes (adjustable based on platform and hardware design). The CPU startup phase ends upon detecting a CPU reset signal. During the CPU startup phase, the bootloader loads the UEFIPI or Management Engine firmware. This firmware is typically stored in the Basic Input / Output System (BIOS) memory. When the UEFIPI or Management Engine firmware reaches the point where the CPU issues a CPU reset signal, it indicates that the firmware has successfully started.
[0118] In this embodiment of the application, the first controller monitors the memory initialization phase of the first basic input / output system according to the startup state, which may include: detecting a central processing unit reset signal and determining that the memory initialization phase has started; if a peripheral initialization signal is detected, determining that the memory initialization phase has been completed.
[0119] As shown in Figure 2, after detecting the central processing unit reset signal, the first controller activates the watchdog timer corresponding to the memory initialization phase. Since the BIOS startup phase has different strategies for memory initialization, such as initializing the memory during the initial power-on of the server, the trained parameters are stored in the non-volatile memory (BIOS flash NVRAM) of the BIOS's memory. When the server restarts or powers on again, the memory initialization training phase is skipped, and the training parameters stored in the BIOS's NVRAM are used for rapid memory initialization. Furthermore, when the memory configuration (location, capacity, model, etc.) changes, the previously stored training parameters cannot be used, and memory initialization training needs to be performed again. Whether the training parameters are obtained through memory initialization training or the stored training parameters are used for rapid initialization, the startup time for the memory initialization phase differs, and the larger the memory capacity, the greater the time difference. Higher memory generations also require longer memory initialization training times, such as DDR3 (Double Data Rate), DDR4, DDR5 memory, and even future DDR6 memory. In this embodiment, the first controller can determine the first startup time during memory initialization training and the first startup time when fast initialization is used in the memory initialization phase based on the memory generation and memory capacity of the identified device, and set different flags. The first controller sets different first startup times according to different flags. When the first basic input / output system detects that it has reached the peripheral initialization signal (such as PCIe OptionRom loading), it determines that the memory initialization phase has been completed and disables the watchdog timer corresponding to the memory initialization phase.
[0120] In this embodiment of the application, the first controller monitors the peripheral loading phase of the first basic input / output system according to the startup state, which may include: detecting a peripheral initialization signal and determining that the peripheral loading phase has started; if an operating system loader loading signal is detected, determining that the peripheral loading phase has been completed.
[0121] As shown in Figure 2, when the Basic Input / Output System (BIOS) starts up, it loads the required optional read-only memory (Option ROM) according to different peripheral needs. This supports the functional applications of different PCIe peripherals during the BIOS startup phase, such as the Preboot Execution Environment (PXE) function of the network card, the RAID function of the Redundant Arrays of Independent Disks (RAID) card, and the encryption function of the Non-Volatile Memory Express (NVMe) hard drive. However, different server hardware designs and requirements result in optional loading of Option ROMs for scalable PCIe devices. This can be selected through an Option ROM loading whitelist. The BIOS calculates how many PCIe Option ROMs need to be loaded based on the device types and quantities in the Option ROM loading whitelist, and then passes the number of Option ROMs to the first controller. The first controller can default to a timeout of 20 seconds for loading a single OptionRom, and then multiply this by the number of OptionRoms to obtain the first startup time corresponding to the peripheral loading phase. When the first basic input / output system detects the peripheral initialization signal, the first controller enables the watchdog timer corresponding to the peripheral loading phase, and disables the watchdog timer when the first basic input / output system starts up to the operating system loader loading phase.
[0122] In this embodiment of the application, the first controller monitors the operating system loader loading stage of the first basic input / output system according to the startup state, which may include: detecting the operating system loader loading signal and determining that the operating system loader loading stage has started; if the operating system kernel initialization signal is detected, determining that the operating system loader loading stage has been completed.
[0123] As shown in Figure 2, during the operating system loader loading phase, the First Basic Input / Output System (BPI) sets different watchdog coefficients or switches based on the boot order and the actual inserted boot option device. For example, it identifies Universal Serial Bus (USB) boot disks, pre-boot execution environment system installations, and different operating system disks, and sets the operating system loader watchdog enable flag and watchdog time coefficient according to the product policy. It should be noted that some operating systems on certain devices do not support timed functions. In this case, the watchdog enable flag is used to skip monitoring the operating system loader loading phase, and monitoring of the next boot phase begins only after the start signal of the next boot phase is detected.
[0124] The first controller (complex programmable logic device or board management controller) calculates the new watchdog time by multiplying the default watchdog time (typical time) by the watchdog time coefficient according to the watchdog enable flag and the watchdog time coefficient. This new watchdog time is then obtained as the first startup time. The watchdog time is then enabled or disabled according to the watchdog enable flag. When the operating system reaches the operating system kernel initialization stage, the watchdog timer corresponding to the operating system loader loading phase is disabled.
[0125] In this embodiment of the application, the first controller monitors the operating system running phase of the first basic input / output system according to the startup state, which may include: detecting the operating system kernel initialization signal and determining that the operating system running phase has started; if the operating system has completed a preset number of running cycles, then determining that the operating system running phase has been completed.
[0126] As shown in Figure 2, when the operating system kernel initialization signal is detected, the first controller activates the watchdog timer corresponding to the operating system's running phase, and deactivates the watchdog timer after one or more operating cycles, depending on the operating system's running cycle.
[0127] During the five boot-up phases mentioned above, the first controller conditionally switches, sets, and monitors the watchdog timer for each boot-up phase. When any level 1 watchdog timeout occurs, the first controller can perform a power reset operation. When any of the first four watchdog timers time out, in addition to performing a power reset operation, the first controller controls the first switching switch 101 to switch the first basic input / output system to the second basic input / output system (for example, it can read the current slot number register (BIOS Flash Slot Number) stored in the NVRAM, invert it, switch, and save it back to the NVRAM), and store the chip-selected flash slot number in the NVRAM. When the system restarts next time, the current slot number register number is read from the NVRAM first for chip selection.
[0128] The purpose of the power reset operation is to reset all registers to their initial state when the power is turned on, thereby improving the reliability and security of the system.
[0129] Taking the five power-on and startup stages described in the above embodiments as an example, this application embodiment explains the monitoring timing of the first controller.
[0130] In the monitoring system of the basic input / output system provided in the embodiments of this application, the monitoring process executed by the first controller may include the following steps S101 to S112.
[0131] S101: After power-on, the first controller reads the BIOS Flash Slot Number register in the NVRAM. If the value is neither 0 nor 1, the BIOS Flash Slot Number 0 is used as the first BIOS Flash Slot by default and written to the register in the NVRAM. If the value of the slot number register read by the first controller is 0 or 1, it monitors whether the BIOS Flash Slot Number startup has timed out. If it has, the value of the slot number register is inverted and the segment is set; if it has not timed out, the BIOS Flash Slot Number is selected according to the value of the slot number register read.
[0132] S102: The first controller continuously monitors whether the central processing unit is powered on. After monitoring that the central processing unit is powered on, it activates the watchdog timer corresponding to the central processing unit startup phase and sets the first startup time (which can be 4 minutes).
[0133] S103: The first controller continuously monitors whether the CPU reset signal has been issued. If the CPU reset signal is detected and set, the watchdog timer corresponding to the CPU startup phase is disabled, and the process proceeds to S104. Otherwise, if the watchdog timer corresponding to the CPU startup phase times out, the timeout flag is recorded in the NVRAM register, the CPU firmware startup timeout log is recorded, and a motherboard power reset operation is performed.
[0134] S104: The first controller enables the watchdog timer during the basic input / output system startup phase (memory initialization phase), and the default timeout can be set to 10 minutes.
[0135] S105: The first controller monitors whether the first basic input / output system has sent a flag for memory initialization training (actual transmission time value coefficient Flag). If the flag is detected within the timeout period, the watchdog time is reset to 10 minutes * Flag / 10 (Note: the time value coefficient transmitted by the first basic input / output system is in minutes) and the process jumps to S106 for execution; otherwise, the process jumps directly to S106 for execution.
[0136] S106: During the first startup time corresponding to the memory initialization phase, the first controller continuously monitors whether there is a peripheral initialization signal (PCIe OptionRom load flag). If there is a peripheral initialization signal, it proceeds to S107. Otherwise, after the memory initialization phase times out, it records the timeout flag in the nvram register, records the log of the timeout during the basic input / output system startup phase (memory initialization phase), and performs a motherboard power reset operation.
[0137] S107: The first controller enables the watchdog timer corresponding to the peripheral loading phase and sets the timeout time to the corresponding first startup time.
[0138] S108: The first controller continuously checks the OS loader flag during the first startup time corresponding to the peripheral loading phase. If the flag is detected during the first startup time, it jumps to S109; otherwise, the watchdog timeout corresponding to the peripheral loading phase occurs, the timeout flag is recorded in the nvram register, the timeout log of the peripheral loading phase is recorded, and a motherboard power reset operation is performed.
[0139] S109: The first controller enables the watchdog timer corresponding to the operating system loader loading phase and sets the timeout to the corresponding first startup time.
[0140] S110: The first controller continuously checks the kernel initialization flag during the first startup time corresponding to the operating system loader loading phase. If the kernel initialization flag is detected during the first startup time, it jumps to S111; otherwise, the watchdog timer corresponding to the operating system loader loading phase times out, the timeout flag is recorded in the nvram register, the timeout log of the operating system loader loading phase is recorded, and a motherboard power reset operation is performed.
[0141] S111: The first controller enables the watchdog timer corresponding to the operating system running phase and sets the timeout to the corresponding first startup time (which can be 60 seconds).
[0142] S112: The first controller cyclically detects the watchdog timer signal during the first startup time corresponding to the operating system running phase. If the watchdog timer is detected during the first startup time, the first controller resets the watchdog timer. Otherwise, the watchdog timer timeout corresponding to the operating system running phase expires, the timeout flag is recorded in the nvram register, the timeout log of the operating system running phase is recorded, and a motherboard power reset operation is performed.
[0143] Taking the five power-on startup stages described in the above embodiments as examples, this application embodiment illustrates the timing of the basic input / output system during the startup process.
[0144] In the monitoring system of the basic input / output system provided in the embodiments of this application, the process executed by the first basic input / output system (or the second basic input / output system) after startup may include the following steps S201 to S204.
[0145] S201: After the first basic input / output system starts, it checks whether memory initialization training is required. If memory initialization training is required, it proceeds to S202; if memory initialization training is not required, it proceeds to S203.
[0146] S202: The first basic input / output system identifies the memory generation (DDR3, DDR4, DDR5, or DDR6, etc.), the capacity of a single memory module (usually 8G (Gigabyte), 16G, 32G, 64G, etc.), and the number of memory modules. It can determine the memory initialization training time by multiplying the memory generation by (memory capacity / 8G) * the number of memory modules * the number of single memory modules (this time needs to be tested on different platforms of the same central processing unit manufacturer). The time required for memory initialization training can be calculated in minutes and written to the register of the first controller (memory initialization training flag register) and sent to the first controller.
[0147] S203: During the peripheral loading phase of the first basic input / output system (PII / O) startup, the system reads the PCIe OptionRom whitelist from the PII / O memory, calculates how many PCIe devices need to load OptionRom, and sends the quantity to the first controller by writing it to the OptionRom quantity register. If the whitelist is empty or contains abnormal data, the default OptionRom quantity can be 1, and the quantity can be sent to the first controller by writing it to the OptionRom quantity register.
[0148] S204: From the First Basic Input / Output System (BPI) boot process to the operating system loader loading stage, the system identifies the current boot device type, such as USB boot disk, HDD (Hard Disk Drive) system disk, PXE system, UEFI shell (UEFI command-line interface), setup, etc., and sets different watchdog policies for each. These policies may include:
[0149] (1) USB boot disk: Disable the watchdog timer during the operating system loader loading phase;
[0150] (2) HDD system disk: Enables watchdog during the loading phase of the operating system loader, and can also identify different operating system types and set the corresponding watchdog time.
[0151] (3) UEFI shell: Disable the watchdog timer during the operating system loader loading phase;
[0152] (4) setup: Disables the watchdog timer during the operating system loader loading phase;
[0153] (5) PXE: Disables the watchdog timer during the operating system loader loading phase.
[0154] It should be noted that in most embodiments of this application, the description focuses on the first controller monitoring the startup state of the first basic input / output system. After the first basic input / output system fails to start, the first controller switches to the second basic input / output system. At this time, the process of the first controller monitoring the startup state of the second basic input / output system is the same as the process of the first controller monitoring the startup state of the first basic input / output system.
[0155] Referring to the monitoring system of the basic input / output system described in the above embodiments, the monitoring method of the basic input / output system provided in the embodiments of this application will be described below with reference to the accompanying drawings.
[0156] Figure 3 is a flowchart of a monitoring method for a basic input / output system provided in an embodiment of this application.
[0157] As shown in Figure 3, the monitoring method for the basic input / output system provided in this embodiment of the application, applied to the first controller, may include:
[0158] S301: After power-on, obtain the startup status of the first basic input / output system by communicating with the first basic input / output system;
[0159] S302: If it is detected that the actual startup time in the power-on startup phase of the first basic input / output system exceeds the first startup time corresponding to the power-on startup phase, then it is determined that the first basic input / output system has failed to start.
[0160] S303: After determining that the first basic input / output system has failed to start, control the first switch to switch the first basic input / output system to the second basic input / output system;
[0161] There are multiple startup phases, and the first startup time is determined based on the device configuration parameters corresponding to the startup phase.
[0162] In this embodiment, the boot-up phase may include at least two of the following: central processing unit boot phase, memory initialization phase, peripheral device loading phase, operating system loader loading phase, and operating system running phase.
[0163] In this embodiment of the application, monitoring the CPU startup phase of the first basic input / output system according to the startup status may include: detecting a power-on signal of the CPU to determine that the CPU startup phase has started; and if a CPU reset signal issued by the CPU is detected, determining that the CPU startup phase has been completed.
[0164] In this embodiment of the application, monitoring the memory initialization phase of the first basic input / output system according to the startup state may include: detecting a central processing unit reset signal to determine that the memory initialization phase has started; and detecting a peripheral initialization signal to determine that the memory initialization phase has been completed.
[0165] In this embodiment of the application, monitoring the peripheral loading phase of the first basic input / output system according to the startup state may include: detecting a peripheral initialization signal and determining that the peripheral loading phase has started; if an operating system loader loading signal is detected, determining that the peripheral loading phase has been completed.
[0166] In this embodiment of the application, monitoring the operating system loader loading stage of the first basic input / output system according to the boot state may include: detecting an operating system loader loading signal and determining that the operating system loader loading stage has started; if an operating system kernel initialization signal is detected, determining that the operating system loader loading stage has been completed.
[0167] In this embodiment of the application, monitoring the operating system running phase of the first basic input / output system according to the startup status may include: detecting the operating system kernel initialization signal to determine that the operating system running phase has started; and if the operating system has completed a preset number of running cycles, determining that the operating system running phase has been completed.
[0168] In this embodiment of the application, obtaining the startup status and identifying that the actual startup time exceeds the first startup time may include: after determining that the first basic input / output system has entered the current power-on startup stage, listening to the output information of the first basic input / output system; if no information indicating that the current power-on startup stage has been completed is received from the first basic input / output system within the corresponding first startup time, the first controller determines that the actual startup time corresponding to the current power-on startup stage exceeds the corresponding first startup time.
[0169] The first controller listens to the output information of the first basic input / output system, which may include: after the first controller hears the information sent by the first basic input / output system indicating the start of the power-on phase, configuring and starting the first timer corresponding to the power-on phase according to the first startup time corresponding to the power-on phase; and after the first controller hears the information sent by the first basic input / output system indicating the completion of the power-on phase, turning off the first timer.
[0170] In this embodiment of the application, obtaining the startup status and identifying that the actual startup time exceeds the first startup time may further include: after determining that the first basic input / output system has entered the current power-on startup stage, accessing the first basic input / output system to obtain the startup status within the first startup time corresponding to the power-on startup stage; if the startup status is that the first basic input / output system has not completed the current power-on startup stage after the first startup time is reached, then determining that the actual startup time corresponding to the current power-on startup stage exceeds the corresponding first startup time.
[0171] Determining that the first basic input / output system has entered the current boot-up phase may include: accessing the first basic input / output system and obtaining information that the first basic input / output system has completed the previous boot-up phase, and then determining that the first basic input / output system has entered the current boot-up phase.
[0172] The process of accessing the first basic input / output system (PIS) to obtain its startup status within a first startup time corresponding to the current startup phase after determining that the first PIS has entered the current startup phase can include: multiplying the first startup time corresponding to the current startup phase by a preset scaling factor to obtain a second startup time; configuring a second timer based on the second startup time and a third timer based on the first startup time, and starting both the second and third timers; after the second timer expires, accessing the first PIS to obtain its startup status; if the first PIS has not completed the current startup phase, continuing to wait; if the first PIS has completed the current startup phase, closing the third timer and determining that the first PIS has entered the next startup phase; and after the third timer expires, accessing the first PIS to obtain its startup status; if the first PIS has not completed the current startup phase, determining that the actual startup time exceeds the first startup time; and if the first PIS has completed the current startup phase, determining that the first PIS has entered the next startup phase.
[0173] In some optional embodiments of this application, the first controller may be a baseboard management controller; in S301, after power-on, the first basic input / output system is communicated with to obtain the startup status of the first basic input / output system, which may include: the baseboard management controller obtaining the startup status through intelligent platform management interface commands.
[0174] In some alternative embodiments of this application, the first controller may also be a complex programmable logic device; S301, after power-on, communicating with the first basic input / output system to obtain the startup state of the first basic input / output system may further include: the complex programmable logic device receiving startup state information output by the first basic input / output system through an integrated circuit bus.
[0175] In some alternative embodiments of this application, the first controller may also be a complex programmable logic device; S301, after power-on, communicating with the first basic input / output system to obtain the startup state of the first basic input / output system may further include: the complex programmable logic device receiving startup state information sent by the board management controller.
[0176] It should be noted that in the embodiments of the monitoring methods for the basic input / output systems of this application, some steps or features may be omitted or not executed. The division of hardware or software functional modules is for ease of explanation and is not the only implementation of the monitoring methods for the basic input / output systems provided in the embodiments of this application.
[0177] The above details various embodiments of the monitoring method for a basic input / output system. Based on this, this application also discloses a monitoring device, equipment, non-volatile storage medium, and computer program product for a basic input / output system corresponding to the above method.
[0178] Applied to the first controller, the monitoring device for the basic input / output system provided in this application embodiment may include:
[0179] The monitoring unit is configured to obtain the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on;
[0180] The identification unit is configured to determine that the first basic input / output system has failed to start if it detects that the actual startup time in the power-on startup phase of the first basic input / output system exceeds the first startup time corresponding to the power-on startup phase.
[0181] The control unit is configured to control the first switch to switch the first basic input / output system to the second basic input / output system after determining that the first basic input / output system has failed to start.
[0182] There are multiple startup phases, and the first startup time is determined based on the device configuration parameters corresponding to the startup phase.
[0183] It should be noted that in the various embodiments of the monitoring device for the basic input / output system provided in this application, the division of units is only a logical functional division, and other division methods can be used. The connection between different units can be electrical, mechanical, or other connection methods. Separate units can be located in the same physical location or distributed across multiple network nodes. Each unit can be implemented in hardware or as a software functional unit. That is, some or all of the units provided in this application can be selected according to actual needs, and corresponding connection or integration methods can be used to achieve the purpose of the solution in this application.
[0184] Since the embodiments of the apparatus and the embodiments of the method correspond to each other, please refer to the description of the embodiments of the method for the embodiments of the apparatus, which will not be repeated here.
[0185] Figure 4 is a schematic diagram of the structure of a monitoring device for a basic input / output system provided in an embodiment of this application.
[0186] As shown in Figure 4, the monitoring device for a basic input / output system provided in this application embodiment includes: a memory 410 configured to store a computer program 411; and a processor 420 configured to execute the computer program 411. When the computer program 411 is executed by the processor 420, it implements the steps of the basic input / output system monitoring method provided in any of the above embodiments.
[0187] The processor 420 may include one or more processing cores, such as a 3-core processor or an 8-core processor. The processor 420 may be implemented using at least one hardware form selected from Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 420 may also include a main processor and a coprocessor. The main processor, also known as a Central Processing Unit (CPU), is configured to process data in the wake-up state; the coprocessor is a low-power processor configured to process data in the standby state. In some embodiments, the processor 420 may integrate a Graphics Processing Unit (GPU), which is configured to render and draw the content required for display on the screen. In some embodiments, the processor 420 may also include an Artificial Intelligence (AI) processor, configured to handle computational operations related to machine learning.
[0188] The memory 410 may include one or more non-volatile storage media, which may be non-transitory. The memory 410 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In this embodiment, the memory 410 is at least configured to store the following computer program 411, wherein, after being loaded and executed by the processor 420, the computer program 411 is able to implement the relevant steps in the monitoring method of the basic input / output system disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 410 may also include an operating system 412 and data 413, and the storage method may be temporary storage or permanent storage. The operating system 412 may be Windows (Microsoft Windows operating system) or other types of operating systems. The data 413 may include, but is not limited to, the data involved in the above methods.
[0189] In some embodiments, the monitoring equipment of the basic input / output system may further include a display screen 430, a power supply 440, a communication interface 450, an input / output interface 460, a sensor 470, and a communication bus 480.
[0190] Those skilled in the art will understand that the structure shown in Figure 4 does not constitute a limitation on the monitoring equipment of a basic input / output system and may include more or fewer components than shown.
[0191] The monitoring device for a basic input / output system provided in this application includes a memory and a processor. When the processor executes the program stored in the memory, it can implement the steps of the monitoring method for a basic input / output system provided in the above embodiments, and the effect is the same as above.
[0192] This application provides a non-volatile storage medium storing a computer program thereon. When executed by a processor, the computer program can implement the steps of the monitoring method for a basic input / output system as provided in any of the above embodiments.
[0193] The non-volatile storage medium may include: USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks or optical disks, and other media that can store program code.
[0194] For a description of the non-volatile storage medium provided in the embodiments of this application, please refer to the above method embodiments. The effect it achieves is the same as the monitoring method of the basic input / output system provided in the embodiments of this application, and will not be repeated here.
[0195] This application provides a computer program product, including a computer program that, when executed by a processor, implements the steps of a monitoring method for a basic input / output system as provided in any of the above embodiments.
[0196] For a description of the computer program product provided in the embodiments of this application, please refer to the above method embodiments. The effects it achieves are the same as those of the basic input / output system monitoring method provided in the embodiments of this application, and will not be repeated here.
[0197] The foregoing provides a detailed description of a monitoring method, apparatus, device, and non-volatile storage medium for a basic input / output system provided in this application. The various embodiments are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus, device, non-volatile storage medium, and computer program product disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the descriptions are relatively simple, and relevant parts can be referred to the method section. It should be noted that those skilled in the art can make several improvements and modifications to this application without departing from the principles of this application, and these improvements and modifications also fall within the protection scope of this application.
[0198] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
Claims
1. A monitoring system for a basic input / output system, characterized in that, Includes a first controller and a first switching switch; The first controller is configured to obtain the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on. If it is detected that the actual startup time in the power-on startup phase of the first basic input / output system exceeds the first startup time corresponding to the power-on startup phase, the controller determines that the first basic input / output system has failed to start and controls the first switch to switch the first basic input / output system to the second basic input / output system. The number of startup phases is multiple, and the first startup time is determined according to the device configuration parameters corresponding to the startup phase.
2. The monitoring system for the basic input / output system according to claim 1, characterized in that, The boot-up phase includes at least two of the following: central processing unit boot phase, memory initialization phase, peripheral loading phase, operating system loader loading phase, and operating system running phase.
3. The monitoring system for the basic input / output system according to claim 2, characterized in that, The first controller monitors the CPU startup phase of the first basic input / output system according to the startup state, including: Upon detecting a power-on signal for the central processing unit, the start-up phase of the central processing unit is determined to have begun. If a CPU reset signal is detected from the CPU, it is determined that the CPU startup phase has been completed.
4. The monitoring system for the basic input / output system according to claim 2, characterized in that, The first controller monitors the memory initialization phase of the first basic input / output system according to the startup state, including: A central processing unit reset signal is detected, indicating that the memory initialization phase has begun; If a peripheral initialization signal is detected, it is determined that the memory initialization phase has been completed.
5. The monitoring system for the basic input / output system according to claim 2, characterized in that, The first controller monitors the peripheral loading phase of the first basic input / output system according to the startup state, including: Upon detecting a peripheral initialization signal, the peripheral loading phase is determined to have begun. If an operating system loader loading signal is detected, it is determined that the peripheral loading phase has been completed.
6. The monitoring system for the basic input / output system according to claim 2, characterized in that, The first controller monitors the operating system loader loading phase of the first basic input / output system according to the startup state, including: An operating system loader loading signal is detected, indicating that the operating system loader loading phase has begun. If an operating system kernel initialization signal is detected, it is determined that the operating system loader loading phase has been completed.
7. The monitoring system for the basic input / output system according to claim 2, characterized in that, The first controller monitors the operating system operation phase of the first basic input / output system according to the startup state, including: Upon detecting the operating system kernel initialization signal, it is determined that the operating system's runtime phase has begun. If the operating system is detected to have completed a preset number of running cycles, then the operating system's running phase is determined to be complete.
8. The monitoring system for the basic input / output system according to claim 1, characterized in that, The first controller acquires the startup status and identifies that the actual startup time exceeds the first startup time, including: After determining that the first basic input / output system has entered the current power-on startup phase, the first controller listens to the output information of the first basic input / output system. If no information indicating the completion of the current startup phase is received from the first basic input / output system within the corresponding first startup time, the first controller determines that the actual startup time corresponding to the current startup phase exceeds the corresponding first startup time.
9. The monitoring system for the basic input / output system according to claim 8, characterized in that, The first controller monitors the output information of the first basic input / output system, including: After the first controller hears the information sent by the first basic input / output system indicating the start of the power-on phase, it configures and starts the first timer corresponding to the power-on phase according to the first startup time corresponding to the power-on phase. After the first controller receives the information sent by the first basic input / output system that the power-on startup phase has been completed, it shuts down the first timer.
10. The monitoring system for the basic input / output system according to claim 1, characterized in that, The first controller acquires the startup status and identifies that the actual startup time exceeds the first startup time, including: After determining that the first basic input / output system has entered the current power-on startup phase, the first controller accesses the first basic input / output system within the first startup time corresponding to the power-on startup phase to obtain the startup status. If, after the first startup time is reached, the startup state is that the first basic input / output system has not completed the current power-on startup phase, then the first controller determines that the actual startup time corresponding to the current power-on startup phase exceeds the corresponding first startup time.
11. The monitoring system for the basic input / output system according to claim 10, characterized in that, The first controller determines that the first basic input / output system has entered the current power-on startup phase, including: After the first controller accesses the first basic input / output system and obtains that the first basic input / output system has completed the previous power-on startup phase, it determines that the first basic input / output system has entered the current power-on startup phase.
12. The monitoring system for the basic input / output system according to claim 10, characterized in that, After determining that the first basic input / output system has entered the current power-on startup phase, the first controller accesses the first basic input / output system within the first startup time corresponding to the power-on startup phase to obtain the startup state, including: After determining that the first basic input / output system has entered the current power-on startup phase, the first controller multiplies the first startup time corresponding to the current power-on startup phase by a preset scaling factor to obtain the second startup time. The first controller configures a second timer based on the second startup time, configures a third timer based on the first startup time, and starts the second timer and the third timer; After the second timer expires, the first controller accesses the first basic input / output system to obtain the startup status. If the first basic input / output system has not completed the current power-on startup stage, it continues to wait. If the first basic input / output system has completed the current power-on startup stage, it closes the third timer and determines that the first basic input / output system enters the next power-on startup stage. When the third timer expires, the first controller accesses the first basic input / output system to obtain the startup status. If the first basic input / output system has not completed the current power-on startup phase, it is determined that the actual startup time exceeds the first startup time. If the first basic input / output system has completed the current power-on startup phase, it is determined that the first basic input / output system enters the next power-on startup phase.
13. The monitoring system for the basic input / output system according to claim 1, characterized in that, The first controller is a baseboard management controller; After power-on, the first controller communicates with the first basic input / output system to obtain the startup status of the first basic input / output system, including: The baseboard management controller obtains the startup status through intelligent platform management interface commands.
14. The monitoring system for the basic input / output system according to claim 1, characterized in that, The first controller is a complex programmable logic device; After power-on, the first controller communicates with the first basic input / output system to obtain the startup status of the first basic input / output system, including: The complex programmable logic device receives the startup status information output by the first basic input / output system via an integrated circuit bus.
15. The monitoring system for the basic input / output system according to claim 1, characterized in that, The first controller is a complex programmable logic device; After power-on, the first controller communicates with the first basic input / output system to obtain the startup status of the first basic input / output system, including: The complex programmable logic device receives the startup status information sent by the board management controller.
16. A monitoring method for a basic input / output system, characterized in that, Applied to the first controller, including: After power-on, the startup status of the first basic input / output system is obtained by communicating with the first basic input / output system. If it is detected that the actual startup time exceeds the first startup time corresponding to the startup phase during the power-on startup phase of the first basic input / output system, then it is determined that the first basic input / output system has failed to start. After determining that the first basic input / output system has failed to start, control the first switch to switch the first basic input / output system to the second basic input / output system; The number of startup phases is multiple, and the first startup time is determined according to the device configuration parameters corresponding to the startup phase.
17. A monitoring device for a basic input / output system, characterized in that, Applied to the first controller, including: The monitoring unit is configured to obtain the startup status of the first basic input / output system by communicating with the first basic input / output system after power-on; The identification unit is configured to determine that the first basic input / output system has failed to start if it detects that the actual startup time in the power-on startup phase of the first basic input / output system exceeds the first startup time corresponding to the power-on startup phase. The control unit is configured to control a first switch to switch the first basic input / output system to a second basic input / output system after determining that the first basic input / output system has failed to start. The number of startup phases is multiple, and the first startup time is determined according to the device configuration parameters corresponding to the startup phase.
18. A monitoring device for a basic input / output system, characterized in that, include: The memory is configured to store computer programs; A processor is configured to execute the computer program, which, when executed by the processor, implements the steps of the monitoring method for the basic input / output system as described in claim 16.
19. A non-volatile storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the monitoring method for the basic input / output system as described in claim 16.
20. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the monitoring method for the basic input / output system as described in claim 16.