Method and device for server self-healing
A server and abnormal information technology, applied in the server field, can solve problems such as system hidden dangers, limited system reliability improvement, business interruption, etc., to reduce the possibility of manual on-site intervention and operation, and restore the normal working state.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] Such as figure 1 The shown server management system structure includes SMM and several slave nodes, that is, BMCs on each server, and each server board has a BIOS. The SMM is connected to the BMC of each server through IPMB (Intelligent Platform Management BUS, Intelligent Platform Management Bus) / LAN (Local Area Network, local area network) and other methods. BMC and BIOS can communicate through various types of physical channels. This system The structure provides a physical channel for SMM to manage server memory exceptions. In the server system, the server mostly uses the memory that supports the ECC function, which provides hardware prerequisites for the timely detection of memory abnormalities. The main function of B / C is to configure how the BMC handles memory exceptions, such as configuring a policy, such as restarting the board and isolating the fault when the frequency of a recoverable memory fault on a certain memory module is greater than a certain threshol...
Embodiment 2
[0100] Such as Figure 4 As shown, it is a flow chart of the server self-healing method in Embodiment 2 of the present invention. in:
[0101] The BIOS is responsible for detecting memory abnormalities. It can distinguish between recoverable one-bit ECC errors and unrecoverable two-bit ECC errors, and can locate the fault to a specific physical memory stick; if the system starts up again after self-healing, it can realize abnormal memory stick Quarantined and no longer used.
[0102] The BMC is responsible for forwarding the memory exception reported by the BIOS to the SMM, and reporting the faulty memory module information to the basic input and output system BIOS when the server is powered on again.
[0103] The SMM receives the memory fault information forwarded by the out-of-band management module, distinguishes the memory modules for abnormal number statistics, and decides whether to perform self-healing processing on the specified abnormal board according to the seriou...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com



