Liquid cooling device fault detection method and device, computer device and storage medium

By introducing bit synchronization and interrupt signals between the BMC and CPLD, the problem of not being able to obtain node signals when communication between the BMC and CPLD is interrupted is solved, enabling rapid fault location and maintenance.

CN117687875BActive Publication Date: 2026-06-30INSPUR SUZHOU INTELLIGENT TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INSPUR SUZHOU INTELLIGENT TECH CO LTD
Filing Date
2023-12-29
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

When the I2C polling link between the BMC and CPLD is interrupted, the BMC has difficulty obtaining the leakage and presence signals of each node of the server in a timely manner, and cannot determine the node where the faulty liquid cooling equipment is located, which is not conducive to the maintenance of the liquid cooling equipment.

Method used

By introducing bit synchronization and interrupt signals between the BMC and CPLD, the CPLD can transmit the leakage and presence information of all nodes to the BMC at once, avoiding the BMC from polling the registers of each node multiple times via I2C.

Benefits of technology

Even when the communication link between the BMC and CPLD is interrupted, it can still obtain the leakage and presence signals of all nodes in a timely and efficient manner, helping maintenance personnel to quickly locate the faulty liquid cooling equipment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117687875B_ABST
    Figure CN117687875B_ABST
Patent Text Reader

Abstract

This invention relates to the field of liquid cooling technology, and discloses a method, apparatus, computer equipment, and storage medium for fault detection of liquid cooling equipment. The method includes: upon receiving a first target signal and a bit synchronization signal sent by a CPLD, obtaining a first sampling point based on the bit synchronization signal; obtaining presence information based on the first sampling point and the first target signal; upon receiving an interrupt signal, a second target signal, and a bit synchronization signal sent by the CPLD, obtaining a second sampling point based on the bit synchronization signal; obtaining leakage information based on the second sampling point and the second target signal; determining the target node where the faulty liquid cooling equipment is located based on the presence information or the leakage information, and outputting the target node. This solves the problem that when the I2C polling link between the BMC and the CPLD is interrupted, the BMC has difficulty obtaining leakage signals and presence signals from each node of the server in a timely manner, making it impossible to determine the node where the faulty liquid cooling equipment is located, which is detrimental to the maintenance of the liquid cooling equipment.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of liquid cooling technology, specifically to a method, apparatus, computer equipment, and storage medium for detecting faults in liquid cooling equipment. Background Technology

[0002] As servers become increasingly powerful, the power consumption of a single server also increases, leading to higher demands for heat dissipation. Traditional air cooling can no longer meet the requirements of high cost-effectiveness and low PUE (Power Usage Effectiveness) for server cooling. Liquid cooling technology has become the main method for server heat dissipation. It is essential to manage the server's liquid cooling equipment, detect and promptly address any malfunctions such as leaks, and prevent any disruption to the server's normal operation.

[0003] Currently, for servers with multiple nodes, the Baseboard Management Controller (BMC) needs to use an I2C (Inter-Integrated Circuit) polling link to read data from the registers on the mainboard's Complex Programmable Logic Device (CPLD) that store leakage signals and leakage detection line presence signals for each node. This allows the BMC to obtain the leakage and presence signals for each node. If the I2C polling link between the BMC and the CPLD malfunctions, the BMC may be unable to identify nodes experiencing leaks or other faults and output maintenance logs. Furthermore, this I2C polling link is also used to transmit other data, such as alarms and presence information from the Power Supply Unit (PSU). When this I2C polling link is occupied, the BMC cannot obtain leakage and presence signals in a timely manner. Moreover, when the number of server nodes increases or decreases, the BMC needs to be modified to change the number of server registers polled for each node.

[0004] Therefore, the relevant technology has the problem that when the I2C polling link between the BMC and CPLD is interrupted, the BMC has difficulty obtaining the leakage and presence signals of each node of the server in a timely manner, and cannot determine the node where the faulty liquid cooling equipment is located, which is not conducive to the maintenance of the liquid cooling equipment. Summary of the Invention

[0005] In view of this, the present invention provides a liquid cooling equipment fault detection method, device, computer equipment and storage medium to solve the problem that when the I2C polling link between the BMC and CPLD is interrupted, the BMC has difficulty obtaining the leakage signal and presence signal of each node of the server in a timely manner, and cannot determine the node where the faulty liquid cooling equipment is located, which is not conducive to the maintenance of the liquid cooling equipment.

[0006] In a first aspect, the present invention provides a method for detecting faults in liquid cooling equipment, which is applied to BMC (Body Control Module), and the method includes:

[0007] Upon receiving the first target signal and bit synchronization signal sent by the CPLD, the first sampling point is obtained based on the bit synchronization signal. The first target information includes the presence information of the leakage detection module, which is used to obtain leakage information of the liquid cooling equipment in each node of the server.

[0008] In-situ information is obtained based on the first sampling point and the first target signal;

[0009] Upon receiving the interrupt signal, the second target signal, and the bit synchronization signal sent by the CPLD, the second sampling point is obtained based on the bit synchronization signal. The interrupt signal is used to instruct the BMC to receive the second target signal, which contains leakage information.

[0010] Leakage information is obtained based on the second sampling point and the second target signal;

[0011] The target node where the faulty liquid cooling device is located is determined based on the on-site information or leakage information, and the target node is output.

[0012] The liquid cooling equipment fault detection method provided in this embodiment involves the BMC determining a first sampling point based on a bit synchronization signal, and then collecting presence information from a first target signal based on the first sampling point. Upon receiving an interrupt signal, the BMC receives a bit synchronization signal and a second target signal, determines a second sampling point based on the bit synchronization signal, and then collects leakage information from the second target signal based on the second sampling point. The CPLD transmits all leakage and presence information from all nodes to the BMC at once, avoiding the BMC having to poll the corresponding registers of each node multiple times via I2C. The BMC determines the target node where the faulty liquid cooling equipment is located based on the presence and leakage information, facilitating subsequent maintenance by maintenance personnel. This solves the problem in related technologies where, when the I2C polling link between the BMC and CPLD is interrupted, the BMC struggles to obtain leakage and presence signals from each server node in a timely manner, making it impossible to determine the node where the faulty liquid cooling equipment is located, which is detrimental to the maintenance of the liquid cooling equipment.

[0013] In one optional implementation, obtaining the presence information based on the first sampling point and the first target signal includes:

[0014] Based on the first sampling point and the first target signal, a first number of first level values ​​are obtained;

[0015] A first start flag and a first stop flag are determined from a first number of first level values;

[0016] The first level value between the first start flag and the first stop flag is taken as the first target level value, and the sequence number of the first target level value is obtained;

[0017] The number of nodes in the server is obtained based on the first target level value whose sequence number is equal to the preset sequence number.

[0018] Obtain the number of communication links between the CPLD and BMC;

[0019] When the number of communication links exceeds a preset threshold, the presence information is obtained based on the first target level value and the number of nodes;

[0020] When the number of communication links equals a preset threshold, the in-situ information is obtained based on the first target level value, the number of nodes, and the preset information distribution rules.

[0021] In this embodiment, a first level value is collected from the first target information based on the first sampling point, and the number of nodes is obtained from the first level value. Each time the CPLD transmits a signal, it informs the BMC server of the number of nodes included. Even if the number of nodes increases or decreases, the BMC does not need further development or modification and can successfully obtain the presence and leakage information. Furthermore, based on the number of communication links between the CPLD and the BMC, the BMC determines whether to use a preset information distribution rule to obtain the presence information from the first target signal. This ensures that regardless of the number of communication links, the present invention can transmit the presence information of all nodes in a timely and efficient manner through the first target signal.

[0022] In one optional implementation, obtaining leakage information based on the second sampling point and the second target signal includes:

[0023] Based on the second sampling point and the second target signal, a second number of second level values ​​are obtained;

[0024] The second start flag and the second stop flag are determined from the second number of second level values;

[0025] The second level value between the second start flag and the second stop flag is taken as the second target level value;

[0026] The number of nodes in the server is obtained from the second target level value, and leakage information is obtained based on the second target level value and the number of nodes.

[0027] In this embodiment, a second level value is acquired from the second target signal based on the second sampling point. The number of nodes is obtained from the second level value, and leakage information is obtained based on the second target level value and the number of nodes. The number of nodes is communicated to the BMC server in real time, ensuring that the BMC can successfully obtain both presence and leakage information even if the number of nodes increases or decreases. Leakage information from all nodes is transmitted to the BMC at once, avoiding the need for the BMC to repeatedly poll the corresponding registers of each node via the I2C polling link.

[0028] In one optional implementation, the target node where the faulty liquid cooling device is located is determined based on the presence information or leakage information, and the target node is output, including:

[0029] Determine whether there is any in-situ information with a value equal to the first preset value;

[0030] If it exists, the in-situ information with a value equal to the first preset value is taken as the target in-situ information, and the node corresponding to the target in-situ information is taken as the first target node. The first preset value is used to determine whether the leakage detection module corresponding to the in-situ information is not in-situ, and the leakage detection module in the first target node is not in-situ.

[0031] Write the first target node to the maintenance log. The maintenance log is used to output target nodes, which include the first target node and the second target node.

[0032] Determine if there is any leakage information with a value equal to the second preset value;

[0033] If it exists, the leakage information with a value equal to the second preset value is taken as the target leakage information, and the node corresponding to the target leakage information is taken as the second target node. The second preset value is used to determine whether the liquid cooling equipment corresponding to the in-situ information is leaking, and the liquid cooling equipment in the second target node is leaking.

[0034] Write the second target node into the maintenance log.

[0035] In this embodiment, BMC determines whether all leakage detection modules are in place using a first preset value and on-site information, and determines whether each liquid cooling device is leaking using a second preset value and leakage information. The nodes where the missing leakage detection modules are located and the nodes where the leaking liquid cooling devices are located are written into the maintenance log, which facilitates subsequent maintenance personnel to quickly locate the problem and perform maintenance.

[0036] Secondly, the present invention provides a fault detection method for liquid cooling equipment, which is applied to a CPLD, and the method includes:

[0037] Obtain leakage information of liquid cooling equipment and location information of leakage detection module in each node of the server. The leakage detection module is used to obtain leakage information.

[0038] Obtain the number of communication links between the CPLD and BMC;

[0039] The target communication link is determined based on the number of communication links. The first target signal and the bit synchronization signal are sent to the BMC through the target communication link. Alternatively, the interrupt signal, the second target signal, and the bit synchronization signal are sent to the BMC through the target communication link. The first target signal contains position information, the second target signal contains leakage information, the interrupt signal is used to instruct the BMC to receive the second target signal, and the bit synchronization signal is used to instruct the BMC to obtain position information from the first target signal or to obtain leakage information from the second target signal.

[0040] The liquid cooling equipment fault detection method provided in this embodiment transmits the presence information of all nodes to the BMC at once through a first target information and the leakage information of all nodes through a second target information, avoiding the need for the BMC to poll the corresponding registers of each node multiple times via I2C. Simultaneously, the CPLD determines the target communication link based on the number of communication links and sends a bit synchronization signal, instructing the BMC to obtain presence information from the first target signal or leakage information from the second target signal. This ensures that regardless of the number of communication links, the CPLD can transmit presence and leakage information to the BMC. This solves the problem in related technologies where, when the I2C polling link between the BMC and CPLD is interrupted, the BMC cannot promptly obtain leakage and presence signals from each node of the server, making it impossible to determine the node where the faulty liquid cooling equipment is located, which is detrimental to the maintenance of the liquid cooling equipment.

[0041] In an optional implementation, before sending the interrupt signal, the second target signal, and the bit synchronization signal to the BMC via the target communication link, the method further includes:

[0042] A bit synchronization signal is generated based on the initial representative data and the preset clock period. The information in the bit synchronization signal after the initial representative data is generated based on the preset clock period.

[0043] Determine if there is any leakage information with a value equal to the second preset value. If so, generate an interrupt signal according to the preset interrupt flag.

[0044] In this embodiment, the CPLD generates a bit synchronization signal and sends it to the BMC, notifying the BMC which sampling points to sample, generating a terminal signal, and instructing the BMC to receive the second target signal, thereby realizing the transmission of node number, in-situ information, and leakage information.

[0045] In one optional implementation, a target communication link is determined based on the number of communication links, and a first target signal and a bit synchronization signal are sent to the BMC through the target communication link; alternatively, an interrupt signal, a second target signal, and a bit synchronization signal are sent to the BMC through the target communication link, including:

[0046] When the number of communication links equals a preset threshold, a first target communication link is determined, wherein the first target communication link includes a first communication link, a second communication link, and a third communication link;

[0047] A first target signal is generated based on the in-situ information and the first timing scheme, and a bit synchronization signal is sent to the BMC through the first communication link, and the first target signal is sent to the BMC through the second communication link;

[0048] Determine whether there is leakage information with a value equal to the second preset value. If so, send an interrupt signal to the BMC through the third communication link, generate a second target signal based on the leakage information and the first timing scheme, send a bit synchronization signal to the BMC through the first communication link, and send the second target signal to the BMC through the second communication link.

[0049] If the number of communication links exceeds a preset threshold, a second target communication link is determined, wherein the second target communication link includes a fourth communication link, a fifth communication link, a sixth communication link, and a seventh communication link.

[0050] A first target signal is generated based on the in-situ information and the second timing scheme, and a bit synchronization signal is sent to the BMC through the fourth communication link, and the first target signal is sent to the BMC through the fifth communication link.

[0051] Determine whether there is leakage information with a value equal to the second preset value. If so, send an interrupt signal to the BMC through the sixth communication link, generate a second target signal based on the leakage information and the third timing scheme, send a bit synchronization signal to the BMC through the fourth communication link, and send the second target signal to the BMC through the seventh communication link.

[0052] In this embodiment, the target communication link for transmitting signals is determined based on the number of communication links and a preset threshold, so that the CPLD can transmit position information and leakage information to the BMC regardless of the number of communication links.

[0053] Thirdly, the present invention provides a liquid cooling equipment fault detection device, which is deployed in a BMC and includes:

[0054] The first obtaining module is used to obtain the first sampling point based on the bit synchronization signal when it receives the first target signal and the bit synchronization signal sent by the CPLD. The first target information includes the in-situ information of the leakage detection module. The leakage detection module is used to obtain the leakage information of the liquid cooling device in each node of the server.

[0055] The second obtaining module is used to obtain in-situ information based on the first sampling point and the first target signal;

[0056] The third module is used to obtain the second sampling point based on the bit synchronization signal when it receives the interrupt signal, the second target signal and the bit synchronization signal sent by the CPLD. The interrupt signal is used to instruct the BMC to receive the second target signal, which contains leakage information.

[0057] The fourth module is used to obtain leakage information based on the second sampling point and the second target signal;

[0058] The determination module is used to determine the target node where the faulty liquid cooling device is located based on the presence information or leakage information, and outputs the target node.

[0059] Fourthly, the present invention provides a liquid cooling equipment fault detection device, which is deployed in a CPLD and includes:

[0060] The first acquisition module is used to acquire leakage information of liquid cooling equipment in each node of the server and the location information of leakage detection module, wherein the leakage detection module is used to acquire leakage information.

[0061] The second acquisition module is used to acquire the number of communication links between the CPLD and the BMC;

[0062] The transmitting module is used to determine the target communication link based on the number of communication links, and to transmit the first target signal and the bit synchronization signal to the BMC through the target communication link, or to transmit the interrupt signal, the second target signal and the bit synchronization signal to the BMC through the target communication link. The first target signal contains position information, the second target signal contains leakage information, the interrupt signal is used to instruct the BMC to receive the second target signal, and the bit synchronization signal is used to instruct the BMC to obtain the position information from the first target signal or the leakage information from the second target signal.

[0063] Fifthly, the present invention provides a computer device, comprising: a memory and a processor, wherein the memory and the processor are communicatively connected to each other, the memory stores computer instructions, and the processor executes the computer instructions to perform the liquid cooling device fault detection method of the first aspect or any corresponding embodiment described above.

[0064] In a sixth aspect, the present invention provides a computer-readable storage medium storing computer instructions for causing a computer to execute the liquid cooling equipment fault detection method of the first aspect or any corresponding embodiment described above. Attached Figure Description

[0065] To more clearly illustrate the technical solutions in the specific embodiments or related technologies of the present invention, the drawings used in the description of the specific embodiments or related technologies will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0066] Figure 1 This is a schematic diagram of a current single-node server leakage detection scheme according to an embodiment of the present invention;

[0067] Figure 2 This is a schematic diagram of a current multi-node server leakage detection scheme according to an embodiment of the present invention;

[0068] Figure 3 This is a schematic flowchart of a liquid cooling equipment fault detection method applied to BMC according to an embodiment of the present invention;

[0069] Figure 4 This is a schematic diagram of a fault detection scheme for liquid cooling equipment according to an embodiment of the present invention;

[0070] Figure 5 This is a schematic diagram of a first timing scheme according to an embodiment of the present invention;

[0071] Figure 6 This is a schematic diagram of the second timing scheme and the third timing scheme according to embodiments of the present invention;

[0072] Figure 7 This is a schematic flowchart of a liquid cooling equipment fault detection method for CPLDs according to an embodiment of the present invention;

[0073] Figure 8 This is a structural block diagram of a liquid cooling equipment fault detection device deployed in a BMC according to an embodiment of the present invention;

[0074] Figure 9 This is a structural block diagram of a liquid cooling equipment fault detection device deployed on a CPLD according to an embodiment of the present invention;

[0075] Figure 10 This is a schematic diagram of the hardware structure of a computer device according to an embodiment of the present invention. Detailed Implementation

[0076] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0077] In a solution for detecting leaks in liquid cooling equipment, for a server with only one node, such as... Figure 1 As shown, in a server with only one head unit node, the motherboard CPLD obtains leakage information and the presence information of the leakage detection module from the head unit node. It then sends a set of liquid cooling signals to the BMC to transmit the leakage and presence information. Additionally, the CPLD can send simulated leakage control signals to the head unit node to test the correctness of the solution in detecting whether the liquid cooling equipment is leaking. For servers with multiple nodes, the motherboard CPLD first collects leakage and presence information from each node via GPIO (General Purpose Input / Output Port) through the connectors (conn) of each board. After integrating the leakage and presence information from each node, it transmits it to the BMC. When the CPLD determines that a leakage event has occurred on a certain node, it notifies the BMC via an interrupt signal. The BMC can then directly power off the entire system, allowing maintenance personnel to check which node is leaking before performing maintenance. Figure 2 As shown, the server has a head node, node 1, and node 2. The motherboard CPLD directly obtains the leakage and presence information of the head node and node 1. The CPLD obtains the leakage and presence information of node 2 through connector 2, and generates leakage signal, interrupt signal, and presence signal to send to the BMC. In addition, the motherboard CPLD sends simulated leakage control signals to the head node, node 1, and node 2 through connector 1 to test the correctness of the solution in detecting whether the liquid cooling equipment of each node is leaking.

[0078] However, in the current solution, the BMC needs to read data from the registers of the CPLD on the motherboard, which store the presence status of each node, through an I2C polling link to obtain leakage and presence information for each node. If the I2C polling link between the BMC and the CPLD is abnormal, the I2C polling link is occupied, or there are additions or removals of nodes in the server, the BMC may not be able to obtain leakage and presence information and output maintenance logs in a timely manner, making it impossible for maintenance personnel to determine which node has leaked or whether the leakage detection module is not in place.

[0079] Based on the above, this invention provides a method for fault detection in liquid cooling equipment. A CPLD (Continuous Liquid Detector) is used to collect leakage information and the presence information of the leakage detection module from each node. A bit synchronization signal is added between the CPLD and the BMC (Block Controller). The leakage information and presence information are transmitted to the BMC in a timely and efficient manner through multiple communication links via the bit synchronization signal and an interrupt signal. This achieves the effect of transmitting all node information to the BMC at once, eliminating the need for the BMC to repeatedly poll the information of each node via I2C polling links.

[0080] According to an embodiment of the present invention, a method for detecting faults in liquid cooling equipment is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer device with data processing capabilities, such as a computer or server. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.

[0081] This embodiment provides a method for detecting faults in liquid cooling equipment, which can be used in the aforementioned computer equipment. Figure 3 This is a flowchart of a liquid cooling equipment fault detection method according to an embodiment of the present invention. This method is applied to BMC (Body Controlled Module), such as... Figure 3 As shown, the process includes the following steps:

[0082] Step S301: Upon receiving the first target signal and bit synchronization signal sent by the CPLD, the first sampling point is obtained based on the bit synchronization signal. The first target information includes the presence information of the leakage detection module, which is used to obtain leakage information of the liquid cooling device in each node of the server.

[0083] Specifically, such as Figure 4 As shown, the motherboard CPLD obtains presence information from the head unit node, node 1, and node 2. After obtaining the presence information of the leakage detection module, it generates a presence signal as the first target signal. The first target signal and a bit synchronization signal are then sent to the BMC. For example, the bit synchronization signal is: Figure 6 The bit synchronization signal shown consists of a clock with a fixed periodic frequency, which notifies the BMC which sampling points to use to obtain in-situ or leakage information. Therefore, the BMC obtains the first sampling point based on the bit synchronization signal. The leakage detection module is used to obtain leakage information from the liquid cooling devices in each node of the server, such as leakage detection lines.

[0084] Step S302: Obtain the in-situ information based on the first sampling point and the first target signal.

[0085] Specifically, BMC samples the first target signal based on the first sampling point to obtain the presence information of the leakage detection module of each node of the server.

[0086] Step S303: Upon receiving the interrupt signal, the second target signal, and the bit synchronization signal sent by the CPLD, the second sampling point is obtained based on the bit synchronization signal. The interrupt signal is used to instruct the BMC to receive the second target signal, which contains leakage information.

[0087] Specifically, such as Figure 4 As shown, the motherboard CPLD obtains leakage information from the head node, node 1, and node 2. After obtaining the leakage information of the liquid cooling device, the motherboard CPLD determines whether the liquid cooling device has leaked. If it has, an interrupt signal is generated and sent to the BMC. After receiving the interrupt signal, the BMC receives the second target signal and bit synchronization signal sent by the motherboard CPLD. The second target signal contains the leakage information, and the second sampling point is obtained based on the bit synchronization signal.

[0088] Step S304: Obtain leakage information based on the second sampling point and the second target signal.

[0089] Specifically, BMC samples the second target signal based on the second sampling point to obtain leakage information of the liquid cooling equipment of each node of the server.

[0090] Step S305: Determine the target node where the faulty liquid cooling device is located based on the presence information or leakage information, and output the target node.

[0091] Specifically, there is a one-to-one correspondence between the presence information and the server node, and also a one-to-one correspondence between the leakage information and the server node. Based on the received presence information, the BMC determines whether there are any missing leakage detection modules. If so, it determines the presence information corresponding to the missing leakage detection module, identifies the node it is located on, and writes that node into the maintenance log for easy maintenance by server maintenance personnel. Similarly, based on the received leakage information, the BMC determines whether there are any leaking liquid cooling devices. If so, it determines the presence information corresponding to the leaking liquid cooling device, identifies the node it is located on, and writes that node into the maintenance log for easy repair by server maintenance personnel.

[0092] The liquid cooling equipment fault detection method provided in this embodiment involves the BMC determining a first sampling point based on a bit synchronization signal, and then collecting presence information from a first target signal based on the first sampling point. Upon receiving an interrupt signal, the BMC receives a bit synchronization signal and a second target signal, determines a second sampling point based on the bit synchronization signal, and then collects leakage information from the second target signal based on the second sampling point. The CPLD transmits all leakage and presence information from all nodes to the BMC at once, avoiding the BMC having to poll the corresponding registers of each node multiple times via I2C. The BMC determines the target node where the faulty liquid cooling equipment is located based on the presence and leakage information, facilitating subsequent maintenance by maintenance personnel. This solves the problem in related technologies where, when the I2C polling link between the BMC and CPLD is interrupted, the BMC struggles to obtain leakage and presence signals from each server node in a timely manner, making it impossible to determine the node where the faulty liquid cooling equipment is located, which is detrimental to the maintenance of the liquid cooling equipment.

[0093] In some optional implementations, the presence information is obtained based on the first sampling point and the first target signal, including:

[0094] Based on the first sampling point and the first target signal, a first number of first level values ​​are obtained;

[0095] A first start flag and a first stop flag are determined from a first number of first level values;

[0096] The first level value between the first start flag and the first stop flag is taken as the first target level value, and the sequence number of the first target level value is obtained;

[0097] The number of nodes in the server is obtained based on the first target level value whose sequence number is equal to the preset sequence number.

[0098] Obtain the number of communication links between the CPLD and BMC;

[0099] When the number of communication links exceeds a preset threshold, the presence information is obtained based on the first target level value and the number of nodes;

[0100] When the number of communication links equals a preset threshold, the in-situ information is obtained based on the first target level value, the number of nodes, and the preset information distribution rules.

[0101] Specifically, the bit synchronization signal can be high in the idle state. When leakage occurs, the bit synchronization signal first pulls low to send the start bit to indicate the start of data transmission, and then sends a clock signal with a fixed periodic frequency, such as... Figure 5 or Figure 6As shown, the bit synchronization signal consists of a clock with a fixed periodic frequency, which can notify the BMC which sampling points to acquire in-situ information or leakage information. In this invention, the sampling points in the process of acquiring in-situ information are referred to as the first sampling points, and the sampling points in the process of acquiring leakage information are referred to as the second sampling points; the two can be the same.

[0102] After obtaining the presence information of the leak detection module, the CPLD will store the presence information in the CPLD register. At the initial moment of server power-on, it will generate a first target signal containing the presence information and send the first target signal and the bit synchronization signal to the BMC at the same time. The leak detection module is, for example, the leak detection line.

[0103] The BMC samples the first target signal based on the first sampling point determined from the bit synchronization signal to obtain a first number of first level values.

[0104] When generating the first target signal containing position information, the CPLD uses a 4-bit fixed high-order four bits as a first start flag to prevent discrimination interference, such as 0101. Three bits are used to transmit the number of server nodes, and one bit represents the position information of one node. After transmitting all position information, a 4-bit fixed last four bits represent the end of position information transmission, such as 1010. In this way, the BMC can directly obtain the position information of multiple nodes. The BMC determines the first start flag and the first stop flag from a first number of first level values. For example, if four consecutive first level values ​​form 1010, then these four first level values ​​form the first start flag; if four consecutive first level values ​​form 1010, then these four first level values ​​form the first stop flag, where 1 represents a high level and 0 represents a low level. Figure 6 As shown, the in-situ signal is the first target signal, where the first start flag is 0101 and the first stop flag is 1010.

[0105] The first voltage level value between the first start flag and the first stop flag is used as the first target voltage level value. The information represented by the first target voltage level value includes the number of nodes in the server and the on-premises information of each node. The sequence number of the first target voltage level value is obtained. For example, the sequence number of the first target voltage level value after the first start flag is set to 1, and so on, until the first stop flag.

[0106] Because this invention uses three bits to transmit the number of server nodes, a preset sequence number, for example, 1, 2, 3, is used. Based on the first target level value where the sequence number equals the preset sequence number, the number of nodes in the server is obtained. Figure 6As shown, in the bit signal, the three bits following the first start flag 0101 represent the number of nodes.

[0107] Obtain the number of communication links between the CPLD and BMC. A communication link could be an I2C channel. A preset threshold of 3 is used as an example for explanation.

[0108] With a communication link count of 3, indicating limited read / write (IO) resources between the CPLD and BMC, the presence signal can be merged into the leakage signal to generate both a leakage and presence signal. This leakage and presence signal can serve as either a first or second target signal. When used as the first target signal, it's for transmitting presence information; when used as the second target signal, it's for transmitting leakage information. Figure 5 As shown, the first target level value includes both position information and leakage information, where n represents the number of nodes, and the preset information distribution rule is as follows: Figure 5 In the leakage and presence signals, the presence information of the node is transmitted first, followed by the leakage information. Therefore, the presence information obtained according to the first target level value, the number of nodes, and the preset information distribution rule includes: after the first target level value representing the number of nodes, obtaining a number of first target level values ​​equal to the number of nodes. These first target level values ​​represent the presence information corresponding to the node. The preset information distribution rule can also be: transmitting the presence information and leakage information of each node sequentially; transmitting the leakage information first and then the presence information, etc.

[0109] When the number of communication links is greater than 3, it indicates that the read / write (IO) resources between the CPLD and BMC are sufficient. In this case, the CPLD uses different communication links to transmit the presence signal (i.e., the first target signal) and the leakage signal (i.e., the second target signal) respectively. Figure 6 As shown. Figure 6 After transmitting the number of nodes in the presence signal, only the presence information of each node is transmitted. Therefore, after the first target level value representing the number of nodes, the first target level value with the number of nodes is obtained. These first target level values ​​represent the presence information of the nodes.

[0110] In this embodiment, a first level value is collected from the first target information based on the first sampling point, and the number of nodes is obtained from the first level value. Each time the CPLD transmits a signal, it informs the BMC server of the number of nodes included. Even if the number of nodes increases or decreases, the BMC does not need further development or modification and can successfully obtain the presence and leakage information. Furthermore, based on the number of communication links between the CPLD and the BMC, the BMC determines whether to use a preset information distribution rule to obtain the presence information from the first target signal. This ensures that regardless of the number of communication links, the present invention can transmit the presence information of all nodes in a timely and efficient manner through the first target signal.

[0111] In some optional implementations, leakage information is obtained based on the second sampling point and the second target signal, including:

[0112] Based on the second sampling point and the second target signal, a second number of second level values ​​are obtained;

[0113] The second start flag and the second stop flag are determined from the second number of second level values;

[0114] The second level value between the second start flag and the second stop flag is taken as the second target level value;

[0115] The number of nodes in the server is obtained from the second target level value, and leakage information is obtained based on the second target level value and the number of nodes.

[0116] Specifically, after acquiring leakage information from the liquid cooling device, the CPLD stores the leakage information in its register. If the CPLD determines that a liquid cooling device is leaking based on the leakage information, it first generates an interrupt signal and sends it to the BMC. Upon receiving the interrupt signal, the BMC knows it needs to receive the second target signal and retrieves the liquid cooling information from it. After sending the interrupt signal, the CPLD generates the second target signal containing the leakage information and simultaneously sends the second target signal and a bit synchronization signal to the BMC.

[0117] The BMC samples the second target signal based on the second sampling point determined from the bit synchronization signal to obtain a second number of second level values.

[0118] When the CPLD generates the second target signal containing leakage information, to prevent discrimination interference, it can use a fixed high four bits (4 bits) to represent the imminent start of leakage information transmission, for example, 0101. Three bits are used to transmit the number of server nodes, and one bit represents the leakage information corresponding to one node. After all leakage information has been transmitted, a fixed last four bits (4 bits) represent the end of leakage information transmission, for example, 1010. In this way, the BMC can directly obtain leakage information corresponding to multiple nodes. The BMC determines the second start and second stop flags from a second number of second level values. For example, if four consecutive second level values ​​form 1010, then these four second level values ​​form the second start flag; if four consecutive second level values ​​form 1010, then these four second level values ​​form the second stop flag, where 1 represents a high level and 0 represents a low level. Figure 6 As shown, the leakage signal is the second target signal, where the second start flag is 0101 and the second stop flag is 1010.

[0119] The second voltage level value between the second start flag and the second stop flag is used as the second target voltage level value. The information represented by the second target voltage level value includes the number of nodes in the server and the leakage information corresponding to each node. The sequence number of the second target voltage level value is obtained; for example, the sequence number of the second target voltage level value after the second start flag is set to 1, and so on, until the second stop flag. Because this invention uses three bits to transmit the number of server nodes, preset sequence numbers are used, for example, 1, 2, 3. Based on the second target voltage level value whose sequence number equals the preset sequence number, the number of nodes in the server is obtained, such as... Figure 6 As shown, in the leakage signal, the three bits following the second start flag 0101 represent the number of nodes.

[0120] Leakage information is obtained based on the second target level and the number of nodes, specifically including:

[0121] With a communication link count of 3, indicating limited read / write (IO) resources between the CPLD and BMC, the presence signal can be merged into the leakage signal to generate both a leakage and presence signal. This leakage and presence signal can serve as either a first or second target signal. When used as the first target signal, it's for transmitting presence information; when used as the second target signal, it's for transmitting leakage information. Figure 5 As shown, the second target level value contains both position information and leakage information, and the preset information distribution rule is as follows: Figure 5The leakage and presence signals first transmit the presence information of the node, and then transmit the leakage information of the node. Therefore, the presence information obtained according to the second target level value, the number of nodes, and the preset information distribution rules includes: after the second target level value representing the number of nodes, skipping a number of second target level values ​​equal to the number of nodes, and then obtaining the next number of second target level values ​​equal to the number of nodes. These second target level values ​​represent the presence information corresponding to the node.

[0122] When the number of communication links is greater than 3, it indicates that the read / write (IO) resources between the CPLD and BMC are sufficient. In this case, the CPLD uses different communication links to transmit the first target signal and the second target signal respectively. The second target signal is a leakage signal, such as... Figure 6 As shown. Figure 6 After transmitting the number of nodes in the leakage signal, only the leakage information of each node is transmitted. Therefore, after the second target level value representing the number of nodes, the number of second target level values ​​equal to the number of nodes is obtained. These second target level values ​​represent the leakage information corresponding to the nodes.

[0123] In this embodiment, a second level value is acquired from the second target signal based on the second sampling point. The number of nodes is obtained from the second level value, and leakage information is obtained based on the second target level value and the number of nodes. The number of nodes is communicated to the BMC server in real time, ensuring that the BMC can successfully obtain both presence and leakage information even if the number of nodes increases or decreases. Leakage information from all nodes is transmitted to the BMC at once, avoiding the need for the BMC to repeatedly poll the corresponding registers of each node via the I2C polling link.

[0124] In some optional implementations, the target node where the faulty liquid cooling device is located is determined based on the presence information or leakage information, and the target node is output, including:

[0125] Determine whether there is any in-situ information with a value equal to the first preset value;

[0126] If it exists, the in-situ information with a value equal to the first preset value is taken as the target in-situ information, and the node corresponding to the target in-situ information is taken as the first target node. The first preset value is used to determine whether the leakage detection module corresponding to the in-situ information is not in-situ, and the leakage detection module in the first target node is not in-situ.

[0127] Write the first target node to the maintenance log. The maintenance log is used to output target nodes, which include the first target node and the second target node.

[0128] Determine if there is any leakage information with a value equal to the second preset value;

[0129] If it exists, the leakage information with a value equal to the second preset value is taken as the target leakage information, and the node corresponding to the target leakage information is taken as the second target node. The second preset value is used to determine whether the liquid cooling equipment corresponding to the in-situ information is leaking, and the liquid cooling equipment in the second target node is leaking.

[0130] Write the second target node into the maintenance log.

[0131] Specifically, when the leakage detection module in a certain node is not in place, the corresponding level value is set to 0; otherwise, it is set to 1. Therefore, if the value of the presence information is 0, it means that the leakage detection module in the node corresponding to that presence information is not in place. This is illustrated using an example where the first preset value is 0:

[0132] Determine if any in-situ information with a value equal to 0 exists. If so, use this in-situ information as the target in-situ information. If the leakage detection module is not present in the node corresponding to the target in-situ information, designate it as the first target node. Write the first target node into the maintenance log. Subsequent server maintenance personnel can identify the first target node by reading the maintenance log and perform maintenance on it.

[0133] When a liquid cooling device in a node leaks, the corresponding voltage level is set to 0; otherwise, it is set to 1. Therefore, a leak information value of 0 indicates that the liquid cooling device in the node corresponding to that information has leaked. Taking a second preset value of 0 as an example: It is determined whether a leak information value of 0 exists. If it does, this leak information is taken as the target leak information. The node corresponding to the target leak information, where the liquid cooling device is leaking, is designated as the second target node. This second target node is written to the maintenance log. Subsequent server maintenance personnel can identify the second target node by reading the maintenance log and perform maintenance accordingly.

[0134] In this embodiment, BMC determines whether all leakage detection modules are in place using a first preset value and on-site information, and determines whether each liquid cooling device is leaking using a second preset value and leakage information. The nodes where the missing leakage detection modules are located and the nodes where the leaking liquid cooling devices are located are written into the maintenance log, which facilitates subsequent maintenance personnel to quickly locate the problem and perform maintenance.

[0135] This embodiment provides a method for detecting faults in liquid cooling equipment, which can be used in the aforementioned computer equipment. Figure 7 This is a flowchart of a liquid cooling equipment fault detection method according to an embodiment of the present invention. The method is applied to CPLDs, such as... Figure 7 As shown, the process includes the following steps:

[0136] Step S701: Obtain leakage information of liquid cooling devices and location information of leakage detection modules in each node of the server, wherein the leakage detection module is used to obtain leakage information.

[0137] Specifically, such as Figure 4 As shown, the head unit includes leak detection connector 1 (4 pins) and leak detection connector 2 (4 pins), which are connected to a leak detection module. The leak detection connector is a LEAKAGE HEADER 1×4P CONN. The leak detection module, for example, is a leak detection line, which can be two single wires intertwined and insulated from each other. When a leak occurs in the liquid cooling system, the two single wires are short-circuited by the liquid coolant, triggering the leak detection. When the leak detection module is not in place, the level signal of the PRSNT pin between leak detection connector 1 (4 pins) and leak detection connector 2 (4 pins) is logically ANDed to generate an presence information, which is then transmitted to the motherboard CPLD. The PRSNT pin is a commonly used pin for hot-swapping mechanisms. The leakage detection module detects leakage in the liquid cooling equipment and generates a weak level signal. This weak level signal is transmitted to amplifier 1 via leakage pin connector 2 (4 pins). After the signal is amplified by amplifier 1, leakage information is generated and transmitted to the motherboard CPLD. The amplifier is called Amplifier.

[0138] When the leak detection module in node 1 is not in place, leak pin connectors 3 (4 pins) and 4 (4 pins) will generate presence information and transmit it to the motherboard CPLD. After the leak detection module detects a leak in the liquid cooling device, it generates a weak voltage level signal, which is sent to amplifier 2 via leak pin connector 4 (4 pins). After amplification by amplifier 2, leak information is generated and transmitted to the motherboard CPLD. Leak pin connectors 5 (4 pins) and 6 (4 pins) in node 2 will generate presence information and transmit it to connector 2. Leak pin connector 6 (4 pins) will transmit the weak voltage level signal generated by the leak detection module to amplifier 3. After amplification, leak information is generated and transmitted to connector 2. Connector 2 will then transmit both the presence and leak information to the motherboard CPLD.

[0139] Therefore, the motherboard CPLD can obtain information on liquid cooling equipment leakage and the presence of leakage detection modules in each node of the server.

[0140] Step S702: Obtain the number of communication links between CPLD and BMC.

[0141] Specifically, there are multiple communication links between the motherboard CPLD and the BMC, such as I2C channels. The motherboard CPLD obtains the number of available communication links between itself and the BMC, and determines the subsequent communication links used to transmit the first target signal, the second target signal, the bit synchronization signal, and the interrupt signal based on this number of communication links.

[0142] Step S703: Determine the target communication link based on the number of communication links, and send the first target signal and the bit synchronization signal to the BMC through the target communication link, or send the interrupt signal, the second target signal and the bit synchronization signal to the BMC through the target communication link. The first target signal contains position information, the second target signal contains leakage information, the interrupt signal is used to instruct the BMC to receive the second target signal, and the bit synchronization signal is used to instruct the BMC to obtain position information from the first target signal or to obtain leakage information from the second target signal.

[0143] Specifically, the target communication links for transmitting the first target signal, the second target signal, the bit synchronization signal, and the interrupt signal are determined based on the number of communication links. For example, when the number of communication links is equal to 3, all 3 communication links are used as target communication links. One target communication link transmits the bit synchronization signal, one target communication link transmits the bit interrupt signal, and the last target communication link transmits the first target signal and the second target signal. When the number of communication links is greater than 3, 4 of them can be randomly selected as target communication links, and different target communication links are used to transmit the first target signal, the second target signal, the bit synchronization signal, and the interrupt signal, respectively.

[0144] The first target signal contains the presence information of the leakage detection module. The first target signal needs to be transmitted to the BMC simultaneously with the bit synchronization signal. At this time, the bit synchronization signal will instruct the BMC to sample from the first target signal based on which first sampling points to obtain the presence information.

[0145] The second target signal contains leakage information from the liquid cooling device. Before sending the second target signal, an interrupt signal needs to be sent to notify the BMC to receive the second target signal sent from the motherboard CPLD. Furthermore, a bit synchronization signal needs to be sent simultaneously when sending the second target signal. This bit synchronization signal instructs the BMC which second sampling points to use to sample from the second target signal to obtain the leakage information. In this invention, the sampling points used in the process of acquiring bit information are referred to as the first sampling points, and the sampling points used in the process of acquiring leakage information are referred to as the second sampling points; the two can be the same.

[0146] The liquid cooling equipment fault detection method provided in this embodiment transmits the presence information of all nodes to the BMC at once through a first target information and the leakage information of all nodes through a second target information, avoiding the need for the BMC to poll the corresponding registers of each node multiple times via I2C. Simultaneously, the CPLD determines the target communication link based on the number of communication links and sends a bit synchronization signal, instructing the BMC to obtain presence information from the first target signal or leakage information from the second target signal. This ensures that regardless of the number of communication links, the CPLD can transmit presence and leakage information to the BMC. This solves the problem in related technologies where, when the I2C polling link between the BMC and CPLD is interrupted, the BMC cannot promptly obtain leakage and presence signals from each node of the server, making it impossible to determine the node where the faulty liquid cooling equipment is located, which is detrimental to the maintenance of the liquid cooling equipment.

[0147] In some alternative implementations, the method further includes, before sending the interrupt signal, the second target signal, and the bit synchronization signal to the BMC via the target communication link:

[0148] A bit synchronization signal is generated based on the initial representative data and the preset clock period. The information in the bit synchronization signal after the initial representative data is generated based on the preset clock period.

[0149] Determine if there is any leakage information with a value equal to the second preset value. If so, generate an interrupt signal according to the preset interrupt flag.

[0150] Specifically, when the motherboard CPLD sends a bit synchronization signal, it first sends start-up representative data. For example, the bit synchronization signal can be high in the idle state. When preparing to send the bit synchronization signal, the level is first pulled low to indicate the start of bit synchronization signal transmission. Several bits are pulled low, changing several bits of the bit synchronization signal from 1 to 0. The specific number of data bits is set according to actual needs; therefore, the start-up representative data is, for example, 0. After sending the start-up representative data, a clock signal with a fixed periodic frequency is sent according to a preset clock cycle. This clock signal instructs the BMC to obtain in-situ or leakage information from which sampling points. Figure 5 and Figure 6 As shown in the figure, for ease of illustration, a series of dashed lines are drawn from the bit synchronization signal in the figure. These dashed lines represent the sampling point positions indicated by the unsynchronized signal.

[0151] When a liquid cooling device in a node leaks, the corresponding level value is set to 0; otherwise, it is set to 1. Therefore, if the leakage information value is 0, it indicates that the liquid cooling device in the node corresponding to that information has leaked. Taking a second preset value of 0 as an example: it checks whether there is a leakage information value of 0. If so, it indicates that a liquid cooling device is leaking. An interrupt signal is generated based on a preset interrupt flag. For example, the interrupt signal can be high in the idle state. When preparing to send an interrupt signal, several data bits of the interrupt signal are pulled low, that is, several bits of the interrupt signal are changed from 1 to 0. The specific number of data bits is set according to actual needs. Therefore, the preset interrupt flag is, for example, 0.

[0152] In this embodiment, the CPLD generates a bit synchronization signal and sends it to the BMC, notifying the BMC which sampling points to sample, generating a terminal signal, and instructing the BMC to receive the second target signal, thereby realizing the transmission of node number, in-situ information, and leakage information.

[0153] In some optional implementations, a target communication link is determined based on the number of communication links, and a first target signal and a bit synchronization signal are sent to the BMC through the target communication link; alternatively, an interrupt signal, a second target signal, and a bit synchronization signal are sent to the BMC through the target communication link, including:

[0154] When the number of communication links equals a preset threshold, a first target communication link is determined, wherein the first target communication link includes a first communication link, a second communication link, and a third communication link;

[0155] A first target signal is generated based on the in-situ information and the first timing scheme, and a bit synchronization signal is sent to the BMC through the first communication link, and the first target signal is sent to the BMC through the second communication link;

[0156] Determine whether there is leakage information with a value equal to the second preset value. If so, send an interrupt signal to the BMC through the third communication link, generate a second target signal based on the leakage information and the first timing scheme, send a bit synchronization signal to the BMC through the first communication link, and send the second target signal to the BMC through the second communication link.

[0157] If the number of communication links exceeds a preset threshold, a second target communication link is determined, wherein the second target communication link includes a fourth communication link, a fifth communication link, a sixth communication link, and a seventh communication link.

[0158] A first target signal is generated based on the in-situ information and the second timing scheme, and a bit synchronization signal is sent to the BMC through the fourth communication link, and the first target signal is sent to the BMC through the fifth communication link.

[0159] Determine whether there is leakage information with a value equal to the second preset value. If so, send an interrupt signal to the BMC through the sixth communication link, generate a second target signal based on the leakage information and the third timing scheme, send a bit synchronization signal to the BMC through the fourth communication link, and send the second target signal to the BMC through the seventh communication link.

[0160] Specifically, there are multiple communication links between the motherboard CPLD and the BMC, such as I2C channels. The motherboard CPLD obtains the number of available communication links between itself and the BMC, and determines the target communication links for transmitting the first target signal, the second target signal, the bit synchronization signal, and the interrupt signal based on this number. This is illustrated using a preset threshold of 3.

[0161] With three communication links, all three are used as the primary target communication link, namely the first communication link, the second communication link, and the third communication link. The first communication link transmits the bit synchronization signal, the second communication link transmits either the first target signal or the second target signal, and the third communication link transmits the interrupt signal.

[0162] For example, the first-order scheme: Figure 5 As shown, leakage information and presence information are transmitted to the BMC via leakage and presence signals, respectively. First, 4 bits are used as a start flag, then 3 bits are used to transmit the number of nodes. After transmitting the number of nodes, a number of bits equal to the number of nodes are used to transmit the presence information of each node to the BMC. Next, a number of bits equal to the number of nodes are used to transmit the leakage information of each node to the BMC. Finally, 4 bits are used as a stop flag. A first target signal is generated based on the presence information and the first timing scheme, and a bit synchronization signal is sent to the BMC via the first communication link. The first target signal is also sent to the BMC via the second communication link. Alternatively, the first timing scheme can also be: transmitting the presence and leakage information of one node first, then transmitting the presence and leakage information of the next node, until all information for each node has been transmitted; or, transmitting the leakage information of each node first, and then transmitting the presence information of each node, etc.

[0163] Taking a second preset value of 0 as an example: It is determined whether there is a leakage information with a value of 0. If so, it indicates that a liquid cooling device is leaking. An interrupt signal is sent to the BMC via the third communication link. A second target signal is generated based on the leakage information and the first timing scheme. A bit synchronization signal is sent to the BMC via the first communication link, and the second target signal is sent to the BMC via the second communication link.

[0164] When the number of communication links is greater than 3, 4 of them can be randomly selected as the second target communication links, namely the fourth communication link, the fifth communication link, the sixth communication link and the seventh communication link. The fourth communication link transmits the bit synchronization signal, the fifth communication link transmits the first target signal, the sixth communication link transmits the interrupt signal and the seventh communication link transmits the second target signal.

[0165] The second timing scheme is, for example: Figure 6 As shown, the presence information is transmitted to the BMC using the presence signal. First, 4 bits are used as a start identifier, then 3 bits are used to transmit the number of nodes. After that, each bit is used to transmit the presence information of one node, and finally, 4 bits are used as an end identifier. A first target signal is generated based on the presence information and the second timing scheme, and a bit synchronization signal is sent to the BMC via the fourth communication link. The first target signal is also sent to the BMC via the fifth communication link.

[0166] Third-order schemes, for example: Figure 6 As shown, leakage information is transmitted to the BMC using a leakage signal. First, 4 bits are used as a start identifier, then 3 bits are used to transmit the number of nodes. After that, each bit is used to transmit the leakage information of one node, and finally, 4 bits are used as a stop identifier. Taking a second preset value of 0 as an example: It is determined whether there is leakage information with a value of 0. If so, it indicates that a liquid cooling device is leaking. An interrupt signal is sent to the BMC via the sixth communication link. Upon receiving the interrupt signal, the BMC knows to receive the subsequent bit synchronization signal and second target signal. The second target signal is generated based on the leakage information and the third timing scheme. The bit synchronization signal is sent to the BMC via the fourth communication link, and the second target signal is sent to the BMC via the seventh communication link.

[0167] In this embodiment, the target communication link for transmitting signals is determined based on the number of communication links and a preset threshold, so that the CPLD can transmit position information and leakage information to the BMC regardless of the number of communication links.

[0168] In some alternative implementations, such as Figure 4As shown, the motherboard CPLD sends simulated leakage control signals to switch 1, switch 2 of node 1, and switch 3 of node 2 via connector 1. Upon receiving the simulated leakage control signals, switch 1 sends a simulated leakage signal group to leakage pin connector 1 (4 pins), switch 2 sends a simulated leakage signal group to leakage pin connector 3 (4 pins), and switch 3 sends a simulated leakage signal group to leakage pin connector 5 (4 pins). The switches are called Switches. The simulated leakage control signals control leakage pin connectors 2 (4 pins), 4 (4 pins), and 6 (4 pins) to return simulated leakage information with a value equal to a second preset value.

[0169] Specifically, the leak detection module is connected between leak pin connector 1 (4 pins) and leak pin connector 2 (4 pins) in the head unit node. The leak detection module can be, for example, a leak detection line, which consists of two single wires wound together and insulated from each other. When a leak occurs in the liquid cooling equipment, the two single wires short-circuit due to the liquid cooling water, thus triggering the leak detection. The leak detection module is also connected between leak pin connector 3 (4 pins) and leak pin connector 4 (4 pins) in node 1, as well as between leak pin connector 5 (4 pins) and leak pin connector 6 (4 pins) in node 1.

[0170] A leak detection module, for example, uses a leak detection line. When a leak occurs in the liquid cooling system, the two individual wires in the leak detection line are short-circuited by the liquid coolant, triggering the leak detection. The simulated leak signal group short-circuits the ends of the two individual wires of the leak detection line, or pulls one of the main detection lines low, simulating liquid coolant dripping onto the cable. This causes the leak detection line to identify a leak in the liquid cooling system and send a corresponding weak voltage level signal. After amplification, this signal generates simulated leak information with a value equal to a second preset value. The motherboard CPLD can simultaneously send simulated leak control signals to one or more of switches 1, 2, and 3.

[0171] After the motherboard CPLD receives the simulated leak information, it generates a second target signal and transmits it to the BMC. The BMC then determines the node corresponding to the simulated leak information and writes it into the maintenance log. For example, if the motherboard CPLD only sends the simulated leak control signal to switch 2 at node 1, the node for which the BMC ultimately writes the liquid cooling equipment leak in the maintenance log will also be node 1, thus the accuracy of the liquid cooling equipment fault detection method meets the requirements.

[0172] In this embodiment, a CPLD is used to send a simulated leakage control signal to the switch in the node to verify the accuracy of the liquid cooling equipment fault detection method, ensuring that the present invention meets the actual accuracy requirements.

[0173] This embodiment also provides a liquid cooling equipment fault detection device, which is used to implement the above embodiments and preferred embodiments; details already described will not be repeated. As used below, the term "module" can be a combination of software and / or hardware that implements a predetermined function. Although the device described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.

[0174] This embodiment provides a liquid cooling equipment fault detection device, which is deployed in a BMC, such as... Figure 8 As shown, it includes:

[0175] The first obtaining module 801 is used to obtain a first sampling point based on the bit synchronization signal when it receives a first target signal and a bit synchronization signal sent by the CPLD. The first target information includes the in-situ information of the leakage detection module. The leakage detection module is used to obtain leakage information of the liquid cooling device in each node of the server.

[0176] The second obtaining module 802 is used to obtain in-situ information based on the first sampling point and the first target signal;

[0177] The third module 803 is used to obtain a second sampling point based on the bit synchronization signal when it receives an interrupt signal, a second target signal and a bit synchronization signal sent by the CPLD. The interrupt signal is used to instruct the BMC to receive the second target signal, which contains leakage information.

[0178] The fourth module 804 is used to obtain leakage information based on the second sampling point and the second target signal;

[0179] The determination module 805 is used to determine the target node where the faulty liquid cooling device is located based on the presence information or leakage information, and output the target node.

[0180] In some alternative implementations, the second obtaining module 802 includes:

[0181] The first obtaining unit is used to obtain a first number of first level values ​​based on the first sampling point and the first target signal;

[0182] The first determining unit is configured to determine a first start flag and a first stop flag from a first number of first level values;

[0183] The first acquisition unit is used to take the first level value between the first start flag and the first stop flag as the first target level value, and acquire the sequence number of the first target level value;

[0184] The second obtaining unit is used to obtain the number of nodes in the server based on the first target level value whose sequence number is equal to the preset sequence number;

[0185] The second acquisition unit is used to acquire the number of communication links between the CPLD and the BMC;

[0186] The third obtaining unit is used to obtain the on-site information based on the first target level value and the number of nodes when the number of communication links is greater than a preset threshold.

[0187] The fourth obtaining unit is used to obtain the in-situ information based on the first target level value, the number of nodes, and the preset information distribution rules when the number of communication links is equal to a preset threshold.

[0188] In some alternative implementations, the fourth obtaining module 804 includes:

[0189] The fifth obtaining unit is used to obtain a second number of second level values ​​based on the second sampling point and the second target signal;

[0190] The second determining unit is configured to determine a second start flag and a second stop flag from a second number of second level values;

[0191] The first unit is used to take the second level value between the second start flag and the second stop flag as the second target level value;

[0192] The sixth obtaining unit is used to obtain the number of nodes in the server from the second target level value, and obtain leakage information based on the second target level value and the number of nodes.

[0193] In some alternative implementations, the determining module 805 includes:

[0194] The first judgment unit is used to determine whether there is in-situ information with a value equal to the first preset value;

[0195] The second unit is used to, if it exists, take the in-situ information with a value equal to the first preset value as the target in-situ information, and take the node corresponding to the target in-situ information as the first target node, wherein the first preset value is used to determine whether the leakage detection module corresponding to the in-situ information is not in-situ, and the leakage detection module in the first target node is not in-situ.

[0196] The first writing unit is used to write the first target node into the maintenance log, wherein the maintenance log is used to output the target node, and the target node includes the first target node and the second target node;

[0197] The second judgment unit is used to determine whether there is leakage information with a value equal to the second preset value;

[0198] The third unit is used to, if it exists, take the leakage information with a value equal to the second preset value as the target leakage information and take the node corresponding to the target leakage information as the second target node. The second preset value is used to determine whether the liquid cooling equipment corresponding to the in-situ information is leaking, and the liquid cooling equipment in the second target node is leaking.

[0199] The second writing unit is used to write the second target node into the maintenance log.

[0200] This embodiment provides a fault detection device for liquid cooling equipment, which is deployed in a CPLD, such as... Figure 9 As shown, it includes:

[0201] The first acquisition module 901 is used to acquire leakage information of liquid cooling equipment in each node of the server and the location information of leakage detection module, wherein the leakage detection module is used to acquire leakage information.

[0202] The second acquisition module 902 is used to acquire the number of communication links between the CPLD and the BMC;

[0203] The transmitting module 903 is used to determine the target communication link based on the number of communication links, and transmit the first target signal and the bit synchronization signal to the BMC through the target communication link, or transmit the interrupt signal, the second target signal and the bit synchronization signal to the BMC through the target communication link. The first target signal contains position information, the second target signal contains leakage information, the interrupt signal is used to instruct the BMC to receive the second target signal, and the bit synchronization signal is used to instruct the BMC to obtain the position information from the first target signal or the leakage information from the second target signal.

[0204] In some alternative embodiments, the device further includes:

[0205] The generation module is used to generate a bit synchronization signal based on the initial representative data and a preset clock period, wherein the information in the bit synchronization signal after the initial representative data is generated based on the preset clock period.

[0206] The judgment module is used to determine whether there is leakage information with a value equal to the second preset value. If it exists, an interrupt signal is generated according to the preset interrupt flag.

[0207] In some alternative implementations, the sending module 903 includes:

[0208] The third determining unit is used to determine a first target communication link when the number of communication links is equal to a preset threshold, wherein the first target communication link includes a first communication link, a second communication link and a third communication link;

[0209] The first transmitting unit is used to generate a first target signal based on the in-situ information and the first timing scheme, and to transmit the bit synchronization signal to the BMC through the first communication link and the first target signal to the BMC through the second communication link.

[0210] The second sending unit is used to determine whether there is leakage information with a value greater than the second preset value. If there is, it sends an interrupt signal to the BMC through the third communication link, generates a second target signal according to the leakage information and the first timing scheme, sends a bit synchronization signal to the BMC through the first communication link, and sends the second target signal to the BMC through the second communication link.

[0211] The fourth determining unit is used to determine the second target communication link when the number of communication links is equal to a preset threshold, wherein the second target communication link includes the fourth communication link, the fifth communication link, the sixth communication link and the seventh communication link;

[0212] The third transmitting unit is used to generate a first target signal based on the in-situ information and the second timing scheme, and to transmit the bit synchronization signal to the BMC through the fourth communication link and the first target signal to the BMC through the fifth communication link.

[0213] The third judgment unit is used to determine whether there is leakage information with a value equal to the second preset value. If there is, it sends an interrupt signal to the BMC through the sixth communication link, generates a second target signal according to the leakage information and the third timing scheme, sends a bit synchronization signal to the BMC through the fourth communication link, and sends the second target signal to the BMC through the seventh communication link.

[0214] Further functional descriptions of the above modules and units are the same as those in the corresponding embodiments described above, and will not be repeated here.

[0215] In this embodiment, the liquid cooling equipment fault detection device is presented in the form of a functional unit. Here, a unit refers to an ASIC (Application Specific Integrated Circuit) circuit, a processor and memory that execute one or more software or fixed programs, and / or other devices that can provide the above functions.

[0216] This invention also provides a computer device having the above-described features. Figure 8 and Figure 9 The liquid cooling equipment fault detection device shown is shown.

[0217] Please see Figure 10 , Figure 10 This is a schematic diagram of the structure of a computer device provided in an optional embodiment of the present invention, such as... Figure 10As shown, the computer device includes one or more processors 10, memory 20, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The components communicate with each other via different buses and can be mounted on a common motherboard or otherwise installed as needed. The processors can process instructions executed within the computer device, including instructions stored in or on memory to display graphical information of a GUI on external input / output devices (such as display devices coupled to the interfaces). In some alternative implementations, multiple processors and / or multiple buses can be used with multiple memories and multiple memory modules, if desired. Similarly, multiple computer devices can be connected, each providing some of the necessary operations (e.g., as a server array, a group of blade servers, or a multiprocessor system). Figure 10 Take a processor 10 as an example.

[0218] Processor 10 may be a central processing unit, a network processor, or a combination thereof. Processor 10 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof. The programmable logic device may be a complex programmable logic device (CAMP), a field-programmable gate array (FPGA), a general-purpose array logic (GDA), or any combination thereof.

[0219] The memory 20 stores instructions executable by at least one processor 10 to cause at least one processor 10 to perform the method shown in the above embodiments.

[0220] The memory 20 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created based on the use of the computer device. Furthermore, the memory 20 may include high-speed random access memory and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, the memory 20 may optionally include memory remotely located relative to the processor 10, and these remote memories may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

[0221] The memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk or solid-state drive; the memory 20 may also include a combination of the above types of memory.

[0222] The computer device also includes a communication interface 30 for communicating with other devices or communication networks.

[0223] This invention also provides a computer-readable storage medium. The methods described above according to embodiments of the invention can be implemented in hardware or firmware, or implemented as computer code that can be recorded on a storage medium, or implemented as computer code downloaded via a network and originally stored on a remote storage medium or a non-transitory machine-readable storage medium and then stored on a local storage medium. Thus, the methods described herein can be processed by software stored on a storage medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware. The storage medium can be a magnetic disk, optical disk, read-only memory, random access memory, flash memory, hard disk, or solid-state drive, etc.; further, the storage medium can also include combinations of the above types of memory. It is understood that computers, processors, microprocessor controllers, or programmable hardware include storage components capable of storing or receiving software or computer code, which, when accessed and executed by the computer, processor, or hardware, implements the methods shown in the above embodiments.

[0224] Although embodiments of the invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations all fall within the scope defined by the appended claims.

Claims

1. A method for fault detection in liquid cooling equipment, characterized in that, The method is applied to BMC, and the method includes: Upon receiving the first target signal and bit synchronization signal sent by the CPLD, the first sampling point is obtained based on the bit synchronization signal. The first target signal contains the presence information of the leakage detection module, which is used to obtain leakage information of the liquid cooling equipment in each node of the server. The in-situ information is obtained based on the first sampling point and the first target signal; The step of obtaining the presence information based on the first sampling point and the first target signal includes: obtaining a first number of first level values ​​based on the first sampling point and the first target signal; determining a first start flag and a first stop flag from the first number of first level values; taking the first level value between the first start flag and the first stop flag as a first target level value and obtaining the sequence number of the first target level value; obtaining the number of nodes in the server based on the first target level value whose sequence number is equal to a preset sequence number; obtaining the number of communication links between the CPLD and the BMC; if the number of communication links is greater than a preset threshold, obtaining the presence information based on the first target level value and the number of nodes; if the number of communication links is equal to the preset threshold, obtaining the presence information based on the first target level value, the number of nodes, and a preset information distribution rule. Upon receiving the interrupt signal, the second target signal, and the bit synchronization signal sent by the CPLD, a second sampling point is obtained based on the bit synchronization signal, wherein the interrupt signal is used to instruct the BMC to receive the second target signal, and the second target signal contains the leakage information; The leakage information is obtained based on the second sampling point and the second target signal; The step of obtaining the leakage information based on the second sampling point and the second target signal includes: obtaining a second number of second level values ​​based on the second sampling point and the second target signal; determining a second start flag and a second stop flag from the second number of second level values; taking the second level value between the second start flag and the second stop flag as a second target level value; obtaining the number of nodes in the server from the second target level value; and obtaining the leakage information based on the second target level value and the number of nodes. The target node where the faulty liquid cooling device is located is determined based on the in-situ information or the leakage information, and the target node is output.

2. The method according to claim 1, characterized in that, The step of determining the target node where the faulty liquid cooling device is located based on the in-situ information or the leakage information, and outputting the target node, includes: Determine whether there is any in-situ information with a value equal to the first preset value; If it exists, the in-situ information with a value equal to the first preset value is taken as the target in-situ information, and the node corresponding to the target in-situ information is taken as the first target node. The first preset value is used to determine whether the leakage detection module corresponding to the in-situ information is not in-situ, and the leakage detection module in the first target node is not in-situ. Write the first target node into the maintenance log, wherein the maintenance log is used to output the target node, and the target node includes the first target node and the second target node; Determine if there is any leakage information with a value equal to the second preset value; If it exists, the leakage information with a value equal to the second preset value is taken as the target leakage information, and the node corresponding to the target leakage information is taken as the second target node. The second preset value is used to determine whether the liquid cooling equipment corresponding to the in-situ information is leaking, and the liquid cooling equipment in the second target node is leaking. Write the second target node into the maintenance log.

3. A method for fault detection in liquid cooling equipment, characterized in that, The method is applied to a CPLD, and the method includes: The system obtains leakage information of liquid cooling devices and the location information of leakage detection modules in each node of the server, wherein the leakage detection module is used to obtain the leakage information. Obtain the number of communication links between the CPLD and the BMC; The target communication link is determined based on the number of communication links. The first target signal and the bit synchronization signal are sent to the BMC through the target communication link. Alternatively, the interrupt signal, the second target signal, and the bit synchronization signal are sent to the BMC through the target communication link. The first target signal contains the presence information, the second target signal contains the leakage information, the interrupt signal is used to instruct the BMC to receive the second target signal, and the bit synchronization signal is used to instruct the BMC to obtain the presence information from the first target signal or the leakage information from the second target signal. The step of determining a target communication link based on the number of communication links, and sending a first target signal and a bit synchronization signal to the BMC through the target communication link, or sending an interrupt signal, a second target signal, and the bit synchronization signal to the BMC through the target communication link, includes: determining a first target communication link when the number of communication links equals a preset threshold, wherein the first target communication link includes a first communication link, a second communication link, and a third communication link; generating the first target signal based on the in-situ information and a first timing scheme, and sending the bit synchronization signal to the BMC through the first communication link, and sending the first target signal to the BMC through the second communication link; determining whether there is leakage information with a value equal to a second preset value, and if so, sending the interrupt signal to the BMC through the third communication link, generating the second target signal based on the leakage information and the first timing scheme, and sending the interrupt signal to the BMC through the first communication link. The bit synchronization signal is sent to the BMC, and the second target signal is sent to the BMC via the second communication link. If the number of communication links is greater than the preset threshold, a second target communication link is determined, wherein the second target communication link includes a fourth communication link, a fifth communication link, a sixth communication link, and a seventh communication link. The first target signal is generated based on the in-situ information and the second timing scheme, and the bit synchronization signal is sent to the BMC via the fourth communication link, and the first target signal is sent to the BMC via the fifth communication link. It is determined whether there is leakage information with a value equal to the second preset value. If so, the interrupt signal is sent to the BMC via the sixth communication link. The second target signal is generated based on the leakage information and the third timing scheme, and the bit synchronization signal is sent to the BMC via the fourth communication link, and the second target signal is sent to the BMC via the seventh communication link.

4. The method according to claim 3, characterized in that, Before sending the interrupt signal, the second target signal, and the bit synchronization signal to the BMC via the target communication link, the method further includes: The bit synchronization signal is generated based on the initial representative data and the preset clock period, wherein the information in the bit synchronization signal after the initial representative data is generated based on the preset clock period; Determine whether there is leakage information with a value equal to the second preset value. If so, generate the interrupt signal according to the preset interrupt flag.

5. A fault detection device for liquid cooling equipment, characterized in that, The device is deployed in the BMC, and the device includes: The first obtaining module is used to obtain a first sampling point based on the bit synchronization signal when receiving a first target signal and a bit synchronization signal sent by the CPLD. The first target signal includes the presence information of the leakage detection module, and the leakage detection module is used to obtain leakage information of the liquid cooling equipment in each node of the server. The second obtaining module is used to obtain the in-situ information based on the first sampling point and the first target signal; The second obtaining module includes: a first obtaining unit, configured to obtain a first number of first level values ​​based on the first sampling point and the first target signal; a first determining unit, configured to determine a first start flag and a first stop flag from the first number of first level values; a first acquiring unit, configured to take the first level value between the first start flag and the first stop flag as a first target level value and acquire the sequence number of the first target level value; a second obtaining unit, configured to obtain the number of nodes of the node in the server based on the first target level value whose sequence number is equal to a preset sequence number; a second acquiring unit, configured to acquire the number of communication links between the CPLD and the BMC; a third obtaining unit, configured to obtain the presence information based on the first target level value and the number of nodes when the number of communication links is greater than a preset threshold; and a fourth obtaining unit, configured to obtain the presence information based on the first target level value, the number of nodes, and a preset information distribution rule when the number of communication links is equal to the preset threshold. The third obtaining module is used to obtain a second sampling point based on the bit synchronization signal when receiving the interrupt signal, the second target signal and the bit synchronization signal sent by the CPLD, wherein the interrupt signal is used to instruct the BMC to receive the second target signal, and the second target signal contains the leakage information; The fourth module is used to obtain the leakage information based on the second sampling point and the second target signal; The fourth obtaining module includes: a fifth obtaining unit, configured to obtain a second number of second level values ​​based on the second sampling point and the second target signal; a second determining unit, configured to determine a second start flag and a second stop flag from the second number of second level values; a first using unit, configured to use the second level value between the second start flag and the second stop flag as a second target level value; and a sixth obtaining unit, configured to obtain the number of nodes in the server from the second target level value, and obtain the leakage information based on the second target level value and the number of nodes. The determination module is used to determine the target node where the faulty liquid cooling device is located based on the in-situ information or the leakage information, and output the target node.

6. A fault detection device for liquid cooling equipment, characterized in that, The device is deployed in a CPLD, and the device includes: The first acquisition module is used to acquire leakage information of liquid cooling equipment and location information of leakage detection module in each node of the server, wherein the leakage detection module is used to acquire the leakage information; The second acquisition module is used to acquire the number of communication links between the CPLD and the BMC; The transmitting module is configured to determine a target communication link based on the number of communication links, and transmit a first target signal and a bit synchronization signal to the BMC through the target communication link, or transmit an interrupt signal, a second target signal, and the bit synchronization signal to the BMC through the target communication link. The first target signal contains the presence information, the second target signal contains the leakage information, the interrupt signal instructs the BMC to receive the second target signal, and the bit synchronization signal instructs the BMC to obtain the presence information from the first target signal or the leakage information from the second target signal. The sending module includes: a third determining unit, configured to determine a first target communication link when the number of communication links equals a preset threshold, wherein the first target communication link includes a first communication link, a second communication link, and a third communication link; a first sending unit, configured to generate the first target signal according to the in-situ information and a first timing scheme, and send the bit synchronization signal to the BMC through the first communication link, and send the first target signal to the BMC through the second communication link; a second sending unit, configured to determine whether there is leakage information with a value equal to a second preset value, and if so, send the interrupt signal to the BMC through the third communication link, generate the second target signal according to the leakage information and the first timing scheme, send the bit synchronization signal to the BMC through the first communication link, and send the second target signal to the BMC through the second communication link; The fourth determining unit is configured to determine a second target communication link when the number of communication links is greater than the preset threshold, wherein the second target communication link includes a fourth communication link, a fifth communication link, a sixth communication link, and a seventh communication link; the third transmitting unit is configured to generate the first target signal according to the in-situ information and the second timing scheme, and transmit the bit synchronization signal to the BMC through the fourth communication link, and transmit the first target signal to the BMC through the fifth communication link; the third judging unit is configured to judge whether there is leakage information with a value equal to the second preset value, and if so, transmit the interrupt signal to the BMC through the sixth communication link, generate the second target signal according to the leakage information and the third timing scheme, transmit the bit synchronization signal to the BMC through the fourth communication link, and transmit the second target signal to the BMC through the seventh communication link.

7. A computer device, characterized in that, include: The system includes a memory and a processor, which are interconnected and the memory stores computer instructions. The processor executes the computer instructions to perform the liquid cooling equipment fault detection method according to any one of claims 1 to 4.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions for causing the computer to execute the liquid cooling equipment fault detection method according to any one of claims 1 to 4.