Server component heat dissipation control methods, devices, storage media, and computer equipment
By reading backup temperature sensor and communication link status information after the server starts up to predict power consumption and obtain fan speed adjustment parameters, the problem of insufficient heat dissipation during the initialization phase of server components is solved, precise heat dissipation control is achieved, and the stability and reliability of the server are improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NINGCHANG INFORMATION TECH (HANGZHOU) CO LTD
- Filing Date
- 2026-03-12
- Publication Date
- 2026-06-30
AI Technical Summary
During the initialization phase, server components may experience insufficient heat dissipation due to the inability to obtain timely and effective temperature data, which can easily lead to overheating risks.
After the server starts up, it reads data from backup temperature sensors around the target component and communication link status information to predict and correct component power consumption, and obtains fan speed control parameters to control the cooling fan.
It achieves precise heat dissipation control during component initialization, avoids the risk of overheating, improves server stability and reliability, and optimizes energy utilization efficiency.
Smart Images

Figure CN122309276A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of server heat dissipation technology, and in particular to a method, apparatus, storage medium and computer equipment for controlling heat dissipation of server components. Background Technology
[0002] With the development of cloud computing, artificial intelligence, and big data technologies, the number of high-power components and their performance requirements in servers are constantly increasing. Taking GPUs (Graphics Processing Units) and network cards as examples, high-power components consume significantly more power to meet these demands, generating a large amount of heat during operation. If this heat cannot be dissipated in time, it can lead to hardware overheating, performance degradation, or even system crashes.
[0003] The current mainstream server cooling logic is as follows: After the system boots up, the BMC (Baseboard Management Controller) waits for the GPU and network card to complete power-on initialization and confirms their presence and normal temperature data before allowing it to participate in cooling speed regulation. If the temperature sensor data is abnormal, the temperature of surrounding sensors is used; if both are abnormal, the speed is adjusted according to the power consumption data and the corresponding rotation speed parameters.
[0004] During the power-up phase, the GPU / NIC's own initialization process is time-consuming, and temperature sensor data cannot be immediately acquired by the BMC. From power-up to the point where the internal main temperature sensor provides reliable readings, the GPU / NIC undergoes a complete process from hardware self-test to software readiness. This includes controller firmware self-initialization, PCIe (Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard used to connect the processor and peripheral devices (such as GPUs and NICs)) link training, and the loading of the host operating system and successful loading of dedicated drivers. The most reliable readable data occurs after the operating system driver is loaded. During this period, the BMC may obtain invalid data, all F values, or no response when reading relevant registers, making it impossible to determine if the component is present, and the main and backup sensors cannot participate in thermal speed regulation.
[0005] The existing BMC cooling logic relies on the in-situ status and temperature and power consumption data of the GPU / network card, and does not have a transitional cooling solution for the initialization phase. However, the GPU / network card will still generate heat during the initialization phase. Insufficient cooling can easily lead to overheating risks and even trigger hardware protection mechanisms. Summary of the Invention
[0006] In view of this, embodiments of this application provide a method, apparatus, storage medium, and computer device for controlling the heat dissipation of server components. After the server is started, the method reads the temperature data from the backup temperature sensor of the target component and obtains the communication link status information to predict and correct the current power consumption of the target component, and then obtains the corresponding fan speed control parameters to control the server cooling fan. This aims to solve the problem of insufficient heat dissipation caused by the inability to obtain effective temperature data in a timely manner during the initialization phase of the server components, and to achieve more accurate and effective heat dissipation control of the server components.
[0007] According to one aspect of this application, a method for controlling heat dissipation of server components is provided, the method comprising: After the server starts up, it reads the first temperature data of the backup temperature sensor corresponding to the target component to be cooled, and obtains the communication link status information corresponding to the target component. The backup temperature sensor is set within a preset range centered on the target component. Based on the communication link status information and the nominal power consumption of the target component, the current power consumption of the target component is predicted, and the predicted current power consumption is corrected based on the first temperature data to obtain the target power consumption. Obtain the fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
[0008] The beneficial effects are: it can solve the problem of insufficient heat dissipation caused by the inability to obtain effective temperature data in a timely manner during the component initialization stage, thus avoiding the risk of overheating; it can improve the accuracy and efficiency of heat dissipation by accurately predicting and adjusting power consumption by comprehensively considering multiple factors; and it can dynamically adapt to the diverse heat dissipation needs of different component models, thereby enhancing the flexibility and versatility of heat dissipation control.
[0009] Optionally, the communication link status information includes the highest link speed, highest link width, current negotiation rate, current link width, and signal quality parameters; the step of predicting the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component includes: Based on the highest link speed and the highest link width, determine the maximum link rate power consumption coefficient; based on the current negotiated rate and the current link width, determine the current link rate power consumption coefficient; and based on the signal quality parameters, determine the signal compensation factor. Calculate the ratio of the current link rate power consumption coefficient to the maximum link rate power consumption coefficient, and determine the signal compensation coefficient based on the signal compensation factor; The product of the nominal power consumption of the target component, the ratio, and the signal compensation coefficient is calculated as the current power consumption prediction value of the target component.
[0010] The beneficial effects are as follows: By analyzing multiple key parameters in the communication link status information and determining different power consumption coefficients and compensation factors, the current power consumption prediction value of the target component is calculated. This more accurately reflects the actual power consumption of the target component under its current operating state, providing more accurate data support for subsequent control of the server's cooling fans. This helps to achieve more precise heat dissipation control, avoiding insufficient or excessive heat dissipation caused by inaccurate power consumption prediction, improving server stability and reliability, and also optimizing energy utilization efficiency to some extent.
[0011] Optionally, the step of correcting the current power consumption prediction value based on the first temperature data to obtain the target power consumption includes: Based on the first temperature data and the temperature data of the previous moment corresponding to the first temperature data, calculate the temperature change data corresponding to the target component, and determine the power consumption correction coefficient based on the product of the temperature change data and the preset gain coefficient. The target power consumption is calculated by multiplying the current power consumption prediction value by the power consumption correction coefficient.
[0012] The beneficial effects are as follows: By introducing temperature data and its changes from a backup temperature sensor, and combining this with a preset gain coefficient to correct the current power consumption prediction, the actual operating status of the target component can be considered more comprehensively and dynamically. Temperature change data can reflect changes in the surrounding environment and the component's own heat generation in real time, while the preset gain coefficient can be flexibly adjusted according to different components and scenarios, making power consumption correction more accurate. The target power consumption obtained in this way is closer to the actual situation, providing a more reliable basis for controlling server cooling fans, helping to achieve more precise heat dissipation control, avoiding insufficient or excessive heat dissipation caused by inaccurate power consumption prediction, improving server stability and reliability, and optimizing energy utilization efficiency.
[0013] Optionally, obtaining the fan speed control parameters corresponding to the target power consumption and controlling the server's cooling fan based on the fan speed control parameters includes: If there are multiple target components, the fan speed control parameters corresponding to the target power consumption of each target component are obtained respectively. The target fan speed control parameters are determined based on the largest fan speed control parameters, and the cooling fan of the server is controlled based on the target fan speed control parameters. If there is only one target component, the fan speed control parameter corresponding to the target power consumption is obtained as the target fan speed control parameter, and the cooling fan of the server is controlled based on the target fan speed control parameter.
[0014] The beneficial effects are as follows: It takes into account the varying number of target components in the server and employs a flexible strategy to determine the target fan speed parameters. When there are multiple target components, selecting the maximum fan speed parameter as the target parameter ensures that all components receive sufficient heat dissipation, preventing performance degradation or damage due to insufficient cooling of some components, thus improving server stability and reliability. When there is only one target component, its corresponding fan speed parameter is directly controlled, avoiding unnecessary high-speed fan operation, reducing energy consumption, and achieving energy-saving effects. Furthermore, this method of dynamically adjusting fan speed based on actual conditions better adapts to the heat dissipation needs under different server workloads, optimizing the overall performance and heat dissipation efficiency of the server.
[0015] Optionally, controlling the server's cooling fan based on the target fan speed parameters includes: The target duty cycle is determined based on the target fan speed control parameters, and a target duty cycle signal is generated based on the target duty cycle. The target duty cycle signal is sent to the cooling fan of the server so that the cooling fan adjusts its speed based on the target duty cycle signal.
[0016] The beneficial effects are as follows: by converting the target fan speed control parameters into a target duty cycle signal and transmitting it to the cooling fan for speed adjustment, precise control of the server's cooling fan is achieved. This control method can dynamically adjust the fan speed according to the server's actual cooling needs, avoiding energy waste caused by the fan always running at high speed and reducing the server's power consumption. At the same time, precise speed adjustment can ensure that the server maintains a suitable operating temperature under different workloads, preventing performance degradation and hardware damage caused by overheating, improving the server's stability and reliability, and extending the server's lifespan.
[0017] Optionally, after the server starts, the method further includes: By reading the second temperature data corresponding to the main temperature sensor of the target component, it is determined whether the main temperature sensor is ready. The ready conditions of the main temperature sensor include successfully reading the second temperature data and the difference between adjacent second temperature data in the second temperature data read a preset number of times is within a preset temperature fluctuation range. If the main temperature sensor is ready, the cooling fan is controlled based on the heat dissipation control logic corresponding to the main temperature sensor.
[0018] The beneficial effects are as follows: During server startup or in the event of a primary temperature sensor failure, the backup temperature sensor can function promptly, accurately predicting and correcting the power consumption of target components by combining communication link status information. This enables effective control of the cooling fans, preventing insufficient or excessive heat dissipation due to inaccurate temperature measurements. Once the primary temperature sensor is ready, the system switches to its corresponding, more refined heat dissipation control logic, further optimizing the cooling effect.
[0019] Optionally, after controlling the cooling fan based on the heat dissipation control logic corresponding to the main temperature sensor, the method further includes: Stop reading the first temperature data of the backup temperature sensor, and determine whether the main temperature sensor has entered a fault state by reading the second temperature data of the main temperature sensor corresponding to the target component. The fault state conditions of the main temperature sensor include failure to read the second temperature data, or the difference between adjacent second temperature data read is outside the preset temperature fluctuation range. If the main temperature sensor malfunctions, the process returns to the step of reading the first temperature data from the backup temperature sensor corresponding to the target component to be cooled.
[0020] The beneficial effects are: continuous monitoring and timely handling of the main temperature sensor's fault status ensures that the system can quickly switch back to the backup solution when the main temperature sensor malfunctions, ensuring that the server is always in a good heat dissipation state, extending the service life of the server hardware, and improving the overall performance and operational stability of the server.
[0021] According to another aspect of this application, a component heat dissipation control device for a server is provided, the device comprising: The communication module is used to read the first temperature data of the backup temperature sensor corresponding to the target component to be cooled after the server starts up, and to obtain the communication link status information corresponding to the target component. The backup temperature sensor is set within a preset range centered on the target component. The calculation module is used to predict the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component, and to correct the current power consumption prediction value based on the first temperature data to obtain the target power consumption. The control module is used to acquire fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
[0022] According to another aspect of this application, a storage medium is provided that stores a computer program thereon, which, when executed by a processor, implements the above-described component heat dissipation control method for the server.
[0023] According to another aspect of this application, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor executes the program to implement the above-described component heat dissipation control method for a server.
[0024] By employing the above technical solutions, this application provides a method, apparatus, storage medium, and computer device for controlling the heat dissipation of server components. Regarding server component heat dissipation, after the server starts up, it first reads the temperature data from a backup temperature sensor around the target component and obtains the communication link status information. Based on this, it predicts and corrects the current power consumption of the target component to obtain the target power consumption. Then, it obtains the fan speed control parameters corresponding to the target power consumption to control the cooling fan. Its advantages include: solving the problem of insufficient heat dissipation caused by the inability to obtain effective temperature data in a timely manner during component initialization, avoiding the risk of overheating; improving the accuracy and efficiency of heat dissipation by comprehensively and accurately predicting and adjusting power consumption based on multiple factors; and dynamically adapting to the diverse heat dissipation needs of different component models, enhancing the flexibility and versatility of heat dissipation control.
[0025] The above description is only an overview of the technical solution of this application. In order to better understand the technical means of this application and to implement it in accordance with the contents of the specification, and to make the above and other objects, features and advantages of this application more obvious and understandable, the following are specific embodiments of this application. Attached Figure Description
[0026] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings: Figure 1 A schematic flowchart of a server component heat dissipation control method provided in an embodiment of this application is shown; Figure 2 A flowchart illustrating another method for controlling heat dissipation of server components provided in an embodiment of this application is shown. Figure 3 This illustration shows a schematic diagram of device interaction for a server according to an embodiment of this application; Figure 4 A flowchart illustrating another component heat dissipation control method for a server provided in an embodiment of this application is shown. Figure 5 This illustration shows a schematic diagram of a server's BMC information processing procedure according to an embodiment of this application; Figure 6 This illustration shows a structural schematic diagram of a server component heat dissipation control device according to an embodiment of this application; Figure 7A schematic diagram of the device structure of a computer device provided in an embodiment of this application is shown. Detailed Implementation
[0027] The present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be noted that, unless otherwise specified, the embodiments and features described in the embodiments of the present application can be combined with each other.
[0028] This embodiment provides a method for controlling the heat dissipation of server components, such as... Figure 1 As shown, the method includes: Step 101: After the server starts, it reads the first temperature data of the backup temperature sensor corresponding to the target component to be cooled, and obtains the communication link status information corresponding to the target component. The backup temperature sensor is set within a preset range centered on the target component. Step 102: Based on the communication link status information and the nominal power consumption of the target component, predict the current power consumption of the target component, and correct the current power consumption prediction value based on the first temperature data to obtain the target power consumption; Step 103: Obtain the fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
[0029] This application provides a method for controlling the heat dissipation of server components. After the server starts up, it reads the temperature data from the backup temperature sensor of the target component and obtains the communication link status information to predict and correct the current power consumption of the target component. Then, it obtains the corresponding fan speed control parameters to control the server cooling fan. This method aims to solve the problem of insufficient heat dissipation caused by the inability to obtain effective temperature data in a timely manner during the initialization phase of the server components, and achieve more accurate and effective heat dissipation control of the server components.
[0030] First, during the initial server startup phase, target components (such as GPUs and network cards) may be in the initialization stage, and their primary temperature sensors cannot immediately provide reliable temperature data. At this time, the temperature around the target component is obtained by reading the first temperature data from a backup temperature sensor located within a preset range centered on the target component. Simultaneously, communication link status information is acquired. This status reflects the communication between the target component and other parts of the server (such as the BMC), enabling subsequent prediction of the target component's power consumption.
[0031] Secondly, by combining communication link status information with the nominal power consumption of the target component (i.e., the typical power consumption value of the component during normal operation), the current power consumption of the target component during the startup phase can be initially predicted. However, since the actual situation may deviate, the current power consumption prediction value can be corrected using the first temperature data from the backup temperature sensor read in step 101. Because there is a certain correlation between temperature and power consumption, the ambient temperature can help to more accurately determine the actual power consumption of the target component, ultimately obtaining a more accurate target power consumption.
[0032] Finally, different power consumptions correspond to different heat generation. To effectively dissipate heat, appropriate fan speed parameters can be determined based on the target power consumption. By establishing a pre-defined relationship between power consumption and fan speed parameters, a fan speed parameter matching the target power consumption is found. Then, the server's cooling fan is controlled according to this parameter, ensuring it operates at an appropriate speed, thereby achieving effective heat dissipation for the target component. This technical solution addresses the problem that traditional methods, during server component initialization, cannot obtain timely and effective data from the component's main temperature sensor, leading to insufficient heat dissipation and a risk of overheating. This solution reads data from a backup temperature sensor and combines it with communication link status information to predict and correct power consumption, thereby controlling the fan speed. This enables effective heat dissipation control during component initialization, preventing overheating. Furthermore, by comprehensively considering the temperature data from the backup temperature sensor, communication link status information, and the nominal power consumption of the target component to predict and correct the target power consumption, rather than using uniform heat dissipation parameters, it more accurately reflects the actual heat generation of the target component. This allows for precise control of the fan speed based on the actual heat generation, improving heat dissipation efficiency and preventing overheating or underheating. In addition, since the heat dissipation requirements of different server components vary greatly, this technical solution does not rely on the fixed parameters of specific components. Instead, it dynamically predicts and adjusts power consumption and fan speed according to the actual situation, which can better adapt to the diverse heat dissipation requirements of different component models and enhance the flexibility and versatility of server heat dissipation control.
[0033] By applying the technical solution of this embodiment for server component heat dissipation, after the server starts up, the temperature data from the backup temperature sensor around the target component to be cooled is first read, and the communication link status information is obtained. Based on this, the current power consumption of the target component is predicted and corrected to obtain the target power consumption. Then, the fan speed control parameters corresponding to the target power consumption are obtained to control the cooling fan. Its advantages are that it can solve the problem of insufficient heat dissipation caused by the inability to obtain effective temperature data in a timely manner during component initialization, avoiding the risk of overheating; by comprehensively and accurately predicting and adjusting power consumption based on multiple factors, the accuracy and efficiency of heat dissipation are improved; and it can dynamically adapt to the diverse heat dissipation needs of different component models, enhancing the flexibility and versatility of heat dissipation control.
[0034] In this embodiment of the application, optionally, the communication link status information in step 102 includes the maximum link speed, maximum link width, current negotiation rate, current link width, and signal quality parameters; the step of predicting the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component includes: Based on the highest link speed and the highest link width, determine the maximum link rate power consumption coefficient; based on the current negotiated rate and the current link width, determine the current link rate power consumption coefficient; and based on the signal quality parameters, determine the signal compensation factor. Calculate the ratio of the current link rate power consumption coefficient to the maximum link rate power consumption coefficient, and determine the signal compensation coefficient based on the signal compensation factor; The product of the nominal power consumption of the target component, the ratio, and the signal compensation coefficient is calculated as the current power consumption prediction value of the target component.
[0035] In the above embodiments, by acquiring multiple key parameters from the communication link status information, including the maximum link speed, maximum link width, current negotiation rate, current link width, and signal quality parameters, different power consumption coefficients and compensation factors are determined based on these parameters. After a series of calculations, the current power consumption prediction value of the target component is obtained, providing a key basis for subsequent accurate heat dissipation control.
[0036] Specifically, the maximum link rate power consumption coefficient is determined based on the highest link speed and highest link width in the communication link status information. The highest link speed and width represent the theoretically maximum transmission capacity achievable by the communication link. Different combinations of speed and width correspond to different power consumption levels, so a maximum link rate power consumption coefficient can be determined based on them to reflect the power consumption characteristics of the link under maximum transmission capacity. This can be determined by a weighted summation of the highest link speed and highest link width. The current link rate power consumption coefficient is determined based on the current negotiation rate and current link width. The current negotiation rate and link width are transmission parameters in actual operation of the link, determining its current actual transmission capacity. The current link rate power consumption coefficient determined by these parameters reflects the power consumption of the link under its current operating state. This can also be determined by a weighted summation of the current negotiation rate and current link width. A signal compensation factor is determined based on signal quality parameters. Signal quality affects the operating efficiency and power consumption of components. Better signal quality may lead to more stable component operation and relatively lower power consumption; conversely, poor signal quality may result in higher power consumption. Therefore, signal quality parameters can be converted into a signal compensation factor to correct for power consumption. Finally, the ratio of the current link rate power consumption coefficient to the maximum link rate power consumption coefficient is calculated. This ratio reflects the relative relationship between the current link's actual transmission capacity and its maximum transmission capacity, and to a certain extent, it reflects the proportion of a component's power consumption in its current operating state relative to its maximum power consumption. The signal compensation coefficient is determined based on a signal compensation factor. The signal compensation factor is a quantitative representation of the impact of signal quality on power consumption. It is converted into a signal compensation coefficient through certain rules to further adjust the predicted power consumption value, making it more consistent with the actual situation. Specifically, the sum of 1 and the signal compensation factor can be used as the signal compensation coefficient. The nominal power consumption of the target component, the above ratio, and the signal compensation coefficient are calculated, and this product is used as the current predicted power consumption value of the target component. The nominal power consumption is the typical power consumption value of the component under normal operation. Combining the previously calculated ratio reflecting the link's operating state and signal quality with the compensation coefficient, multiple factors can be comprehensively considered to more accurately predict the actual power consumption of the target component at present.
[0037] This technical solution analyzes multiple key parameters in the communication link status information, determines different power consumption coefficients and compensation factors, and then calculates the current power consumption prediction value of the target component. This more accurately reflects the actual power consumption of the target component under its current operating state, providing more precise data support for subsequent control of the server's cooling fans. This helps achieve more precise heat dissipation control, avoiding insufficient or excessive heat dissipation caused by inaccurate power consumption prediction, improving server stability and reliability, and also optimizing energy utilization efficiency to some extent.
[0038] In this embodiment of the application, optionally, the step 102 of correcting the current power consumption prediction value based on the first temperature data to obtain the target power consumption includes: Based on the first temperature data and the temperature data of the previous moment corresponding to the first temperature data, calculate the temperature change data corresponding to the target component, and determine the power consumption correction coefficient based on the product of the temperature change data and the preset gain coefficient. The target power consumption is calculated by multiplying the current power consumption prediction value by the power consumption correction coefficient.
[0039] In the above embodiments, based on the current power consumption prediction value of the target component, the first temperature data collected by the backup temperature sensor and the temperature data at the previous moment are further used to calculate the temperature change data and combine it with the preset gain coefficient to obtain the power consumption correction coefficient. Finally, the target power consumption is obtained by multiplying the current power consumption prediction value by the power consumption correction coefficient, thereby improving the accuracy of power consumption prediction and providing a more reliable basis for heat dissipation control.
[0040] Specifically, the backup temperature sensor continuously collects temperature information around the target component, obtaining the first temperature data (current temperature) and its corresponding previous temperature data. By subtracting the previous temperature from the current temperature, the temperature change of the target component during that time period can be obtained. This temperature change data reflects the dynamic trend of the temperature around the target component, such as whether the temperature is rising, falling, or remaining stable. The preset gain coefficient is a pre-set parameter used to adjust the degree of influence of temperature changes on power consumption correction. Different server components and heat dissipation scenarios may require different gain coefficients, which are usually determined through extensive experimental testing and data analysis. The calculated temperature change data is multiplied by the preset gain coefficient, and the power consumption correction coefficient is determined based on the result. For example, the power consumption correction coefficient is the sum of the product and 1. The magnitude of this coefficient depends on the temperature change data and the gain coefficient, reflecting the degree and direction of the influence of temperature changes on the power consumption of the target component. The current power consumption prediction value is a preliminary estimate of the target component's power consumption based on factors such as communication link status information. However, due to the influence of various factors, this prediction value may have some error. By multiplying the current power consumption prediction value by the power consumption correction coefficient, the prediction value can be adjusted and corrected. If the power consumption correction factor is greater than 1, it indicates that the temperature change suggests that the power consumption of the target component may be higher than the predicted value, and the corrected target power consumption will increase; if the power consumption correction factor is less than 1, it indicates that the power consumption of the target component may be lower than the predicted value, and the corrected target power consumption will decrease, thus obtaining a more accurate target power consumption.
[0041] This technical solution incorporates temperature data and its changes from a backup temperature sensor, combined with a preset gain coefficient, to correct the current power consumption prediction. This allows for a more comprehensive and dynamic consideration of the actual operating status of the target component. Temperature change data reflects real-time changes in the surrounding environment and the component's own heat generation, while the preset gain coefficient can be flexibly adjusted according to different components and scenarios, resulting in more accurate power consumption correction. The target power consumption obtained in this way is closer to the actual situation, providing a more reliable basis for controlling server cooling fans. This helps achieve more precise heat dissipation control, avoiding insufficient or excessive heat dissipation caused by inaccurate power consumption prediction, improving server stability and reliability, and optimizing energy efficiency.
[0042] Optionally, in this embodiment of the application, step 103, which involves obtaining the fan speed control parameters corresponding to the target power consumption and controlling the server's cooling fan based on the fan speed control parameters, includes: If there are multiple target components, the fan speed control parameters corresponding to the target power consumption of each target component are obtained respectively. The target fan speed control parameters are determined based on the largest fan speed control parameters, and the cooling fan of the server is controlled based on the target fan speed control parameters. If there is only one target component, the fan speed control parameter corresponding to the target power consumption is obtained as the target fan speed control parameter, and the cooling fan of the server is controlled based on the target fan speed control parameter.
[0043] In the above embodiments, when there are multiple target components, the maximum value is selected as the target parameter by comprehensively considering the fan speed control parameters corresponding to each component; when there is only one target component, its corresponding fan speed control parameter is directly used as the target parameter, so as to ensure that the server can achieve reasonable heat dissipation under different component working states.
[0044] Specifically, the fan speed control parameters for each target component are obtained separately. When multiple target components in a server require cooling, each component has a different target power consumption, and the fan speed control parameters are closely related to this target power consumption. Therefore, for each target component, based on its respective target power consumption, a pre-defined power consumption-speed parameter mapping relationship is used (this mapping relationship is usually determined based on extensive experimental data and cooling models; different power consumption ranges correspond to different fan speed adjustment values). For example, the target power consumption of component A corresponds to a fan speed of 50%, and the target power consumption of component B corresponds to a fan speed of 70%. Further, after obtaining the fan speed control parameters for each target component, these parameters are compared, and the maximum value is found. This maximum value represents the highest fan speed required to meet the cooling needs of all components under the current operating conditions of multiple target components. This maximum value is determined as the target fan speed control parameter to ensure that even the component requiring the highest speed cooling receives sufficient cooling. Then, based on the target fan speed control parameters, the server's cooling fan control system adjusts the fan speed. For example, if the target fan speed parameter is 70% of its rated speed, the control system will adjust the fan speed to 70% of its rated speed, allowing the fan to operate at an appropriate speed to dissipate heat from the server components. When only one target component in the server needs cooling, the system directly uses the target power consumption of that component and a pre-set power consumption-speed parameter mapping relationship to obtain the corresponding fan speed parameter, and sets this parameter as the target fan speed parameter. For example, if the target power consumption of this component corresponds to a fan speed of 60% of its rated speed, then 60% of its rated speed is the target fan speed parameter. The cooling fan control system then adjusts the fan speed to the appropriate value based on this target fan speed parameter to meet the cooling needs of that single target component.
[0045] This technical solution considers different numbers of target components in the server and employs a flexible strategy to determine the target fan speed parameters. When there are multiple target components, the maximum fan speed parameter is selected as the target parameter, ensuring that all components receive sufficient heat dissipation and preventing performance degradation or damage due to insufficient heat dissipation in some components, thus improving server stability and reliability. When there is only one target component, its corresponding fan speed parameter is directly controlled, avoiding unnecessary high-speed fan operation, reducing energy consumption, and achieving energy-saving effects. Furthermore, this method of dynamically adjusting fan speed based on actual conditions better adapts to the heat dissipation needs under different server workloads, optimizing the overall performance and heat dissipation efficiency of the server.
[0046] In this embodiment of the application, optionally, controlling the cooling fan of the server based on the target fan speed control parameters includes: determining a target duty cycle based on the target fan speed control parameters, and generating a target duty cycle signal based on the target duty cycle; sending the target duty cycle signal to the cooling fan of the server so that the cooling fan adjusts its speed based on the target duty cycle signal.
[0047] In the above embodiments, after determining the target fan speed adjustment parameters, they are further converted into control signals that the cooling fan can recognize. The fan speed is precisely adjusted through signal transmission, thereby ensuring that the server can maintain a suitable operating temperature under different operating conditions and improving the stability and reliability of the server operation.
[0048] Specifically, pulse width modulation (PWM) technology is commonly used to adjust fan speed in server cooling fan control. The duty cycle of a PWM signal refers to the proportion of time a high-level signal occupies within a cycle. By changing the duty cycle, the average voltage of the fan drive circuit can be altered, thereby controlling the fan speed. A mapping relationship between the target fan speed control parameter and the target duty cycle is established beforehand. This mapping relationship is determined based on extensive experimental data and fan characteristic curves. For example, when the target fan speed control parameter is 50% speed, the corresponding target duty cycle might be 30%; when the target fan speed control parameter is 100% speed, the corresponding target duty cycle might be 90%. Alternatively, the target fan speed control parameter can be directly defined as the target duty cycle. Then, the server's BMC (Browser Controlled Controller) is used to generate the target duty cycle signal. The BMC can have dedicated timer and counter modules. By programming the timer period and counter count value, a PWM signal with a specific duty cycle can be generated. The BMC configures the timer and counter parameters according to the determined target duty cycle. For example, if the target duty cycle is 40%, within a set period, the control chip will keep the output signal in a high-level state for 40% of the cycle time and in a low-level state for 60% of the cycle time, thereby generating a target duty cycle signal that meets the requirements. Furthermore, the target duty cycle signal is typically transmitted from the control chip to the cooling fan's drive circuit via a specific signal line. This could be a trace on a printed circuit board (PCB) or a dedicated connection cable (such as a PCIe link). Upon receiving the target duty cycle signal, the cooling fan's drive circuit adjusts the average voltage applied to the fan motor according to the signal's duty cycle. For example, when receiving a high duty cycle signal, the drive circuit increases the average voltage across the motor, increasing the fan speed; when receiving a low duty cycle signal, the drive circuit decreases the average voltage across the motor, decreasing the fan speed.
[0049] This technical solution achieves precise control of server cooling fans by converting target fan speed parameters into target duty cycle signals and transmitting them to the cooling fans for speed adjustment. This control method dynamically adjusts fan speeds based on the server's actual cooling needs, avoiding energy waste caused by fans constantly running at high speeds and reducing server power consumption. Simultaneously, precise speed adjustment ensures the server maintains suitable operating temperatures under different workloads, preventing performance degradation and hardware damage due to overheating, thus improving server stability and reliability and extending server lifespan.
[0050] Furthermore, as a refinement and extension of the specific implementation of the above embodiments, and to fully illustrate the specific implementation process of this embodiment, another method for controlling the heat dissipation of server components is provided, such as... Figure 2 As shown, the method includes: Step 201: Start the server.
[0051] Step 202: Read the first temperature data from the backup temperature sensor corresponding to the target component to be cooled, and obtain the communication link status information corresponding to the target component, wherein the backup temperature sensor is set within a preset range centered on the target component; predict the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component, and correct the predicted current power consumption value based on the first temperature data to obtain the target power consumption; obtain the fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
[0052] Step 203: By reading the second temperature data corresponding to the main temperature sensor of the target component, determine whether the main temperature sensor is ready. The ready conditions of the main temperature sensor include successfully reading the second temperature data and the difference between adjacent second temperature data in the second temperature data read consecutively for a preset number of times is within a preset temperature fluctuation range.
[0053] Step 204: If the main temperature sensor is ready, control the cooling fan based on the heat dissipation control logic corresponding to the main temperature sensor; stop reading the first temperature data of the backup temperature sensor, and determine whether the main temperature sensor has entered a fault state by reading the second temperature data corresponding to the main temperature sensor of the target component. The fault state conditions of the main temperature sensor include failure to read the second temperature data, or the difference between adjacent second temperature data read is outside the preset temperature fluctuation range.
[0054] Step 205: If the main temperature sensor enters a fault state, return to step 202; otherwise, maintain the heat dissipation control logic corresponding to the main temperature sensor to control the cooling fan.
[0055] In the above embodiment, after the server starts up, the power consumption of the target component is predicted and corrected by using a backup temperature sensor combined with communication link status information, and then the cooling fan is controlled. Afterwards, the main temperature sensor is used to determine if the component is ready. If it is ready, the cooling control logic switches to the main temperature sensor, and the main temperature sensor is continuously monitored for failure. If the main temperature sensor fails, the process returns to using the backup temperature sensor for control, thereby ensuring the stability and reliability of the server component cooling.
[0056] Specifically, after the server is powered on and the hardware system completes initialization, it enters the working state, and steps 202 and 203 are executed synchronously. In step 202, the first temperature data from a backup temperature sensor located within a preset range centered on the target component is read. This data reflects the temperature around the target component. Simultaneously, the communication link status information corresponding to the target component is acquired, as the communication link status affects the target component's power consumption. Based on the communication link status information and the target component's nominal power consumption, the current power consumption of the target component is predicted. Since predictions may contain errors, the predicted power consumption value is corrected based on the first temperature data to obtain a more accurate target power consumption. The fan speed control parameters corresponding to the target power consumption are acquired, and the server's cooling fan is controlled according to these parameters to operate at an appropriate speed to meet the target component's current cooling needs. In step 203, the readout status is determined by reading the second temperature data from the main temperature sensor corresponding to the target component. There are two conditions for the main temperature sensor to be ready: first, successful reading of the second temperature data ensures that the sensor can work normally and acquire temperature information; second, the difference between adjacent second temperature data reads within a preset number of consecutive reads is within a preset temperature fluctuation range. This indicates that the temperature measured by the main temperature sensor is relatively stable, the data is reliable, and it can enter normal working state. If the main temperature sensor is ready, the cooling fan is controlled based on the corresponding heat dissipation control logic of the main temperature sensor, and the reading of the first temperature data from the backup temperature sensor is stopped to reduce unnecessary data processing and resource consumption. The main temperature sensor can usually reflect the core temperature of the target component more accurately, and its corresponding heat dissipation control logic may be more refined and optimized, better meeting the heat dissipation requirements of the component. At the same time, the main temperature sensor is continuously checked for fault conditions by reading the second temperature data. Fault conditions include failure to read the second temperature data, indicating that the sensor may be damaged or have an abnormal connection; or the difference between adjacent second temperature data reads is outside the preset temperature fluctuation range, indicating that the temperature measurement is unstable and the data is unreliable. If the main temperature sensor malfunctions, it means that accurate heat dissipation control can no longer be achieved using the main temperature sensor. In this case, return to step 202, and re-utilize the backup temperature sensor in conjunction with communication link status information to predict and correct the power consumption of the target component, and control the cooling fan to ensure that the heat dissipation of the server components is not affected by the main temperature sensor malfunction. It should be noted that while returning to step 202, the process of determining whether the main temperature sensor is ready in step 203 is executed simultaneously. If it is determined that the main temperature sensor is ready, it means that the fault has been repaired, and the process of step 204 can continue.
[0057] This technical solution improves the reliability and stability of server component heat dissipation control through the coordinated operation of backup and main temperature sensors. During server startup or in the event of a main temperature sensor failure, the backup sensor can promptly activate, accurately predicting and correcting the power consumption of target components by combining communication link status information. This enables effective control of the cooling fans, preventing insufficient or excessive heat dissipation due to inaccurate temperature measurements. Once the main temperature sensor is ready, the system switches to its corresponding, more refined heat dissipation control logic to further optimize cooling performance. Simultaneously, continuous monitoring and timely handling of main temperature sensor malfunctions ensure a rapid switchback to the backup solution, guaranteeing the server remains in a well-ventilated state, extending the lifespan of server hardware, and improving overall server performance and operational stability.
[0058] In a specific application scenario, server component heat dissipation control methods can be implemented through BMC, such as... Figure 3 As shown, this method mainly involves the following components: High-power component module: The core components are the GPU and high-speed network card. On the one hand, it transmits data with the BMC through communication buses such as PCIe / I2C; on the other hand, its PCIe link status can be read by the BMC in real time. EEPROM: Stores static characteristic information of the GPU / network card components, such as device ID, manufacturer's nominal power consumption (or typical power consumption), maximum speed, etc. Main temperature sensor: Usually located inside the GPU or network card components, used to directly monitor its core temperature. The BMC obtains the data through communication buses such as PCIe / I2C for GPU / network card heat dissipation control. Backup temperature sensor: Deployed around the GPU / network card components, it transmits temperature data to the BMC through the I2C bus, serving as auxiliary monitoring for the main sensor's heat dissipation control. Fan: Receives PWM control signals from the BMC and achieves heat dissipation by adjusting the fan speed duty cycle. CPLD (Complex Programmable Logic Device) is a programmable digital integrated circuit used to implement custom logic functions. It determines whether the GPU / network card components are "in place" (i.e., physically connected and powered) by monitoring the power enable pin's level and feeds this status back to the BMC via the I2C bus. BMC: As the system's central hub, its internal components include: Communication module: responsible for data exchange with the CPLD, EEPROM, backup temperature sensor, PCIe link, and fan; Storage module: embeds a thermal parameter table and a lightweight power prediction algorithm model; Computation module: comprehensively analyzes all input information, executes the prediction algorithm, and makes the final speed adjustment decision.
[0059] like Figure 4 , 5 As shown, the specific process is as follows: 1. Pre-configuration stage: The administrator uses a programming tool to pre-enter the EEPROM with the characteristic information of the GPU / network card components that are planned to be installed on the server, including the identification ID and the manufacturer's nominal power consumption.
[0060] 2. System Power-On and PCIe Initialization: When the server is powered on, the motherboard, CPU, and PCIe slots begin receiving power. At this time, the electrical signals between the GPU / network card module and the motherboard's PCIe slots begin negotiation, automatically determining the highest link speed (e.g., Gen1, Gen2) and maximum link width (e.g., x8, x16) that both can support. After successful negotiation, the link status (current speed, width, signal quality parameters) is automatically written to the LinkStatus Register in the PCIe device's configuration space. This process is entirely handled by hardware logic, requiring no software or firmware intervention.
[0061] At this time, the BMC is also powering on and initializing. It initializes its own I2C and PCIe controllers and establishes communication with PCIe devices or CPLDs on the motherboard. After the PCIe link training is complete, the BMC is also ready. At this point, the host CPU is still in POST state, and the operating system has not yet been loaded.
[0062] 3. Presence Status Detection: The CPLD determines the presence status of each component by reading the power enable GPIO pin level of its slot and writes the result into a specific internal register. The BMC reads this register via the I2C bus to obtain the real-time presence status of all components in all slots.
[0063] 4. Data Acquisition: For any GPU / network card component detected as present by the CPLD, the BMC synchronously performs the following operations: a. The BMC reads the backup temperature sensor. It reads data from the backup temperature sensor located near the component via I2C. If the reading is within the valid range, it is used as an auxiliary monitoring indicator; if the reading is abnormal, the fixed speed logic is triggered as a safety fallback.
[0064] b. BMC analyzes the PCIe link status of the component. The BMC reads key data from the link status register of the PCIe controller located inside the GPU or network card: the BMC reads in real time the current negotiated link speed, current link width, signal quality parameters (Lane Margin), maximum link speed, and maximum link width of the component's PCIe link. The power consumption of the device's PCIe interface is an important component of its total power consumption, and the interface power consumption is strongly positively correlated with the link speed and width.
[0065] 5. Dynamic power consumption prediction and parameter matching: The BMC matches the characteristic information of the in-situ component from the EEPROM by the identifier ID, especially the manufacturer's nominal power consumption value.
[0066] The BMC's built-in lightweight power prediction model begins operation. This model takes the previously read PCIe link state parameters and the manufacturer's nominal power consumption as inputs, and calculates using the formula: Predicted power consumption = Manufacturer's nominal power consumption × (Current link rate power consumption coefficient / Maximum link rate power consumption coefficient) × (1 + Signal compensation factor) Final power consumption = Predicted power consumption × (1 + K × ΔT) The final power consumption of the component is calculated in real time.
[0067] Table 1 Formula Parameter Table
[0068] Based on the calculated predicted final power consumption value, the BMC queries the built-in "final power consumption-speed" mapping table to dynamically obtain the corresponding fan speed control parameters.
[0069] 6. Speed Control Decision and Execution: The BMC uses the fan speed parameters of each component, derived from predicted power consumption, as the required duty cycle for each component. If there are multiple high-power components in the server that require cooling, the BMC will take the maximum value among the required duty cycles of each component as the final target.
[0070] The final execution duty cycle = MAX(GPU1 duty cycle, GPU2 duty cycle, ..., NIC1 duty cycle, ...) The BMC ultimately sends the calculated PWM duty cycle signal to the fan to control its speed.
[0071] 7. Switching to the primary sensor: When the BMC successfully reads the temperature data from the primary sensor of the GPU / network card component via the PCIe / I2C bus, and the readings are consistently stable and reasonable, the primary sensor is deemed ready. At this point, the BMC transfers control from the predictive model to the primary sensor's thermal management. The BMC disables the corresponding backup sensor for this component.
[0072] 8. Initialization and speed adjustment complete: The GPU / network card component initialization and speed adjustment are complete, and the system enters the normal operation phase, with the main sensor handling all heat dissipation and speed adjustment. If a fault causes the system to enter predictive mode, it will remain in this mode until the main sensor is repaired or the component is replaced.
[0073] This application solves the problem of heat dissipation gap when the main temperature sensor is not ready during the GPU / network card initialization stage by adopting a dynamic heat dissipation framework of "pre-sensing speed regulation as the main method and temperature monitoring as the auxiliary method". It uses the PCIe link status to predict power consumption in real time and drive fan speed regulation. Combined with the trend correction of the backup temperature sensor, it realizes advanced heat dissipation control during the initialization stage, avoids overheating and frequency reduction, and reduces heat dissipation power consumption during the initialization stage.
[0074] Beneficial effects include at least: 1. Predicting component power consumption by real-time analysis of PCIe link training status. The system can read key data such as the current negotiation rate, link width, and signal quality parameters from the link status register of the PCIe device, and combine this data with the manufacturer's nominal power consumption to perform dynamic power consumption prediction using a built-in lightweight power prediction model. This technology can detect component power consumption trends in advance, thus providing a basis for proactive thermal speed adjustment.
[0075]
[0076]
[0077] 2. A unique heat dissipation control framework was constructed, with pre-sensing speed regulation as the core, supplemented by temperature monitoring. During the component initialization phase, the main sensor may not yet be ready. At this time, the system mainly relies on power consumption prediction for heat dissipation speed regulation. This pre-sensing speed regulation method can quickly respond to changes in component power consumption and adjust the fan speed in advance to ensure heat dissipation effect. At the same time, a backup temperature sensor serves as an auxiliary monitoring method, monitoring the ambient temperature of the component in real time, providing additional reference for heat dissipation speed regulation, and triggering fixed speed logic as a safety net when necessary, further enhancing the reliability and stability of the system.
[0078] 3. Using the BMC as the system's central hub, a multi-module collaborative thermal speed control system was constructed. The BMC is internally divided into a communication module, a storage module, and a computing module, each responsible for data interaction with the other modules, storing thermal parameter tables and power consumption prediction algorithm models, and comprehensively analyzing input information to make speed control decisions. The modules collaborate closely through clearly defined connections and data interaction logic: high-power component modules transmit data to the BMC via communication buses such as PCIe / I2C; the EEPROM stores static characteristic information for the BMC to read; the main temperature sensor and backup temperature sensor provide core and auxiliary temperature data, respectively; the CPLD detects the on-state of components and feeds it back to the BMC; and the fan receives PWM control signals from the BMC to achieve heat dissipation. This architecture fully leverages the advantages of each module, achieving efficient and precise thermal speed control, effectively addressing the complex thermal requirements of server GPU / network card components during the initialization phase.
[0079] Furthermore, as Figure 1 In terms of specific implementation, this application provides a component heat dissipation control device for a server, such as... Figure 6 As shown, the device includes: The communication module is used to read the first temperature data of the backup temperature sensor corresponding to the target component to be cooled after the server starts up, and to obtain the communication link status information corresponding to the target component. The backup temperature sensor is set within a preset range centered on the target component. The calculation module is used to predict the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component, and to correct the current power consumption prediction value based on the first temperature data to obtain the target power consumption. The control module is used to acquire fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
[0080] Optionally, the communication link status information includes the highest link speed, the highest link width, the current negotiation rate, the current link width, and signal quality parameters; the calculation module is specifically used for: Based on the highest link speed and the highest link width, determine the maximum link rate power consumption coefficient; based on the current negotiated rate and the current link width, determine the current link rate power consumption coefficient; and based on the signal quality parameters, determine the signal compensation factor. Calculate the ratio of the current link rate power consumption coefficient to the maximum link rate power consumption coefficient, and determine the signal compensation coefficient based on the signal compensation factor; The product of the nominal power consumption of the target component, the ratio, and the signal compensation coefficient is calculated as the current power consumption prediction value of the target component.
[0081] Optionally, the computing module is further used for: Based on the first temperature data and the temperature data of the previous moment corresponding to the first temperature data, calculate the temperature change data corresponding to the target component, and determine the power consumption correction coefficient based on the product of the temperature change data and the preset gain coefficient. The target power consumption is calculated by multiplying the current power consumption prediction value by the power consumption correction coefficient.
[0082] Optionally, the control module is specifically used for: If there are multiple target components, the fan speed control parameters corresponding to the target power consumption of each target component are obtained respectively. The target fan speed control parameters are determined based on the largest fan speed control parameters, and the cooling fan of the server is controlled based on the target fan speed control parameters. If there is only one target component, the fan speed control parameter corresponding to the target power consumption is obtained as the target fan speed control parameter, and the cooling fan of the server is controlled based on the target fan speed control parameter.
[0083] Optionally, the control module is specifically used for: The target duty cycle is determined based on the target fan speed control parameters, and a target duty cycle signal is generated based on the target duty cycle. The target duty cycle signal is sent to the cooling fan of the server so that the cooling fan adjusts its speed based on the target duty cycle signal.
[0084] Optionally, the communication module is further configured to, after the server starts, determine whether the main temperature sensor is ready by reading the second temperature data corresponding to the main temperature sensor corresponding to the target component. The ready condition of the main temperature sensor includes successfully reading the second temperature data and the difference between adjacent second temperature data in the second temperature data read consecutively for a preset number of times is within a preset temperature fluctuation range. The control module is further configured to control the cooling fan based on the heat dissipation control logic corresponding to the main temperature sensor if the main temperature sensor is ready.
[0085] Optionally, the communication module is further configured to: After controlling the cooling fan based on the heat dissipation control logic corresponding to the main temperature sensor, the reading of the first temperature data of the backup temperature sensor is stopped, and the second temperature data corresponding to the main temperature sensor of the target component is read to determine whether the main temperature sensor has entered a fault state. The fault state conditions of the main temperature sensor include failure to read the second temperature data, or the difference between adjacent second temperature data read is outside the preset temperature fluctuation range. If the main temperature sensor malfunctions, the process returns to the step of reading the first temperature data from the backup temperature sensor corresponding to the target component to be cooled.
[0086] It should be noted that other corresponding descriptions of the functional units involved in the server component heat dissipation control device provided in this application embodiment can be found in the following references. Figures 1 to 5 The corresponding descriptions in the method will not be repeated here.
[0087] This application also provides a computer device, which may specifically be a personal computer, a server, a network device, etc. Figure 7As shown, the computer device includes a bus, a processor, memory, and a communication interface, and may also include an input / output interface and a display device. The processor provides computing and control capabilities. The memory includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The database stores location information. The network interface allows communication with external terminals via a network connection. When the computer program is executed by the processor, it implements the steps in the various method embodiments.
[0088] Those skilled in the art will understand that Figure 7 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0089] In one embodiment, a computer-readable storage medium is provided, which may be non-volatile or volatile, having stored thereon a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0090] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0091] It should be noted that the user personal information involved in the embodiments of this application is all authorized (with the knowledge and consent) by the relevant parties or fully authorized by all parties, and the executing entity can obtain it through various legal and compliant means. The collection, storage, use, processing, transmission, provision, and disclosure of the information, data, and signals involved all comply with the relevant laws and regulations of the relevant countries and regions, and do not violate public order and good morals. It should be noted that if any software tools or components other than those of this company appear in the embodiments of this application, they are merely illustrative examples and do not represent actual use.
[0092] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments described above. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, graphics processors, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0093] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0094] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A method for controlling heat dissipation of server components, characterized in that, The method includes: After the server starts up, it reads the first temperature data of the backup temperature sensor corresponding to the target component to be cooled, and obtains the communication link status information corresponding to the target component. The backup temperature sensor is set within a preset range centered on the target component. Based on the communication link status information and the nominal power consumption of the target component, the current power consumption of the target component is predicted, and the predicted current power consumption is corrected based on the first temperature data to obtain the target power consumption. Obtain the fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
2. The method according to claim 1, characterized in that, The communication link status information includes the highest link speed, highest link width, current negotiation rate, current link width, and signal quality parameters; the prediction of the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component includes: Based on the highest link speed and the highest link width, determine the maximum link rate power consumption coefficient; based on the current negotiated rate and the current link width, determine the current link rate power consumption coefficient; and based on the signal quality parameters, determine the signal compensation factor. Calculate the ratio of the current link rate power consumption coefficient to the maximum link rate power consumption coefficient, and determine the signal compensation coefficient based on the signal compensation factor; The product of the nominal power consumption of the target component, the ratio, and the signal compensation coefficient is calculated as the current power consumption prediction value of the target component.
3. The method according to claim 1, characterized in that, The step of correcting the current power consumption prediction value based on the first temperature data to obtain the target power consumption includes: Based on the first temperature data and the temperature data of the previous moment corresponding to the first temperature data, calculate the temperature change data corresponding to the target component, and determine the power consumption correction coefficient based on the product of the temperature change data and the preset gain coefficient. The target power consumption is calculated by multiplying the current power consumption prediction value by the power consumption correction coefficient.
4. The method according to claim 1, characterized in that, The step of acquiring the fan speed control parameters corresponding to the target power consumption and controlling the server's cooling fan based on the fan speed control parameters includes: If there are multiple target components, the fan speed control parameters corresponding to the target power consumption of each target component are obtained respectively. The target fan speed control parameters are determined based on the largest fan speed control parameters, and the cooling fan of the server is controlled based on the target fan speed control parameters. If there is only one target component, the fan speed control parameter corresponding to the target power consumption is obtained as the target fan speed control parameter, and the cooling fan of the server is controlled based on the target fan speed control parameter.
5. The method according to claim 4, characterized in that, The control of the server's cooling fan based on the target fan speed parameters includes: The target duty cycle is determined based on the target fan speed control parameters, and a target duty cycle signal is generated based on the target duty cycle. The target duty cycle signal is sent to the cooling fan of the server so that the cooling fan adjusts its speed based on the target duty cycle signal.
6. The method according to any one of claims 1 to 5, characterized in that, After the server starts, the method further includes: By reading the second temperature data corresponding to the main temperature sensor of the target component, it is determined whether the main temperature sensor is ready. The ready conditions of the main temperature sensor include successfully reading the second temperature data and the difference between adjacent second temperature data in the second temperature data read a preset number of times is within a preset temperature fluctuation range. If the main temperature sensor is ready, the cooling fan is controlled based on the heat dissipation control logic corresponding to the main temperature sensor.
7. The method according to claim 6, characterized in that, After controlling the cooling fan based on the heat dissipation control logic corresponding to the main temperature sensor, the method further includes: Stop reading the first temperature data of the backup temperature sensor, and determine whether the main temperature sensor has entered a fault state by reading the second temperature data of the main temperature sensor corresponding to the target component. The fault state conditions of the main temperature sensor include failure to read the second temperature data, or the difference between adjacent second temperature data read is outside the preset temperature fluctuation range. If the main temperature sensor malfunctions, the process returns to the step of reading the first temperature data from the backup temperature sensor corresponding to the target component to be cooled.
8. A component heat dissipation control device for a server, characterized in that, The device includes: The communication module is used to read the first temperature data of the backup temperature sensor corresponding to the target component to be cooled after the server starts up, and to obtain the communication link status information corresponding to the target component. The backup temperature sensor is set within a preset range centered on the target component. The calculation module is used to predict the current power consumption of the target component based on the communication link status information and the nominal power consumption of the target component, and to correct the current power consumption prediction value based on the first temperature data to obtain the target power consumption. The control module is used to acquire fan speed control parameters corresponding to the target power consumption, and control the cooling fan of the server based on the fan speed control parameters.
9. A storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the method of any one of claims 1 to 7.
10. A computer device, comprising a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, characterized in that, When the processor executes the computer program, it implements the method of any one of claims 1 to 7.