A temperature control method, system and device for a chip heat spreader
By combining Kalman filtering and Fourier transform with wavelet transform, PID parameters are adjusted in real time, solving the problems of response lag and overshoot in traditional PID control algorithms in chip heat dissipation systems. This achieves faster and more accurate temperature control, improving the system's adaptability and robustness.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- DONGGUAN FUQIYANG ELECTRONIC TECH CO LTD
- Filing Date
- 2025-10-27
- Publication Date
- 2026-06-26
Smart Images

Figure CN121300517B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of temperature control technology, specifically to a temperature control method, system, and device for a chip heat sink. Background Technology
[0002] With advancements in integrated circuit technology, the transistor density of integrated chips continues to increase, leading to higher power consumption and greater heat generation. Temperature control has thus become a crucial factor influencing electronic product design. Exceeding a certain temperature limit can cause chip failure, resulting in performance degradation, reduced reliability, or even permanent damage. Traditional heat dissipation methods primarily employ passive heat sinks or air convection fans to accelerate cooling. However, with the development of new materials and technologies, heat pipe radiators based on phase change heat transfer principles have emerged, along with currently popular technologies such as liquid cooling, phase change material cooling, and dynamic adjustment using intelligent temperature control algorithms. Overall, temperature control plays a vital role in chip operation, serving as an indispensable factor for equipment lifespan, high-performance computing applications, and energy efficiency.
[0003] The heat generation of a chip is inherently a nonlinear process, influenced by various factors such as the efficiency of the heat sink fins and fluid dynamics characteristics. Changes in these factors lead to diminishing marginal returns in heat dissipation. Traditional PID control algorithms typically rely on fixed parameters for temperature regulation, meaning their performance under varying loads may be unsatisfactory. Specifically, under low loads, PID control may cause the system to overreact, generating noise or wasting energy; while under sudden load increases, the PID algorithm's response speed may not be fast enough to adjust heat dissipation in time, causing the chip to overheat and enter an overheating throttling mode, affecting system stability and performance. Therefore, traditional PID control may have certain limitations when handling such complex nonlinear systems. Summary of the Invention
[0004] In view of the above, it is necessary to provide a temperature control method, system and device for chip heat sinks to solve the above problems.
[0005] According to one aspect of this application, a method for temperature control of a chip heat sink is provided, the method comprising:
[0006] Acquire temperature data from the chip at each acquisition time;
[0007] The temperature data at each acquisition time and all previous acquisition times are filtered to obtain the optimal temperature estimate at each acquisition time. The estimate is then compared with the temperature data at each acquisition time. The thermal pressure load index at each acquisition time is obtained by combining the frequency domain characteristics of the temperature data.
[0008] The temperature data at each acquisition time and all previous acquisition times are decomposed to obtain the approximate coefficients of each layer after decomposition. Combined with the discrete characteristics and trend characteristics of the thermal pressure load coefficients at all acquisition times within the preset window, the dynamic thermal disturbance factor at each acquisition time is determined.
[0009] The proportional gain in the PID control algorithm for chip heat dissipation is improved based on the aforementioned dynamic thermal disturbance factor.
[0010] The filtering of temperature data at each acquisition moment and all previous acquisition moments employs the Kalman filter algorithm.
[0011] The method for obtaining the frequency domain features is as follows:
[0012] A fast Fourier transform is performed on the temperature data at each acquisition moment and all previous acquisition moments, and the extracted dominant frequency is used as the frequency domain feature of each acquisition moment.
[0013] Specifically, obtaining the thermal pressure load index at each sampling moment involves:
[0014] Calculate the difference between the optimal temperature estimate and the temperature data at each acquisition time, and obtain the degree of dispersion of the difference at each acquisition time and all previous acquisition times; calculate the normalized value of the dispersion and the main frequency at each acquisition time, and perform positive fusion with the dispersion to obtain the thermal pressure load coefficient at each acquisition time.
[0015] The discrete features are determined by the coefficient of variation.
[0016] The trend feature is determined by the slope of a straight line after fitting the thermal pressure load coefficient at all times within a preset window.
[0017] Specifically, determining the dynamic thermal disturbance factor at each acquisition moment involves:
[0018] Calculate the coefficient of variation and slope of the thermal load coefficient within each window; calculate the mean of the positive fusion results obtained from all windows, and combine it with the approximation coefficients of the preset number of layers obtained from wavelet decomposition to determine the dynamic thermal disturbance factor at each acquisition time.
[0019] The improved formula for the proportional gain is as follows: ;in, The improved proportional gain for the next acquisition time. This is the initial proportional gain. This is a preset adjustment coefficient; This is the normalization function; This represents the dynamic thermal disturbance factor at each acquisition moment.
[0020] According to another aspect of this application, a temperature control device for a chip heat sink is provided, including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program to implement the steps of any of the methods described above.
[0021] According to another aspect of this application, a temperature control system for a chip heat sink is provided, wherein the system stores a computer program that, when executed by a processor, implements any of the methods described above.
[0022] This application has at least the following beneficial effects:
[0023] This application addresses the control lag or overshoot problem caused by nonlinearity and time-varying characteristics in chip heat dissipation systems. It uses Kalman filtering and fast Fourier transform to reflect the degree of load fluctuation and model mismatch characteristics, thus solving the problem that traditional temperature control methods are insufficient in response to instantaneous change conditions and cannot quickly adapt to load changes.
[0024] To address the complex dynamic response characteristics of chip heat dissipation systems and the difficulty of accurately meeting requirements with traditional fixed-parameter control, this solution proposes an analysis of temperature change characteristics using wavelet transform and sliding window statistical methods. Wavelet transform effectively extracts key features such as long-term heat accumulation, short-term fluctuation intensity, and instantaneous rate of change within the system. Combined with sliding window technology, noise interference and measurement errors can be eliminated from the real-time data stream, thus more accurately reflecting the system's dynamic behavior and pressure changes. This method can more flexibly handle the complex changes in the heat dissipation system under different operating conditions, improving the accuracy and response speed of temperature control.
[0025] Finally, based on the dynamic thermal disturbance factor, the PID parameters are adaptively adjusted in real time, so that the proportional gain is dynamically adjusted with the system operating conditions. This enables rapid suppression of temperature spikes, avoids overcompensation energy consumption under low load, significantly improves the adaptability and robustness of the control system to nonlinear and time-varying environments, ensures that the chip can maintain temperature stability during high load surges and long-term operation, extends equipment life and improves energy efficiency. Attached Figure Description
[0026] Figure 1 A flowchart of the steps of a temperature control method for a chip heat sink provided in this application;
[0027] Figure 2 An improved flowchart of the proportional gain provided for this application. Detailed Implementation
[0028] In the description of the embodiments in this application, the words "exemplary," "or," and "for example" are used to indicate examples, illustrations, or descriptions. Any embodiment or design scheme described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of the words "exemplary," "or," and "for example" is intended to present the relevant concepts in a specific manner.
[0029] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in this application's specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
[0030] It should also be noted that the terms "first" and "second" in this application and its accompanying drawings are used to distinguish similar objects, rather than to describe a specific order or sequence. The methods disclosed in the embodiments of this application or the methods shown in the flowcharts include one or more steps for implementing the method. Without departing from the scope of this application, the execution order of multiple steps can be interchanged, and some steps can also be deleted.
[0031] Please see Figure 1 The diagram illustrates a flowchart of a temperature control method for a chip heat sink according to an embodiment of this application. The method includes the following steps:
[0032] Step 1: Acquire temperature data from the chip at each acquisition time.
[0033] Temperature data from the chip is collected in real time by a temperature sensor. Temperature values are continuously acquired at a fixed sampling frequency of 10Hz. The temperature data collected at all previous acquisition times are sorted in ascending order of time to form a temperature time series for each acquisition time.
[0034] Step 2: Filter the temperature data at each acquisition time and all previous acquisition times to obtain the optimal temperature estimate for each acquisition time, and compare it with the temperature data at each acquisition time. Combine the frequency domain characteristics of the temperature data to obtain the thermal pressure load index for each acquisition time.
[0035] Because chip heat dissipation systems involve highly nonlinear and time-varying processes, traditional temperature control methods exhibit response lag or overshoot under sudden changes in external load conditions. Simultaneously, the heat generated by the chip changes instantaneously with the computational load, and due to the thermal resistance of the heat sink and the inertia and kinetic energy of the fluid, the heat dissipation characteristics of the system also exhibit dynamic characteristics, inevitably resulting in nonlinearity. This nonlinearity means that a fast response cannot be achieved while maintaining stability using pre-defined fixed parameters.
[0036] To address noise interference and measurement uncertainty in temperature signals, this application employs a Kalman filter algorithm. Using the temperature sequence at each acquisition time as input, and setting the process noise covariance Q = 0.01 and the observation noise covariance R = 0.1, the algorithm outputs the optimal temperature estimate for each acquisition time. By fusing system dynamics with real-time observations through a state-space model, random noise is eliminated to obtain an optimal estimate that more closely approximates the true temperature. The Kalman filter continuously updates the state estimate through a prediction-correction mechanism, adapting to dynamic changes in the system, and outputs... This represents the most reliable understanding of the core temperature of the system, providing a reliable basis for subsequent control decisions.
[0037] However, relying solely on instantaneous temperature estimation cannot reveal the periodic characteristics of temperature load changes or predict trend changes. Therefore, a Fast Fourier Transform (FFT) is introduced. Using the temperature sequence at each acquisition moment as input, the frequency with the largest amplitude is selected as the dominant frequency. The temperature signal in the time domain is converted to the frequency domain, and all dominant frequencies and their corresponding amplitudes are output. Since changes in chip load (such as periodic task scheduling) leave unique frequency markers in the temperature signal, this method is quite effective in solving the problem of trend analysis. That is, the periodic strength of load changes and energy distribution can be determined based on the dominant frequencies and amplitudes output by the FFT.
[0038] Based on the above analysis, a thermal pressure load coefficient is constructed for each data acquisition moment. Specifically, the difference between the optimal temperature estimate and the actual temperature at each acquisition moment is calculated, and the dispersion of the difference at each acquisition moment and all previous acquisition moments is obtained. The dispersion is then positively fused with the normalized value of the dominant frequency at each acquisition moment to obtain the thermal pressure load coefficient for each acquisition moment. In this embodiment, the dispersion of the difference is calculated using the coefficient of variation; the positive fusion of multiple variables is performed using a multiplication method; and the normalization method uses the Z-score normalization function.
[0039] It should be understood that in the calculation of the thermal load factor, the difference between the optimal temperature estimate and the actual temperature is used to describe the degree of difference between the optimal estimate output by the Kalman filter and the actual measured value in the chip heat dissipation system. If the deviation is large, it indicates that the actual thermal dynamic behavior of the system differs significantly from the state equation set by the Kalman filter, suggesting that the chip's actual load has undergone a sudden change greater than expected (e.g., a momentary change from standby to high-load computing). The larger the coefficient of variation, the more unstable the chip's temperature change. The dominant frequency refers to the average volatility of the most important fluctuation component in the temperature signal, measuring the chip's load change rate and fluctuation period. The normalized result represents the standardized deviation of the current dominant frequency from the historical frequency distribution, reflecting the abnormal significance of load fluctuations in the frequency domain. When this value is high, it indicates that the current chip temperature state changes rapidly and frequently. Under such conditions, the heat dissipation rate also increases rapidly, and the time to reach saturation is correspondingly shorter.
[0040] The thermal load coefficient can be obtained from these two aspects, which comprehensively describes the load fluctuation of the heat dissipation system. When the model is severely mismatched and the load fluctuation is large, the value will be large, indicating that the chip heat dissipation system is operating under relatively harsh conditions.
[0041] Step 3: Decompose the temperature data at each acquisition time and all previous acquisition times to obtain the approximate coefficients of each layer after decomposition. Combine the discrete characteristics and trend characteristics of the thermal pressure load coefficients at all acquisition times within the preset window to determine the dynamic thermal disturbance factor at each acquisition time.
[0042] Because chip heat dissipation systems have strong nonlinearity and time-varying characteristics, the chip's heat generation power jumps instantaneously with the computing load. Its heat dissipation process has complex dynamic response characteristics due to the combined effects of multiple factors such as fin efficiency and fluid dynamics. Traditional fixed parameter control methods are difficult to achieve accurate quantitative requirements. At low loads, overcompensation is likely to occur, leading to energy consumption. At high loads, insufficient compensation response may also cause overheating risks.
[0043] To quantify this dynamic characteristic, the thermal load indices calculated at all previous acquisition times were sorted in ascending order to obtain the thermal load sequence for each acquisition time. Based on wavelet transform, the thermal load sequence for each acquisition time was selected as input. The wavelet basis function was set to Daubechies4, and the decomposition level was set to 3. The Daubechies4 wavelet has good localization and smoothness, and can effectively describe the temperature change trend component. The 3-level decomposition can extract the details and overall trend of the signal at different time scales, outputting the approximation coefficients and detail coefficients of each level. The third-level approximation coefficients represent the changing trend of the thermal load coefficient on medium- to long-term time scales, describing the continuous accumulation of heat load in the heat dissipation system over a long period, reflecting the temperature rise inertia caused by the chip's continuous high-load operation for a long time. Wavelet transform has multi-resolution analysis characteristics, and can comprehensively analyze the changing law of the thermal load coefficient from both the time and frequency domains.
[0044] Next, the thermal load sequence at each acquisition moment is used as input, and a sliding window statistical feature is used. In this embodiment, the window time length is set to 1 second, and the sliding step size is the sampling interval. The coefficient of variation of the thermal load coefficient at all acquisition moments within the output window is calculated, and a straight line is fitted to the thermal load coefficient at all acquisition moments to obtain the slope. Thus, the central tendency, fluctuation intensity, and direction of change of the thermal load coefficient in a local time are obtained. The coefficient of variation represents the relative dispersion of thermal load fluctuations. The larger the value, the more unstable the system heat dissipation state. The slope reflects the rate of change of heat dissipation demand (a positive value indicates an increase in demand). When the chip is dissipating heat, an increase means that the fan speed needs to be adjusted frequently to adapt to the load transition.
[0045] Based on the above analysis, a dynamic thermal disturbance factor B is constructed for each acquisition moment. Specifically, the coefficient of variation of the thermal pressure load coefficient and the forward fusion result of the slope within each window are calculated. The mean of the forward fusion results obtained from all windows is calculated, and the approximation coefficients of the preset number of layers obtained by wavelet decomposition are combined to determine the dynamic thermal disturbance factor for each acquisition moment.
[0046] In this embodiment, the specific formula for the dynamic thermal disturbance factor is: ;in, These are the approximate coefficients for the third level of wavelet transform decomposition. and Let be the coefficient of variation and the slope within the j-th window, respectively, and J be the number of windows.
[0047] It should be understood that, This indicates the cumulative intensity of heat load in the medium to long term. This indicates the magnitude of short-term fluctuations relative to average demand. The value of B reflects the instantaneous change rate of heat dissipation demand; while B comprehensively reflects the rapid changes and urgency of the heat dissipation system under various challenges: the larger the value of B, the more the heat dissipation system needs to withstand three situations at the same time: continuous heat accumulation, frequent load fluctuations and sudden temperature rise. The more the control unit needs to actively use high-power heat dissipation strategies to prevent the chip from throttling or being damaged due to overheating.
[0048] Step 4: Improve the proportional gain in the PID control algorithm for chip heat dissipation based on the dynamic thermal disturbance factor.
[0049] Because of the strong nonlinear and time-varying characteristics of chip heat dissipation systems, after calculating the instantaneous increase in heat generation power caused by the chip load, the heat dissipation effect undergoes instantaneous changes due to the influence of changes in the parameters of the heat dissipation fins themselves and the properties of fluid dynamics. Therefore, traditional fixed-parameter PID control methods are difficult to meet the requirements of precise temperature control: that is, if the control amplitude is too large under low load, it will trigger unnecessary noise and additional energy consumption; under sudden high load, the algorithm responds slowly and cannot effectively suppress temperature spikes, which can lead to problems such as overheating, chip frequency reduction, or even burnout. The fundamental reason is that the fixed-parameter control strategy cannot follow the real-time dynamic changes of the system to adjust its own thermal load pressure.
[0050] Therefore, it is necessary to design an adaptive PID control parameter optimization strategy based on the dynamic thermal disturbance factor. Specifically, taking the temperature error as the input, in this embodiment, the initial proportional gain KP of the PID is set to 5.0, the initial integral gain is 0.1, and the initial derivative gain is 1.0. The output control signal acts on the actuator of the heat sink to directly control the fan speed.
[0051] Based on the initial PID control algorithm, the KP value is adjusted in real time according to the instantaneous dynamic thermal disturbance factor. The dynamic thermal disturbance factor reflects the sum of the medium- and long-term heat accumulation intensity, the severity of short-term load changes, and the rate of temperature rise of the heat dissipation system. The larger the value, the greater the dynamic pressure on the heat dissipation system and the more urgent the heat dissipation demand. Based on this, the improved formula for the proportional gain is: ;in, The improved proportional gain for the next acquisition time. This is the initial proportional gain. The preset adjustment coefficient ranges from 0.5 to 0.7 and is used to avoid over-adjustment or insufficient response. In this embodiment, the value is 0.5. Implementers can adjust the value according to the actual situation, and this application does not impose any restrictions on it. The maximum-minimum normalization function is used in this embodiment.
[0052] The flowchart for improving the proportional gain is as follows: Figure 2 As shown.
[0053] When the dynamic thermal disturbance factor increases, the proportional gain automatically increases, improving the controller's response speed to sudden loads and effectively suppressing temperature spikes. When the dynamic thermal disturbance factor is small, the proportional gain returns to the reference value, avoiding over-adjustment under low loads. This strategy enables the PID parameters to adaptively adjust according to the real-time operating conditions of the cooling system, significantly improving the control adaptability and robustness for nonlinear and time-varying systems.
[0054] Based on the same concept as the method embodiments of this application, a temperature control device for a chip heat sink is provided, including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program to implement the steps of any of the methods described above.
[0055] Based on the same concept as the method embodiments of this application, a temperature control system for a chip heat sink is provided. The system stores a computer program, which, when executed by a processor, implements any of the methods described above.
[0056] It should be noted that the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of the systems, methods, and computer program products according to embodiments of this application. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than that shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. In the descriptions corresponding to the flowcharts and block diagrams in the accompanying drawings, the operations or steps corresponding to different blocks may also occur in a different order than disclosed in the description; sometimes there is no specific order between different operations or steps. For example, two consecutive operations or steps may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. Each block in a block diagram and / or flowchart, and combinations of blocks in a block diagram and / or flowchart, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.
[0057] The above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.
Claims
1. A method for temperature control of a chip heat sink, characterized in that, The method includes the following steps: Acquire temperature data from the chip at each acquisition time; The temperature data at each acquisition time and all previous acquisition times are filtered to obtain the optimal temperature estimate at each acquisition time. The estimate is then compared with the temperature data at each acquisition time. The thermal pressure load index at each acquisition time is obtained by combining the frequency domain characteristics of the temperature data. The temperature data at each acquisition time and all previous acquisition times are decomposed to obtain the approximate coefficients of each layer after decomposition. Combined with the discrete characteristics and trend characteristics of the thermal pressure load coefficients at all acquisition times within the preset window, the dynamic thermal disturbance factor at each acquisition time is determined. The proportional gain in the PID control algorithm for chip heat dissipation is improved based on the aforementioned dynamic thermal disturbance factor. The improved formula for the proportional gain is as follows: ;in, The improved proportional gain for the next acquisition time. This is the initial proportional gain. This is a preset adjustment coefficient; This is the normalization function; This represents the dynamic thermal disturbance factor at each acquisition moment; The specific formula for the dynamic thermal disturbance factor is as follows: ;in, These are the approximate coefficients for the third level of wavelet transform decomposition. and Here, are the coefficient of variation and slope within the j-th window, respectively, and J is the number of windows; This indicates the cumulative intensity of heat load in the medium to long term. This indicates the magnitude of short-term fluctuations relative to average demand. It reflects how quickly the heat dissipation demand changes instantaneously.
2. The temperature control method for a chip heat sink as described in claim 1, characterized in that, The Kalman filter algorithm is used to filter the temperature data at each acquisition time and all previous acquisition times.
3. The temperature control method for a chip heat sink as described in claim 1, characterized in that, The method for obtaining the frequency domain features is as follows: A fast Fourier transform is performed on the temperature data at each acquisition moment and all previous acquisition moments, and the extracted dominant frequency is used as the frequency domain feature of each acquisition moment.
4. The temperature control method for a chip heat sink as described in claim 3, characterized in that, The process of obtaining the thermal load index at each data collection moment is as follows: Calculate the difference between the optimal temperature estimate and the temperature data at each acquisition time, and obtain the degree of dispersion of the difference at each acquisition time and all previous acquisition times; calculate the normalized value of the dispersion and the main frequency at each acquisition time, and perform positive fusion with the dispersion to obtain the thermal pressure load coefficient at each acquisition time.
5. The temperature control method for a chip heat sink as described in claim 1, characterized in that, The discrete features are determined using the coefficient of variation.
6. The temperature control method for a chip heat sink as described in claim 5, characterized in that, The trend characteristic is determined by the slope of a straight line obtained by fitting the thermal pressure load coefficient at all times within a preset window.
7. The temperature control method for a chip heat sink as described in claim 6, characterized in that, The determination of the dynamic thermal disturbance factor at each acquisition moment is specifically as follows: Calculate the coefficient of variation and slope of the thermal load coefficient within each window; calculate the mean of the positive fusion results obtained from all windows, and combine it with the approximation coefficients of the preset number of layers obtained from wavelet decomposition to determine the dynamic thermal disturbance factor at each acquisition time.
8. A temperature control device for a chip heat sink, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the method as described in any one of claims 1-7.
9. A temperature control system for a chip heat sink, wherein the system stores a computer program, characterized in that, When the computer program is executed by a processor, it implements the method as described in any one of claims 1-7.