Constraint neural network based data center cooling system control method and apparatus

By optimizing the control parameters of the data center cooling system using a control method based on constrained neural networks, the problem of chip temperature exceeding limits caused by model uncertainty was solved, achieving high efficiency and energy saving of the system and improved chip thermal reliability.

CN117908653BActive Publication Date: 2026-06-23XI AN JIAOTONG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XI AN JIAOTONG UNIV
Filing Date
2024-01-25
Publication Date
2026-06-23

Smart Images

  • Figure CN117908653B_ABST
    Figure CN117908653B_ABST
Patent Text Reader

Abstract

A data center cooling system control method and device based on a constraint neural network, the method comprising: acquiring thermodynamic parameters of a data center cooling system in real time, and constructing an original database; preprocessing data in the original database to obtain a power consumption dataset and a chip temperature dataset; training a power consumption prediction neural network using the power consumption dataset and a penalty function; training a chip temperature prediction neural network using the chip temperature dataset and the penalty function; the penalty function for temperature prediction adopts an adaptive penalty function, and a penalty factor is introduced when the model predicted temperature is lower than the actual temperature, so that the model predicted temperature is always higher than the actual temperature; and taking the minimum power consumption of the system as an optimization objective, and taking the actual temperature of the chip being less than the upper limit of the chip temperature as a constraint condition, the optimal control parameters under different environmental temperature and humidity and heat load are optimized. The optimization control result can be maintained within the critical temperature of the chip, and the energy saving of the data center is maximized.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of data center thermal management technology, specifically relating to a data center cooling system control method and device based on a constrained neural network. Background Technology

[0002] With the advent of the information age, data centers have developed rapidly and have become an indispensable part of modern society. According to relevant statistics, data centers consume approximately 1% of the total electricity in society, and this figure is projected to reach 20% by 2025. Cooling systems are a crucial component of data center systems, dissipating the significant heat generated by IT equipment to ensure its high performance and safe operation. Cooling systems consume approximately 40% of the total system power, second only to the power consumption of IT equipment. Therefore, optimizing data center cooling systems and improving system operating efficiency is of paramount importance.

[0003] When a chip's operating temperature exceeds its critical allowable temperature, its thermal reliability decreases; conversely, when its operating temperature is significantly below the critical allowable temperature, system cooling is wasted and power consumption increases. Therefore, data center thermal management must maintain chip temperatures within a safe range and as close to the critical temperature as possible. Currently, most data centers use data models such as neural networks for modeling and optimizing control. However, these models contain uncertainties. When a model predicts a chip temperature equal to the critical temperature, the actual chip temperature under the same operating conditions may exceed the critical temperature, damaging the chip's lifespan. Therefore, most manufacturers lower the chip's critical temperature to compensate for model uncertainties; however, this approach results in significant cooling waste. Summary of the Invention

[0004] The purpose of this invention is to address the problems in the prior art by providing a data center cooling system control method and apparatus based on constrained neural networks, which compensates for the uncertainty of the data model, maintains the optimized control results within the chip critical temperature, enhances the robustness of neural network optimized control, and maximizes data center energy saving.

[0005] To achieve the above objectives, the present invention provides the following technical solution:

[0006] Firstly, a data center cooling system control method based on a constrained neural network is provided, comprising:

[0007] Real-time acquisition of thermodynamic parameters of data center cooling systems to construct a raw database;

[0008] The data in the original database is preprocessed to obtain power consumption dataset and chip temperature dataset;

[0009] The pre-built power prediction neural network is trained using a power consumption dataset and a power consumption prediction penalty function;

[0010] The pre-established chip temperature prediction neural network is trained using a chip temperature dataset and a temperature prediction penalty function. The temperature prediction penalty function is an adaptive penalty function. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model predicts a temperature higher than the actual temperature. The closer the actual temperature is to the upper limit of the chip temperature, the larger the penalty factor becomes.

[0011] By using a trained power consumption prediction neural network and a chip temperature prediction neural network, with the goal of minimizing the power consumption of the data center cooling system, and with the constraint that the actual chip temperature is less than the upper limit of the chip temperature, the optimal control parameters are optimized under different ambient temperatures, humidity and heat loads, and the data center cooling system is controlled according to the optimal control parameters.

[0012] As a preferred embodiment, the data center cooling system includes a cooling tower and a heat exchanger. The primary side of the heat exchanger is connected to the cooling tower via a pipeline, and the secondary side of the heat exchanger is connected to the data center server via a pipeline. Heat exchange between cold and hot fluids occurs within the heat exchanger. The data center server contains multiple chips, and sensors are arranged to collect the actual temperature of each chip within the data center server.

[0013] As a preferred embodiment, a primary water pump is installed on the pipeline between the cooling tower and the heat exchanger, and a secondary water pump is installed on the pipeline between the heat exchanger and the data center server; in the step of acquiring the thermodynamic parameters of the data center cooling system in real time and constructing the original database, the thermodynamic parameters of the data center cooling system include:

[0014] The parameters corresponding to the cooling tower include: external ambient temperature T amb External relative humidity (RH) and fan frequency (f) fan Cooling tower outlet temperature T tower,out Cooling tower inlet temperature T tower,in and the power consumption P of the cooling tower tower ;

[0015] The parameters corresponding to the primary side water pump include: primary side flow rate q v,1 Primary water pump power consumption P pump,1 ;

[0016] The parameters corresponding to the heat exchanger include: primary side liquid supply temperature T sup,1 Primary side return liquid temperature T back,1 Secondary side liquid supply temperature T sup,2 and the secondary side return liquid temperature T back,2 ;

[0017] The parameters corresponding to the secondary side water pump include: secondary side flow rate q v,2 Secondary water pump power consumption P pump,2 ;

[0018] The parameters corresponding to data center servers include: server heat load Q, server outlet temperature T. sever,out Server inlet temperature T sever,in and the temperature T of each chip chi,i .

[0019] As a preferred embodiment, the step of preprocessing the data in the original database includes:

[0020] Data denoising, data filtering, data repair, and normalization processing;

[0021] The expression for normalization is:

[0022]

[0023] In the formula, x0 represents normalized data; x represents original data; x min x is the minimum value in the original data. max The maximum value of the original data;

[0024] The normalized power consumption-related thermodynamic parameters are used as the power consumption dataset, and the normalized chip temperature-related thermodynamic parameters are used as the chip temperature dataset.

[0025] The power consumption dataset includes: ambient temperature T amb External relative humidity (RH) and fan frequency (f) fan Cooling tower power consumption P tower Primary flow rate q v,1 Primary water pump power consumption P pump,1 Secondary flow rate q v,2 Secondary water pump power consumption P pump,2 And server thermal load Q;

[0026] The chip temperature dataset includes: ambient temperature T amb External relative humidity (RH) and fan frequency (f) fan Primary flow rate q v,1 Secondary flow rate q v,2 Server heat load Q, primary side liquid supply temperature T sup,1 Primary side return liquid temperature T back,1 Secondary side liquid supply temperature T sup,2 Secondary side return liquid temperature T back,2 and the temperature T of each chip chi,i .

[0027] As a preferred embodiment, the power consumption prediction penalty function is:

[0028]

[0029] Among them, Loss P P is the penalty function for the power consumption prediction neural network; P is the actual power consumption; P net Power consumption prediction for neural networks; MSE P denoted as mean square error; N represents the total amount of data.

[0030] As a preferred embodiment, the temperature prediction penalty function is:

[0031] Loss T =loss d +loss p

[0032]

[0033] Among them, Loss T The penalty function for the neural network predicting chip temperature; loss d The loss term is data-driven. p For the constraint loss term; T i T represents the temperature of the i-th actual chip. net,i Predict the chip temperature for the i-th neural network; λ T,i The adaptive penalty factor for the penalty function of the neural network for predicting the temperature of the i-th chip is calculated as follows:

[0034]

[0035] Where, λ T An adaptive penalty factor for the penalty function of the neural network for chip temperature prediction; μ T The scaling factor is used to balance the constraint effect and convergence status by adjusting the scaling factor according to the actual situation; T is the actual chip temperature; T net The neural network predicts the chip temperature; T0 is the upper limit of the chip temperature.

[0036] As a preferred embodiment, both the power consumption prediction neural network and the chip temperature prediction neural network adopt a fully connected form, including an input layer, a hidden layer, and an output layer. When training the pre-established power consumption prediction neural network and the pre-established chip temperature prediction neural network, the number of neural network unit layers and the number of iterations are adjusted until the accuracy of the test set and the validation set is within an acceptable range, and finally the power consumption prediction neural network with the best fitting effect is obtained, thus completing the training.

[0037] As a preferred embodiment, the step of optimizing the optimal control parameters under different environmental temperatures, humidity levels, and heat loads by utilizing a trained power consumption prediction neural network and a chip temperature prediction neural network, with the goal of minimizing the power consumption of the data center cooling system and the constraint that the actual chip temperature is less than the upper limit of the chip temperature, is as follows:

[0038] P = f(f) fan ,q v,1 ,q v,2 )

[0039] st lb v,i ≤q v,i ≤ub v,i i = 1, 2

[0040] lb fan ≤f fan ≤ub fan

[0041] T chi,i =f i (f fan ,q v,1 ,q v,2 )≤T chi,i,max i = 1, 2, ..., n

[0042] Where P is the power consumption of the cooling system; f(f fan ,q v,1 ,q v,2 ) represents the system power consumption as a function of fan frequency and primary and secondary side flow rates under a certain ambient temperature, humidity, and heat load; lb and ub are the upper and lower limits of cooling tower fan airflow, primary side pump flow rate, and secondary side pump flow rate; f i (f fan ,q v,1 ,q v,2 (T) represents the temperature of each chip as a function of the fan frequency and the primary and secondary flow rates under a given ambient temperature, humidity, and heat load; chi,i,max is the upper limit of chip temperature; n is the number of chips in the server.

[0043] Secondly, a data center cooling system control system based on a constrained neural network is provided, comprising:

[0044] The data acquisition module is used to acquire the thermodynamic parameters of the data center cooling system in real time and build the raw database;

[0045] The data preprocessing module is used to preprocess the data in the original database to obtain power consumption dataset and chip temperature dataset;

[0046] The power prediction neural network training module is used to train a pre-built power prediction neural network using a power dataset and a power prediction penalty function.

[0047] The chip temperature prediction neural network training module is used to train a pre-established chip temperature prediction neural network using a chip temperature dataset and a temperature prediction penalty function. The temperature prediction penalty function is an adaptive penalty function. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model predicts a temperature higher than the actual temperature. The closer the actual temperature is to the upper limit of the chip temperature, the larger the penalty factor becomes.

[0048] The control parameter optimization module utilizes a trained power consumption prediction neural network and a chip temperature prediction neural network to optimize the data center cooling system with the goal of minimizing power consumption and the constraint that the actual chip temperature is less than the upper limit of chip temperature. It optimizes the optimal control parameters under different ambient temperatures, humidity levels and heat loads, and controls the data center cooling system according to the optimal control parameters.

[0049] Thirdly, a computer-readable storage medium is provided, the computer-readable storage medium storing a computer program, which, when executed by a processor, implements the data center cooling system control method based on a constrained neural network.

[0050] Compared with the prior art, the present invention has at least the following beneficial effects:

[0051] This invention utilizes various high-precision sensors to acquire real-time thermodynamic parameters of the data center cooling system, constructing a raw database. By preprocessing the data in this database to reduce uncertainty, a constraint-based neural network is employed to fit the system power consumption and chip temperature. Finally, with minimizing system power consumption as the optimization objective and chip temperature as the constraint, the optimal control parameters are optimized under different environmental temperatures, humidity levels, and heat loads. This invention can significantly improve data center energy efficiency and compensate for data model uncertainties, maintaining the optimized control results within the chip's critical temperature and improving the robustness of the optimized control. By adding constraints to the loss function of the temperature prediction data model, the predicted chip temperature is always higher than the actual chip temperature, further enhancing the model's control robustness. When the optimization results are applied to a real system, chip temperatures can be prevented from exceeding the critical temperature, improving chip thermal reliability. Attached Figure Description

[0052] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present invention. For those skilled in the art, other related drawings can be obtained from these drawings without creative effort.

[0053] Figure 1 This is a schematic diagram of the data center cooling system structure according to an embodiment of the present invention.

[0054] Figure 2 This is a flowchart of a data center cooling system control method based on a constrained neural network according to an embodiment of the present invention.

[0055] Figure 3 This is a control effect diagram of the data center cooling system control method based on constrained neural networks according to an embodiment of the present invention.

[0056] In the attached diagram: 1-Cooling tower; 2-Primary side filter; 3-Heat exchanger; 4-Primary side water pump; 5-Secondary side filter; 6-Data center server; 7-Secondary side water pump. Detailed Implementation

[0057] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, those skilled in the art can obtain other embodiments without creative effort.

[0058] Please see Figure 1 This invention describes a data center cooling system using a typical plate heat exchanger as an example. The system includes a cooling tower 1 and a heat exchanger 3. The primary side of the heat exchanger 3 is connected to the cooling tower 1 via a pipe, and the secondary side of the heat exchanger 3 is connected to a data center server 6 via a pipe. Heat exchange occurs between cold and hot fluids within the heat exchanger 3. A primary-side water pump 4 and a primary-side filter 2 are installed on the pipe between the cooling tower 1 and the heat exchanger 3, and a secondary-side water pump 7 and a secondary-side filter 5 are installed on the pipe between the heat exchanger 3 and the data center server 6. The data center server 6 contains multiple chips, and sensors are used to collect the actual temperature of each chip within the data center server 6.

[0059] Cooling tower 1 is located externally. Utilizing the unsaturated nature of air, driven by the pressure difference of water vapor, the sprayed water in cooling tower 1 exchanges heat and moisture with the air, thereby cooling the refrigerant inside the pipeline. Primary side filter 2 and secondary side filter 5 are used to filter impurities within the pipeline. Cold and hot fluids flow and exchange heat within their respective pipes on both sides of heat exchanger 3, achieving heat transfer and impurity isolation. Primary side water pump 4 and secondary side water pump 7 are used to pressurize and transport the liquid for heat exchange. Data center server 6 performs cloud data storage and intelligent computing, generating a large amount of heat. Therefore, to ensure equipment safety, the cooling system must dissipate heat from data center server 6, ensuring the chip temperature remains within a safe range.

[0060] Please see Figure 2 The present invention provides a data center cooling system control method based on a constrained neural network, comprising:

[0061] Real-time acquisition of thermodynamic parameters of data center cooling systems to construct a raw database;

[0062] The data in the original database is preprocessed to obtain power consumption dataset and chip temperature dataset;

[0063] The pre-built power prediction neural network is trained using a power consumption dataset and a power consumption prediction penalty function;

[0064] The pre-established chip temperature prediction neural network is trained using a chip temperature dataset and a temperature prediction penalty function. The temperature prediction penalty function is an adaptive penalty function. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model predicts a temperature higher than the actual temperature. The closer the actual temperature is to the upper limit of the chip temperature, the larger the penalty factor becomes.

[0065] By using a trained power consumption prediction neural network and a chip temperature prediction neural network, with the goal of minimizing the power consumption of the data center cooling system, and with the constraint that the actual chip temperature is less than the upper limit of the chip temperature, the optimal control parameters are optimized under different ambient temperatures, humidity and heat loads, and the data center cooling system is controlled according to the optimal control parameters.

[0066] In one possible implementation, Figure 1 The typical cold plate liquid-cooled data center shown is equipped with a variety of sensors, which can acquire the thermodynamic parameters of each device in the data center cooling system in real time, as shown in Table 1.

[0067] Table 1

[0068]

[0069] Under conditions of minimal heat loss from the pipeline, the cooling tower outlet temperature T tower,out Equal to the primary side return liquid temperature T back,1 Cooling tower inlet temperature T tower,in Equal to the primary side supply temperature T sup,1 Server outlet temperature T sever,out Equal to the secondary side return liquid temperature T back,2 Server inlet temperature T sever,in Equal to the secondary side liquid supply temperature T sup,2 .

[0070] In one possible implementation, multiple data preprocessing methods are applied to the data in the original database, including empirical mode decomposition and wavelet threshold denoising, Mahalanobis distance and k-nearest neighbor methods for data filtering, and linear interpolation and expectation-maximization algorithms for data repair, in order to reduce the uncertainty of the original data and improve the robustness of the model.

[0071] The data of various types are normalized according to the following formula to form a high-quality database of the model.

[0072]

[0073] In the formula, x0 represents normalized data; x represents original data; x min x is the minimum value in the original data. max This represents the maximum value of the original data.

[0074] The normalized power consumption-related thermodynamic parameters are used as the power consumption dataset, and the normalized chip temperature-related thermodynamic parameters are used as the chip temperature dataset.

[0075] The power consumption dataset includes: ambient temperature T amb External relative humidity (RH) and fan frequency (f) fan Cooling tower power consumption P tower Primary flow rate q v,1 Primary water pump power consumption P pump,1 Secondary flow rate q v,2 Secondary water pump power consumption P pump,2 And server thermal load Q;

[0076] The chip temperature dataset includes: ambient temperature T amb External relative humidity (RH) and fan frequency (f) fan Primary flow rate q v,1 Secondary flow rate q v,2 Server heat load Q, primary side liquid supply temperature T sup,1 Primary side return liquid temperature T back,1 Secondary side liquid supply temperature T sup,2 Secondary side return liquid temperature T back,2 and the temperature T of each chip chi,i .

[0077] In one possible implementation, when training a pre-established power prediction neural network using a power consumption dataset and a power consumption prediction penalty function, the number of neural network layers and the number of iterations are adjusted to obtain the power prediction neural network with the best fitting effect. The output of the power prediction neural network is the power consumption of each device in the cooling system, and the remaining parameters of the power consumption dataset are the input parameters of the power prediction neural network. The power consumption prediction penalty function is:

[0078]

[0079] Among them, Loss P P is the penalty function for the power consumption prediction neural network; P is the actual power consumption; P net Power consumption prediction for neural networks; MSE P denoted as mean square error; N represents the total amount of data.

[0080] In one possible implementation, an adaptive chip temperature prediction neural network penalty function is established. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model's predicted temperature is always higher than the actual temperature. This ensures that the optimized actual chip temperature is below the chip temperature safety upper limit during optimization control. Furthermore, the closer the chip temperature is to the chip temperature upper limit, the larger the penalty factor; conversely, the further the chip temperature is from the chip temperature upper limit, the smaller the penalty factor. Since the chip temperature prediction neural network serves as a constraint on the optimization algorithm, it significantly impacts the robustness of the cooling system when the chip temperature approaches the chip temperature upper limit. Therefore, an adaptive penalty function is employed to balance system robustness with the neural network's convergence performance.

[0081] The temperature prediction penalty function is:

[0082] Loss T =loss d +loss p

[0083]

[0084] Among them, Loss T The penalty function for the neural network predicting chip temperature; loss d The loss term is data-driven. p For the constraint loss term; T i T represents the temperature of the i-th actual chip. net,i Predict the chip temperature for the i-th neural network; λ T,i The adaptive penalty factor for the penalty function of the neural network for predicting the temperature of the i-th chip is calculated as follows:

[0085]

[0086] Where, λ T An adaptive penalty factor for the penalty function of the neural network for chip temperature prediction; μ T The scaling factor is used to balance the constraint effect and convergence status by adjusting the scaling factor according to the actual situation; T is the actual chip temperature; T net The neural network predicts the chip temperature; T0 is the upper limit of the chip temperature. Since data center servers contain multiple chips, it is necessary to calculate the loss for each chip.p The results are then summed to ensure that the neural network predicts a chip temperature higher than the actual temperature. When an optimization algorithm is used, the optimization results are applied to the actual situation, ensuring that the actual chip temperature remains below the chip's critical safety temperature, thus improving the robustness of the model control.

[0087] In one possible implementation, when training a pre-built chip temperature prediction neural network using a chip temperature dataset and a temperature prediction penalty function, the number of neural network unit layers and the number of iterations are adjusted to obtain the chip temperature prediction neural network with the best fitting effect. The output of the chip temperature prediction neural network is the temperature of each chip, and the remaining parameters of the chip temperature dataset are the input parameters of the chip temperature prediction neural network.

[0088] In one possible implementation, minimizing the energy consumption of the data center cooling system is the optimization objective, and keeping the chip temperature below the upper limit of the chip temperature is one of the constraints. A data center thermal management optimization algorithm is then established. The goal of the control method in this embodiment of the invention is to determine the optimal cooling tower fan airflow, primary side pump flow rate, and secondary side pump flow rate under different ambient temperatures, humidity levels, and heat loads, thereby minimizing system power consumption. This can be expressed mathematically as follows:

[0089] P = f(f) fan ,q v,1 ,q v,2 )

[0090] st lb v,i ≤q v,i ≤ub v,i i = 1, 2

[0091] lb fan ≤f fan ≤ub fan

[0092] T chi,i =f i (f fan ,q v,1 ,q v,2 )≤T chi,i,max i = 1, 2, ..., n

[0093] Where P is the power consumption of the cooling system; f(f fan ,q v,1 ,q v,2 ) represents the system power consumption as a function of fan frequency and primary and secondary side flow rates under a certain ambient temperature, humidity, and heat load; lb and ub are the upper and lower limits of cooling tower fan airflow, primary side pump flow rate, and secondary side pump flow rate; f i (f fan ,q v,1 ,q v,2(T) represents the temperature of each chip as a function of the fan frequency and the primary and secondary flow rates under a given ambient temperature, humidity, and heat load; chi,i,max is the upper limit of chip temperature; n is the number of chips in the server.

[0094] like Figure 3 As shown, after ignoring the uncertainties of sensors and controllers, the control effects of the conventional model optimization control method and the data center cooling system control method based on constrained neural networks in this embodiment of the invention are compared. The uncertainties of the conventional model may cause the actual temperature of the chip to exceed the upper limit, while the actual temperature of the chip after the highly robust optimization control of the method of this invention is always lower than the upper limit of the chip temperature, which improves the thermal reliability of the chip and extends the chip life.

[0095] Another embodiment of the present invention also proposes a data center cooling system control system based on a constrained neural network, comprising:

[0096] The data acquisition module is used to acquire the thermodynamic parameters of the data center cooling system in real time and build the raw database;

[0097] The data preprocessing module is used to preprocess the data in the original database to obtain power consumption dataset and chip temperature dataset;

[0098] The power prediction neural network training module is used to train a pre-built power prediction neural network using a power dataset and a power prediction penalty function.

[0099] The chip temperature prediction neural network training module is used to train a pre-established chip temperature prediction neural network using a chip temperature dataset and a temperature prediction penalty function. The temperature prediction penalty function is an adaptive penalty function. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model predicts a temperature higher than the actual temperature. The closer the actual temperature is to the upper limit of the chip temperature, the larger the penalty factor becomes.

[0100] The control parameter optimization module utilizes a trained power consumption prediction neural network and a chip temperature prediction neural network to optimize the data center cooling system with the goal of minimizing power consumption and the constraint that the actual chip temperature is less than the upper limit of chip temperature. It optimizes the optimal control parameters under different ambient temperatures, humidity levels and heat loads, and controls the data center cooling system according to the optimal control parameters.

[0101] Another embodiment of the present invention provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the data center cooling system control method based on a constrained neural network.

[0102] For example, the instructions stored in the memory can be divided into one or more modules / units. These modules / units are stored in a computer-readable storage medium and executed by the processor to complete the data center cooling system control method based on constrained neural networks of the present invention. The one or more modules / units can be a series of computer-readable instruction segments capable of performing specific functions, which describe the execution process of the computer program in the server.

[0103] The electronic device may be a smartphone, laptop, PDA, or cloud server, among other computing devices. It may include, but is not limited to, a processor and memory. Those skilled in the art will understand that the electronic device may also include more or fewer components, or combinations of certain components, or different components; for example, it may also include input / output devices, network access devices, buses, etc.

[0104] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor.

[0105] The memory can be an internal storage unit of the server, such as a hard drive or RAM. Alternatively, it can be an external storage device, such as a plug-in hard drive, Smart Media Card (SMC), Secure Digital (SD) card, or Flash Card. Furthermore, the memory can include both internal and external storage units. The memory is used to store computer-readable instructions and other programs and data required by the server. It can also be used to temporarily store data that has been output or will be output.

[0106] It should be noted that the information interaction and execution process between the above-mentioned module units are based on the same concept as the method embodiment. For details on their specific functions and technical effects, please refer to the method embodiment section. They will not be repeated here.

[0107] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is merely an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit. Furthermore, the specific names of the functional units and modules are only for easy differentiation and are not intended to limit the scope of protection of this application. The specific working process of the units and modules in the above system can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0108] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments of this application can be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include at least: any entity or device capable of carrying the computer program code to a photographing device / terminal device, a recording medium, a computer memory, a read-only memory (ROM), a random access memory (RAM), an electrical carrier signal, a telecommunication signal, and a software distribution medium. Examples include USB flash drives, portable hard drives, magnetic disks, or optical disks.

[0109] In the above embodiments, the descriptions of each embodiment have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0110] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. A control method for a data center cooling system based on a constrained neural network, characterized in that, include: Real-time acquisition of thermodynamic parameters of data center cooling systems to construct a raw database; The data in the original database is preprocessed to obtain power consumption dataset and chip temperature dataset; The pre-built power prediction neural network is trained using a power consumption dataset and a power consumption prediction penalty function; The pre-established chip temperature prediction neural network is trained using a chip temperature dataset and a temperature prediction penalty function. The temperature prediction penalty function is an adaptive penalty function. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model predicts a temperature higher than the actual temperature. The closer the actual temperature is to the upper limit of the chip temperature, the larger the penalty factor becomes. By using a trained power consumption prediction neural network and a chip temperature prediction neural network, with the goal of minimizing the power consumption of the data center cooling system, and with the constraint that the actual chip temperature is less than the upper limit of the chip temperature, the optimal control parameters are optimized under different ambient temperatures, humidity and heat loads, and the data center cooling system is controlled according to the optimal control parameters.

2. The data center cooling system control method based on constrained neural networks according to claim 1, characterized in that, The data center cooling system includes a cooling tower (1) and a heat exchanger (3). The primary side of the heat exchanger (3) is connected to the cooling tower (1) through a pipeline, and the secondary side of the heat exchanger (3) is connected to the data center server (6) through a pipeline. The heat exchanger (3) performs heat exchange between cold and hot fluids. The data center server (6) contains multiple chips, and sensors are arranged to collect the actual temperature of each chip in the data center server (6).

3. The data center cooling system control method based on constrained neural networks according to claim 2, characterized in that, A primary water pump (4) is installed on the pipeline between the cooling tower (1) and the heat exchanger (3), and a secondary water pump (7) is installed on the pipeline between the heat exchanger (3) and the data center server (6); in the step of acquiring the thermodynamic parameters of the data center cooling system in real time and constructing the original database, the thermodynamic parameters of the data center cooling system include: The parameters corresponding to the cooling tower (1) include: external ambient temperature. T amb External environmental relative humidity RH Fan frequency f fan Cooling tower outlet temperature T tower,out Cooling tower inlet temperature T tower,in and the power consumption of cooling towers P tower ; The parameters corresponding to the primary side water pump (4) include: primary side flow rate. q v,1 Primary water pump power consumption P pump,1 ; The parameters corresponding to heat exchanger (3) include: primary side liquid supply temperature T sup,1 Primary side return liquid temperature T back,1 Secondary side liquid supply temperature T sup,2 and secondary side return liquid temperature T back,2 ; The parameters corresponding to the secondary side water pump (7) include: secondary side flow rate. q v,2 Secondary water pump power consumption P pump,2 ; The parameters corresponding to the data center server (6) include: server heat load. Q Server exit temperature T sever,out Server entry temperature T sever,in and the temperature of each chip T chi,i .

4. The data center cooling system control method based on constrained neural networks according to claim 3, characterized in that, The steps for preprocessing the data in the original database include: Data denoising, data filtering, data repair, and normalization processing; The expression for normalization is as follows: In the formula, x 0 represents normalized data; x represents original data; x min It is the minimum value in the original data; x max The maximum value of the original data; The normalized power consumption-related thermodynamic parameters are used as the power consumption dataset, and the normalized chip temperature-related thermodynamic parameters are used as the chip temperature dataset. The power consumption dataset includes: ambient temperature. T amb External environmental relative humidity RH Fan frequency f fan Cooling tower power consumption P tower Primary flow rate q v,1 Primary water pump power consumption P pump,1 Secondary flow q v,2 Secondary water pump power consumption P pump,2 and server heat load Q ; The chip temperature dataset includes: ambient temperature. T amb External environmental relative humidity RH Fan frequency f fan Primary flow rate q v,1 Secondary flow q v,2 Server heat load Q Primary side liquid supply temperature T sup,1 Primary side return liquid temperature T back,1 Secondary side liquid supply temperature T sup,2 Secondary side return liquid temperature T back,2 and the temperature of each chip T chi,i .

5. The data center cooling system control method based on constrained neural networks according to claim 1, characterized in that, The power consumption prediction penalty function is: in, Loss P The penalty function for the power consumption prediction neural network; P This represents the actual power consumption. P net Predict power consumption for neural networks; MSE P Mean square error; N This represents the total amount of data.

6. The data center cooling system control method based on constrained neural networks according to claim 1, characterized in that, The temperature prediction penalty function is: in, Loss T Penalty function for the neural network for chip temperature prediction; loss d For data-driven loss terms; loss p To constrain the loss term; For the first i The actual chip temperature; For the first i A neural network predicts chip temperature; For the first i The adaptive penalty factor of the penalty function for the chip temperature prediction neural network is calculated as follows: in, λ T An adaptive penalty factor for the penalty function of the neural network for chip temperature prediction; μ T The scaling factor can be adjusted according to the actual situation to balance the constraint effect and the convergence status. T This refers to the actual chip temperature. T net Predicting chip temperature using neural networks; T 0 represents the upper limit of the chip temperature.

7. The data center cooling system control method based on constrained neural networks according to claim 1, characterized in that, Both the power consumption prediction neural network and the chip temperature prediction neural network are fully connected, including an input layer, a hidden layer, and an output layer. When training the pre-established power consumption prediction neural network and the pre-established chip temperature prediction neural network, the number of neural network unit layers and the number of iterations are adjusted until the accuracy of the test set and the validation set is within an acceptable range. Finally, the power consumption prediction neural network with the best fitting effect is obtained, and the training is completed.

8. The data center cooling system control method based on constrained neural networks according to claim 4, characterized in that, The optimization expression for the step of using a trained power consumption prediction neural network and a chip temperature prediction neural network, with the goal of minimizing the power consumption of the data center cooling system and the constraint that the actual chip temperature is less than the upper limit of the chip temperature, to optimize the optimal control parameters under different environmental temperatures, humidity levels, and heat loads is as follows: in, P For cooling system power consumption; f ( f fan , q v,1 , q v,2 Let f(x) be the system power consumption as a function of fan frequency and primary and secondary flow rates under a certain ambient temperature, humidity and heat load. lb and ub These are the upper and lower limits for the air volume of the cooling tower fan, and the flow rates of the primary and secondary side pumps. f i ( f fan , q v,1 , q v,2 ) represents the temperature of each chip as a function of the fan frequency and the primary and secondary flow rates under a certain ambient temperature, humidity and heat load; T chi , i,max This is the upper limit of the chip's temperature. n This refers to the number of chips inside the server.

9. A control system for a data center cooling system based on a constrained neural network, characterized in that, include: The data acquisition module is used to acquire the thermodynamic parameters of the data center cooling system in real time and build the raw database; The data preprocessing module is used to preprocess the data in the original database to obtain power consumption dataset and chip temperature dataset; The power prediction neural network training module is used to train a pre-built power prediction neural network using a power dataset and a power prediction penalty function. The chip temperature prediction neural network training module is used to train a pre-established chip temperature prediction neural network using a chip temperature dataset and a temperature prediction penalty function. The temperature prediction penalty function is an adaptive penalty function. When the model predicts a temperature lower than the actual temperature, a penalty factor is introduced to ensure that the model predicts a temperature higher than the actual temperature. The closer the actual temperature is to the upper limit of the chip temperature, the larger the penalty factor becomes. The control parameter optimization module utilizes a trained power consumption prediction neural network and a chip temperature prediction neural network to optimize the data center cooling system with the goal of minimizing power consumption and the constraint that the actual chip temperature is less than the upper limit of chip temperature. It optimizes the optimal control parameters under different ambient temperatures, humidity levels and heat loads, and controls the data center cooling system according to the optimal control parameters.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the data center cooling system control method based on any one of claims 1 to 8.