An energy-saving optimization scheduling method for a refrigeration system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By establishing an IoT sensing network and a DRL training environment, thermal response, energy consumption response, and hot spot temperature prediction models were constructed, solving the problems of weak model accuracy and control effect evaluation closed loop in the refrigeration system, and achieving high-precision end-to-end optimization and continuous improvement of energy-saving effect.

CN122308282APending Publication Date: 2026-06-30GUANGDONG FUNENG BIG DATA IND PARK CONSTR CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: GUANGDONG FUNENG BIG DATA IND PARK CONSTR CO LTD
Filing Date: 2026-03-25
Publication Date: 2026-06-30

Application Information

Patent Timeline

25 Mar 2026

Application

30 Jun 2026

Publication

CN122308282A

IPC: G05B19/418; G06F18/214; G06F18/21; G06N5/04

AI Tagging

Technology Topics

State prediction Decision model

Technical Efficacy Phrases

High precisionReliable hotspot control

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

An automatic screw feeder
CN224390405Uprevent stackingavoid sticking Metal working apparatus Structural engineering Machine
Accurate metering and dispensing optimization device for condiment production
CN224377118UHigh precisionEliminate adsorptionLoading/unloading
An on-chip current acquisition circuit
CN116338285BHigh precisionApplicable to multi-scenario application requirementsCurrent/voltage measurement Software engineering Hemt circuits
A sea area performance evaluation method and system based on geospatial analysis technology
CN122198328Aimprove science High precision Data processing applications Knowledge based models
A method for manufacturing surface defect-free high-purity oxygen-free copper wire based on continuous extrusion
CN122252477AIncrease productivity Shorten the production cycle Wire rod Copper wire

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing refrigeration systems suffer from limitations in model accuracy, separation of prediction and decision-making, and weak closed-loop control effect evaluation. These issues result in large errors in high-density cabinet scenarios, reliance on post-analysis for optimization effects, and a lack of closed-loop mechanisms for online evaluation and self-triggered model updates.

Method used

Establish an IoT sensing network to collect multi-source heterogeneous time-series data, construct a standardized time-series feature set through data cleaning and feature engineering, combine it with DRL training environment and intelligent inference decision model to form a collaborative system for thermal response, energy consumption response and hot spot temperature prediction, achieve end-to-end optimization, and form closed-loop optimization through control effect evaluation report.

Benefits of technology

It improved model accuracy, increased energy efficiency, and achieved continuous optimization of hotspot control reliability and energy-saving effect, avoiding the defects of module fragmentation and reliance on ex-post analysis.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122308282A_ABST

Patent Text Reader

Abstract

This invention relates to the field of refrigeration energy-saving scheduling technology, specifically disclosing an energy-saving optimization scheduling method for refrigeration systems. Based on environmental data, load data, refrigeration system state data, and energy consumption data, this invention constructs a physical modeling standardized time-series feature set, a state prediction standardized time-series feature set, and a decision state standardized time-series feature set. Historical data from these feature sets are then used to train thermal response models, energy consumption response models, hotspot temperature prediction models, and intelligent reasoning decision models, improving model accuracy and achieving predictive control. After control commands are executed, this invention performs temperature control effect evaluation, energy consumption control effect evaluation, and command execution deviation evaluation, while simultaneously setting retraining trigger conditions, forming an effective closed loop of execution, evaluation, and retraining. The combined application of each step in this invention effectively improves energy efficiency and makes hotspot control more stable.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of refrigeration energy-saving scheduling technology, and in particular to an energy-saving optimization scheduling method for refrigeration systems. Background Technology

[0002] With the continuous growth in computing demand, the number and scale of data centers are expanding rapidly, leading to a sharp increase in energy consumption. This poses a significant challenge to the green operation of data centers and places a heavy burden on the power system's energy supply. As a crucial system for ensuring the safe and stable operation of equipment within data centers, the cooling system's energy consumption has also increased dramatically, becoming a major source of energy consumption and carbon emissions for data centers. Therefore, how to better achieve energy conservation and consumption reduction has become a key focus of development in the cooling field.

[0003] An energy-saving scheduling method for data center cooling systems is proposed, comprising the following steps: collecting data on rack air intake zone, cold aisle / hot aisle temperatures, air conditioning terminal operating parameters, IT equipment load, and outdoor temperature and humidity; constructing a thermal resistance-heat capacity model of the data center based on thermodynamic principles; predicting future IT equipment load changes using time series methods; calculating the optimal air conditioning settings using a genetic algorithm with the goal of minimizing cooling system energy consumption and the constraint that hot spot temperatures do not exceed a threshold; and manually verifying the optimization results or directly sending them to the precision air conditioning controller and chiller group control system for execution.

[0004] Existing methods, based on the thermal resistance-thermal capacity model, have clear physical meanings and relatively controllable problem scales, making them suitable for deployment on local industrial control computers or edge servers. They exhibit high long-term stability in scenarios with relatively stable loads and fixed data center layouts. However, existing methods still have some drawbacks: First, model accuracy is limited; simplified physical models struggle to accurately capture complex airflow organization and dynamic changes in local hotspots, especially in high-density rack scenarios where errors are significant. Second, prediction and decision-making are separated; load prediction and optimization decision-making are independent modules, failing to form end-to-end joint optimization, which easily leads to suboptimal solutions. Third, the control effect evaluation closed-loop is weak; optimization effects largely rely on post-event report analysis, lacking a closed-loop mechanism for online evaluation and self-triggered model updates. Summary of the Invention

[0005] This invention addresses the problems in the prior art, such as limited model accuracy, separation of prediction and decision-making, and weak closed-loop evaluation of control effect, by proposing an energy-saving optimization scheduling method for refrigeration systems.

[0006] To achieve the above objectives, the present invention provides the following technical solution: an energy-saving optimization scheduling method for a refrigeration system, comprising the following steps:

[0007] Step 1: Establish an IoT sensing network to collect environmental data, load data, cooling system status data, and energy consumption data, and output a multi-source heterogeneous time-series dataset with timestamps.

[0008] Step 2: Perform data cleaning and feature engineering on the multi-source heterogeneous time series datasets, and construct and output the physical modeling standardized time series feature set, the state prediction standardized time series feature set, and the decision state standardized time series feature set;

[0009] Step 3: Train the constructed thermal response model and energy consumption response model using the standardized time series feature set of historical physics modeling, and train the constructed hot spot temperature prediction model using the standardized time series feature set of historical state prediction. Construct a DRL training environment based on the thermal response model and energy consumption response model. Train the intelligent reasoning decision model based on the standardized time series feature set of historical decision state, the DRL training environment, and the hot spot temperature prediction results. If all tests pass, save the model file; otherwise, calibrate and update the model.

[0010] Step 4: Input the standardized time-series feature set of real-time physical modeling into the thermal response model and energy consumption response model for inference, output the current hot spot temperature estimate and the current cooling energy consumption estimate, trigger model drift detection, and feed the detection results back to Step 3;

[0011] Step 5: Input the standardized time series feature set of real-time state prediction into the hotspot temperature prediction model for inference and output the hotspot temperature at future time points; input the standardized time series feature set of real-time decision state and the hotspot temperature at future time points into the intelligent inference decision model for inference and output the control instruction set.

[0012] Step 6: Send the control command set to the controller, wait for the system response, collect the temperature data, energy consumption data and command execution data after the control command is executed, and conduct temperature control effect evaluation, energy consumption control effect evaluation and command execution deviation evaluation respectively. Then, calculate the comprehensive score based on the evaluation results and generate a control effect evaluation report.

[0013] Step 7: Write the performance evaluation report into the time series database. When the preset model retraining conditions are triggered, retrain the model from Step 3.

[0014] Compared with the prior art, this application includes at least one of the following beneficial technical effects:

[0015] 1. To address the limitation of model accuracy, step three applies a standardized time-series feature set based on historical physical modeling to train the constructed thermal response model and energy consumption response model, and applies a standardized time-series feature set based on historical state prediction to train the constructed hotspot temperature prediction model. A DRL training environment is constructed based on the thermal response model and energy consumption response model. The intelligent inference decision-making model is trained based on the standardized time-series feature set based on historical decision states, the DRL training environment, and the hotspot temperature prediction results. If all tests pass, the model file is saved; otherwise, the model is calibrated and updated. High-dimensional nonlinear mappings are learned directly from historical operating data, resulting in higher accuracy than simplified physical models. The model is upgraded from a single physical model or prediction model to a four-model collaborative system of thermal response + energy consumption response + hotspot temperature prediction + intelligent decision-making, making hotspot control more reliable.

[0016] 2. To address the drawbacks of separating prediction and decision-making, step five inputs the standardized time-series feature set of real-time state prediction into the hotspot temperature prediction model for inference and output of the hotspot temperature at future time points. Simultaneously, the standardized time-series feature set of real-time decision-making state and the hotspot temperature at future time points are input into the intelligent inference decision-making model for inference and output of the control command set. By linking the hotspot temperature prediction model and the intelligent inference decision-making model, the prediction results become the input to the decision-making model, forming end-to-end optimization and avoiding module fragmentation; thus further improving energy efficiency.

[0017] 3. To address the weakness of the closed-loop control effect evaluation, step six sends the control command set to the controller. After waiting for the system response, it collects temperature data, energy consumption data, and command execution data after the control commands are executed. Temperature control effect evaluation, energy consumption control effect evaluation, and command execution deviation evaluation are performed respectively. Then, a comprehensive score is calculated based on the evaluation results, and a control effect evaluation report is generated. Step seven writes the effect evaluation report into the time series database. When the preset model retraining conditions are triggered, the model in step three is retrained, thus forming a closed-loop continuous optimization. Attached Figure Description

[0018] Figure 1 A flowchart illustrating the steps of an energy-saving optimization scheduling method for a refrigeration system;

[0019] Figure 2 This is a connection diagram of an energy-saving optimization scheduling system module for a refrigeration system. Detailed Implementation

[0020] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0021] like Figure 1 As shown, this embodiment provides an energy-saving optimization scheduling method for a refrigeration system, including the following steps:

[0022] Step 1: Establish an IoT sensing network to collect environmental data, load data, cooling system status data, and energy consumption data, and output a multi-source heterogeneous time-series dataset with timestamps.

[0023] Step 2: Perform data cleaning and feature engineering on the multi-source heterogeneous time series datasets, and construct and output the physical modeling standardized time series feature set, the state prediction standardized time series feature set, and the decision state standardized time series feature set;

[0024] Step 3: Train the constructed thermal response model and energy consumption response model using the standardized time series feature set of historical physics modeling, and train the constructed hot spot temperature prediction model using the standardized time series feature set of historical state prediction. Construct a DRL training environment based on the thermal response model and energy consumption response model. Train the intelligent reasoning decision model based on the standardized time series feature set of historical decision state, the DRL training environment, and the hot spot temperature prediction results. If all tests pass, save the model file; otherwise, calibrate and update the model.

[0025] Specifically, in this embodiment, the constructed thermal response model and energy consumption response model are trained, validated, and tested using a standardized time-series feature set based on historical physics modeling. The training set can be the top 70%, the validation set the middle 15%, and the test set the bottom 15%. Training is performed on the training set, optimization is performed on the validation set, and the mean absolute error, root mean square error, and coefficient of determination of the thermal response model and energy consumption response model are evaluated on the test set. After passing the test, the thermal response model and energy consumption response model are saved. The constructed hotspot temperature prediction model is trained, validated, and tested using a standardized time-series feature set based on historical state prediction. The training set can be the top 70%, the validation set the middle 15%, and the test set the bottom 15%. Training is performed on the training set, optimization is performed on the validation set, and the mean absolute error, root mean square error, and coefficient of determination of the hotspot temperature prediction model are evaluated on the test set. The focus is on verifying the accuracy of the predicted temperature rise or fall trend, while also verifying the prediction delay to ensure the timeliness of the prediction.

[0026] In this embodiment, it is specifically noted that when constructing the DRL training environment based on the thermal response model and energy consumption response model, the environment input is the current state and action, and the environment output is the next state and immediate reward. Simultaneously, a hotspot temperature prediction model is integrated to provide future temperature predictions as part of the state. After the DRL training environment is constructed, the action space is defined, including continuous actions (chilled water supply temperature setpoint, pump frequency, cooling tower fan frequency, valve opening, and terminal fan speed), action boundaries (defined upper and lower limits based on equipment physical constraints), and action smoothing constraints. After defining the reward function, the continuous action space algorithm (PPO) is used to train the intelligent inference decision model. The trained strategy is evaluated on the test set, and the energy saving rate, over-temperature events, and action stability are analyzed by comparing it with benchmark strategies (such as PID and rule-based control). If necessary, the reward function weights are adjusted and retrained. The strategy network weights are saved, the inference model is exported, and the encapsulated inference interface is confirmed to be used for inputting the current state and outputting control commands.

[0027] Step 4: Input the standardized time-series feature set of real-time physical modeling into the thermal response model and energy consumption response model for inference, output the current hot spot temperature estimate and the current cooling energy consumption estimate, trigger model drift detection, and feed the detection results back to Step 3;

[0028] In this embodiment, it should be specifically noted that the current estimated cooling energy consumption includes the current estimated total power consumption of the cooling system, the current estimated power consumption of the chiller unit, the current estimated power consumption of the cooling tower, the current estimated power consumption of the water pump, and the current estimated power consumption of the terminal air conditioner. Outputting the current hotspot temperature estimate and the current estimated cooling energy consumption, and triggering model drift detection, includes the following steps:

[0029] Obtain the current hot spot temperature, the current total power consumption of the refrigeration system, the current power consumption of the chiller unit, the current power consumption of the cooling tower, the current power consumption of the water pump, and the current power consumption of the terminal air conditioner;

[0030] Calculate the difference between the current hot spot temperature and the estimated current hot spot temperature; calculate the difference between the current total power consumption of the refrigeration system and the estimated current power consumption of the cooling tower; calculate the difference between the current power consumption of the chiller unit and the estimated current power consumption of the chiller unit; calculate the difference between the current power consumption of the cooling tower and the estimated current power consumption of the cooling tower; calculate the difference between the current power consumption of the water pump and the estimated current power consumption of the water pump; calculate the difference between the current power consumption of the terminal air conditioner and the estimated current cooling energy consumption.

[0031] Calculate the actual rolling average absolute error within each detection cycle (e.g., every 5 minutes);

[0032] The actual rolling average absolute error of the thermal response model and the energy consumption response model is retrieved. If the actual rolling average absolute error is greater than 1.2 times the reference rolling average absolute error, less than or equal to 1.5 times the reference rolling average absolute error, or lasts for less than 2 hours, it is considered a slight model drift. If the actual rolling average absolute error is greater than 1.5 times the reference rolling average absolute error, less than 2 times the reference rolling average absolute error, or lasts for more than 2 hours, it is considered a moderate model drift. If the actual rolling average absolute error is greater than 2 times the reference rolling average absolute error, it is considered a severe model drift.

[0033] After the detection, the drift detection results are packaged into feedback data and sent to step three.

[0034] Step 5: Input the standardized time series feature set of real-time state prediction into the hotspot temperature prediction model for inference and output the hotspot temperature at future time points; input the standardized time series feature set of real-time decision state and the hotspot temperature at future time points into the intelligent inference decision model for inference and output the control instruction set.

[0035] Step 6: Send the control command set to the controller, wait for the system response, collect the temperature data, energy consumption data and command execution data after the control command is executed, and conduct temperature control effect evaluation, energy consumption control effect evaluation and command execution deviation evaluation respectively. Then, calculate the comprehensive score based on the evaluation results and generate a control effect evaluation report.

[0036] In this embodiment, it should be specifically noted that the control effect evaluation report includes the control command issuance time, evaluation period, environmental conditions, control command set, execution status, temperature control effect evaluation result, energy consumption control effect evaluation result, command execution deviation control result, comprehensive evaluation result, and optimization suggestions.

[0037] Step 7: Write the performance evaluation report into the time series database. When the preset model retraining conditions are triggered, retrain the model from Step 3.

[0038] In this embodiment, it should be specifically noted that the model retraining condition trigger types include performance degradation trigger (triggered when moderate model drift occurs), time period trigger (triggered at a fixed time interval of every Monday at midnight), data accumulation trigger (triggered when the amount of new data reaches a threshold), abnormal event trigger (triggered when equipment failure occurs or the deviation of three consecutive control command executions is greater than 20%), and manual trigger.

[0039] Furthermore, the environmental data includes outdoor dry-bulb temperature, relative humidity, and atmospheric pressure; indoor air conditioning return air temperature and humidity, cold aisle temperature, and hot aisle temperature; the load data includes IT equipment power consumption, server intake air temperature, CPU utilization, and GPU utilization; the refrigeration system status data includes chilled water supply temperature setpoint, chilled water flow rate, chilled water supply and return temperatures, cooling water supply and return temperatures, compressor suction and discharge pressure, natural cooling coil inlet and outlet temperatures, pump frequency, pump flow rate, cooling tower fan frequency, and valve opening on the delivery side; and terminal fan speed, supply air temperature setpoint, and actual supply air temperature. The energy consumption data includes total refrigeration system power consumption, chiller unit power consumption, cooling tower power consumption, pump power consumption, terminal air conditioning power consumption, and total input power consumption.

[0040] Furthermore, data cleaning of multi-source heterogeneous time-series datasets includes outlier detection and correction, missing value handling, and multi-source data alignment; the feature engineering includes the following steps:

[0041] After data cleaning, the environmental data, load data, refrigeration system status data, and energy consumption data are sequentially labeled as basic environmental characteristics, load characteristics, basic refrigeration system status characteristics, and basic energy consumption characteristics.

[0042] Based on the basic environmental characteristics, we construct environmental derivative characteristics; based on the basic refrigeration system state characteristics, we construct refrigeration system state derivative characteristics; based on the basic energy consumption characteristics, we construct energy consumption derivative characteristics. Among them, the environmental derivative characteristics include outdoor wet-bulb temperature and indoor return air enthalpy; the refrigeration system state derivative characteristics include chilled water supply and return water temperature difference, cooling water supply and return water temperature difference, compressor compression ratio, natural cooling contribution temperature difference, water pump operating efficiency, and refrigeration unit cooling capacity; and the energy consumption derivative characteristics include real-time PUE and refrigeration unit COP.

[0043] Construct a standardized temporal feature set for physical modeling, a standardized temporal feature set for state prediction, and a standardized temporal feature set for decision state;

[0044] The three feature sets constructed are stored in the cloud database.

[0045] Furthermore, constructing environmental derivative features based on basic environmental characteristics, constructing refrigeration system state derivative features based on basic refrigeration system state characteristics, and constructing energy consumption derivative features based on basic energy consumption characteristics includes the following steps:

[0046] Based on the outdoor dry-bulb temperature, outdoor relative humidity, and atmospheric pressure, the outdoor wet-bulb temperature is calculated using the iterative method of the Stull formula, and the indoor return air enthalpy is calculated based on the air conditioning return air temperature, air conditioning return air humidity, and atmospheric pressure.

[0047] The chilled water supply and return water temperature difference is calculated based on the chilled water supply temperature and return water temperature; the cooling water supply and return water temperature difference is calculated based on the cooling water supply temperature and return water temperature; the compressor compression ratio is calculated based on the compressor suction pressure and discharge pressure; the natural cooling contribution temperature difference is calculated based on the natural cooling coil inlet temperature and outlet temperature; the water pump operating efficiency is calculated based on the water pump frequency, water pump flow rate, and water pump power consumption; and the refrigeration capacity is calculated based on the chilled water flow rate, water specific heat capacity, water density, chilled water supply temperature, and return water temperature.

[0048] Real-time PUE is calculated based on total input power consumption and IT equipment power consumption, and chiller COP is calculated based on chiller cooling capacity and chiller unit power consumption.

[0049] Furthermore, constructing a standardized temporal feature set for physical modeling includes the following steps:

[0050] After determining the target variable and input variables, calculate the standard deviation of the 30-minute window variable, the mean and cumulative value of the 60-minute window variable, and the lagged value of the lagged window variable.

[0051] Steady-state screening is performed. The time-series statistical features of samples that pass the steady-state screening are labeled as target time-series statistical features and retained for subsequent processing. Samples that fail the steady-state screening are discarded directly.

[0052] For each timestamp that passes the steady-state screening, basic features, derived features, target time-series statistical features, and state features are concatenated.

[0053] The dataset is divided into training, validation, and test sets in chronological order. The mean and standard deviation of each continuous feature are calculated on the training set, and the Z-score is used to standardize the training, validation, and test sets using the calculated mean and standard deviation.

[0054] The standardized temporal feature set for physical modeling has been completed.

[0055] Specifically, in this embodiment, when the target of the constructed feature set is a physical modeling standardized time-series feature set, the target variables include hot spot temperature, total power consumption of the refrigeration system, power consumption of the chiller unit, power consumption of the cooling tower, power consumption of the water pump, and power consumption of the terminal air conditioner; the input variables include outdoor dry-bulb temperature, IT equipment power consumption, CPU utilization, GPU utilization, chilled water supply temperature setpoint, cooling tower fan frequency, water pump frequency, valve opening, terminal fan speed, chilled water supply and return temperatures, cooling water supply and return temperatures, compressor suction pressure and exhaust pressure, natural cooling coil inlet and outlet temperatures, atmospheric pressure, and indoor air conditioner return air temperature and humidity.

[0056] Specifically, in this embodiment, the standard deviation of the 30-minute window variable includes the standard deviation of hotspot temperature, IT equipment power consumption, chilled water supply temperature, and chilled water return temperature over the past 30 minutes; the mean of the 60-minute window variable includes the mean of all input variables over the past 60 minutes; the cumulative value of the 60-minute window variable includes the cumulative value of IT equipment power consumption, chilled water flow rate, and refrigeration unit cooling capacity over the past 60 minutes; and the lag value of the lag window variable includes the 10-minute lag time of the chilled water supply temperature (i.e., the current time). The following parameters are used: chilled water supply temperature lag value at time t (taken as the original chilled water supply temperature at time t-10min); chilled water return temperature lag value at 5min (i.e., chilled water return temperature lag value at current time t takes the original chilled water return temperature at time t-5min); cooling water supply temperature lag value at 5min (i.e., cooling water supply temperature lag value at current time t takes the original cooling water supply temperature at time t-5min); valve opening lag value at 2min (i.e., valve opening lag value at current time t takes the original valve opening value at time t-2min); and fan speed lag value at 2min (i.e., fan speed lag value at current time t takes the original fan speed at time t-2min).

[0057] In this embodiment, it should be specifically explained that steady-state screening means removing samples whose hot spot temperature standard deviation is greater than a set threshold within a 30-minute window. If the hot spot temperature standard deviation is greater than the set threshold within a 30-minute window, the sample fails steady-state screening. If the hot spot temperature standard deviation is less than or equal to the set threshold within a 30-minute window, the sample passes steady-state screening. The sample that passes steady-state screening essentially refers to the timestamp of the sample and its corresponding complete data row. The target time series statistical features do not include the standard deviation of hot spot temperature. The standard deviation of hot spot temperature is only used for steady-state screening.

[0058] In this embodiment, it should be specifically noted that the status characteristics refer to status labels, including operating mode labels, load rate range labels, and time period labels. When the outdoor wet-bulb temperature is greater than the upper limit threshold for natural cooling or the natural cooling valve is closed, the refrigeration system is determined to be operating in mechanical cooling mode, i.e., a mechanical cooling status label is constructed. When the outdoor wet-bulb temperature is greater than the lower limit threshold for complete natural cooling, or the outdoor wet-bulb temperature is less than or equal to the upper limit threshold for natural cooling and the natural cooling valve is open, the refrigeration system is determined to be operating in mixed cooling mode, i.e., a mixed cooling status label is constructed. When the outdoor wet-bulb temperature is less than or equal to the lower limit threshold for complete natural cooling and the chiller unit is shut down, the refrigeration system is determined to be operating in complete natural cooling mode. Load factor is the ratio of IT equipment power consumption to rated load. When the load range is [0, 30%), the load factor range label value is 0, which physically means light load; when the load range is [30%, 50%), the load factor range label value is 1, which physically means low to medium load; when the load range is [50%, 70%), the load factor range label value is 2, which physically means medium load; when the load range is [70%, 90%), the load factor range label value is 3, which physically means medium to high load; when the load range is [90%, 100%], the load factor range label value is 4, which physically means full load. The time period label name includes hour and Is-Workday. The hour value is 0-23, which directly uses the hour number. The Is-Workday value is 0 or 1, where 1 represents a weekday and 0 represents a weekend or holiday.

[0059] Furthermore, constructing a standardized temporal feature set for state prediction includes the following steps:

[0060] After determining the target variable and key variables, calculate the mean, maximum value and rate of change of the variable in the 5-minute window, calculate the mean, maximum value and standard deviation of the variable in the 15-minute window, and calculate the mean and maximum value in the 30-minute window.

[0061] After adding the current instantaneous value, add a time code. Concatenate the current instantaneous value, window statistics, and time code to obtain the target time series statistical features.

[0062] The target time-series statistical features are divided into training set, validation set and test set according to time order. The mean and standard deviation of each continuous feature are calculated on the training set and Z-score standardization is performed on the training set, validation set and test set using the calculated mean and standard deviation.

[0063] The standardized features are paired with the target variable to form supervised learning samples;

[0064] The standardized time-series feature set for state prediction has been constructed.

[0065] In this embodiment, it should be specifically noted that when the target of the constructed feature set is a standardized time series feature set for state prediction, the target variable is the hotspot temperature at a future time point, which refers to the hotspot temperature 15 minutes in the future (it is also possible to predict the temperature 5 minutes or 30 minutes in the future, but 15 minutes is a typical control cycle, and predicting 15 minutes in advance can leave adjustment time for control); key variables include outdoor wet-bulb temperature, IT equipment power consumption, CPU utilization, GPU utilization, current hotspot temperature, chilled water supply temperature, valve opening, terminal fan speed, and cooling tower fan frequency.

[0066] Specifically, in this embodiment, the mean of the 5-minute window variables includes the mean of all key variables over the past 5 minutes; the maximum value of the 5-minute window variables includes the maximum power consumption of IT equipment over the past 5 minutes and the maximum temperature of the current hotspot over the past 5 minutes; the rate of change of the 5-minute window variables includes the rate of change of outdoor wet-bulb temperature, the rate of change of IT equipment power consumption, the rate of change of CPU utilization, the rate of change of GPU utilization, the rate of change of current hotspot temperature, the rate of change of chilled water supply temperature, the rate of change of valve opening, and the rate of change of terminal fan speed over the past 5 minutes.

[0067] Specifically, in this embodiment, the mean values of the 15-minute window variables include the mean values of the outdoor wet-bulb temperature, IT equipment power consumption, CPU utilization, GPU utilization, current hotspot temperature, chilled water supply temperature, valve opening, and terminal fan speed over the past 15 minutes. The maximum value of the 15-minute window variables refers to the maximum value of the current hotspot temperature over the past 15 minutes. The standard deviation of the 15-minute window variables includes the standard deviation of IT equipment power consumption and the standard deviation of the current hotspot temperature over the past 15 minutes.

[0068] Specifically, in this embodiment, the mean values of the 30-minute window variables include the mean values of the outdoor wet-bulb temperature, the IT equipment power consumption, the current hotspot temperature, and the chilled water supply temperature over the past 30 minutes; the maximum value of the 30-minute window variables refers to the maximum value of the current hotspot temperature over the past 30 minutes.

[0069] In this embodiment, it should be specifically noted that the current instantaneous values are the outdoor wet-bulb temperature, IT equipment power consumption, CPU utilization, GPU utilization, current hotspot temperature, chilled water supply temperature, valve opening, terminal fan speed, and cooling tower fan frequency.

[0070] In this embodiment, it should be specifically noted that the time encoding includes hourly sine or cosine and weekday identifier.

[0071] In this embodiment, it is specifically explained that calculating the mean and standard deviation of each continuous feature on the training set and using the calculated mean and standard deviation to perform Z-score standardization on the training set, validation set, and test set specifically means first calculating the mean and standard deviation of each continuous feature in the training set, and then using the calculated mean and standard deviation of each continuous feature to perform Z-score standardization on the corresponding continuous features in the training set, validation set, and test set.

[0072] Furthermore, constructing a standardized temporal feature set for decision states includes the following steps:

[0073] Define the components of the state space, including environmental perception, load perception, system perception, and trend perception. Environmental perception is the outdoor wet-bulb temperature, load perception includes IT equipment power consumption and hot spot temperature, system perception includes chilled water supply and return water temperature difference, cooling water supply and return water temperature difference, compressor compression ratio, water pump operating efficiency, and chiller COP, and trend perception includes hot spot temperature change rate, IT load change rate, hourly sine code, hourly cosine code, and workday identifier.

[0074] Calculate the rate of change and maximum value of hotspot temperature and the rate of change of IT load in the 5-minute window; calculate the mean and standard deviation of hotspot temperature and the mean of IT load in the 30-minute window; calculate the maximum value of hotspot temperature in the 60-minute window.

[0075] Construct incremental states, including temperature margin, temperature difference change, and COP change, and then add time encoding;

[0076] State space features are constructed by splicing together environmental perception, load perception, system perception, trend perception, incremental state, and time coding.

[0077] Calculate reward-related features, including total power consumption of the cooling system, hotspot temperature, and actions taken in the previous moment;

[0078] The state space features and reward-related features with the same timestamp are synchronously divided into training set, validation set and test set in chronological order. The mean and standard deviation of the training set are used to standardize the continuous features of the state space using Z-score, while the reward-related features are not standardized.

[0079] The standardized time-series feature set of decision states has been constructed, consisting of standardized state space features and reward-related features.

[0080] In this embodiment, it should be specifically noted that the highest value of the server's intake air temperature is the hotspot temperature; and the power consumption of the IT equipment is the IT load.

[0081] Furthermore, after waiting for the system response, the system collects temperature data, energy consumption data, and, in the command execution data, the temperature data refers to the hot spot temperature after execution. Energy consumption data includes the total power consumption of the refrigeration system and the power consumption of the chiller unit after execution. Command execution data includes the actual chilled water supply temperature, actual chilled water return temperature, actual chilled water flow rate, actual pump frequency, actual cooling tower fan frequency, actual valve opening, actual fan speed, control command issuance time, and system response stabilization time. The temperature control effect evaluation includes calculating the over-temperature integral and temperature fluctuation. The energy consumption control effect evaluation includes calculating the energy saving rate and COP change rate. The instruction execution deviation assessment includes calculating the instruction execution accuracy and execution response time. The calculation of the comprehensive score based on the assessment results includes the following steps: mapping the over-temperature integral, temperature fluctuation, energy saving rate, COP change rate, instruction execution accuracy, and execution response time to a score of 0-100; then weighting and summing the over-temperature integral and temperature fluctuation to obtain the comprehensive score for temperature control; weighting and summing the energy saving rate and COP change rate to obtain the comprehensive score for energy consumption control; weighting and summing the instruction execution accuracy and execution response time to obtain the comprehensive score for execution deviation; and weighting and summing the comprehensive scores for temperature control, energy consumption control, and execution deviation to obtain the comprehensive score.

[0082] Specifically, in this embodiment, the generation of the over-temperature integral includes: determining the control command issuance time and the system response stabilization time to obtain the evaluation period; retrieving the hotspot temperature value at each sampling moment within the evaluation period; for each sampling moment, calculating the difference between the hotspot temperature and the safety threshold; if the temperature exceeds the threshold, the over-temperature amount is the difference; if the temperature does not exceed the threshold, the over-temperature amount is 0; multiplying the over-temperature amount at each sampling moment by the sampling interval duration to obtain the over-temperature area within that period; and accumulating the over-temperature areas of all sampling intervals within the evaluation period to obtain the over-temperature integral.

[0083] In this embodiment, a specific example calculation is provided: Assuming the safety threshold is set to 27°C and the sampling interval is 1 minute, during the 15-minute evaluation period, the temperature reaches 27.5°C, 27.8°C, 28.0°C, 27.6°C, and 27.2°C at 5 time points, and is below 27°C at other times, then the over-temperature integral is calculated as: (0.5 + 0.8 + 1.0 + 0.6 + 0.2) × 1 minute = 3.1°C·minute.

[0084] In this embodiment, it should be specifically explained that the generation of temperature fluctuations includes: determining the evaluation period (same as the overheating integral evaluation period); finding the highest and lowest hot spot temperature values within the evaluation period; and subtracting the lowest hot spot temperature value from the highest hot spot temperature value to obtain the temperature fluctuation.

[0085] In this embodiment, it is specifically necessary to explain that the generation of the energy saving rate includes: selecting the historical average total power consumption of the refrigeration system under the current operating conditions (same outdoor wet-bulb temperature, same IT load) using the baseline control strategy; after the control command is executed, reading the average total power consumption of the refrigeration system during the evaluation period from the electricity meter; subtracting the actual energy consumption from the baseline energy consumption, dividing the energy consumption difference by the baseline energy consumption, and then multiplying by 100% to obtain the energy saving rate percentage. The generation of COP change rate includes: obtaining the chilled water supply temperature, actual chilled water return temperature, actual chilled water flow rate, and chiller unit power consumption after execution; calculating the chiller cooling capacity based on chilled water flow rate, specific heat capacity of water, water density, chilled water supply temperature, and return temperature; the ratio of chiller cooling capacity to chiller unit power consumption is the actual chiller COP; selecting the average chiller COP from historical data using the benchmark control strategy under the same operating conditions as the current conditions; subtracting the benchmark chiller COP from the actual chiller COP to obtain the COP change, calculating the ratio of the COP change to the benchmark chiller COP, and then multiplying by 100% to obtain the COP change rate percentage.

[0086] In this embodiment, it should be specifically explained that the generation of instruction execution accuracy includes: acquiring the pump frequency instruction value, cooling tower fan frequency instruction value, valve opening instruction value, fan speed instruction value, chilled water supply temperature instruction value, chilled water return temperature instruction value, chilled water flow rate instruction value, and the actual pump frequency, actual cooling tower fan frequency, actual valve opening, actual fan speed, actual chilled water supply temperature, actual chilled water return temperature, and actual chilled water flow rate after instruction execution; calculating the absolute value of the difference between the pump frequency instruction value and the actual pump frequency, and the absolute value of the difference is related to the range of pump frequency change. The ratio of the ranges is the relative deviation of the pump frequency. Similarly, the relative deviations of the cooling tower fan frequency, valve opening, fan speed, chilled water supply temperature, chilled water return temperature, and chilled water flow rate can be calculated. The average relative deviation is obtained by summing the relative deviations of the pump frequency, cooling tower fan frequency, valve opening, fan speed, chilled water supply temperature, chilled water return temperature, and chilled water flow rate. The execution accuracy is obtained by subtracting the average relative deviation from 1.

[0087] In this embodiment, it is necessary to explain the following example calculation: the chilled water temperature command is 9°C, the actual temperature is 9.2°C, the variation range is 5°C, and the relative deviation is 0.04; the water pump frequency command is 46Hz, the actual frequency is 45.5Hz, the variation range is 20Hz, and the relative deviation is 0.025; the valve opening command is 75%, the actual opening is 73%, the variation range is 70%, and the relative deviation is 0.029; the average relative deviation is (0.04+0.025+0.029)÷3=0.031, and the execution accuracy is 1-0.031=0.969.

[0088] In this embodiment, it should be specifically noted that the generation of the execution response duration includes: determining the control command issuance time point and the system response stabilization time point. The system response stabilization time point minus the control command issuance time point is the execution response duration.

[0089] In this embodiment, it is specifically necessary to explain that mapping over-temperature integral, temperature fluctuation, energy saving rate, COP change rate, instruction execution accuracy, and execution response time to a score of 0-100 includes: determining the level ranges for over-temperature integral, temperature fluctuation, energy saving rate, COP change rate, instruction execution accuracy, and execution response time. For over-temperature integral, a value in the range [0, 2.5) is considered excellent with a score of 100; a value in the range [2.5, 12.5) is considered good with a score of 80; a value in the range [12.5, 20] is considered average with a score of 60; and a value greater than 20 is considered poor with a score of 60. The value is 40; for temperature fluctuations, a value less than 1.0°C is rated as Excellent (100 points), a value within the range of [1.0°C, 2.5°C] is rated as Good (80 points), a value within the range of [2.5°C, 3.0°C] is rated as Average (60 points), and a value greater than 3.0°C is rated as Poor (40 points); for energy efficiency, a value greater than 10% is rated as Excellent (100 points), a value within the range of (2.5%, 10%) is rated as Good (80 points), a value within the range of [0%, 2.5%) is rated as Average (60 points), and a value less than 0% is rated as Poor (40 points). For the COP change rate, a value greater than 5% is rated as excellent with a score of 100; a value within the range of (-2.5%, 5%) is rated as good with a score of 80; a value within the range of [-5%, 2.5%) is rated as average with a score of 60; and a value less than -5% is rated as poor with a score of 40. For instruction execution accuracy, a value greater than 0.95 is rated as excellent with a score of 100; a value within the range of (0.875, 0.95) is rated as good with a score of 80; and a value within the range of [0.85, 0.875] is rated as average with a score of 60. A value less than 0.85 indicates a poor rating with a score of 40. For execution response time, a value less than 5 minutes indicates an excellent rating with a score of 100; a value in the range of [5 minutes, 12.5 minutes] indicates a good rating with a score of 80; a value in the range of [12.5 minutes, 15 minutes] indicates a fair rating with a score of 60; and a value greater than 12.5 minutes indicates a poor rating with a score of 40. The calculated over-temperature integral, temperature fluctuation, energy saving rate, COP change rate, instruction execution accuracy, and execution response time are matched with the above rating ranges to obtain the 0-100 score mapping results for their respective parameters.

[0090] In this embodiment, it should be noted that the threshold, preset value, and set value used are all selected based on actual needs, and no specific value limit is imposed here.

[0091] like Figure 2As shown, this embodiment provides an energy-saving optimization scheduling system for a refrigeration system, including a multi-source data acquisition module, a multi-source data preprocessing module, a feature engineering module, a model training module, a physical response module, a hotspot temperature prediction module, an intelligent inference decision-making module, an execution control module, a control effect evaluation module, and a cloud database. The multi-source data acquisition module, multi-source data preprocessing module, feature engineering module, and model training module are sequentially connected. The model training module is connected to the physical response module, hotspot temperature prediction module, and intelligent inference decision-making module. The feature engineering module is connected to the physical response module, hotspot temperature prediction module, and intelligent inference decision-making module. The hotspot temperature prediction module is connected to the intelligent inference decision-making module. The intelligent inference decision-making module, execution control module, and control effect evaluation module are sequentially connected. All modules in the system are connected to the cloud database.

[0092] The multi-source data acquisition module establishes an IoT sensing network to collect environmental data, load data, refrigeration system status data, and energy consumption data, and outputs a multi-source heterogeneous time-series dataset with timestamps.

[0093] The multi-source data preprocessing module performs data cleaning on multi-source heterogeneous time-series datasets, including outlier detection and correction, missing value handling, and multi-source data alignment.

[0094] The feature engineering module constructs basic environmental features, load features, basic refrigeration system state features, and basic energy consumption features based on preprocessed multi-source data. It then constructs derived environmental features based on the basic environmental features, derived refrigeration system state features based on the basic refrigeration system state features, and derived energy consumption features based on the basic energy consumption features. Finally, it constructs a standardized time-series feature set for physical modeling, a standardized time-series feature set for state prediction, and a standardized time-series feature set for decision-making state.

[0095] The model training module uses a historical physics modeling standardized time series feature set to train the constructed thermal response model and energy consumption response model, and uses a historical state prediction standardized time series feature set to train the constructed hot spot temperature prediction model. A DRL training environment is constructed based on the thermal response model and energy consumption response model. The intelligent reasoning decision model is trained based on the historical decision state standardized time series feature set, the DRL training environment, and the hot spot temperature prediction results. If all tests pass, the model file is saved; otherwise, the model is calibrated and updated.

[0096] The physical response module inputs the standardized time-series feature set of real-time physical modeling into the thermal response model and energy consumption response model for inference, outputs the current hot spot temperature estimate and the current cooling energy consumption estimate, and triggers model drift detection;

[0097] The hotspot temperature prediction module inputs the standardized time-series feature set of real-time state prediction into the hotspot temperature prediction model to infer and output the hotspot temperature at future time points.

[0098] The intelligent reasoning and decision-making module inputs the standardized time-series feature set of real-time decision status and the hotspot temperature at future time points into the intelligent reasoning and decision-making model to infer and output a set of control instructions.

[0099] The execution control module sends the control instruction set to the controller for executing the control instructions;

[0100] After waiting for the system response, the control effect evaluation module collects temperature data, energy consumption data, and command execution data after the control command is executed, and performs temperature control effect evaluation, energy consumption control effect evaluation, and command execution deviation evaluation respectively. Then, it calculates a comprehensive score based on the evaluation results and generates a control effect evaluation report.

[0101] The cloud database is used to store data information for all modules in the system.

[0102] In conclusion, the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. An energy-saving optimal scheduling method for a refrigeration system, characterized in that, Includes the following steps: Step 1: Establish an IoT sensing network to collect environmental data, load data, cooling system status data, and energy consumption data, and output a multi-source heterogeneous time-series dataset with timestamps. Step 2: Perform data cleaning and feature engineering on the multi-source heterogeneous time series datasets, and construct and output the physical modeling standardized time series feature set, the state prediction standardized time series feature set, and the decision state standardized time series feature set; Step 3: Train the constructed thermal response model and energy consumption response model using the standardized time series feature set of historical physics modeling, and train the constructed hot spot temperature prediction model using the standardized time series feature set of historical state prediction. Construct a DRL training environment based on the thermal response model and energy consumption response model. Train the intelligent reasoning decision model based on the standardized time series feature set of historical decision state, the DRL training environment, and the hot spot temperature prediction results. If all tests pass, save the model file; otherwise, calibrate and update the model. Step 4: Input the standardized time-series feature set of real-time physical modeling into the thermal response model and energy consumption response model for inference, output the current hot spot temperature estimate and the current cooling energy consumption estimate, trigger model drift detection, and feed the detection results back to Step 3; Step 5: Input the standardized time series feature set of real-time state prediction into the hotspot temperature prediction model for inference and output the hotspot temperature at future time points; input the standardized time series feature set of real-time decision state and the hotspot temperature at future time points into the intelligent inference decision model for inference and output the control instruction set. Step 6: Send the control command set to the controller, wait for the system response, collect the temperature data, energy consumption data and command execution data after the control command is executed, and conduct temperature control effect evaluation, energy consumption control effect evaluation and command execution deviation evaluation respectively. Then, calculate the comprehensive score based on the evaluation results and generate a control effect evaluation report. Step 7: Write the performance evaluation report into the time series database. When the preset model retraining conditions are triggered, retrain the model from Step 3.

2. The energy saving optimization scheduling method of a refrigeration system according to claim 1, wherein, The environmental data includes outdoor dry-bulb temperature, relative humidity, and atmospheric pressure; indoor air conditioning return air temperature and humidity, cold aisle temperature, and hot aisle temperature; the load data includes IT equipment power consumption, server intake air temperature, CPU utilization, and GPU utilization; the refrigeration system status data includes chilled water supply temperature setpoint, chilled water flow rate, chilled water supply and return temperatures, cooling water supply and return temperatures, compressor suction and discharge pressure, natural cooling coil inlet and outlet temperatures, pump frequency, pump flow rate, cooling tower fan frequency, and valve opening on the delivery side; and terminal fan speed, supply air temperature setpoint, and actual supply air temperature. The energy consumption data includes total refrigeration system power consumption, chiller unit power consumption, cooling tower power consumption, pump power consumption, terminal air conditioning power consumption, and total input power consumption.

3. The energy-saving optimization scheduling method of a refrigeration system according to claim 2, characterized in that, The data cleaning of the multi-source heterogeneous time-series dataset includes outlier detection and correction, missing value handling, and multi-source data alignment; the feature engineering includes the following steps: After data cleaning, the environmental data, load data, refrigeration system status data, and energy consumption data are sequentially labeled as basic environmental characteristics, load characteristics, basic refrigeration system status characteristics, and basic energy consumption characteristics. Based on the basic environmental characteristics, we construct environmental derivative characteristics; based on the basic refrigeration system state characteristics, we construct refrigeration system state derivative characteristics; based on the basic energy consumption characteristics, we construct energy consumption derivative characteristics. Among them, the environmental derivative characteristics include outdoor wet-bulb temperature and indoor return air enthalpy; the refrigeration system state derivative characteristics include chilled water supply and return water temperature difference, cooling water supply and return water temperature difference, compressor compression ratio, natural cooling contribution temperature difference, water pump operating efficiency, and refrigeration unit cooling capacity; and the energy consumption derivative characteristics include real-time PUE and refrigeration unit COP. Construct a standardized temporal feature set for physical modeling, a standardized temporal feature set for state prediction, and a standardized temporal feature set for decision state; The three feature sets constructed are stored in the cloud database.

4. The energy-saving optimization scheduling method of a refrigeration system according to claim 3, characterized in that, The process of constructing environmental derived features based on basic environmental features, constructing refrigeration system state derived features based on basic refrigeration system state features, and constructing energy consumption derived features based on basic energy consumption features includes the following steps: Based on the outdoor dry-bulb temperature, outdoor relative humidity, and atmospheric pressure, the outdoor wet-bulb temperature is calculated using the iterative method of the Stull formula, and the indoor return air enthalpy is calculated based on the air conditioning return air temperature, air conditioning return air humidity, and atmospheric pressure. The chilled water supply and return water temperature difference is calculated based on the chilled water supply temperature and return water temperature; the cooling water supply and return water temperature difference is calculated based on the cooling water supply temperature and return water temperature; the compressor compression ratio is calculated based on the compressor suction pressure and discharge pressure; the natural cooling contribution temperature difference is calculated based on the natural cooling coil inlet temperature and outlet temperature; the water pump operating efficiency is calculated based on the water pump frequency, water pump flow rate, and water pump power consumption; and the refrigeration capacity is calculated based on the chilled water flow rate, water specific heat capacity, water density, chilled water supply temperature, and return water temperature. Real-time PUE is calculated based on total input power consumption and IT equipment power consumption, and chiller COP is calculated based on chiller cooling capacity and chiller unit power consumption.

5. The energy saving optimization scheduling method of a refrigeration system according to claim 3, wherein, The construction of the standardized temporal feature set for physical modeling includes the following steps: After determining the target variable and input variables, calculate the standard deviation of the 30-minute window variable, the mean and cumulative value of the 60-minute window variable, and the lagged value of the lagged window variable. Steady-state screening is performed. The time-series statistical features of samples that pass the steady-state screening are labeled as target time-series statistical features and retained for subsequent processing. Samples that fail the steady-state screening are discarded directly. For each timestamp that passes the steady-state screening, basic features, derived features, target time-series statistical features, and state features are concatenated. The dataset is divided into training, validation, and test sets in chronological order. The mean and standard deviation of each continuous feature are calculated on the training set, and the Z-score is used to standardize the training, validation, and test sets using the calculated mean and standard deviation. The standardized temporal feature set for physical modeling has been completed.

6. The energy-saving optimization scheduling method of a refrigeration system according to claim 3, characterized in that, The construction of the standardized temporal feature set for state prediction includes the following steps: After determining the target variable and key variables, calculate the mean, maximum value and rate of change of the variable in the 5-minute window, calculate the mean, maximum value and standard deviation of the variable in the 15-minute window, and calculate the mean and maximum value in the 30-minute window. After adding the current instantaneous value, add a time code. Concatenate the current instantaneous value, window statistics, and time code to obtain the target time series statistical features. The target time-series statistical features are divided into training set, validation set and test set according to time order. The mean and standard deviation of each continuous feature are calculated on the training set and Z-score standardization is performed on the training set, validation set and test set using the calculated mean and standard deviation. The standardized features are paired with the target variable to form supervised learning samples; The standardized time-series feature set for state prediction has been constructed.

7. The energy-saving optimization scheduling method of a refrigeration system according to claim 3, characterized in that, The construction of the standardized temporal feature set of decision states includes the following steps: Define the components of the state space, including environmental perception, load perception, system perception, and trend perception. Environmental perception is the outdoor wet-bulb temperature, load perception includes IT equipment power consumption and hot spot temperature, system perception includes chilled water supply and return water temperature difference, cooling water supply and return water temperature difference, compressor compression ratio, water pump operating efficiency, and chiller COP, and trend perception includes hot spot temperature change rate, IT load change rate, hourly sine code, hourly cosine code, and workday identifier. Calculate the rate of change and maximum value of hotspot temperature and the rate of change of IT load in the 5-minute window; calculate the mean and standard deviation of hotspot temperature and the mean of IT load in the 30-minute window; calculate the maximum value of hotspot temperature in the 60-minute window. Construct incremental states, including temperature margin, temperature difference change, and COP change, and then add time encoding; State space features are constructed by splicing together environmental perception, load perception, system perception, trend perception, incremental state, and time coding. Calculate reward-related features, including total power consumption of the cooling system, hotspot temperature, and actions taken in the previous moment; The state space features and reward-related features with the same timestamp are synchronously divided into training set, validation set and test set in chronological order. The mean and standard deviation of the training set are used to standardize the continuous features of the state space using Z-score, while the reward-related features are not standardized. The standardized time-series feature set of decision states has been constructed, consisting of standardized state space features and reward-related features.

8. The method of claim 1, wherein, The process of collecting temperature data, energy consumption data, and command execution data after waiting for system response and executing control commands includes: temperature data (hotspot temperature after execution), energy consumption data (total power consumption of the refrigeration system and power consumption of the chiller unit after execution), and command execution data (actual chilled water supply temperature, actual chilled water return temperature, actual chilled water flow rate, actual water pump frequency, actual cooling tower fan frequency, actual valve opening, actual fan speed, control command issuance time, and system response stabilization time).

9. The method of claim 1, wherein, The evaluation of temperature control effectiveness includes calculating over-temperature integral and temperature fluctuation; the evaluation of energy consumption control effectiveness includes calculating energy saving rate and COP change rate; the evaluation of instruction execution deviation includes calculating instruction execution accuracy and execution response time.

10. The method of claim 1, wherein, The calculation of the comprehensive score based on the evaluation results includes the following steps: mapping the over-temperature integral, temperature fluctuation, energy saving rate, COP change rate, command execution accuracy, and execution response time to 0-100 points; then weighting and summing the over-temperature integral and temperature fluctuation to obtain the comprehensive score for temperature control; weighting and summing the energy saving rate and COP change rate to obtain the comprehensive score for energy consumption control; weighting and summing the command execution accuracy and execution response time to obtain the comprehensive score for execution deviation; and weighting and summing the comprehensive scores for temperature control, energy consumption control, and execution deviation to obtain the comprehensive score.