A data center liquid cooling system adaptive operation and maintenance regulation method

By combining hybrid pulse neural networks and dynamic heterogeneous graphs, adaptive operation and maintenance control of liquid cooling systems is achieved, solving the problems of adaptability and response speed in traditional control methods and improving the stability and adaptability of the system.

CN122239902APending Publication Date: 2026-06-19CHANGTAI CLOUD TECH SERVICE (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHANGTAI CLOUD TECH SERVICE (SHENZHEN) CO LTD
Filing Date
2026-01-27
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Traditional liquid cooling systems' coordinated control methods are ill-suited to adapting to a wide range of dynamically changing loads and ambient temperatures, leading to overshoot, oscillation, or slow response, and lacking global adaptability and optimization capabilities.

Method used

A hybrid pulse neural network is used for inner-loop control, combined with an outer-loop collaborative adjustment mechanism based on a dynamic heterogeneous graph reward diffusion mechanism, to achieve real-time rapid regulation and long-term optimization of the liquid cooling system. Adaptive operation and maintenance regulation is achieved by constructing encoding-decoding rules, a fully connected weight matrix, and a dynamic heterogeneous graph.

Benefits of technology

It improves the overall reliability and adaptability of the liquid cooling system, enabling it to respond quickly to sudden load changes and ensure the system's instantaneous stability and long-term performance.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122239902A_ABST
    Figure CN122239902A_ABST
Patent Text Reader

Abstract

This invention provides an adaptive operation and maintenance control method for a data center liquid cooling system. The method includes: acquiring liquid cooling neurons and effective training data; constructing an encoding-decoding rule using the liquid cooling neurons; acquiring a hybrid spiking neural network (HSWN) loaded with a fully connected weight matrix based on the encoding-decoding rule and effective training data; acquiring raw sensor data; processing the raw sensor data based on the encoding-decoding rule; and acquiring control command parameters through the HSWN; constructing a dynamic heterogeneous graph based on the liquid cooling neurons and acquiring global reward data; acquiring local reward data through the dynamic heterogeneous graph and global reward data; acquiring local policy networks and observation feature data respectively; and performing outer-loop adjustment on the HSWN based on the local policy networks and observation feature data. This provides a method for controlling the liquid cooling system through the coordinated operation of the inner and outer loops, thereby improving the overall reliability and adaptability of the liquid cooling system during operation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence technology, and more specifically, to an adaptive operation and maintenance control method for a data center liquid cooling system. Background Technology

[0002] With the development of cloud computing and artificial intelligence, the power density of single racks in data centers is gradually increasing. Traditional air cooling is approaching its physical limit. Liquid cooling technology, including cold plate type and immersion liquid cooling, has become the mainstream solution for high-density heat dissipation. However, the liquid cooling system is a strongly coupled and nonlinear system. Its efficient and stable operation depends on the coordinated control of many actuators (pumps, valves, etc.).

[0003] Currently, PID control, feedforward-feedback composite control, or threshold rule-based control are commonly used for the coordinated regulation of liquid cooling systems. Among them, PID control is simple to design, but for the multivariable and large time delay characteristics of liquid cooling systems, parameter tuning is difficult, and fixed parameters are difficult to adapt to a wide range of dynamically changing loads and ambient temperatures, which can easily lead to problems such as overshoot, oscillation, or slow response. The essence of the above methods is reactive control, which can only correct deviations after they occur, and lacks global adaptability and optimization capabilities. Summary of the Invention

[0004] In view of the aforementioned problems, and in conjunction with the first aspect of the present invention, embodiments of the present invention provide an adaptive operation and maintenance control method for a data center liquid cooling system, the method comprising: Acquire liquid-cooled neurons and effective training data, construct encoding-decoding rules using liquid-cooled neurons, and acquire a hybrid spiking neural network loaded with a fully connected weight matrix based on the encoding-decoding rules and effective training data; The system acquires raw sensor data, processes the raw sensor data based on encoding-decoding rules, and obtains control command parameters through a hybrid pulse neural network. A dynamic heterogeneous graph is constructed based on liquid-cooled neurons to obtain global reward data, and local reward data is obtained through the dynamic heterogeneous graph and global reward data. Local policy network and observation feature data are acquired separately, and the outer loop of the hybrid spiking neural network is adjusted based on the local policy network and observation feature data.

[0005] As a further aspect of the present invention, liquid-cooled neurons and effective training data are obtained, and a hybrid spiking neural network with a fully connected weight matrix is ​​obtained based on the encoding-decoding rules and the effective training data, using the liquid-cooled neurons to construct an encoding-decoding rule. Acquire liquid cooling structure data and liquid cooling equipment data, and define liquid cooling neurons based on the liquid cooling structure data and liquid cooling equipment data; Obtain a preset parameter range, and construct encoding-decoding rules based on the preset parameter range and liquid-cooled neurons. The encoding-decoding rules include decoding rules and encoding rules. Obtain valid training data, transform the valid training data based on encoding-decoding rules, and obtain multiple sets of transformed training data; A spiking hybrid neural network is obtained, and a fully connected weight matrix is ​​obtained based on the spiking hybrid neural network and multiple sets of training transformation data, using a weight update rule. The fully connected weight matrix is ​​loaded into the hybrid spiking neural network, and the hybrid spiking neural network is validated.

[0006] As a further aspect of the present invention, raw sensor data is acquired, processed based on encoding-decoding rules, and control command parameters are obtained through a hybrid pulse neural network, including: The raw sensor data is acquired, converted into a pulse emission time series based on the encoding-decoding rule, and then loaded into the hybrid pulse neural network. The steady-state membrane potential vector is obtained by evolution through a hybrid pulse neural network, and then converted into control command parameters based on the encoding-decoding rule. The liquid cooling system is cyclically controlled based on control command parameters.

[0007] As a further aspect of the present invention, a dynamic heterogeneous graph is constructed based on liquid-cooled neurons to obtain global reward data, and local reward data is obtained through the dynamic heterogeneous graph and global reward data, including: Historical state data is acquired, an initial graph is constructed based on liquid-cooled neurons, and a dynamic heterogeneous graph is obtained based on the initial graph and a fusion strategy based on historical state data. Acquire performance evaluation data, perform aggregation operations based on the performance evaluation data to obtain scalar data, and obtain global reward data based on the scalar data and the reward strategy; Based on global reward data, a cyclical reward diffusion is performed in a dynamic heterogeneous graph, and local reward data is obtained after the iteration is completed.

[0008] As a further aspect of the present invention, historical state data is acquired, an initial graph is constructed based on liquid-cooled neurons, and a dynamic heterogeneous graph is obtained based on the initial graph and a fusion strategy based on historical state data, including: Set a fixed historical analysis window and obtain historical status data based on the fixed historical analysis window; All liquid-cooled neurons are treated as a set of nodes and an edge set is obtained. Static weights are assigned to the edge set, and an initial graph is constructed based on the edge set, node set, and static weights. Based on historical state data, the transfer entropy between node sets is obtained, and the transfer entropy is fused with static weights to obtain fused weights, where the transfer entropy is used to measure causal influence. Dynamic heterogeneous graphs are obtained based on fusion weights and the initial graph.

[0009] As a further aspect of the present invention, cyclical reward diffusion is performed on a dynamic heterogeneous graph based on global reward data, and local reward data is obtained after the iteration is completed, including: Add virtual performance nodes to the dynamic heterogeneous graph; An initial reward vector is constructed based on virtual performance nodes, dynamic heterogeneous graphs, and global reward data; Based on the initial reward vector, a diffusion update rule is used to diffuse the reward in the dynamic heterogeneous graph to obtain local reward data.

[0010] As a further aspect of the present invention, local policy network and observation feature data are acquired separately, and outer loop adjustment is performed on the hybrid spiking neural network based on the local policy network and observation feature data, including: Initialize the local policy network and acquire observation feature data. Input the observation feature data into the local policy network to acquire advantage data. Based on the superior data and using the PPO algorithm, the local policy network is updated to obtain the updated network parameters; The updated network parameters are preprocessed, and the hybrid spiking neural network is fine-tuned based on the updated network parameters.

[0011] In view of the aforementioned problems, and in conjunction with the second aspect of the present invention, embodiments of the present invention provide an adaptive operation and maintenance control device for a data center liquid cooling system, the system comprising: The adaptive operation and maintenance control device for the data center liquid cooling system includes a processor and a memory, which are connected to each other. The memory stores programs, instructions, or code, and the processor executes the programs, instructions, or code in the memory to implement the adaptive operation and maintenance control method for the data center liquid cooling system described in the embodiment. In this embodiment, an adaptive operation and maintenance control method for a data center liquid cooling system combines an inner loop based on a hybrid pulse neural network reflection control with an outer loop based on a dynamic graph reward diffusion mechanism to control the liquid cooling system. The inner loop is responsible for real-time and rapid control of the liquid cooling system to ensure its instantaneous stability. The use of a hybrid pulse neural network avoids the complex iterative calculations required by traditional numerical optimization-based controllers, enabling the liquid cooling system to quickly suppress and regulate situations such as sudden load changes. The outer loop is responsible for long-term, slow optimization of the inner loop to improve the long-term performance and adaptability of the liquid cooling system. Thus, this method can improve the overall reliability and adaptability of the liquid cooling system during operation. Attached Figure Description

[0012] Figure 1 This is a flowchart of the steps of an adaptive operation and maintenance control method for a data center liquid cooling system according to the present invention.

[0013] Figure 2 This is a schematic diagram of an adaptive operation and maintenance control device for a data center liquid cooling system according to the present invention. Detailed Implementation

[0014] It should be understood that, in the embodiments of the present invention, the order of the above-mentioned process numbers does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

[0015] The present invention will now be described in detail with reference to the accompanying drawings, as shown in the attached drawings. Figure 1 To be continued Figure 2 As shown.

[0016] This invention provides an adaptive operation and maintenance control method for a data center liquid cooling system, comprising: Step X1: Obtain liquid-cooled neurons and effective training data. Construct a hybrid spiking neural network with encoding-decoding rules based on the liquid-cooled neurons and the effective training data, and load the fully connected weight matrix.

[0017] Step X1 is performed based on steps X11 to X15: Step X11: Obtain liquid cooling structure data and liquid cooling equipment data, and define liquid cooling neurons based on the liquid cooling structure data and liquid cooling equipment data.

[0018] Specifically, the liquid cooling structure data and liquid cooling equipment data are obtained separately. The liquid cooling structure data is the physical topology diagram of the overall liquid cooling system of the data center. This diagram needs to clearly define the connection relationship of each component in the cooling loop, such as the physical location and connecting pipes of the main circulation pump, distribution unit, valves, branch circuits of each cabinet, sensors, etc. The liquid cooling equipment data includes all sensors and actuators and their corresponding parameters, such as the corresponding parameters of temperature sensors, pressure sensors, circulation pumps, regulating valves, etc. For sensors, the corresponding parameters should include their sensor type, installation location, physical range, etc. For actuators, the corresponding parameters should include control type, control physical range, etc.

[0019] Then, all liquid cooling structure data and liquid cooling equipment data are traversed to identify each independent physical entity in the liquid cooling system that has monitoring or control functions. Each of these physical entities is defined as a liquid cooling neuron. The liquid cooling neuron should include, but is not limited to, neuron identifier, neuron type, physical range / control physical range and association relationship. The neuron identifier is a unique identifier. The neuron type is divided into sensor neuron or actuator neuron. The physical range / control physical range refers to the physical measurement range of the sensor and the control range of the actuator, respectively. For example, the valve opening can be from min to max. The association relationship is obtained from the liquid cooling structure data, and the IDs of the neighboring neurons directly connected to the liquid cooling neuron in the physical topology are recorded.

[0020] Step X12: Obtain the preset parameter range, and construct the encoding-decoding rules based on the preset parameter range and the liquid-cooled neurons. The encoding-decoding rules include decoding rules and encoding rules.

[0021] Specifically, firstly, a preset parameter range is obtained. The preset parameter range refers to a preset range of parameters used to simulate neuronal behavior. The preset parameter range should include a preset frequency range and a preset potential range. The preset frequency range represents the upper and lower limits of the number of pulses fired per second by the neuron, for example, 5Hz to 50Hz. The preset potential range represents the upper and lower limits of the neuron's membrane potential, which usually simulates the resting potential and saturation potential of the neuron, for example, -70mV to -50mV. At the same time, an encoding-decoding rule is established based on the liquid-cooled neuron and the preset parameter range. The encoding-decoding rule includes encoding rules and decoding rules.

[0022] The encoding rule, for the sensor neurons in the liquid-cooled neuron, maps the collected physical quantities to the frequency data of the corresponding neurons. This can be done through linear mapping, as shown in the formula: ; In the above formula, , These represent the lower and upper limits of the preset frequency range, respectively. , These represent the lower and upper limits of the physical quantity range, respectively. This is expressed as the pulse frequency of the neuron corresponding to the sensor neuron at time t, in Hz. This represents the real-time reading of the sensor neuron at time t.

[0023] The decoding rule, for the actuator neurons in the liquid-cooled neuron, maps the neuron's membrane potential to specific physical control commands. Similar to the encoding rule, it uses a linear mapping approach, with the following formula: ; In the above formula, and These represent the lower and upper limits of the preset potential range, respectively. and These represent the lower and upper limits of the controlled physical range, respectively. This represents the control command parameters to be issued to the actuator, such as valve opening percentage, pump speed setpoint, etc. This represents the membrane potential of the neuron corresponding to the actuator after the convergence of the spiking neural network, expressed in mV. This establishes a mathematical mapping relationship between the physical state and the subsequent internal state of the hybrid spiking neural network.

[0024] Step X13: Obtain valid training data, transform the valid training data based on the encoding-decoding rule, and obtain multiple sets of transformed training data.

[0025] Specifically, when the liquid cooling system operates based on a preset goal, such as minimizing the total power consumption of the system while meeting the heat dissipation requirements of all servers, after confirming that the liquid cooling system has achieved the goal and maintained a steady state, data collection is performed on the liquid cooling system's operating data for a fixed period of time. This includes raw sensor readings and control command parameters of the actuators. The collected data is used as effective training data. Then, the encoding rules in the encoding-decoding rule are used to encode the effective training data to obtain the pulse frequency corresponding to the sensor neurons in the liquid cooling neuron. Based on the pulse frequency, a pulse firing time sequence within a fixed period of time is generated. At the same time, the ideal membrane potential of the actuator neurons in the liquid cooling neuron is obtained by the inverse operation of the decoding rules in the encoding-decoding rule. The pulse firing time sequence and membrane potential are integrated into time-series conversion data. Simultaneously, operating condition labels are added to the conversion data and grouped. The operating condition labels need to be added manually, such as uniform load, high load of branch A, etc., to obtain multiple sets of training conversion data, representing training data under different operating conditions.

[0026] Step X14: Obtain the spiking hybrid neural network. Based on the spiking hybrid neural network and multiple sets of training transformation data, obtain the fully connected weight matrix using the weight update rule.

[0027] Specifically, the hybrid spiking neural network is a hybrid network combining the Hopfield neural network and the spiking neural network, employing a fully connected symmetric recursive topology. Each neuron uses an integral firing model and corresponds to both the actuator neuron and the sensor neuron. Neurons are connected by a symmetric weight matrix, which is initialized to zero or a very small random value and learned through weight update rules. The network has a global energy function, whose dynamic evolution drives the network state to automatically converge to a local energy minimum, i.e., an attractor. Each attractor stores a specific spiking pattern of all neurons under a given operating condition. During operation, the network is driven by the spiking time series in the training data, which is used as input to quickly converge to the attractor that best matches the current operating condition through an associative memory process. The membrane potential of the actuator neuron in this attractor state is decoded to generate optimized control commands, thereby adaptively regulating the liquid cooling system.

[0028] The global energy function can be approximated as: ; In the above formula, N represents the total number of liquid-cooled neurons. The connection weights are those between liquid-cooled neuron j and liquid-cooled neuron i. This can be represented as the state of liquid-cooled neuron i over a period of time, and can be approximated as membrane potential. The function, and Similarly, This represents the firing threshold of the liquid-cooled neuron. Driven by pulses, the network state evolves along the direction of decreasing energy, eventually stopping at a minimum point, i.e., the attractor.

[0029] The hybrid spiking neural network is initialized, and the intrinsic parameters of neurons, such as membrane time constant and firing threshold, are obtained simultaneously. Multiple sets of training transformation data are sequentially loaded into the hybrid spiking neural network to simulate the spiking activity of each neuron under different operating conditions. Based on the spiking activity, the connection weight matrix is ​​updated using a weight update rule. All training transformation data are repeatedly traversed until the connection weight matrix is ​​lower than a preset threshold, thus achieving convergence. After convergence, the fully connected weight matrix after training is output. The value of element ij in this matrix encodes the strength and direction of the influence of neuron i's spiking activity on neuron j's spiking activity in historical operating experience.

[0030] Understandably, by using the output fully connected weight matrix, each global state becomes an attractor in the network. When the network receives a similar input during operation, it will drive the entire network state to automatically converge to the corresponding attractor, thereby finding the optimal control command that matches the mode.

[0031] Furthermore, the weight update rules include: ; in, This is expressed as the change in connection weights from neuron i to neuron j during a single round. It is a small positive integer used to control the overall step size of weight updates. This can be represented as a double summation of all pulse times of neuron i and all pulse times of neuron j.

[0032] Furthermore, The specific form is: ; in, Let be the time difference between the b-th pulse of neuron j and the a-th pulse of neuron i. This is expressed as the magnitude of long-term enhancement. Greater than 0, This is expressed as the magnitude of long-term suppression. Less than 0, and usually The absolute value is less than The absolute value, Represented as the time constant for long-term enhancement. It is represented as the time constant for long-term suppression.

[0033] Step X15: Load the fully connected weight matrix into the hybrid spiking neural network and validate the hybrid spiking neural network.

[0034] Specifically, the existence and convergence of the attractor are first verified. A set of training transformed data is selected from multiple sets and injected as test input into a hybrid spiking neural network loaded with the fully connected weight matrix mentioned above. Simultaneously, the network state at time t is set as a vector composed of the membrane potentials of all neurons. Network simulation is initiated, allowing the network state to evolve freely according to its dynamic equation, which is driven by the input pulse and the fully connected weight matrix. During the evolution, the rate of change of the network state is calculated. When the rate of change is less than a very small threshold within a reasonable time, the network is deemed to have met the convergence condition, proving the existence of an attractor corresponding to the test input. The vector at convergence is recorded, and then the attractor and its convergence are verified. The conformity to the ideal state is determined by comparing the vector with the ideal membrane potential in the test input and calculating the error norm between them. If the error norm is less than a preset error threshold, the control command parameters output by the attractor are considered to be within an acceptable range. Next, the stability of the attractor is verified. After the network converges, a small random perturbation is applied to the membrane potential of one or more neurons. It is observed whether the network state after the perturbation is removed can converge back to the original state or converge to a new steady state with an error less than the preset error threshold. If it can be recovered, it proves that the attractor is locally asymptotically stable and has stability. After the above verification, the verified hybrid spiking neural network with fully connected weight matrices is output.

[0035] Step X2: Obtain raw sensor data, process the raw sensor data based on encoding-decoding rules, and obtain control command parameters through a hybrid pulse neural network.

[0036] Step X2 is performed based on steps X21 to X22: Step X21: Obtain raw sensor data, convert the raw sensor data into a pulse emission time series based on the encoding-decoding rule, and load the pulse emission time series into the hybrid pulse neural network.

[0037] Specifically, firstly, the raw sensor data, which is the real-time continuous measurement data acquired by the sensor in the liquid cooling system at each sampling moment, is then encoded using the encoding rules mentioned in step X12 to obtain the pulse frequency of the liquid cooling neuron at each sampling moment. The pulse firing time series is then obtained from this and input into the hybrid spiking neural network loaded with the fully connected weight matrix.

[0038] Step X22 involves using a hybrid spiking neural network to evolve and obtain the steady-state membrane potential vector, and then converting the steady-state membrane potential vector into control command parameters based on the encoding-decoding rule.

[0039] Specifically, after loading, the network state of the hybrid spiking neural network is obtained, which is a vector composed of the membrane potentials of all liquid-cooled neurons. This vector evolves over time. When a pulse from liquid-cooled neuron j arrives at liquid-cooled neuron i at a sampling time, the pulse will have an instantaneous effect on the membrane potential of liquid-cooled neuron i through the connection weights between the two. This effect can usually be modeled as causing a postsynaptic potential. A current injection model can be used, that is, the arrival of the pulse causes a current pulse of a fixed shape to be injected into the liquid-cooled neuron. The function of the current pulse is expressed as: ; in, This is represented by the current pulse received by neuron i at time t. The connection weights are those between liquid-cooled neuron j and liquid-cooled neuron i. This is a normalization constant used to adjust the amplitude scale of the post-surge potential. Represented as any time, Represented as the current time, and Should be greater than , and These represent the time constants for synaptic conductance decay and rise, respectively. This process processes all arriving pulses in the network in parallel, transforming discrete pulses into continuous temporal effects on the membrane potential of liquid-cooled neurons.

[0040] During the intervals between pulses and during the process influenced by the pulses, the membrane potential of each liquid-cooled neuron i follows its inherent dynamic equation, namely, using the integral firing model, whose differential equation is expressed as: ; in, The membrane time constant of the liquid-cooled neuron. Represented as the membrane potential of a liquid-cooled neuron. Represented as membrane resistance, It is represented as the sum of all current pulses applied to the liquid-cooled neuron i at time t. This is represented as a constant background current, used to simulate the intrinsic activity of neurons. This represents the membrane potential of each liquid-cooled neuron i at time t, when When a preset membrane potential threshold is reached or exceeded, liquid-cooled neuron i is considered to fire a pulse, which is then transmitted to all its connected liquid-cooled neurons. It is immediately reset and enters a brief refractory period.

[0041] All liquid-cooled neurons perform numerical integration of the above equations. Pulse sequences from the sensors are continuously input into the network and generated through the fully connected weight matrix, causing the vector of the entire network to be updated cyclically and eventually fall into a local minimum of energy, which corresponds to the attractor of a certain operating condition. Then, convergence verification can be performed based on the verification in step X15. If convergence is achieved, the steady-state membrane potential vector after network convergence can be output. The steady-state membrane potential vector is then processed again using the encoding-decoding rule to output the control command parameters.

[0042] Step X23: Perform cyclic control of the liquid cooling system based on control command parameters.

[0043] Understandably, during operation, the liquid cooling system generates new control command parameters based on step X2, and uses these parameters to perform cyclical adaptive control of the actuators, thereby ensuring the continuity and adaptability of the control over the liquid cooling system.

[0044] Step X3: Construct a dynamic heterogeneous graph based on liquid-cooled neurons and obtain global reward data. Then, obtain local reward data through the dynamic heterogeneous graph and global reward data.

[0045] Step X3 is based on steps X31 to X33: Step X31: Obtain historical state data, construct an initial graph based on liquid-cooled neurons, and simultaneously obtain a dynamic heterogeneous graph based on the initial graph and a fusion strategy based on historical state data.

[0046] Step X31 is performed based on steps X311 to X314: Step X311: Set a historical fixed analysis window and obtain historical status data based on the historical fixed analysis window.

[0047] Step X312: Treat all liquid-cooled neurons as a set of nodes and obtain the set of edges. At the same time, assign static weights to the set of edges and construct an initial graph based on the set of edges, the set of nodes, and the static weights.

[0048] Step X313: Obtain the transfer entropy between the node set based on historical state data, and fuse the transfer entropy with the static weight to obtain the fused weight, where the transfer entropy is used to measure causal influence.

[0049] Step X314: Obtain the dynamic heterogeneous graph based on the fusion weights and the initial graph.

[0050] Specifically, a fixed historical analysis window is first set, for example, 1 hour. Based on the fixed historical analysis window, the complete state records generated during the continuous operation of step X2 are recorded, i.e., historical state data. For each liquid-cooled neuron, its state record is a time series. For sensor neurons in the liquid-cooled neurons, the pulse sequence after encoding-decoding rule is recorded. For actuator neurons, the membrane potential sequence before decoding is recorded. The historical state data is essentially a matrix, where the rows are the total number of liquid-cooled neurons and the columns are the number of sampling points in the window. At the same time, the liquid-cooled structure data and liquid-cooled device data in the liquid-cooled neurons also need to be called.

[0051] Using all liquid-cooled neurons as nodes, an initial graph is constructed. The set of nodes in the initial graph is consistent with that of the liquid-cooled neurons. The set of edges is predefined based on the liquid-cooled structure data. For example, if there is a direct relationship between node i and node j, an undirected edge is established between them. At the same time, each edge is assigned a static weight, which ranges from 0 to 1 to reflect the tightness of the physical connection. For example, the edge weight between directly adjacent nodes can be set to 0.9, but the weight between non-adjacent nodes in the same branch can be set to 0.5, and the weight between nodes with no obvious relationship can be set to 0.1.

[0052] For each pair of nodes i and j that may have mutual influence in the initial graph, the transfer entropy from node i to node j is calculated. The transfer entropy is used to measure the information gain about the future state of node j that can be brought about by introducing the historical state data of node i, given the historical state data of node i itself, thereby quantifying the causal influence from node i to node j.

[0053] Understandably, given the time series of two node states X (representing node i) and Y (representing node j), the transfer entropy from X to Y is defined as: ; in, Let Y be the state of Y at the future time t+1. This represents the historical state data of Y over the past k time steps. Let X be the historical state data of X over the past time step. p(*) and p(*|*) represent the joint probability and conditional probability distribution, respectively, obtained through statistical estimation of the historical state data, for example, using binning. The calculated (X→Y) is a non-negative scalar. A value greater than 0 for (X→Y) indicates that the state of node i contributes to the prediction of the future state of node j, i.e., there is a causal relationship from i to j. The larger the value, the greater the causal influence. Next, (X→Y) is transformed into a dynamic part suitable for graph weights. After normalization, (X→Y) is combined with the set static weights to obtain the fused weight of each edge at time t. ; in, Represented as an edge The fusion weights at time t It is represented as the harmonic coefficient, and If it is between 0 and 1, A larger value indicates a greater reliance on static weights, while a smaller value indicates a greater reliance on causal discovery. This represents the static weights that are set. This is represented as transitive entropy.

[0054] This leads to the construction of a dynamic heterogeneous graph, where nodes are liquid-cooled neurons, edges represent the interaction relationships between nodes, and edge weights are time variables that integrate static and causal discovery.

[0055] Step X32: Obtain performance evaluation data, perform aggregation operations based on the performance evaluation data to obtain scalar data, and obtain global reward data based on the scalar data and the reward strategy.

[0056] Specifically, performance evaluation data is obtained, which refers to the performance-related indicators of the liquid cooling system during operation within a predefined and relatively long evaluation period, such as 24 hours. These indicators include energy efficiency indicators, such as the average power efficiency and total power consumption of the liquid cooling system; thermal performance indicators, such as the cumulative time when the detected temperature exceeds the upper limit and the maximum temperature; stability indicators, such as the frequency or amplitude of actuator movement; and reliability indicators, such as the number of fault alarms and health score.

[0057] The performance evaluation data within the evaluation period is aggregated to obtain a set of representative scalar values, i.e., scalar data, such as the average power consumption over the entire period. These scalar values ​​are then normalized to eliminate the influence of different dimensions. The normalized scalar values ​​are then fed into the global reward function for calculation, resulting in a global reward value. This global reward value is a scalar value representing the overall performance evaluation of the liquid cooling system over the past evaluation period. The design of the specific global reward function aims to transform the multi-objective optimization problem into a scalar optimization problem. Its general form can be expressed as a weighted sum or product of various performance indicators, and it usually includes a penalty term. The weights can be set according to the importance placed on different performance indicators. For example, if energy efficiency is prioritized while ensuring temperature control, a larger weight will be assigned to energy efficiency-related performance indicators.

[0058] Step X33: Perform cyclical reward diffusion in the dynamic heterogeneous graph based on global reward data, and obtain local reward data after the iteration is completed.

[0059] Step X33 is based on steps X331 to X333: Step X331: Add virtual performance nodes to the dynamic heterogeneous graph.

[0060] Step X332: Construct an initial reward vector based on virtual performance nodes, dynamic heterogeneous graphs, and global reward data.

[0061] Step X333: Based on the initial reward vector, the reward is diffused in the dynamic heterogeneous graph using a diffusion update rule to obtain local reward data.

[0062] Specifically, firstly, an initial reward vector with the same number of nodes as the dynamic heterogeneous graph is constructed, with a dimension of M×1, and initialized using a single-source injection method. A virtual performance node is added to the dynamic heterogeneous graph. This node does not belong to the actual node set, but is connected to all nodes in the dynamic heterogeneous graph through directed edges. The connection weight can be set to a uniform value, that is, the edge weight from the virtual performance node to each actual node i is 1 / M. The global reward data is assigned to the virtual performance node, while the initial reward of the actual node is set to the global reward function / M. At this time, the initial reward vector contains the virtual performance node and is a (M+1)×1 vector. Then, an iterative method is used to allow the reward to propagate in the dynamic heterogeneous graph in multiple rounds. In each round of iteration, each node distributes its current reward value to its neighboring nodes according to the strength of its influence on other nodes, and also receives the distributed rewards from its neighboring nodes.

[0063] Understandable, let's assume Let be the reward vector after the k-th iteration, and let the elements in the reward vector be... Let represent the cumulative reward value of node i after the k-th iteration, and the diffusion update rule adopts a random walk model, with the specific formula as follows: ; in, Represented as the normalized fusion weight matrix in a dynamic heterogeneous graph. Represented as the transposed fusion weight matrix, since Let the causal influence from node i to node j be represented. During reward diffusion, the reward should flow in the opposite direction of the causal influence, that is, from the result node to the cause node. Therefore, a transpose matrix is ​​required. It is expressed as the diffusion attenuation coefficient. The value is between 0 and 1, used to control the proportion of reward spreading along the dynamic heterogeneous graph in each iteration; the remaining portion... The initial reward will be retained. The closer the value is to 1, the farther the reward propagates across the dynamic heterogeneous graph, and the smoother the reward distribution among nodes. Conversely, a value closer to 1 indicates a higher correlation. A typical value is 0.85. This is the initial reward vector.

[0064] After k iterations, monitoring can be used to... The amount of change is determined, and the process terminates early when the amount of change is less than a threshold, or it can be terminated directly by using a fixed number of k rounds. After the iteration ends, the final reward vector, i.e., local reward data, is obtained. The local reward data represents the reward share that node i is finally allocated after the dynamic heterogeneous graph diffusion. Nodes with strong causal influence and close correlation with global performance will receive higher local reward data.

[0065] Step X4: Obtain the local policy network and observation feature data respectively, and adjust the outer loop of the hybrid spiking neural network based on the local policy network and observation feature data.

[0066] Step X4 is based on steps X41 to X43: Step X41: Initialize the local policy network and acquire observation feature data. Input the observation feature data into the local policy network to acquire advantage data.

[0067] Specifically, after step X11, a local policy network is created independently for each liquid-cooled neuron (a node in the dynamic heterogeneous graph). A relatively simple feedforward neural network can be selected, such as an MLP with one or two hidden layers, where the number of neurons in each layer is small, such as 16 or 32. After creation, the local policy network can be initialized. The parameters of the local policy network use standard neural network initialization methods, such as small random number initialization. Each local policy network is bound to the corresponding liquid-cooled neuron. The input is the observed feature data and local reward data, and the output is the inner loop fine-tuning amount.

[0068] Acquire observation feature data, which represents the corresponding observation data generated during historical steps. This data may include, for example, the corresponding pulse sequence of sensor neurons, the steady-state membrane potential of actuator neurons, and the recent state characteristics of neighboring nodes in the dynamic heterogeneous graph, such as trends, statistical characteristics, and moving averages. Input the observation feature data and local reward data into the local policy network. Since the local reward data is a single-step reward at the end of a cycle, to evaluate the merits of each action in the historical steps, it is necessary to calculate the advantage function estimate for each time step, i.e., the advantage data. The advantage function measures the quality of taking an action compared to the average action under the observation feature data. Single-step temporal difference error can be used as the advantage estimate, and its functional expression is as follows: ; in, This is expressed as the advantage function estimate at each time step t. Let the local reward data for node i be represented. Represented as a discount factor, ranging from 0 to 1, it is used to weigh the importance of current rewards against future rewards in periodic tasks. It can be set to a value close to 1. It is the state value function estimate of node i, representing the state value function estimated based on observed feature data. Below, the cumulative reward that node i is expected to obtain allows node i to maintain an independent value network. This value can be estimated, or it can be obtained through methods such as Monte Carlo simulation.

[0069] Step X42: Update the local policy network based on the advantageous data and using the PPO algorithm to obtain the updated network parameters.

[0070] Next, the policy gradient theorem is applied, with the goal of minimizing the loss function to update the local policy network parameters. The loss function can be expressed in the form of Proximal Policy Optimization (PPO), specifically as follows: ; in, The expectation for time step t can be obtained through average sampling. This is expressed as the importance sampling ratio, defined as the ratio of the probability of the new policy to the probability of the old policy outputting an action given observations. Expressed as a ratio Clip to range Inside, It is a hyperparameter, for example, 0.2, which limits the step size of policy updates to ensure training stability. The loss function... For network parameters The gradient is the policy gradient, which is calculated using the backpropagation algorithm and then applied using stochastic gradient descent or Adam. The update process is performed, and after each node completes a policy update locally, the updated network parameters are output.

[0071] Step X43 involves preprocessing the updated network parameters and then fine-tuning the hybrid spiking neural network based on these updated parameters.

[0072] For each node, the updated network parameters output by its local policy network are a scalar or vector, representing the adjustment amount of one or more adjustable parameters of the liquid-cooled neuron corresponding to that node in the hybrid spiking neural network. In this invention, the firing threshold of the liquid-cooled neuron in the hybrid spiking neural network is preferentially adjusted, as the firing threshold is the parameter that has the most direct and significant impact on the overall dynamics of the network.

[0073] Next, the updated network parameters are preprocessed. First, a first-order low-pass filter is applied to smooth the updated network parameters to eliminate high-frequency jitter. Then, the filtered updated network parameters are limited to a safe boundary to prevent abnormal functioning of the hybrid spiking neural network during subsequent parameter adjustments. An incremental update method is used to apply the updated network parameters to the firing threshold of the liquid-cooled neurons in the hybrid spiking neural network to fine-tune the hybrid spiking neural network. Since the liquid-cooled neurons in the hybrid spiking neural network run in parallel, all parameter fine-tuning updates should be completed synchronously within the same control cycle to avoid unexpected behavior in different cycles.

[0074] The function of this embodiment is explained below: In this embodiment, an adaptive operation and maintenance control method for a data center liquid cooling system combines an inner loop based on a hybrid pulse neural network reflection control with an outer loop based on a dynamic graph reward diffusion mechanism to control the liquid cooling system. The inner loop is responsible for real-time and rapid control of the liquid cooling system to ensure its instantaneous stability. The use of a hybrid pulse neural network avoids the complex iterative calculations required by traditional numerical optimization-based controllers, enabling the liquid cooling system to quickly suppress and regulate situations such as sudden load changes. The outer loop is responsible for long-term, slow optimization of the inner loop to improve the long-term performance and adaptability of the liquid cooling system. Thus, this method can improve the overall reliability and adaptability of the liquid cooling system during operation.

[0075] The present invention also provides an adaptive operation and maintenance control device for a data center liquid cooling system, comprising: The adaptive operation and maintenance control device for the data center liquid cooling system includes a processor and a memory. The memory and the processor are connected. The memory is used to store programs, instructions, or code, and the processor is used to execute the programs, instructions, or code in the memory to implement the adaptive operation and maintenance control method for the data center liquid cooling system described in the above embodiment.

[0076] An adaptive operation and maintenance control device for a data center liquid cooling system can be a general-purpose server or a special-purpose server; both can be used to implement the adaptive operation and maintenance control method for the data center liquid cooling system of this application. Although only one server is shown in this application, for convenience, the functions described in this application can be implemented in a distributed manner on multiple similar platforms to balance the load.

[0077] For example, an adaptive operation and maintenance control device for a data center liquid cooling system may include a network port connected to a network, one or more processors for executing program instructions, a communication bus, and different forms of storage media, such as disks, ROM, or RAM, or any combination thereof. Exemplarily, the adaptive operation and maintenance control device for a data center liquid cooling system may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof. The methods of this application can be implemented according to these program instructions. The adaptive operation and maintenance control device for a data center liquid cooling system also includes an input / output (I / O) interface between the computer and other input / output devices.

[0078] For ease of explanation, only one processor is described in the adaptive operation and maintenance control device for data center liquid cooling systems. However, it should be noted that the adaptive operation and maintenance control device for data center liquid cooling systems in this application may also include multiple processors. Therefore, the steps performed by one processor as described in this application may also be performed jointly by multiple processors or individually. For example, if the processor of the adaptive operation and maintenance control device for data center liquid cooling systems performs steps A and B, it should be understood that steps A and B may also be performed jointly by two different processors or individually by one processor. For example, the first processor performs step A, the second processor performs step B, or the first processor and the second processor jointly perform steps A and B.

[0079] The above-described embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should all be included within the protection scope of the present invention.

Claims

1. An adaptive operation and maintenance control method for a data center liquid cooling system, characterized in that, The method includes: Acquire liquid-cooled neurons and effective training data, construct encoding-decoding rules using liquid-cooled neurons, and acquire a hybrid spiking neural network loaded with a fully connected weight matrix based on the encoding-decoding rules and effective training data; The system acquires raw sensor data, processes the raw sensor data based on encoding-decoding rules, and obtains control command parameters through a hybrid pulse neural network. A dynamic heterogeneous graph is constructed based on liquid-cooled neurons to obtain global reward data, and local reward data is obtained through the dynamic heterogeneous graph and global reward data. Local policy network and observation feature data are acquired separately, and the outer loop of the hybrid spiking neural network is adjusted based on the local policy network and observation feature data.

2. The adaptive operation and maintenance control method for a data center liquid cooling system according to claim 1, characterized in that, Acquire liquid-cooled neurons and effective training data; construct an encoding-decoding rule using liquid-cooled neurons; and obtain a hybrid spiking neural network loaded with a fully connected weight matrix based on the encoding-decoding rule and effective training data, including: Acquire liquid cooling structure data and liquid cooling equipment data, and define liquid cooling neurons based on the liquid cooling structure data and liquid cooling equipment data; Obtain a preset parameter range, and construct encoding-decoding rules based on the preset parameter range and liquid-cooled neurons. The encoding-decoding rules include decoding rules and encoding rules. Obtain valid training data, transform the valid training data based on encoding-decoding rules, and obtain multiple sets of transformed training data; A spiking hybrid neural network is obtained, and a fully connected weight matrix is ​​obtained based on the spiking hybrid neural network and multiple sets of training transformation data, using a weight update rule. The fully connected weight matrix is ​​loaded into the hybrid spiking neural network, and the hybrid spiking neural network is validated.

3. The adaptive operation and maintenance control method for a data center liquid cooling system according to claim 1, characterized in that, Raw sensor data is acquired, processed based on encoding-decoding rules, and control command parameters are obtained through a hybrid pulse neural network, including: The raw sensor data is acquired, converted into a pulse emission time series based on the encoding-decoding rule, and then loaded into the hybrid pulse neural network. The steady-state membrane potential vector is obtained by evolution through a hybrid pulse neural network, and then converted into control command parameters based on the encoding-decoding rule. The liquid cooling system is cyclically controlled based on control command parameters.

4. The adaptive operation and maintenance control method for a data center liquid cooling system according to claim 1, characterized in that, A dynamic heterogeneous graph is constructed based on liquid-cooled neurons to obtain global reward data. Local reward data is then obtained through the dynamic heterogeneous graph and global reward data, including: Historical state data is acquired, an initial graph is constructed based on liquid-cooled neurons, and a dynamic heterogeneous graph is obtained based on the initial graph and a fusion strategy based on historical state data. Acquire performance evaluation data, perform aggregation operations based on the performance evaluation data to obtain scalar data, and obtain global reward data based on the scalar data and the reward strategy; Based on global reward data, a cyclical reward diffusion is performed in a dynamic heterogeneous graph, and local reward data is obtained after the iteration is completed.

5. The adaptive operation and maintenance control method for a data center liquid cooling system according to claim 4, characterized in that, Historical state data is acquired, an initial graph is constructed based on liquid-cooled neurons, and a dynamic heterogeneous graph is obtained based on the initial graph and a fusion strategy based on historical state data, including: Set a fixed historical analysis window and obtain historical status data based on the fixed historical analysis window; All liquid-cooled neurons are treated as a set of nodes and an edge set is obtained. Static weights are assigned to the edge set, and an initial graph is constructed based on the edge set, node set, and static weights. Based on historical state data, the transfer entropy between node sets is obtained, and the transfer entropy is fused with static weights to obtain fused weights, where the transfer entropy is used to measure causal influence. Dynamic heterogeneous graphs are obtained based on fusion weights and the initial graph.

6. The adaptive operation and maintenance control method for a data center liquid cooling system according to claim 4, characterized in that, Based on global reward data, a cyclical reward diffusion is performed in a dynamic heterogeneous graph. After the iteration is completed, local reward data is obtained, including: Add virtual performance nodes to the dynamic heterogeneous graph; An initial reward vector is constructed based on virtual performance nodes, dynamic heterogeneous graphs, and global reward data; Based on the initial reward vector, a diffusion update rule is used to diffuse the reward in the dynamic heterogeneous graph to obtain local reward data.

7. The adaptive operation and maintenance control method for a data center liquid cooling system according to claim 1, characterized in that, Local policy network and observation feature data are acquired separately. Based on these data, the outer loop of the hybrid spiking neural network is adjusted, including: Initialize the local policy network and acquire observation feature data. Input the observation feature data into the local policy network to acquire advantage data. Based on the superior data and using the PPO algorithm, the local policy network is updated to obtain the updated network parameters; The updated network parameters are preprocessed, and the hybrid spiking neural network is fine-tuned based on the updated network parameters.

8. An adaptive operation and maintenance control device for a data center liquid cooling system, characterized in that, The adaptive operation and maintenance control device for the data center liquid cooling system includes a processor and a memory. The memory and the processor are connected. The memory is used to store programs, instructions, or code. The processor is used to execute the programs, instructions, or code in the memory to implement the adaptive operation and maintenance control method for the data center liquid cooling system according to any one of claims 1-7.