Intelligent control method and system for electric dust removal based on deep learning
By constructing a high-fidelity virtual model using deep learning and exploring optimized control strategies in a virtual environment using deep reinforcement learning, the problem that intelligent control methods for electrostatic precipitators cannot adapt to dynamic changes in operating conditions has been solved. This has enabled adaptive optimization control, reduced costs and risks, and improved dust removal efficiency and energy efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUONENG JILIN LONGHUA THERMAL POWER CO LTD YANJI THERMAL POWER PLANT
- Filing Date
- 2026-02-06
- Publication Date
- 2026-06-30
AI Technical Summary
Existing intelligent control methods for electrostatic precipitators rely on historical best data and cannot adapt to dynamic changes in operating conditions. Furthermore, on-site trial and error carries high risks and costs, making it difficult to achieve adaptive and optimized control.
The intelligent control system for electrostatic precipitators, based on deep learning, includes a condition sensing module, a digital twin module, a strategy exploration module, and a control execution module. It utilizes deep neural networks to construct a high-fidelity virtual model, explores and optimizes control strategies in the virtual environment through deep reinforcement learning, and adjusts them in real time within the closed-loop control architecture, combined with a safety verification mechanism and online adaptive updates.
The system achieves adaptive optimization control of the electrostatic precipitator under different operating conditions, reduces testing costs and operational risks, improves the overall performance of dust removal efficiency and energy consumption ratio, and ensures the reliability and safety of the system.
Smart Images

Figure CN122298577A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of electrostatic precipitator control technology, specifically relating to an intelligent control method and system for electrostatic precipitators based on deep learning. Background Technology
[0002] In the field of industrial flue gas purification, electrostatic precipitators, as a highly efficient gas-solid separation device, are widely used in industries such as power, metallurgy, and building materials to control particulate matter emissions. They use a high-voltage electric field to charge dust particles and collect them at the collecting electrode, thereby achieving the purpose of purifying the flue gas.
[0003] Intelligent control of electrostatic precipitators is a key technological direction for improving their operational economy and reliability. This technology aims to dynamically adjust high-voltage power supply parameters according to real-time operating conditions to achieve an optimal balance between dust removal efficiency and energy consumption.
[0004] Existing technologies commonly employ supervised learning-based intelligent control methods. These methods essentially rely on training models using a large amount of historical optimal operating data, essentially imitating and reproducing past experience. However, electrostatic precipitators are complex systems with strong nonlinearity and large time lag. Their true optimal control setpoints dynamically drift with changes in flue gas conditions, meaning that historical data may not contain the global optimal solution for all possible operating conditions.
[0005] Furthermore, directly exploring and testing control strategies on actual electrostatic precipitator equipment not only incurs high testing costs but also carries the risk of operational risks such as a sharp increase in energy consumption or damage to critical components due to improper operation. Therefore, how to achieve adaptive and optimized control of electrostatic precipitators under different operating conditions while ensuring equipment safety has become a pressing technical challenge in this field. Summary of the Invention
[0006] The technical problem to be solved by the present invention is to overcome the shortcomings of existing intelligent control methods for electrostatic precipitators that rely on historical best data and cannot adapt to dynamic changes in working conditions, as well as the high risk and high cost of direct on-site trial and error, and to provide an intelligent control method and system for electrostatic precipitators based on deep learning.
[0007] The present invention provides an intelligent control system for electrostatic precipitators based on deep learning, comprising a condition sensing module, a digital twin module, a strategy exploration module, and a control execution module.
[0008] The operating condition sensing module is used to collect and process multi-source status data during the operation of the electrostatic precipitator in real time. The collected data includes primary voltage, primary current, secondary voltage, secondary current, flue gas temperature, flue gas humidity, flue gas flow rate, and inlet and outlet dust concentration.
[0009] The digital twin module receives real-time data from the operating condition sensing module and runs a high-fidelity virtual model of the electrostatic precipitator. This virtual model is built through a deep neural network and can accurately simulate the dynamic response of the actual electrostatic precipitator under given high-voltage power supply parameters, including the dust charging, migration, and collection processes, as well as the corresponding energy consumption and efficiency indicators.
[0010] The strategy exploration module is connected to the digital twin module. In the virtual environment provided by the digital twin module, it autonomously explores and optimizes the control strategy based on deep reinforcement learning algorithms. This process does not rely on historical best operation data, but rather seeks the high-voltage power supply parameter set point that achieves the optimal balance between dust removal efficiency and energy consumption under different operating conditions by maximizing the cumulative reward through the interaction between the agent and the virtual environment.
[0011] The control execution module receives the optimized control instructions output by the strategy exploration module and converts them into specific control signals to drive the high-voltage power supply equipment of the electrostatic precipitator to perform corresponding parameter adjustments.
[0012] Furthermore, the construction process of the high-fidelity virtual model in the digital twin module is as follows.
[0013] First, collect historical operating data covering a wide range of electrostatic precipitators, including all parameters collected by the aforementioned condition sensing module.
[0014] Secondly, a deep neural network model is constructed. Its input layer contains state variables representing the current operating conditions and high-voltage power supply parameters to be evaluated. The output layer predicts the key performance indicators of the electrostatic precipitator at multiple time steps in the future under the combination of state and parameters. These indicators include dust removal efficiency, unit energy consumption, and electric field stability parameters.
[0015] Then, the deep neural network is trained under supervision using the collected historical data. By minimizing the error between the model's predicted values and the actual measured values, the model is able to accurately capture the dynamic characteristics of the electrostatic precipitator.
[0016] Ultimately, the trained deep neural network constitutes the high-fidelity virtual model, which can replace the actual device in the policy exploration module to perform secure policy evaluation.
[0017] Furthermore, the operating mechanism of the deep reinforcement learning algorithm in the policy exploration module is as follows: The control problem of the electrostatic precipitator is modeled as a Markov decision process.
[0018] The state space is defined by multiple state variables collected in real time by the operating condition sensing module.
[0019] The action space is the adjustable range of high-voltage power supply parameters, such as the set value of the secondary voltage. The reward function is designed as a multi-objective trade-off function, the core of which is to encourage high dust removal efficiency while penalizing high energy consumption, and to impose additional penalties on unstable states of the electric field.
[0020] The agent is trained using a proximal policy optimization algorithm, which outputs the probability distribution of each action to be taken in the current state through a policy network and evaluates the value of the state through a value network.
[0021] The agent performs extensive trial and error iterations in the virtual environment provided by the digital twin module. It samples actions based on the policy network, receives rewards and the next state from the virtual environment, and uses this experience data to update the parameters of the policy network and the value network. Finally, it learns an optimized control strategy that can adapt to different operating conditions.
[0022] Furthermore, the policy exploration module also integrates an experience replay buffer. This buffer stores experience tuples of states, actions, rewards, and the next state generated by the agent during the exploration process. During training, the policy exploration module randomly samples a small batch of experience data from the experience replay buffer for updating network parameters.
[0023] This experience replay mechanism breaks the temporal correlation between data, improves the utilization rate of data samples, and helps stabilize the training process of deep reinforcement learning algorithms.
[0024] Furthermore, the control execution module includes a safety verification step before executing control commands. This step compares the optimized control parameters output by the strategy exploration module with preset safe operating boundaries.
[0025] If the optimized control parameters exceed the safety boundary, the control execution module will not execute the instruction and will adopt a preset conservative safety strategy for control. At the same time, it will provide a negative reward to the strategy exploration module to guide the agent to avoid unsafe control actions in future explorations.
[0026] Furthermore, the system operates within a closed-loop control architecture. The operating condition sensing module continuously monitors the actual operating status of the electrostatic precipitator.
[0027] The strategy exploration module periodically or triggered by events that significantly change the operating conditions re-calculates the strategy in the digital twin environment based on the latest state, and generates updated control instructions.
[0028] The control execution module is responsible for applying the new optimization instructions to the actual device.
[0029] This closed-loop process ensures that the control system can continuously track changes in operating conditions and maintain the optimal operation of the electrostatic precipitator in dynamic environments.
[0030] Furthermore, the digital twin module has an online adaptive update function. It calculates the model prediction error by comparing the key performance indicators predicted by the virtual model with the corresponding indicators measured by the actual electrostatic precipitator.
[0031] When the prediction error continues to exceed the preset threshold, the digital twin module will initiate the model update process, using recently accumulated actual operating data to incrementally train the deep neural network model, so that the virtual model can track the characteristic drift of the actual equipment caused by factors such as aging and dust accumulation, and maintain its simulation accuracy.
[0032] Compared with the prior art, the beneficial effects of the present invention are as follows: 1. This invention introduces a digital twin module to construct a high-fidelity virtual model, providing a safe, low-cost, and unlimited policy exploration environment for deep reinforcement learning agents. This completely avoids the high costs and operational risks associated with trial and error on real electrostatic precipitator equipment, enabling the agent to explore globally optimal control strategies without restraint, thus overcoming the bottleneck of traditional supervised learning methods that are limited by local optima in historical data.
[0033] 2. The deep reinforcement learning paradigm employed in this invention is based on the fact that the agent autonomously learns and optimizes strategies through interaction with the environment, rather than simply imitating historical data. This enables the system to proactively adapt to the strong nonlinearity, large hysteresis characteristics, and dynamic drift of operating conditions of electrostatic precipitators, discovering better control setpoints not recorded in historical data, thereby achieving true adaptive and optimized control and improving the overall performance of dust removal efficiency and energy consumption ratio.
[0034] 3. The closed-loop control architecture and safety verification mechanism designed in this invention jointly ensure the long-term reliability and safety of the system. The system can continuously respond to changes in operating conditions and dynamically adjust the control strategy. Meanwhile, the safety verification step acts as a robust defense, preventing the execution of any unsafe control commands that might lead to equipment failure or performance degradation, ensuring the robust application of the entire intelligent control system in complex industrial environments. The online adaptive update function of the digital twin module further ensures the consistency between the virtual model and the physical entity, providing a solid foundation for long-term accurate strategy optimization. Attached Figure Description
[0035] Figure 1 This is a schematic diagram of the overall technical architecture of the deep learning-based intelligent control system for electrostatic precipitators proposed in this invention. Figure 2 This is a schematic diagram illustrating the core principle framework of the digital twin module in this invention for constructing a high-fidelity virtual model; Figure 3This is a flowchart illustrating the logical flow of the strategy exploration module in this invention, which optimizes control strategies based on deep reinforcement learning. Figure 4 This is a schematic diagram of the multi-level interaction relationship and data flow of the working condition perception, digital twin, strategy exploration and control execution modules in this invention; Detailed Implementation Please refer to the attached document. Figure 1 This embodiment details a deep learning-based intelligent control system for electrostatic precipitators. The system consists of a condition sensing module, a digital twin module, a strategy exploration module, and a control execution module. These modules are connected via a high-speed industrial network, forming a closed-loop control system encompassing real-time data acquisition, virtual simulation, intelligent decision-making, and precise execution.
[0036] The core of the system lies in using digital twin technology to construct a high-fidelity virtual model of an electrostatic precipitator, and in this safe environment, autonomously exploring the optimal control strategy based on deep reinforcement learning algorithms, thereby achieving efficient, energy-saving and stable control of the actual electrostatic precipitator.
[0037] The operating condition sensing module, as the system's data source, is responsible for comprehensively collecting multi-source status data during the operation of the electrostatic precipitator. This module integrates various high-precision sensors and data acquisition units, specifically including a high-voltage power supply parameter acquisition unit, a flue gas parameter acquisition unit, and a dust concentration detection unit.
[0038] The high-voltage power supply parameter acquisition unit is directly connected to the high-voltage power supply equipment of the electrostatic precipitator, and monitors the instantaneous values of primary voltage, primary current, secondary voltage and secondary current in real time. The sampling frequency is greater than 1000Hz to ensure that it can capture rapid fluctuations in electric field intensity.
[0039] The flue gas parameter acquisition unit is deployed in the inlet flue of the electrostatic precipitator and is equipped with a thermocouple temperature sensor, a capacitive humidity sensor and a differential pressure flow meter, which are used to continuously measure the flue gas temperature, flue gas humidity and flue gas flow rate, respectively.
[0040] The dust concentration detection unit is equipped with laser scattering dust concentration meters installed at the inlet and outlet pipes of the electrostatic precipitator to obtain real-time data on inlet and outlet dust concentrations.
[0041] All raw analog signals acquired by the sensors are amplified, filtered, and isolated by signal conditioning circuitry, and then converted into digital signals by a 16-bit analog-to-digital converter.
[0042] The working condition sensing module has an embedded data preprocessing submodule, which performs engineering unit conversion, range normalization, and bad value removal on the converted digital quantity.
[0043] For data acquired instantaneously, the preprocessing submodule executes a moving average filtering algorithm with a window width of 50 sampling points to suppress random noise interference.
[0044] The preprocessed multi-source state data is encapsulated into a unified data frame format. Each frame contains a timestamp, device identifier, and checksum of all state variables. It is periodically sent to the digital twin module through the industrial Ethernet interface, with a default sending period of 500ms.
[0045] The digital twin module receives real-time multi-source status data from the operating condition sensing module and runs a high-fidelity virtual model of the electrostatic precipitator. Please refer to the appendix. Figure 2 At the core of this virtual model is a deeply trained deep neural network model.
[0046] The process of building this high-fidelity virtual model involves four stages: data preparation, model building, supervised training, and validation and deployment.
[0047] During the data preparation phase, the system collects historical operating data covering a wide range of electrostatic precipitator operations. The data dimensions must include all parameters collected by the operating condition sensing module, namely primary voltage, primary current, secondary voltage, secondary current, flue gas temperature, flue gas humidity, flue gas flow rate, inlet dust concentration, and outlet dust concentration. The historical data should span at least 12 months to ensure coverage of operating conditions under different seasons, loads, and coal types.
[0048] The collected raw historical data must undergo a rigorous data cleaning process, including missing value imputation, outlier detection and correction, and data consistency checks.
[0049] During the model building phase, the deep neural network model adopts a recurrent neural network structure with long short-term memory units to capture the dynamic temporal characteristics of electrostatic precipitators.
[0050] The input layer of this network is designed as a multi-dimensional vector, with vector elements including state variables representing the current operating conditions and high-voltage power supply parameters to be evaluated.
[0051] The state variables are specifically the normalized primary voltage, primary current, flue gas temperature, flue gas humidity, flue gas flow rate, and inlet dust concentration. The high-voltage power supply parameters are the set values of the secondary voltage or the secondary current.
[0052] The output layer predicts the key performance indicators of the electrostatic precipitator at multiple time steps in the future under the input state and parameter combination. These indicators include dust removal efficiency, unit energy consumption, and electric field stability parameters.
[0053] The electric field stability parameter is quantified by the variance of the secondary voltage fluctuation.
[0054] The network hidden layer contains four long short-term memory layers, each with 128 neurons. Residual connections are used between layers to alleviate the gradient vanishing problem.
[0055] A fully connected layer is connected after the output layer to map the features output by the Long Short-Term Memory layer to specific performance metric predictions.
[0056] During the supervised training phase, the deep neural network is trained in a supervised manner using preprocessed historical data.
[0057] The training objective is to minimize the error between the model's predicted values and the actual measured values, and the loss function is the mean squared error function.
[0058] The optimizer uses the adaptive moment estimation algorithm, with an initial learning rate of 0.001 and an exponential decay strategy, where the decay rate decreases to 0.95 of the original value every 10 training epochs.
[0059] Early stopping is introduced during training to prevent overfitting. If the validation set loss no longer decreases for five consecutive training cycles, training is terminated.
[0060] Ultimately, the trained deep neural network constitutes the high-fidelity virtual model.
[0061] The model is deployed on a high-performance graphics processor of a digital twin module for real-time inference, with an inference latency requirement of less than 10ms to ensure the real-time nature of policy exploration.
[0062] The strategy exploration module connects to the digital twin module. Within the virtual environment provided by the digital twin module, it autonomously explores and optimizes control strategies based on deep reinforcement learning algorithms. Please refer to the appendix. Figure 3 The core of the strategy exploration module is to model the control problem of the electrostatic precipitator as a Markov decision process. The state space of this process is defined by multiple state variables collected in real time by the operating condition sensing module, specifically including primary voltage, primary current, flue gas temperature, flue gas humidity, flue gas flow rate, and inlet dust concentration.
[0063] These state variables undergo the same normalization process as during the training of the digital twin model before being input into the policy exploration module.
[0064] The action space is the adjustable range of high-voltage power supply parameters, such as the set value of the secondary voltage. Its adjustment range is set according to the design parameters of the electrostatic precipitator itself, typically from 40kV to 80kV, with an action resolution set to 0.1kV. The reward function is designed as a multi-objective trade-off function, mathematically expressed as follows: ; in, Represents the time step The rewards received; Represents the time step The predicted dust removal efficiency ranges from 0 to 1; Represents the time step Predicted unit energy consumption; Represents the time step The predicted variance of the secondary voltage fluctuation; , , These are weighting coefficients used to balance the importance of dust removal efficiency, energy consumption, and stability, with typical values of 100, 10, and 50, respectively.
[0065] The core of this reward function is to encourage high dust removal efficiency while penalizing high energy consumption, and to impose additional penalties on unstable states of the electric field.
[0066] The agents in the policy exploration module are trained using a proximal policy optimization algorithm.
[0067] The algorithm consists of two core neural networks: a policy network and a value network.
[0068] The policy network takes the current state vector as input and outputs an action probability distribution, which defines the probability of choosing each available action in the current state.
[0069] The policy network employs a fully connected neural network with two hidden layers, each containing 256 neurons, and uses a modified linear unit as the activation function. The output layer uses the Softmax function to transform the raw scores into a probability distribution.
[0070] The value network also takes the current state vector as input and outputs a scalar value to evaluate the long-term expected cumulative reward of the current state.
[0071] The value network structure is similar to the policy network, but the output layer is a linear activation function. The agent performs extensive trial and error iterations in a virtual environment provided by the digital twin module.
[0072] In each iteration step, the agent samples an action, such as a specific secondary voltage setpoint, based on the probability distribution output by the policy network.
[0073] The action is input into the virtual model of the digital twin module. Based on the current state and the action, the virtual model simulates the dynamic response of the electrostatic precipitator and calculates the state and corresponding reward value for the next time step.
[0074] The agent receives this reward and the next state, forming a complete experience tuple.
[0075] The strategy exploration module integrates an experience replay buffer, which is typically set to a capacity of 100,000 experience tuples to store the experience data of the agent's state, actions, rewards, and next state generated during the exploration process.
[0076] During training, the policy exploration module randomly samples a small batch of experience data from the experience replay buffer, with the batch size set to 128.
[0077] Using the data obtained from these samples, the parameters of the policy network and the value network are updated simultaneously through the update rules of the near-end policy optimization algorithm.
[0078] The objective function of the proximal policy optimization algorithm aims to maximize the expected cumulative reward while ensuring that the policy update step size is not too large, thereby maintaining the stability of training.
[0079] Its objective function involves the probability ratio of the old and new strategies, the estimation of the advantage function, and a pruning term.
[0080] The dominance function is estimated using the generalized dominance estimation algorithm, which combines time-series difference error with a decay factor, typically 0.95.
[0081] The update learning rate for the policy network was set to 0.0003, and the update learning rate for the value network was set to 0.001.
[0082] This exploration process in the virtual environment continues until the policy network converges, meaning the agent has learned an optimized control policy that can adapt to different operating conditions and achieve the best balance between dust collection efficiency, energy consumption, and stability under various states.
[0083] The control execution module receives the optimized control instructions output by the strategy exploration module and converts them into specific control signals to drive the high-voltage power supply equipment of the electrostatic precipitator to perform corresponding parameter adjustments.
[0084] The control execution module contains an instruction parsing submodule, a security verification submodule, and a signal driving submodule.
[0085] The instruction parsing submodule receives optimization control instructions from the strategy exploration module, which typically include target secondary voltage setpoints or target secondary current setpoints.
[0086] The parsing submodule verifies the integrity of the instruction format and extracts key control parameters.
[0087] The security verification submodule is a key component in ensuring the secure operation of the system.
[0088] This submodule maintains a preset safe operating boundary database, which stores the safe operating parameter ranges of the electrostatic precipitator under various operating conditions, such as the upper and lower limits of the secondary voltage and the maximum allowable current density.
[0089] The security verification submodule compares the optimized control parameters output by the strategy exploration module with the corresponding security operation boundaries in the database in real time.
[0090] The comparison process employs multiple verification logics, including range checks, gradient change rate checks, and compatibility checks with the current operating conditions. Verification passes if the optimized control parameters are entirely within the safety boundaries and the change rate is less than a preset threshold.
[0091] If any of the optimized control parameters exceeds the safety boundary, or if the parameter changes too drastically, the safety verification submodule determines that the instruction is an unsafe instruction.
[0092] For unsafe instructions, the control execution module will refuse to execute them and immediately activate the preset conservative security policy.
[0093] Conservative safety strategies typically employ rule-based control logic, such as maintaining previously safe control parameters or switching to a fixed, proven safe parameter setpoint.
[0094] At the same time, the control execution module will feed back a significant negative reward to the policy exploration module. This negative reward value is much smaller than the reward in normal exploration, for example, it is set to -100, so as to strongly guide the agent to avoid making similar unsafe control actions in future explorations.
[0095] Once the command passes the security check, the signal drive submodule converts the digital control command into an analog control signal that the high-voltage power supply equipment can recognize.
[0096] This submodule typically includes a digital-to-analog converter and a power amplifier circuit, outputting a standard 4-20mA current signal or a 0-10V voltage signal, which is directly connected to the controller setpoint input port of the high-voltage power supply equipment.
[0097] The execution cycle of the control execution module is synchronized with the data transmission cycle of the operating condition sensing module, with a default value of 500ms, to ensure the real-time performance of the control.
[0098] Please refer to the attached document. Figure 4This system operates within a strictly closed-loop control architecture. The operating condition sensing module, acting as the system's sensory nerve endings, continuously and periodically monitors the actual operating status of the electrostatic precipitator and uninterruptedly transmits pre-processed multi-source status data streams to the digital twin module. Upon receiving the real-time data, the digital twin module updates the internal state of its virtual model to ensure consistency with the current state of the physical electrostatic precipitator. The strategy exploration module's activation triggering mechanism has two modes: periodic triggering and event triggering.
[0099] In the periodic trigger mode, the strategy exploration module performs a new round of strategy optimization calculation in the digital twin environment at fixed time intervals, such as every 5 minutes, based on the latest current state, and generates updated control instructions.
[0100] In event-triggered mode, when the operating condition perception module detects a significant change in key state variables, such as a sudden change in inlet dust concentration of more than 30% or a change in flue gas flow rate of more than 10% per minute, the strategy exploration module is immediately triggered to re-optimize.
[0101] This design ensures that the control system can respond quickly to drastic changes in operating conditions.
[0102] The control execution module is responsible for applying the new optimization instructions generated by the strategy exploration module to the high-voltage power supply equipment of the actual electrostatic precipitator after safety verification.
[0103] The entire closed-loop process repeats itself, ensuring that the control system can continuously track changes in operating conditions and maintain the optimal operation of the electrostatic precipitator in dynamic environments.
[0104] The digital twin module also features online adaptive updates to ensure that the virtual model can maintain high fidelity over a long period of time.
[0105] This functionality is implemented through a model prediction error monitoring submodule.
[0106] This submodule continuously compares the key performance indicators predicted by the virtual model with the corresponding indicators measured by the actual electrostatic precipitator through the operating condition sensing module. Specific indicators compared include dust removal efficiency and unit energy consumption.
[0107] The mean absolute percentage error is used as the metric for calculating the prediction error of the model.
[0108] When the prediction error exceeds the preset threshold for 10 consecutive control cycles, for example, the prediction error threshold for dust removal efficiency is set to 3% and the prediction error threshold for unit energy consumption is set to 5%, the digital twin module will start the model update process.
[0109] The update process first extracts recently accumulated actual operating data from the historical operating database, which usually refers to data within the past 24 hours.
[0110] Then, these new real-world operating data are used to incrementally train the existing deep neural network model.
[0111] Incremental training uses a small learning rate, such as 0.0001, to prevent catastrophic forgetting of existing knowledge. Training cycles are typically short, lasting only a few cycles until the loss convergence is verified on new data.
[0112] Through this online adaptive update, the virtual model can track the characteristic drift of the actual equipment caused by factors such as aging, dust accumulation on the electrode plates, and changes in rapping efficiency, thereby maintaining its simulation accuracy over a long period of time and providing a continuous and reliable virtual environment for the strategy exploration module.
[0113] This embodiment provides another specific implementation of a deep learning-based intelligent control system for electrostatic precipitators. The core architecture is the same, but the reward function design of the strategy exploration module and the model structure of the digital twin module have been optimized to further improve the system's control performance and model generalization ability under extreme conditions.
[0114] In the strategy exploration module, the reward function incorporates considerations for the smoothness of control actions. The new reward function takes the following form: ; in, , , , , , , The definition is the same as above; This represents the action at the current time step, i.e., the high-voltage power supply parameter setting value; This represents the action at the previous time step; This is the weighting coefficient for the motion smoothness penalty, typically set to 2.
[0115] Adding a smoothness penalty term aims to avoid frequent and drastic fluctuations in control commands, reduce mechanical and electrical stress on high-voltage power supply equipment, and improve system stability and equipment lifespan. During the exploration process, the agent not only needs to learn efficient control strategies but also needs to learn to generate smooth sequences of control commands.
[0116] In constructing a high-fidelity virtual model for the digital twin module, this embodiment adopts a hybrid model structure that combines deep neural networks with a simplified mechanism model based on physical principles.
[0117] The deep neural network portion is primarily responsible for learning the complex nonlinear processes in electrostatic precipitators that are difficult to describe with precise mathematical models, such as the complex relationship between dust charging efficiency and electric field distribution, and flue gas conditions. This part of the network adopts an architecture combining convolutional neural networks and long short-term memory networks.
[0118] Convolutional layers are used to extract spatial correlation features from multi-channel sensor data, while long short-term memory layers are used to capture temporal dynamics.
[0119] Simplified mechanism models based on physical principles embed known and highly reliable physical laws, such as the formula for the migration velocity of dust particles in an electric field and the energy conservation equation.
[0120] The hybrid model operates as follows: First, the simplified mechanistic model calculates a preliminary performance index prediction based on the input state and actions; then, the deep neural network part receives the same input and the preliminary prediction result of the simplified mechanistic model, and outputs a residual correction term; the final system performance prediction value is the sum of the prediction value of the simplified mechanistic model and the residual correction term output by the neural network.
[0121] This hybrid modeling approach can alleviate the over-reliance of pure data-driven models on the quantity and quality of training data to some extent. Especially in extreme working conditions where training data cannot cover, the introduction of physical mechanisms can provide a certain extrapolation and prediction capability, thereby enhancing the robustness of the model.
[0122] The training process of this hybrid model adopts a phased strategy: first, the adjustable parameters in the simplified mechanism model are pre-trained using historical data; then, the simplified mechanism model is fixed, and the deep neural network part is trained using the same historical data, with the goal of minimizing the error between the predicted value and the actual measured value of the final hybrid model.
[0123] In this embodiment, the experience replay buffer of the strategy exploration module introduces a priority experience replay mechanism.
[0124] Traditional uniform random sampling may overlook empirical samples that have high learning value but occur infrequently.
[0125] The priority experience playback mechanism assigns different sampling priorities to each experience tuple based on the magnitude of its time difference error.
[0126] A large time difference error usually means that the agent has greater uncertainty in estimating the value of the state-action pair, thus having higher learning value and a correspondingly higher probability of being sampled.
[0127] In the specific implementation, the sampling probability of each empirical tuple is proportional to the absolute value of its time difference error plus a small positive constant, ensuring that all empirical data have a chance to be sampled. Simultaneously, to correct the bias introduced by non-uniform sampling, importance sampling weights are used to adjust the gradient during parameter updates.
[0128] This mechanism can accelerate the learning process of an agent, especially in the early stages of exploration, helping the agent learn from key experiences more quickly.
[0129] The security verification process of the control execution module is further strengthened in this embodiment by adding a forward-looking security assessment based on model prediction. In addition to the static security boundary check, the security verification submodule also sends the control command to be executed output by the strategy exploration module to the digital twin module, requesting the digital twin module to quickly simulate the operating trajectory of the electrostatic precipitator within a short period of time in the future, such as the next 30 seconds, based on the current state and the command.
[0130] The safety verification submodule analyzes key parameters in the simulated trajectory, such as secondary current and electric field strength, to determine whether there is a risk of exceeding limits or severe oscillations.
[0131] If the forward-looking simulation predicts any unsafe trends, the security verification submodule will mark the instruction as a potential risk instruction even if it passes the static boundary check, and take the same measures as when handling out-of-bounds instructions, namely, refuse to execute it and enable the conservative security policy, while feeding back a negative reward to the policy exploration module.
[0132] This security verification mechanism, which combines static and dynamic evaluation, can prevent potential risks caused by the dynamic characteristics of the system.
[0133] In this embodiment, the online adaptive update function of the digital twin module is more tightly coupled with the training process of the policy exploration module.
[0134] When the digital twin module initiates the model update process, it not only uses recent actual operating data to incrementally train the virtual model, but also notifies the strategy exploration module of the model update event.
[0135] After receiving a model update notification, the strategy exploration module can selectively clear some of the old experiences in its experience replay buffer, because these old experiences were generated based on the virtual model before the update and with lower accuracy.
[0136] The clearing strategy can adopt the first-in, first-out principle, or selectively clear based on the empirical timestamp and its time difference error.
[0137] Subsequently, the strategy exploration module can conduct additional exploration training for a period of time based on the updated, more accurate virtual model to quickly adapt to the new model dynamics and ensure that the control strategy is consistent with the latest device characteristics.
[0138] This collaborative update mechanism helps maintain the optimal overall performance of the entire intelligent control system during long-term operation.
[0139] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus.
[0140] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A deep learning-based intelligent control system for electrostatic precipitators, characterized in that, include: The operating condition sensing module is used to collect and process multi-source status data during the operation of the electrostatic precipitator in real time. The data it collects includes primary voltage, primary current, secondary voltage, secondary current, flue gas temperature, flue gas humidity, flue gas flow rate, and inlet and outlet dust concentration. The digital twin module receives real-time data from the operating condition sensing module and runs a high-fidelity virtual model of the electrostatic precipitator. This virtual model is built through a deep neural network and can accurately simulate the dynamic response of the actual electrostatic precipitator under given high-voltage power supply parameters, including the dust charging, migration, and collection processes, as well as the corresponding energy consumption and efficiency indicators. The strategy exploration module, connected to the digital twin module, autonomously explores and optimizes control strategies based on deep reinforcement learning algorithms in the virtual environment provided by the digital twin module. This process does not rely on historical best operation data, but rather seeks the high-voltage power supply parameter set point that achieves the optimal balance between dust removal efficiency and energy consumption under different operating conditions by maximizing cumulative rewards through the interaction between the agent and the virtual environment. The control execution module receives the optimized control instructions output by the strategy exploration module and converts them into specific control signals to drive the high-voltage power supply equipment of the electrostatic precipitator to perform corresponding parameter adjustments.
2. The intelligent control system for electrostatic precipitators based on deep learning according to claim 1, characterized in that, The construction process of the high-fidelity virtual model in the digital twin module is as follows: Collect historical operating data covering a wide range of electrostatic precipitators, including all parameters collected by the operating condition sensing module; A deep neural network model is constructed. Its input layer contains state variables representing the current operating conditions and high-voltage power supply parameters to be evaluated. The output layer predicts the key performance indicators of the electrostatic precipitator at multiple time steps in the future under the combination of state and parameters. These indicators include dust removal efficiency, unit energy consumption and electric field stability parameters. The deep neural network is trained under supervision using collected historical data. By minimizing the error between the model's predicted values and the actual measured values, the model can accurately capture the dynamic characteristics of the electrostatic precipitator. The trained deep neural network constitutes the high-fidelity virtual model.
3. The intelligent control system for electrostatic precipitators based on deep learning according to claim 1, characterized in that, The operating mechanism of the deep reinforcement learning algorithm in the policy exploration module is as follows: The control problem of the electrostatic precipitator is modeled as a Markov decision process; the state space is defined by multiple state variables collected in real time by the operating condition sensing module. The operating space is the adjustable range of the high-voltage power supply parameters; The reward function is designed as a multi-objective trade-off function, the core of which is to encourage high dust removal efficiency while penalizing high energy consumption, and to impose additional penalties on unstable states of the electric field. The agent is trained using a proximal policy optimization algorithm, which outputs the probability distribution of each action to be taken in the current state through a policy network and evaluates the value of the state through a value network. The agent performs numerous trial-and-error iterations in the virtual environment provided by the digital twin module. It samples actions based on the policy network, receives rewards and the next state from the virtual environment, and uses this experience data to update the parameters of the policy network and the value network.
4. The intelligent control system for electrostatic precipitators based on deep learning according to claim 3, characterized in that, The policy exploration module also integrates an experience replay buffer; this buffer is used to store the experience tuples of the state, action, reward and next state generated by the agent during the exploration process; during training, the policy exploration module randomly samples a small batch of experience data from the experience replay buffer for updating the network parameters.
5. The intelligent control system for electrostatic precipitators based on deep learning according to claim 1, characterized in that, The control execution module includes a security verification step before executing control commands; This step compares the optimized control parameters output by the strategy exploration module with the preset safe operating boundary. If the optimized control parameters exceed the safe boundary, the control execution module will not execute the instruction and will adopt the preset conservative safety strategy for control, while feeding back a negative reward to the strategy exploration module.
6. The intelligent control system for electrostatic precipitators based on deep learning according to claim 1, characterized in that, The system operates in a closed-loop control architecture; the operating condition sensing module continuously monitors the operating status of the actual electrostatic precipitator; the strategy exploration module periodically or triggered by significant changes in operating conditions recalculates the strategy in the digital twin environment based on the latest status, and generates updated control instructions; the control execution module is responsible for applying the new optimized instructions to the actual equipment.
7. The intelligent control system for electrostatic precipitators based on deep learning according to claim 1, characterized in that, The digital twin module has an online adaptive update function; it calculates the model prediction error by comparing the key performance indicators predicted by the virtual model with the corresponding indicators measured by the actual electrostatic precipitator; when the prediction error continues to exceed the preset threshold, the digital twin module will start the model update process and use the recently accumulated actual operating data to incrementally train the deep neural network model.
8. The intelligent control system for electrostatic precipitators based on deep learning according to claim 2, characterized in that, The deep neural network model adopts a recurrent neural network structure with long short-term memory units; the input layer of the network is designed as a multi-dimensional vector, and the vector elements include state variables representing the current operating conditions and high-voltage power supply parameters to be evaluated. The output layer predicts the key performance indicators of the electrostatic precipitator at multiple time steps in the future under the input state and parameter combination.
9. The intelligent control system for electrostatic precipitators based on deep learning according to claim 3, characterized in that, The reward function is designed as follows: the reward value equals the dust removal efficiency multiplied by the first weighting coefficient minus the unit energy consumption multiplied by the second weighting coefficient minus the electric field stability parameter multiplied by the third weighting coefficient; wherein the electric field stability parameter is quantified by the fluctuation variance of the secondary voltage.
10. A deep learning-based intelligent control method for electrostatic precipitators, characterized in that, Intelligent control of electrostatic precipitators can be achieved using the deep learning-based intelligent control system for electrostatic precipitators as described in any one of claims 1-9.