Wastewater treatment dosing control method and device based on latent state sequence deduction, equipment and medium
By using a latent state sequence deduction method, and leveraging state coding networks and state transition models, candidate dosing actions are generated and evaluated. This solves the lag and mismatch problems of existing dosing control systems in emergency situations, achieving proactive control and reagent optimization, and reducing operating costs and risks.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- XINTONG EMPOWERMENT (CHANGSHA) ARTIFICIAL INTELLIGENCE IND APPLICATION SYSTEM CO LTD
- Filing Date
- 2026-05-15
- Publication Date
- 2026-06-12
AI Technical Summary
Existing dosing control systems are slow to react to emergencies such as combined sewer overflows caused by rainfall or illegal nighttime industrial wastewater discharges, leading to process lag and mechanistic model mismatch. This makes it difficult to achieve effective proactive control and poses risks of reagent waste and sludge increase.
A wastewater treatment chemical dosing control method is constructed using a latent state sequence deduction approach, through state coding networks and state transition models. Candidate dosing actions are generated using hidden state vectors and policy networks, and dosing control actions are evaluated and optimized based on value networks, achieving proactive control by deduction before execution.
It enables multi-step sequence deduction and strategy evaluation in the potential state space, avoiding the response delay and mechanism model mismatch of traditional feedback control, reducing the risk of reagent waste and sludge increment, optimizing reagent dosage, and reducing operating costs.
Smart Images

Figure CN122194935A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of automatic control technology, and in particular to a wastewater treatment chemical dosing control method, device, equipment and medium based on potential state sequence deduction. Background Technology
[0002] Existing chemical dosing control systems often react slowly to sudden events such as combined sewer overflows caused by rainfall or illegal nighttime industrial wastewater discharges. Furthermore, after adjusting the dosing dosage, it takes a long time for the effects to be visible on online instruments, by which time a large amount of effluent exceeding standards may have already been generated. Overcompensation may also lead to wasted chemicals and increased sludge production. This exposes two deep-seated problems: First, decision-making blind spots caused by process lag. Introducing time-series predictive models such as Long Short-Term Memory (LSTM) networks typically only predicts the next water quality change, failing to simulate multi-step cumulative effects and struggling to overcome the decision-making disconnect caused by lag. Second, model mismatch caused by water quality fluctuations. Mechanism-based predictive control relies on fixed coagulation kinetic parameters, which are significantly affected by water temperature, pH, and organic matter composition, causing the model to quickly become inaccurate when operating conditions change. While fuzzy control can handle nonlinearity, its rule base coverage is limited, resulting in insufficient generalization ability when facing unseen shock loads. More importantly, these methods all directly model in the original observation space, leading to frequent oscillations in the control strategy.
[0003] As can be seen from the above, how to achieve proactive control that involves deduction before execution, solve the problems of response delay and mechanism model mismatch in traditional feedback control, and realize intelligent wastewater treatment dosing control are problems that need to be solved in this field. Summary of the Invention
[0004] In view of this, the purpose of this invention is to provide a wastewater treatment chemical dosing control method, device, equipment, and medium based on latent state sequence deduction, achieving proactive control through deduction followed by execution, solving the problems of response delay and mechanism model mismatch in traditional feedback control, and realizing intelligent wastewater treatment chemical dosing control. The specific solution is as follows: In a first aspect, this application discloses a wastewater treatment dosing control method based on latent state sequence deduction, applied to a wastewater treatment process model; the wastewater treatment process model is constructed based on a state-coded network and a state transition model; the method includes: The multidimensional operating parameters and historical dosage of the wastewater treatment process are obtained, and the multidimensional operating parameters and historical dosage are preprocessed to obtain preprocessed data and preprocessed historical dosage. The preprocessed data is input into a pre-trained state coding network so that the state coding network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during the wastewater treatment process. The hidden state vector and the preprocessed historical dosage are input into the policy network to search and select candidate actions, so as to output the probability distribution of multiple dosage candidate actions. Based on the action probability distribution, multiple sampling is performed to generate multiple dosage candidate actions. For each candidate dosing action, using the current hidden state vector as the initial state and the candidate dosing action as the control input, the dosing control action is determined based on the candidate dosing action. Based on the state transition model, the environmental dynamics are modeled in the latent state space, and the state evolution under different dosing control actions is predicted to obtain the latent state evolution sequence corresponding to the dosing control action. The state transition model is a sequence prediction model trained based on historical trajectory data. Each of the potential state evolution sequences is input into a value network for evaluation, so as to output a cumulative reward score corresponding to each of the drug dosage control actions; Based on the cumulative return scores, the dosing control actions are ranked, and the dosing control action with the best score is selected as the target dosing control action. The target dosing control action is then used to achieve wastewater treatment dosing control.
[0005] Optionally, the step of obtaining multidimensional operating parameters and historical dosage in the wastewater treatment process, and preprocessing the multidimensional operating parameters and historical dosage to obtain preprocessed data and preprocessed historical dosage includes: Obtain multidimensional operating parameters and historical dosage data for the wastewater treatment process; the multidimensional operating parameters include total phosphorus concentration, suspended solids concentration, pH, redox potential, influent and effluent flow rates, and current dosage of polyaluminum chloride in the influent and key process sections; The multidimensional operating parameters and historical dosage were time-aligned, outlier removed, and normalized to obtain pre-processed data and pre-processed historical dosage.
[0006] Optionally, the step of inputting the preprocessed data into a pre-trained state encoding network, so that the state encoding network maps the preprocessed data to a latent state space to obtain a hidden state vector for characterizing the reaction kinetics and water quality evolution state that cannot be directly observed during wastewater treatment, includes: The preprocessed data is input into a pre-trained state encoding network so that the preprocessed data can be mapped to a latent state space to obtain a hidden state vector in a preset dimension; the latent state space is used to support the time-by-time state evolution prediction of the state transition model.
[0007] Optionally, the step of inputting the hidden state vector and the preprocessed historical dosage into the policy network, searching and selecting candidate actions to output a probability distribution of multiple dosage candidate actions, and generating multiple dosage candidate actions based on the action probability distribution through multiple sampling, includes: The hidden state vector and the preprocessed historical dosage are input into the policy network so that the policy network can be used to construct candidate actions in the action space and output the probability distribution of multiple dosage candidate actions. Multiple sampling is performed from the probability distribution to generate multiple dosage candidate actions.
[0008] Optionally, the step of inputting each of the potential state evolution sequences into a value network for evaluation, to output a cumulative reward score corresponding to each of the drug dosage control actions, includes: The state at each moment in the potential state evolution sequence is calculated using a preset reward function to obtain an immediate reward; the preset reward function is a function used to consider the total phosphorus level of the effluent, the amplitude and stability of water quality fluctuations, the cumulative consumption of reagents, and the potential risk of sludge increment at future moments. The cumulative return score is obtained by summing or discounting the instantaneous returns at each moment using the value network.
[0009] Optionally, the step of controlling wastewater treatment dosing using the target dosage control action includes: Generate dosing control instructions corresponding to the target dosing dosage control action; Using a control interface, the dosing control command is sent to the metering pump and the dosing actuator so that the metering pump and the dosing actuator can control the dosing of chemicals for wastewater treatment.
[0010] Optionally, after controlling the wastewater treatment dosing using the target dosage control action, the method further includes: According to the preset time interval, process feedback data after the completion of wastewater treatment chemical dosing control is obtained; The state coding network and dynamic evolution model are incrementally updated using the process feedback data.
[0011] Secondly, this application discloses a wastewater treatment dosing control device based on latent state sequence deduction, applied to a wastewater treatment process model; the wastewater treatment process model is constructed based on a state coding network and a state transition model; the device includes: The pretreatment module is used to acquire multidimensional operating parameters and historical dosage in the wastewater treatment process, and to preprocess the multidimensional operating parameters and historical dosage to obtain preprocessed data and preprocessed historical dosage. The hidden state vector determination module is used to input the preprocessed data into a pre-trained state coding network so that the state coding network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during the wastewater treatment process. The candidate action generation module is used to input the hidden state vector and the preprocessed historical dosage into the policy network, search and select candidate actions, output the probability distribution of multiple dosage candidate actions, and perform multiple sampling based on the action probability distribution to generate multiple dosage candidate actions. The state inference module is used to determine the dosage control action for each candidate dosage action, starting with the current hidden state vector as the initial state and the candidate dosage action as the control input. Based on the candidate dosage action, it models the environmental dynamics in the latent state space based on the state transition model and predicts the state evolution under different dosage control actions to obtain the latent state evolution sequence corresponding to the dosage control action. The state transition model is a sequence prediction model trained based on historical trajectory data. The scoring calculation module is used to input each of the potential state evolution sequences into the value network for evaluation, so as to output the cumulative reward score corresponding to each of the drug dosage control actions; The dosing control module is used to sort the dosing control actions based on the cumulative return scores, and select the dosing control action with the best score as the target dosing control action, and use the target dosing control action to realize the dosing control for wastewater treatment.
[0012] Thirdly, this application discloses an electronic device, including: Memory, used to store computer programs; A processor is used to execute the computer program to implement the aforementioned wastewater treatment dosing control method based on potential state sequence deduction.
[0013] Fourthly, this application discloses a computer storage medium for storing a computer program; wherein, when the computer program is executed by a processor, it implements the steps of the aforementioned wastewater treatment dosing control method based on potential state sequence deduction.
[0014] As can be seen, this application is applied to a wastewater treatment process model. The wastewater treatment process model is constructed based on a state-encoded network and a state transition model. The method includes: preprocessing the multidimensional operating parameters and historical dosage in the wastewater treatment process to eliminate dimensional differences between different sensors, remove outliers and noise interference, and inputting the preprocessed data into a pre-trained state-encoded network so that the state-encoded network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed in the wastewater treatment process. This achieves a dimensionality reduction representation of the process state, which relies on the inaccurate mechanism equations. This solves the mismatch problem of traditional mechanism models when operating conditions change and can carry richer dynamic information. The hidden state vector and the preprocessed historical dosage are input into a strategy network to search and select candidate actions to output the probability distribution of multiple dosage candidate actions. Based on the action probability distribution, multiple sampling is performed to generate multiple dosage candidate actions, so that the deduction process can consider the current situation. To ensure continuity of execution and avoid decision-making oscillations, for each candidate dosing action, the current hidden state vector is used as the initial state, and the candidate dosing action is used as the control input. Based on the candidate dosing action, the dosing control action is determined. Based on the state transition model, environmental dynamics are modeled in the potential state space, and the state evolution under different dosing control actions is predicted to obtain the potential state evolution sequence corresponding to the dosing control action. This achieves proactive control by first deducing and then executing, avoiding the risk of exceeding standards caused by trial and error in the real environment. Each potential state evolution sequence is input into the value network for evaluation, outputting a cumulative reward score corresponding to each dosing control action. This achieves interpretable decision evaluation and avoids the subjectivity and inconsistency of human experience. Based on each cumulative reward score, the dosing control actions are ranked, and the dosing control action with the best score is selected as the target dosing control action. The target dosing control action is used to achieve wastewater treatment dosing control, dynamically optimizing the dosage while ensuring stable effluent compliance and reducing operating costs.
[0015] This invention combines reinforcement learning decision-making processes with process control problems by introducing multi-step sequence deduction and policy evaluation mechanisms into the potential state space, thus realizing a control paradigm that differs from traditional feedback control and mechanism modeling methods. Attached Figure Description
[0016] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0017] Figure 1 This is a flowchart of a wastewater treatment chemical dosing control method based on potential state sequence deduction disclosed in this application; Figure 2 This application discloses a specific flowchart for implementing chemical dosing control in wastewater treatment. Figure 3 This is a schematic diagram of a wastewater treatment dosing control device based on potential state sequence deduction disclosed in this application; Figure 4 This application provides a structural diagram of an electronic device. Detailed Implementation
[0018] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0019] See Figure 1 As shown, this invention discloses a wastewater treatment dosing control method based on latent state sequence deduction, applied to a wastewater treatment process model; the wastewater treatment process model is constructed based on a state coding network and a state transition model; the method specifically includes: Step S11: Obtain multidimensional operating parameters and historical dosage in the wastewater treatment process, preprocess the multidimensional operating parameters and historical dosage to obtain preprocessed data and preprocessed historical dosage.
[0020] In this embodiment, multidimensional operating parameters and historical dosages in the wastewater treatment process are obtained. The multidimensional operating parameters include total phosphorus concentration, suspended solids concentration, pH, redox potential, influent and effluent flow rate, and current dosage of polyaluminum chloride in the influent and key process sections. The multidimensional operating parameters and historical dosages are time-aligned, outlier removed, and normalized to obtain pre-processed data and pre-processed historical dosages.
[0021] Specifically, through existing online instruments and control systems, multi-dimensional operating parameters of the wastewater treatment process are periodically collected, including but not limited to: TP (total phosphorus concentration in influent and key process sections), MLSS (suspended solids concentration), pH (acidity and alkalinity), ORP (oxidation-reduction potential), influent and effluent flow rates, and current PAC (polyaluminum chloride) dosage. The collected multi-dimensional operating parameters are time-aligned, outlier-removed, and normalized to eliminate differences in the dimensions of different sensors and significant measurement noise, providing stable input for subsequent potential state coding.
[0022] During online system operation, the intelligent aeration control strategy and the traditional control method are configured in parallel. When abnormal sensor data, communication interruption, model output exceeding the safe range, or other abnormal situations are detected, the system automatically stops the control output of the strategy model and switches to the preset manual control or traditional control mode to ensure the continuous and stable operation of the wastewater treatment system. After the abnormality is resolved, the intelligent control strategy can be reactivated with manual confirmation. By setting the above-mentioned fallback mechanism, this invention ensures operational safety and reliability in practical engineering applications.
[0023] Step S12: Input the preprocessed data into a pre-trained state coding network so that the state coding network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during wastewater treatment.
[0024] In this embodiment, the preprocessed data is input into a pre-trained state encoding network so that the preprocessed data can be mapped to a latent state space using the state encoding network to obtain a hidden state vector in a preset dimension; the latent state space is used to support the time-by-time state evolution prediction of the state transition model.
[0025] Specifically, the preprocessed data is input into a pre-trained state encoding network, mapping the high-dimensional, noisy original observations into a high-dimensional hidden state vector. This hidden state vector possesses the following characteristics: it can automatically mitigate the impact of short-term random fluctuations and sensor noise; it focuses on characterizing the intrinsic dynamic relationship between water quality change trends and reagent responses; and its dimensionality is significantly higher than the original observation space, facilitating multi-step deduction of potential connections from a higher-dimensional space. Through this step, complex process systems can be transformed into a computable and predictable latent state space.
[0026] The latent state space proposed in this application is essentially a highly robust implicit representation. By compressing multidimensional process parameters into a hidden state vector through an encoding network, this representation has the following advantages: Strong anti-interference capability: Automatically filters out sensor noise and short-term fluctuations, focusing on the long-term trend of water quality evolution; Compact representation: Compared with traditional discretized state partitioning (such as the membership function of fuzzy control), continuous implicit vectors carry richer dynamic information with fewer dimensions, reducing the computational complexity of inference; Process adaptability: Implicitly learns the nonlinear coupling relationship between "water quality-chemicals-flocculation effect", without relying on the inaccurate mechanism equations.
[0027] Step S13: Input the hidden state vector and the preprocessed historical dosage into the policy network, search and select candidate actions to output the probability distribution of multiple dosage candidate actions, and perform multiple sampling based on the action probability distribution to generate multiple dosage candidate actions.
[0028] In this embodiment, the process of determining multiple drug dosage candidate actions is as follows: the hidden state vector and the preprocessed historical drug dosage are input into the policy network so as to construct candidate actions in the action space using the policy network and output the probability distribution of multiple drug dosage candidate actions. Multiple sampling is performed from the probability distribution to generate multiple drug dosage candidate actions.
[0029] The purpose of this step is to conduct parallel virtual evaluations of different dosing decisions under the same operating conditions, thereby obtaining candidate dosing actions.
[0030] The policy network adopts a parameterized policy model based on deep neural networks. Its inputs are hidden state vectors and historical action information, and its outputs are action distribution parameters.
[0031] Step S14: For each candidate dosing action, with the current hidden state vector as the initial state and the candidate dosing action as the control input, determine the dosing control action based on the candidate dosing action. Based on the state transition model, model the environmental dynamics in the latent state space and predict the state evolution under different dosing control actions to obtain the latent state evolution sequence corresponding to the dosing control action; wherein, the state transition model is a sequence prediction model trained based on historical trajectory data.
[0032] Specifically, for each candidate dosing action, the pre-trained state transition model is invoked to model the environmental dynamics in the potential state space and predict the state evolution under different dosing control actions.
[0033] This process can predict key indicators such as TP and MLSS when the flocculation reaction is completed after chemical dosing without interacting with the actual water treatment system, thus achieving advanced decision-making through "simulation first, execution later".
[0034] This process has the following characteristics: Each step of the simulation corresponds to a time span in the actual process, and multiple simulation steps cover the overall lag time of coagulation reaction, sedimentation separation and detection feedback; The simulation is performed only within the potential state space and does not require interaction with the actual wastewater treatment system. Dynamic models can gradually correct short-term errors during the simulation process, maintaining the stability of long-term predictions.
[0035] This step allows us to obtain the potential state evolution sequence corresponding to different dosing control actions over a future period of time.
[0036] The state transition model is constructed using recurrent neural networks, temporal convolutional networks, and a sequence modeling structure based on attention mechanisms. Different models are used for different implementations to approximate the nonlinear dynamic relationship between dosing behavior and water quality evolution.
[0037] Step S15: Input each of the potential state evolution sequences into the value network for evaluation, so as to output the cumulative reward score corresponding to each of the drug dosage control actions.
[0038] In this embodiment, a preset reward function is used to calculate the state at each moment in the potential state evolution sequence to obtain an immediate reward. The preset reward function is a function used to consider the total phosphorus level of the effluent, the amplitude and stability of water quality fluctuations, the cumulative consumption of reagents, and the potential risk information of sludge increment at future moments. The value network is used to accumulate and sum or discount the immediate rewards at each moment to obtain a cumulative reward score.
[0039] The value network is used to estimate the expected cumulative return of the corresponding control strategy over a future time horizon.
[0040] Step S16: Based on the cumulative return scores, sort the dosing control actions and select the dosing control action with the best score as the target dosing control action, and use the target dosing control action to realize wastewater treatment dosing control.
[0041] In this embodiment, a dosing control command corresponding to the target dosing amount control action is generated; the dosing control command is sent to the metering pump and the dosing execution device through the control interface, so that the metering pump and the dosing execution device can control the dosing of wastewater treatment chemicals.
[0042] Specifically, based on the cumulative return score, the dosing control actions are sorted, and the target dosing control actions that meet the effluent standard constraints and have the best overall cost are automatically selected. The dosing control command corresponding to the target dosing control action is generated and sent to the metering pump and dosing actuator through the control interface to realize the automatic adjustment of PAC dosage.
[0043] The target dosage control action is obtained through multi-step prediction and deduction in the potential state space, rather than an instantaneous feedback control strategy based on the current observation value.
[0044] In addition, it also includes: acquiring process feedback data after the completion of wastewater treatment chemical dosing control at preset time intervals; and using the process feedback data to incrementally update the state coding network and dynamic evolution model.
[0045] In other words, during system operation, the actual process feedback data after execution is periodically used to incrementally update the state coding network and dynamic evolution model. The number of actual interactions required for this update process is significantly lower than that of traditional trial-and-error control methods, thereby ensuring that the model adapts to changes in water quality while reducing the risk of exceeding standards and operating costs.
[0046] This application achieves intelligent control of the polyaluminum chloride dosing process by constructing a closed loop of "virtual simulation-decision optimization". Specifically, firstly, multi-dimensional process parameters such as total phosphorus, suspended solids, pH, and flow rate of the influent are collected in real time and input into a state encoding network, which is encoded into a high-dimensional hidden state vector. This vector can automatically filter out sensor noise and short-term fluctuations, and centrally represent the intrinsic dynamic law between water quality evolution and chemical reaction.
[0047] Based on this, the environmental dynamics are modeled in the potential state space using the learned state transition model, and the state evolution under different dosage control actions is predicted to obtain the potential state evolution sequence corresponding to the dosage control action. This allows the control system to predict the impact of the current strategy on the final effluent total phosphorus before actual dosing, thereby achieving proactive decision-making rather than delayed response.
[0048] During the simulation, long-term indicators corresponding to various candidate dosages are evaluated simultaneously, including effluent compliance stability, total reagent consumption, and sludge increment risk. The optimal dosing scheme is then selected and sent to the metering pump for execution. To adapt to water quality fluctuations, the system requires only minimal real-world interaction (e.g., several times daily) to update potential characterizations and dynamic models. Most strategy optimizations are completed in the virtual simulation, significantly reducing trial-and-error costs and the risk of exceeding standards.
[0049] The entire solution requires no modification to existing dosing equipment and sensors. It enables traditional dosing systems to "predict future effects" simply through software deployment, fundamentally solving the decision-making blind spots caused by process lag, model mismatch caused by water quality fluctuations, and the trial-and-error costs of implementing intelligent algorithms.
[0050] The specific process for implementing wastewater treatment chemical dosing control in this application is as follows: Figure 2 As shown, the steps are as follows: First, the multidimensional operating parameters and historical dosage of the wastewater treatment process are pre-processed to obtain the pre-processed data and the pre-processed historical dosage. Then, the preprocessed data is input into a pre-trained state coding network to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during the wastewater treatment process. Then, the hidden state vector and the preprocessed historical dosage are input into the policy network to search and select candidate actions, so as to output the probability distribution of multiple dosage candidate actions, and multiple sampling is performed based on the action probability distribution to generate multiple dosage candidate actions. Then, for each candidate dosing action, the current hidden state vector is used as the initial state and the candidate dosing action is used as the control input. The dosing control action is determined based on the candidate dosing action. Based on the state transition model, the environmental dynamics are modeled in the potential state space, and the state evolution under different dosing control actions is predicted to obtain the potential state evolution sequence corresponding to the dosing control action. Then, the evolution sequence of each potential state is input into the value network for evaluation, and the cumulative reward score corresponding to each dosage control action is output. Then, based on the cumulative reward scores, the dosage control actions are ranked, and the dosage control action with the best score is selected as the target dosage control action. Finally, using the control interface, the dosing control command corresponding to the target dosing amount is sent to the metering pump and the dosing actuator to realize the dosing control for wastewater treatment.
[0051] This application encodes multidimensional, noisy wastewater treatment process observation data into continuous state vectors, uniformly describing water quality evolution and reagent reaction processes within this potential state space. This effectively isolates sensor noise and short-term fluctuations, improving the robustness and generalization ability of the control model under complex operating conditions. The invention models environmental dynamics within the potential state space and predicts state evolution under different dosage control actions, covering the overall process lag of coagulation reaction, sedimentation separation, and detection feedback, achieving proactive control through "pre-deduction, then execution." This application eliminates the need for virtual decision-making evaluation methods based on trial and error in real systems; the merits of the dosing scheme are primarily evaluated through the results of potential state deduction, significantly reducing reliance on trial and error in real wastewater treatment systems. This ensures stable effluent compliance while significantly reducing the risk of exceeding standards and operating costs. The invention supports state transition models for prediction over different time spans. The water treatment process model can directly predict the results of process state changes over different time spans, avoiding the accumulation of errors caused by gradual rolling predictions, and is more suitable for wastewater treatment dosing scenarios with long time lag characteristics. This application can simultaneously take into account the comprehensive optimization goals of effluent compliance and reagent costs. In the dosing decision-making process, it considers the stability of total phosphorus in the effluent, the cumulative consumption of reagents, and the potential risk of sludge increment, realizing automatic dosing optimization under multi-objective constraints, rather than single-index control. Through potential state characterization and continuous updating of the dynamic model, this application can naturally adapt to non-steady-state conditions such as rainfall impact and industrial wastewater fluctuations, reducing the need for frequent manual parameter adjustments. This invention does not require a software upgrade deployment method to modify existing hardware. It can be directly connected to existing online monitoring instruments, dosing metering pumps, and control systems. Intelligent dosing control can be achieved through software deployment alone, which has good engineering feasibility and promotion value.
[0052] In this embodiment, the model is applied to a wastewater treatment process model. The wastewater treatment process model is constructed based on a state-encoded network and a state transition model. The method includes: preprocessing the multidimensional operating parameters and historical dosage data of the wastewater treatment process to eliminate dimensional differences between different sensors, remove outliers and noise interference, and inputting the preprocessed data into a pre-trained state-encoded network. This allows the state-encoded network to map the preprocessed data to a latent state space, obtaining a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during wastewater treatment. This achieves a dimensionality reduction representation of the process state, which relies on inaccurate mechanistic equations. This solves the mismatch problem of traditional mechanistic models when operating conditions change and can carry richer dynamic information. The hidden state vector and the preprocessed historical dosage data are then input into a strategy network to search and select candidate actions, outputting a probability distribution of multiple dosage candidate actions. Multiple sampling is performed based on the action probability distribution to generate multiple dosage candidate actions, enabling the deduction process to consider the current situation. To ensure continuity of execution and avoid decision-making oscillations, for each candidate dosing action, the current hidden state vector is used as the initial state, and the candidate dosing action is used as the control input. Based on the candidate dosing action, the dosing control action is determined. Based on the state transition model, environmental dynamics are modeled in the potential state space, and the state evolution under different dosing control actions is predicted to obtain the potential state evolution sequence corresponding to the dosing control action. This achieves proactive control by first deducing and then executing, avoiding the risk of exceeding standards caused by trial and error in the real environment. Each potential state evolution sequence is input into the value network for evaluation, outputting a cumulative reward score corresponding to each dosing control action. This achieves interpretable decision evaluation and avoids the subjectivity and inconsistency of human experience. Based on each cumulative reward score, the dosing control actions are ranked, and the dosing control action with the best score is selected as the target dosing control action. The target dosing control action is used to achieve wastewater treatment dosing control, dynamically optimizing the dosage while ensuring stable effluent compliance and reducing operating costs.
[0053] See Figure 3 As shown, this invention discloses a wastewater treatment dosing control device based on latent state sequence deduction, applied to a wastewater treatment process model; the wastewater treatment process model is constructed based on a state coding network and a state transition model; the device specifically may include: Preprocessing module 11 is used to acquire multidimensional operating parameters and historical dosage in the wastewater treatment process, preprocess the multidimensional operating parameters and historical dosage to obtain preprocessed data and preprocessed historical dosage; The hidden state vector determination module 12 is used to input the preprocessed data into a pre-trained state coding network so that the state coding network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during the wastewater treatment process. The candidate action generation module 13 is used to input the hidden state vector and the preprocessed historical dosage into the policy network, search and select candidate actions, output the probability distribution of multiple dosage candidate actions, and perform multiple sampling based on the action probability distribution to generate multiple dosage candidate actions. The state deduction module 14 is used to determine the dosage control action based on the current hidden state vector as the initial state and the dosage candidate action as the control input for each dosage candidate action. Based on the dosage candidate action, the module models the environmental dynamics in the latent state space based on the state transition model and predicts the state evolution under different dosage control actions to obtain the latent state evolution sequence corresponding to the dosage control action. The state transition model is a sequence prediction model trained based on historical trajectory data. The scoring calculation module 15 is used to input each of the potential state evolution sequences into the value network for evaluation, so as to output the cumulative reward score corresponding to each of the drug dosage control actions; The dosing control module 16 is used to sort the dosing control actions based on the cumulative return scores, and select the dosing control action with the best score as the target dosing control action, and use the target dosing control action to realize the dosing control for wastewater treatment.
[0054] In some specific embodiments, the preprocessing module 11 may specifically include: The multi-dimensional operating parameter acquisition module is used to acquire multi-dimensional operating parameters and historical dosage in the wastewater treatment process; the multi-dimensional operating parameters include total phosphorus concentration, suspended solids concentration, pH, redox potential, influent and effluent flow rate, and current dosage of polyaluminum chloride in the influent and key process sections. The parameter and dosage preprocessing module is used to perform time alignment, outlier removal, and normalization on multidimensional operating parameters and historical dosages to obtain preprocessed data and preprocessed historical dosages.
[0055] In some specific embodiments, the hidden state vector determination module 12 may specifically include: The mapping module is used to input the preprocessed data into a pre-trained state encoding network so that the preprocessed data can be mapped to a latent state space using the state encoding network to obtain a hidden state vector in a preset dimension; the latent state space is used to support the time-by-time state evolution prediction of the state transition model.
[0056] In some specific embodiments, the candidate action generation module 13 may specifically include: The dosing dosage candidate action set generation module is used to input the hidden state vector and the preprocessed historical dosing dosage into the policy network, so as to construct candidate actions in the action space using the policy network, and output the probability distribution of multiple dosing dosage candidate actions. Multiple samples are taken from the probability distribution to generate multiple dosing dosage candidate actions.
[0057] In some specific embodiments, the scoring calculation module 15 may specifically include: The real-time reward calculation module is used to calculate the state at each moment in the potential state evolution sequence using a preset reward function to obtain the real-time reward; the preset reward function is a function that takes into account the total phosphorus level of the effluent, the amplitude and stability of water quality fluctuations, the cumulative consumption of reagents, and the potential risk information of sludge increment at future moments. The cumulative return score calculation module is used to calculate the cumulative return score by summing or discounting the real-time returns at each time point using the value network.
[0058] In some specific embodiments, the dosing control module 16 may specifically include: The dosing control command generation module is used to generate dosing control commands corresponding to the target dosing amount control action; The wastewater treatment dosing control module is used to send the dosing control command to the metering pump and the dosing actuator via a control interface, so that the metering pump and the dosing actuator can control the dosing of wastewater treatment chemicals.
[0059] In some specific embodiments, the dosing control module 16 may specifically include: The process feedback data acquisition module is used to acquire process feedback data after the completion of wastewater treatment dosing control at preset time intervals. The incremental update module is used to incrementally update the state coding network and the dynamic evolution model using the process feedback data.
[0060] Figure 4This is a schematic diagram of an electronic device provided in an embodiment of this application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input / output interface 25, and a communication bus 26. The memory 22 stores a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the wastewater treatment dosing control method based on latent state sequence deduction, as disclosed in any of the foregoing embodiments.
[0061] In this embodiment, the power supply 23 is used to provide operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be any communication protocol applicable to the technical solution of this application, and is not specifically limited here; the input / output interface 25 is used to acquire external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs, and is not specifically limited here.
[0062] In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, random access memory, disk or optical disk, etc. The resources stored on it include operating system 221, computer program 222 and data 223, etc., and the storage method can be temporary storage or permanent storage.
[0063] The operating system 221 manages and controls the various hardware devices on the electronic device 20 and the computer program 222 to enable the processor 21 to perform operations and processing on the data 223 in the memory 22. The operating system can be Windows, Unix, Linux, etc. The computer program 222, in addition to including a computer program capable of performing the wastewater treatment dosing control method based on latent state sequence deduction executed by the electronic device 20 as disclosed in any of the foregoing embodiments, may further include computer programs capable of performing other specific tasks. The data 223 may include data received by the wastewater treatment dosing control device based on latent state sequence deduction from external devices, and may also include data collected by its own input / output interface 25.
[0064] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented directly by hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
[0065] Furthermore, this application also discloses a computer-readable storage medium storing a computer program. When the computer program is loaded and executed by a processor, it implements the steps of the wastewater treatment dosing control method based on latent state sequence deduction disclosed in any of the foregoing embodiments.
[0066] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0067] The above provides a detailed description of a wastewater treatment dosing control method, apparatus, equipment, and medium based on latent state sequence deduction provided by the present invention. Specific examples have been used to illustrate the principles and implementation methods of the present invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of the present invention. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of the present invention. Therefore, the content of this specification should not be construed as a limitation of the present invention.
Claims
1. A wastewater treatment chemical dosing control method based on latent state sequence deduction, characterized in that, The method is applied to a wastewater treatment process model; the wastewater treatment process model is constructed based on a state-coded network and a state transition model; the method includes: The multidimensional operating parameters and historical dosage of the wastewater treatment process are obtained, and the multidimensional operating parameters and historical dosage are preprocessed to obtain preprocessed data and preprocessed historical dosage. The preprocessed data is input into a pre-trained state coding network so that the state coding network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during the wastewater treatment process. The hidden state vector and the preprocessed historical dosage are input into the policy network to search and select candidate actions, so as to output the probability distribution of multiple dosage candidate actions. Based on the action probability distribution, multiple sampling is performed to generate multiple dosage candidate actions. For each candidate dosing action, using the current hidden state vector as the initial state and the candidate dosing action as the control input, the dosing control action is determined based on the candidate dosing action. Based on the state transition model, the environmental dynamics are modeled in the latent state space, and the state evolution under different dosing control actions is predicted to obtain the latent state evolution sequence corresponding to the dosing control action. The state transition model is a sequence prediction model trained based on historical trajectory data. Each of the potential state evolution sequences is input into a value network for evaluation, so as to output a cumulative reward score corresponding to each of the drug dosage control actions; Based on the cumulative return scores, the dosing control actions are ranked, and the dosing control action with the best score is selected as the target dosing control action. The target dosing control action is then used to achieve wastewater treatment dosing control.
2. The wastewater treatment chemical dosing control method based on latent state sequence deduction according to claim 1, characterized in that, The process of acquiring multidimensional operating parameters and historical dosage in the wastewater treatment process, and preprocessing the multidimensional operating parameters and historical dosage to obtain preprocessed data and preprocessed historical dosage includes: Obtain multidimensional operating parameters and historical dosage data for the wastewater treatment process; the multidimensional operating parameters include total phosphorus concentration, suspended solids concentration, pH, redox potential, influent and effluent flow rates, and current dosage of polyaluminum chloride in the influent and key process sections; The multidimensional operating parameters and historical dosage were time-aligned, outlier removed, and normalized to obtain pre-processed data and pre-processed historical dosage.
3. The wastewater treatment chemical dosing control method based on latent state sequence deduction according to claim 1, characterized in that, The step of inputting the preprocessed data into a pre-trained state encoding network, so that the state encoding network maps the preprocessed data to a latent state space, to obtain a hidden state vector for characterizing the reaction kinetics and water quality evolution state that cannot be directly observed during wastewater treatment, includes: The preprocessed data is input into a pre-trained state encoding network so that the preprocessed data can be mapped to a latent state space to obtain a hidden state vector in a preset dimension; the latent state space is used to support the time-by-time state evolution prediction of the state transition model.
4. The wastewater treatment chemical dosing control method based on latent state sequence deduction according to claim 1, characterized in that, The process involves inputting the hidden state vector and the preprocessed historical dosage into a policy network to search and select candidate actions, outputting a probability distribution of multiple dosage candidate actions, and generating multiple dosage candidate actions through multiple sampling based on the action probability distribution. This includes: The hidden state vector and the preprocessed historical dosage are input into the policy network so that the policy network can be used to construct candidate actions in the action space and output the probability distribution of multiple dosage candidate actions. Multiple sampling is performed from the probability distribution to generate multiple dosage candidate actions.
5. The wastewater treatment chemical dosing control method based on latent state sequence deduction according to claim 1, characterized in that, The step of inputting each of the potential state evolution sequences into a value network for evaluation, and outputting a cumulative reward score corresponding to each of the drug dosage control actions, includes: The state at each moment in the potential state evolution sequence is calculated using a preset reward function to obtain an immediate reward; the preset reward function is a function used to consider the total phosphorus level of the effluent, the amplitude and stability of water quality fluctuations, the cumulative consumption of reagents, and the potential risk of sludge increment at future moments. The cumulative return score is obtained by summing or discounting the instantaneous returns at each moment using the value network.
6. The wastewater treatment chemical dosing control method based on latent state sequence deduction according to claim 1, characterized in that, The method of controlling wastewater treatment dosing using the target dosage control action includes: Generate dosing control instructions corresponding to the target dosing dosage control action; Using a control interface, the dosing control command is sent to the metering pump and the dosing actuator so that the metering pump and the dosing actuator can control the dosing of chemicals for wastewater treatment.
7. The wastewater treatment chemical dosing control method based on latent state sequence deduction according to any one of claims 1 to 6, characterized in that, After implementing wastewater treatment dosing control using the target dosage control action, the method further includes: According to the preset time interval, process feedback data after the completion of wastewater treatment chemical dosing control is obtained; The state coding network and dynamic evolution model are incrementally updated using the process feedback data.
8. A wastewater treatment dosing control device based on latent state sequence deduction, characterized in that, Applied to a wastewater treatment process model; the wastewater treatment process model is constructed based on a state-coded network and a state transition model; the device includes: The pretreatment module is used to acquire multidimensional operating parameters and historical dosage in the wastewater treatment process, and to preprocess the multidimensional operating parameters and historical dosage to obtain preprocessed data and preprocessed historical dosage. The hidden state vector determination module is used to input the preprocessed data into a pre-trained state coding network so that the state coding network maps the preprocessed data to a latent state space to obtain a hidden state vector that characterizes the reaction kinetics and water quality evolution state that cannot be directly observed during the wastewater treatment process. The candidate action generation module is used to input the hidden state vector and the preprocessed historical dosage into the policy network, search and select candidate actions, output the probability distribution of multiple dosage candidate actions, and perform multiple sampling based on the action probability distribution to generate multiple dosage candidate actions. The state inference module is used to determine the dosage control action for each candidate dosage action, starting with the current hidden state vector as the initial state and the candidate dosage action as the control input. Based on the candidate dosage action, it models the environmental dynamics in the latent state space based on the state transition model and predicts the state evolution under different dosage control actions to obtain the latent state evolution sequence corresponding to the dosage control action. The state transition model is a sequence prediction model trained based on historical trajectory data. The scoring calculation module is used to input each of the potential state evolution sequences into the value network for evaluation, so as to output the cumulative reward score corresponding to each of the drug dosage control actions; The dosing control module is used to sort the dosing control actions based on the cumulative return scores, and select the dosing control action with the best score as the target dosing control action, and use the target dosing control action to realize the dosing control for wastewater treatment.
9. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor for executing the computer program to implement the wastewater treatment dosing control method based on potential state sequence deduction as described in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, Used to store computer programs; wherein, when the computer programs are executed by a processor, they implement the wastewater treatment dosing control method based on potential state sequence deduction as described in any one of claims 1 to 7.