A Smart Water Quality Prediction Method for Watershed Floods Based on Dynamic Weight Optimization and Deep Reinforcement Learning
By combining dynamic weight optimization and deep reinforcement learning with grey relational analysis and LSTM networks, the problems of poor adaptability and insufficient real-time performance in flood water quality prediction are solved. This achieves high-precision, real-time water quality prediction and adaptive adjustment, thus improving the technical means for early warning of watershed water environment risks.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HOHAI UNIV
- Filing Date
- 2025-11-11
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies in watershed hydrological management suffer from poor adaptability, limited real-time forecast accuracy, lack of dynamic optimization and real-time learning capabilities in predicting water quality changes during floods, and failure to effectively couple the dynamic relationship between flood processes and water quality changes, resulting in delayed or distorted forecast results.
A method based on dynamic weight optimization and deep reinforcement learning is adopted. By using grey relational analysis, long short-term memory network (LSTM) and TD3 algorithm, the flood weight is dynamically adjusted to achieve intelligent prediction of flood process and water quality. Real-time optimization and adaptive adjustment are carried out by combining multi-source information.
It significantly improves the accuracy and real-time performance of flood water quality prediction, enhances the model's adaptability and stability, enables rapid recovery under abnormal conditions, and provides collaborative simulation and prediction of flood processes and water quality changes.
Smart Images

Figure CN121579970B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of hydrological prediction and water quality monitoring technology, specifically involving an intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning. Background Technology
[0002] In watershed hydrological management and water environment protection, predicting water quality changes during floods has always been a complex and challenging task. Traditional water quality prediction methods mostly rely on hydrological and hydrodynamic models or statistical methods based on historical data, which have the following significant shortcomings: First, traditional hydrological models heavily rely on physical mechanism modeling and extensive parameter calibration, resulting in poor adaptability to sudden rainstorm and flood scenarios and limited real-time forecast accuracy. Second, existing methods often separate flood process prediction from water quality prediction, failing to fully consider the dynamic coupling relationship between flood evolution and water quality changes. In addition, most prediction models lack the ability to learn and adaptively adjust to real-time monitoring information, making it difficult to cope with missing or abnormal data, leading to delayed or distorted forecast results.
[0003] In recent years, some studies have attempted to introduce machine learning methods (such as LSTM) for hydrological and water quality prediction. Although these methods have improved forecasting capabilities to some extent, they still have limitations: First, these methods usually rely on fixed model structures and weights, making it difficult to dynamically optimize them based on the real-time evolution of floods; second, existing methods do not fully explore the similarities between multiple historical floods and lack the ability to dynamically assign weights to key influencing factors; third, the use of real-time feedback information during the prediction process is insufficient, and online learning and calibration of model parameters have not been achieved, limiting their generalization ability and stability under changing environments.
[0004] Therefore, there is an urgent need for an intelligent prediction method that can integrate multi-source information, possess dynamic optimization and real-time learning capabilities, and effectively couple flood processes and water quality indicators to improve the accuracy, real-time performance, and reliability of joint flood and water quality prediction. Summary of the Invention
[0005] The purpose of this invention is to overcome the shortcomings of the prior art and provide a watershed flood water quality intelligent prediction method based on dynamic weight optimization and deep reinforcement learning, so as to achieve intelligent detection of water quality during floods, which is accurate, real-time and adaptive.
[0006] To achieve the above objectives, the present invention is implemented using the following technical solution:
[0007] On the one hand, this invention provides an intelligent prediction method for watershed flood water quality based on dynamic weight optimization and deep reinforcement learning, characterized by the following steps:
[0008] Collect historical flood water quality observation data and historical flood hydrological data;
[0009] The historical flood hydrological data are standardized to obtain comparison sequences and reference sequences, and the comparison sequences and reference sequences are then dimensionless.
[0010] The dynamic grey relational coefficient is calculated using the dimensionless comparison sequence and the reference sequence.
[0011] The weighted grey correlation degree of the comparison sequence to the reference sequence is calculated using the dynamic grey correlation coefficient, and the weighted grey correlation degree is sorted to form a weighted grey correlation degree sequence.
[0012] The effective historical flood fields are selected by using the weighted grey relational sequence, and the historical flood weights of the effective historical flood fields are dynamically optimized based on a long short-term memory network and a reinforcement learning agent based on the TD3 algorithm.
[0013] The final predicted flood water quality data is obtained by using the historical flood water quality observation data and the historical flood weights.
[0014] Furthermore, using the starting point of rainfall for each historical flood as the time origin, a flood similarity index set is extracted, including the initial flow rate, hourly cumulative rainfall sequence, and hourly cumulative water volume sequence, forming a comparison sequence, expressed as:
[0015] ;
[0016] in, For comparison sequences; The initial rise flow rate of the i-th historical flood; For comparison, hourly cumulative rainfall sequences; This is a comparison sequence of hourly cumulative water volume; x = 1, 2, ..., 72; i represents the i-th historical flood. N is a positive integer;
[0017] Depending on the flood rise time t0, the flood rise flow, hourly cumulative rainfall sequence, and hourly cumulative water volume sequence are obtained through automatic data collection or manual observation from hydrological and rainfall monitoring stations. Using flood forecast data, the stage-specific cumulative water volume within the flood forecast period is obtained to form a reference sequence, expressed as:
[0018] ;
[0019] in, For reference sequence; This represents the current measured flood surge flow rate; The hourly cumulative rainfall series serves as a reference series. The hourly cumulative water volume sequence is the reference sequence; y=1,2,…, ; Subscript The flood forecast period is defined as t0, where t0 is the moment the flood begins to rise. ≤72.
[0020] Furthermore, the comparison sequence and the reference sequence are dimensionless, expressed as follows:
[0021] ;
[0022] ;
[0023] ;
[0024] in, The comparison sequence is dimensionless, with j ∈ {flow, rainfall, water volume} as the three index dimensions, j = 1, 2, ... ; To compare the data of the j-th similarity index of the i-th flood in the sequence; The data is for the j-th similarity index in the reference sequence; This is the reference sequence after dimensionless processing.
[0025] Furthermore, the absolute difference between the dimensionless comparison sequence and the reference sequence is calculated. The expression is:
[0026] ;
[0027] Extract the maximum value. Minimum value The expression is:
[0028] ;
[0029] ;
[0030] in, This indicates taking the minimum value in a series; This indicates taking the maximum value in the series;
[0031] Using absolute difference maximum value Minimum value Calculate the dynamic grey relational coefficient The expression is:
[0032] ;
[0033] in, The time decay resolution coefficient, The expression is as follows:
[0034] ;
[0035] in, For predicting time;
[0036] Furthermore, by utilizing the dynamic grey relational coefficient Calculate the weighted grey relational degree of each indicator, the expression is:
[0037] ;
[0038] in, Let be the weighted grey relational degree of the i-th flood; The weights for each indicator are as follows: flow rate has a weight of 0.4, rainfall has a weight of 0.3, and water volume has a weight of 0.3.
[0039] The weighted grey relational degrees of n comparison sequences relative to the same reference sequence are arranged in order of magnitude to form a weighted grey relational degree sequence.
[0040] Furthermore, using a weighted grey relational sequence, the effective historical flood field frequency A is selected, as shown in the following expression:
[0041] ;
[0042] Take the historical floods corresponding to the first A weighted grey relational degrees in the weighted grey relational degree sequence as the similar flood set.
[0043] Furthermore, the initial weights for historical floods are generated using a Long Short-Term Memory (LSTM) network, including the following steps:
[0044] Extract historical flood hydrological data from effective historical flood fields, construct a standardized 216-dimensional feature vector for each historical flood, and construct a real-time flood state vector;
[0045] The standardized 216-dimensional feature vector is stacked with the real-time flood state vector to form the input matrix of the long short-term memory network.
[0046] The standardized input matrix is fed into a bidirectional long short-term memory network for high-dimensional spatiotemporal feature extraction.
[0047] The attention weights at each time step during the flood process are calculated based on the spatiotemporal attention mechanism, and the weighted context feature vectors are generated using the obtained attention weights at each time step.
[0048] The context vector is input into a fully connected neural network for transformation, and after normalization by the softmax function, the initial probability distribution weights of historical floods are obtained.
[0049] Furthermore, the expression for the standardized 216-dimensional feature vector is:
[0050] ;
[0051] in, A standardized 216-dimensional feature vector; The flood surge flow is the flood surge flow sequence, which consists of the flood surge flow value. It consists of 72 copies. After filtering Hourly cumulative rainfall sequence of a historical flood; After filtering The hourly cumulative water volume sequence of a historical flood; x=1,2,…,72; For the first A historic flood, ; Represented as a 216-dimensional real number space;
[0052] The expression for the real-time flood state vector is:
[0053] ;
[0054] in, Let the real-time flood state vector be the vector; for the real-time flood state vector... t>t0+ During this period, its cumulative rainfall and cumulative water volume The input is filled with the historical average or zero value to meet the neural network’s requirement for a fixed input size and ensure that the dimension is always 216.
[0055] The expression for the input matrix X of a Long Short-Term Memory (LSTM) network is:
[0056] ;
[0057] The standardized input matrix X is fed into a bidirectional long short-term memory network for high-dimensional spatiotemporal feature extraction, as shown in the following expression:
[0058] ;
[0059] ;
[0060] ;
[0061] in, Input for the current time step; and These are the hidden states of the forward long short-term memory network and the backward long short-term memory network at the previous time step t-1 and the next time step t+1, respectively. and The cells represent the forward long short-term memory network and the backward long short-term memory network at the previous time step t-1 and the next time step t+1, respectively. and These are the hidden states of the forward long short-term memory network and the backward long short-term memory network at the current time step t, respectively. and These represent the cell states of the forward long short-term memory network and the backward long short-term memory network at the current time step t, respectively. The final feature representation at time step t is formed by concatenating the hidden states of the forward long short-term memory network and the backward long short-term memory network at the current time step t; the Tanh function is used as the cell state activation function.
[0062] A spatiotemporal attention mechanism is introduced to calculate the attention weight at each time step, focusing on the critical periods of the flood process. The expression for the attention weight at each time step is as follows:
[0063] ;
[0064] in, Let be the normalized attention weights at time step t; It is an exponential function; It is the sum of the exponential fractions of all time steps; The energy fraction at time step t is expressed as:
[0065] ;
[0066] in, Trainable feature score vectors; It is the hyperbolic tangent activation function; is a trainable weight matrix; It is a trainable bias vector;
[0067] Context feature vector The expression is:
[0068] ;
[0069] context vector The input is transformed into a fully connected neural network, and the expression is:
[0070] ;
[0071] in, This is the original weight score vector; This is the first layer weight matrix; This is the first layer bias vector; This is the weight matrix for the second layer; This is the second layer bias vector; It is a linear rectifier unit. It is an S-shaped function that outputs the original weight score vector. ;
[0072] The initial probability distribution weights of historical floods, obtained after normalization using the softmax function, are expressed as follows:
[0073] ;
[0074] in, For the first Normalized initial weights for historical floods; For the first The original weighted score vector of a historical flood.
[0075] Furthermore, a reinforcement learning agent based on the TD3 algorithm is employed to normalize the initial weights. Real-time optimization includes the following steps:
[0076] First, construct the state space, and at each optimization time... State vector It consists of the following four-dimensional features, expressed as follows:
[0077] ;
[0078] ;
[0079] ;
[0080] ;
[0081] in, Let be the absolute error of the prediction at time k; Predict the flow rate value for the hydrological model at time k; Let be the measured flow rate at time k; This represents the measured rate of change in flow rate; The attenuation coefficient is... ; For the filtered first Weighted grey relational degree of a historical flood; The weighted grey relational degree after time decay;
[0082] Secondly, reinforcement learning agents are used to generate actions through the Actor policy network to update historical flood weights:
[0083] The action output by the reinforcement learning agent at time k Adjustment amount for historical flood weights The constraint is within the range of [-0.1, 0.1], and the expression is:
[0084] ;
[0085] Actions are generated by the Actor policy network. The generation is represented as:
[0086] ;
[0087] in, Indicates that the Actor policy network is the first... The recommended weighting adjustment for historical flood events;
[0088] Historical flood weights are updated using the following formula:
[0089] ;
[0090] in, No. The old weight of a historical flood at time k; This is a truncation function to ensure that the calculated new weights do not exceed a reasonable range of [0, 1]. For the first The new weight of the historical flood at time k+1;
[0091] right The weight normalization process is performed, and the expression is:
[0092] ;
[0093] in, This is the sum of the weights of all historical floods after the update;
[0094] Furthermore, a reward function R is designed to optimize prediction accuracy and process line smoothness, expressed as:
[0095] ;
[0096] ;
[0097] in, The weighting coefficients for the error penalty term. ; This is the second derivative of the measured flow rate process line; The weighting coefficients for the smoothness penalty term. ;
[0098] Finally, the network update uses a Critic network with a dual-Q network structure, updated synchronously with the Actor network, and the target network soft update coefficient is 0.005; the exploration strategy uses Ornstein-Uhlenbeck noise with noise parameter σ=0.1; the agent updates the strategy network parameters synchronously every 6 hours.
[0099] Furthermore, the final predicted flood water quality data at each moment is calculated by multiplying the historical flood water quality observation data by the corresponding historical flood weights, as expressed in the following expression:
[0100] ;
[0101] in, To ultimately predict flood water quality data, For the first Measured flood water quality data at time t for a historical flood event;
[0102] Error assessment was performed on the final predicted flood water quality data obtained:
[0103] Real-time monitoring of prediction absolute error ,when hour, If the error is the largest absolute error in history, it is determined to be a model state mismatch, and the recalibration mechanism is immediately triggered.
[0104] Compared with the prior art, the beneficial effects achieved by the present invention are as follows:
[0105] The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning provided by this invention significantly improves the overall performance of watershed flood water quality prediction through the integration of multiple technologies and mechanism innovation, specifically in the following aspects:
[0106] 1) High prediction accuracy;
[0107] By quantifying the similarity between historical floods and current events through dynamic grey relational analysis, and by using a bidirectional long short-term memory network (LSTM) with integrated attention mechanism to deeply extract high-dimensional spatiotemporal features, a precise basis for weight allocation is provided, fundamentally improving the accuracy of water quality parameter prediction.
[0108] 2) Strong real-time adaptive capability;
[0109] By introducing a reinforcement learning agent based on the TD3 algorithm, the historical flood weights can be dynamically adjusted according to real-time monitoring data (such as prediction error and flow change rate). This online optimization mechanism enables the model to adapt to the dynamic changes of the flood process and effectively reduces the lag of the prediction results.
[0110] 3) The prediction process is smooth and stable;
[0111] The designed reward function considers both prediction accuracy and process line smoothness during optimization, avoiding the problem of drastic fluctuations in predicted values that may occur in traditional methods, making the prediction results more reasonable, reliable, and practical.
[0112] 4) The system has good robustness;
[0113] The built-in recalibration mechanism can be automatically triggered when the prediction error exceeds the limit (model mismatch). By resetting the network state and re-initializing the optimization process, the system can quickly recover to a reliable state, enhancing the model's self-repair capability under abnormal conditions and its long-term stability.
[0114] 5) Comprehensive utilization of multi-source information: The prediction method provided by this invention organically couples flood hydrological characteristics with water environment quality indicators, realizing the collaborative simulation and prediction of flood process and water pollution changes, and providing a comprehensive technical means for solving watershed water environment risk early warning under complex environments.
[0115] This invention not only significantly improves the accuracy and real-time performance of flood water quality prediction, but also enhances the adaptability and reliability of the model, demonstrating promising application prospects and promotional value. Attached Figure Description
[0116] Figure 1 The flowchart illustrates an intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning, provided as an embodiment of the present invention. Detailed Implementation
[0117] The present invention will be further described below with reference to the accompanying drawings. The following embodiments are only used to more clearly illustrate the technical solution of the present invention, and should not be used to limit the scope of protection of the present invention.
[0118] Reference Figure 1 As shown, the present invention provides an intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning, comprising the following steps:
[0119] Step 1) Collect historical flood water quality observation data and historical flood hydrological data, and standardize the flood data to obtain comparison sequences and reference sequences; specific steps include:
[0120] Collect historical flood water quality observation data, including dissolved oxygen, pH, total phosphorus, total nitrogen, chemical oxygen demand, and permanganate index.
[0121] Using the starting point of rainfall for each historical flood as the time origin, a flood similarity index set is extracted, which includes: initial flow rate, hourly cumulative rainfall sequence, and hourly cumulative water volume sequence, forming a comparison sequence. The flood duration is 72 hours, with a time interval of 1 hour, as shown in the following expression:
[0122] ;
[0123] in, For comparison sequences; The initial rise flow rate of the i-th historical flood; For comparison, hourly cumulative rainfall sequences; This is a comparison sequence of hourly cumulative water volume; x = 1, 2, ..., 72; i represents the i-th historical flood. N is a positive integer.
[0124] Depending on the different flood rise time t0 (e.g., 6 hours or 12 hours after rainfall), the flood rise flow, hourly cumulative rainfall sequence, and hourly cumulative water volume sequence are obtained through automatic data collection at hydrological and rainfall monitoring stations or manual observation. Using flood forecast data, the stage-specific cumulative water volume within the flood forecast period is obtained to form a reference sequence, expressed as follows:
[0125] ;
[0126] in, For reference sequence; This represents the current measured flood surge flow rate; The hourly cumulative rainfall series serves as a reference series. The hourly cumulative water volume sequence is the reference sequence; y=1,2,…, ; Subscript The flood forecast period is defined as t0, where t0 is the moment the flood begins to rise. ≤72.
[0127] Step 2) Perform dimensionless processing on the comparison sequence and the reference sequence; specific steps include:
[0128] The comparison sequence and the reference sequence are dimensionless, as shown in the following expression:
[0129] ;
[0130] ;
[0131] ;
[0132] in, The comparison sequence is dimensionless, with j ∈ {flow, rainfall, water volume} as the three index dimensions, j = 1, 2, ... ; To compare the data of the j-th similarity index of the i-th flood in the sequence; The data is for the j-th similarity index in the reference sequence; This is the reference sequence after dimensionless processing.
[0133] Step 3) Calculate the dynamic grey relational coefficient using the dimensionless comparison sequence and the reference sequence; specifically including:
[0134] Calculate the absolute difference between the dimensionless comparison sequence and the reference sequence. The expression is as follows:
[0135] ;
[0136] Extract the maximum value. Minimum value The expression is as follows:
[0137] ;
[0138] ;
[0139] in, This indicates taking the minimum value in a series; This indicates taking the maximum value in the series.
[0140] Using absolute difference maximum value Minimum value Calculate the dynamic grey relational coefficient Dynamic grey relational coefficient The expression is as follows:
[0141] ;
[0142] in, The time decay resolution coefficient (traditional grey relational analysis uses a fixed resolution coefficient; this invention improves it into a model that decays over time, highlighting recent effects, weakening long-term effects, and dynamically adjusting the resolution capability) is expressed as follows:
[0143] ;
[0144] in, For predicting time.
[0145] Step 4) Calculate the weighted grey relational degree using the dynamic grey relational coefficient and rank them; specifically including:
[0146] Using dynamic grey relational coefficient Calculate the weighted grey relational degree for each indicator. The expression for the weighted grey relational degree is as follows:
[0147] ;
[0148] in, Let be the weighted grey relational degree of the i-th flood; The weights for each indicator are as follows: flow rate has a weight of 0.4, rainfall has a weight of 0.3, and water volume has a weight of 0.3.
[0149] The weighted grey relational degrees of n comparison sequences relative to the same reference sequence are arranged in order of magnitude to form a weighted grey relational degree sequence.
[0150] Step 5) Dynamically optimize flood water quality prediction based on long short-term memory networks and reinforcement learning agents; specific steps include:
[0151] Step 51) Using a weighted grey relational sequence, filter out the number of valid historical flood fields A, as shown in the following expression:
[0152] ;
[0153] The historical floods corresponding to the first A weighted grey relational degrees in the weighted grey relational degree sequence are taken as the similar flood set; the initial weights of the historical floods are generated through a Long Short-Term Memory (LSTM) network, as follows:
[0154] Step 511) Input feature construction and standardization;
[0155] Extract the filtered historical flood data for event A, and construct a standardized 216-dimensional feature vector for each historical flood, as shown in the following expression:
[0156] ;
[0157] in, A standardized 216-dimensional feature vector; The flood surge flow is the flood surge flow sequence, which consists of the flood surge flow value. It consists of 72 copies. The selected historical flood a has an hourly cumulative rainfall sequence; After filtering The hourly cumulative water volume sequence of a historical flood; x=1,2,…,72; For the first A historic flood, ; Represented as a 216-dimensional real number space;
[0158] The real-time flood state vector is constructed as follows:
[0159] ;
[0160] in, Let the real-time flood state vector be the vector; for the real-time flood state vector... t>t0+ The cumulative rainfall during the period (i.e., future periods beyond the currently known range) and cumulative water volume The input is filled with the historical average or zero value to meet the neural network’s requirement for a fixed input size and ensure that the dimension is always 216.
[0161] Will and The input matrix X of the Long Short-Term Memory network is stacked and expressed as follows:
[0162] ;
[0163] Step 512) Feature extraction from bidirectional long short-term memory network;
[0164] The standardized input matrix X is fed into a bidirectional long short-term memory network for high-dimensional spatiotemporal feature extraction, as shown in the following expression:
[0165] ;
[0166] ;
[0167] ;
[0168] in, Input for the current time step; and These are the hidden states of the forward long short-term memory network and the backward long short-term memory network at the previous time step t-1 and the next time step t+1, respectively. and The cells represent the forward long short-term memory network and the backward long short-term memory network at the previous time step t-1 and the next time step t+1, respectively. and These are the hidden states of the forward long short-term memory network and the backward long short-term memory network at the current time step t, respectively. and These represent the cell states of the forward long short-term memory network and the backward long short-term memory network at the current time step t, respectively. The final feature representation at time step t is formed by concatenating the hidden states of the forward long short-term memory network and the backward long short-term memory network at the current time step t; the Tanh function is used as the cell state activation function.
[0169] Step 513) Attention mechanism weighting;
[0170] A spatiotemporal attention mechanism is introduced to calculate the attention weight at each time step, in order to focus on the key periods of the flood process, as shown in the following expression:
[0171] ;
[0172] in, Let be the normalized attention weights at time step t; It is an exponential function; This is the sum of the exponential fractions for all time steps (t=1 to 72); The energy fraction at time step t is expressed as follows:
[0173] ;
[0174] in, Trainable feature score vectors; It is the hyperbolic tangent activation function; is a trainable weight matrix; It is a trainable bias vector;
[0175] Using the attention weights obtained above for each time step Generate a weighted context feature vector The expression is as follows:
[0176] ;
[0177] Step 514) Fully connected layer and weight normalization;
[0178] context vector The input is transformed into a fully connected neural network, as shown in the following expression:
[0179] ;
[0180] in, This is the original weight score vector; This is the first layer weight matrix; This is the first layer bias vector; This is the weight matrix for the second layer; This is the second layer bias vector; It is a linear rectifier unit. It is an S-shaped function that outputs the original weight score vector. After normalization using the softmax function, the initial probability distribution weights of historical floods are obtained, expressed as follows:
[0181] ;
[0182] in, For the first Normalized initial weights for historical floods; For the first The original weighted score vector of a historical flood;
[0183] Step 52) Dynamically optimize weights based on reinforcement learning agents;
[0184] A reinforcement learning agent based on the TD3 algorithm is used to normalize the initial weights. Real-time optimization will be performed, as follows:
[0185] Step 521) Construct the state space;
[0186] At each optimization moment State vector It consists of the following four-dimensional features, expressed as follows:
[0187] ;
[0188] ;
[0189] ;
[0190] ;
[0191] in, Let be the absolute error of the prediction at time k; Predict the flow rate value for the hydrological model at time k; Let be the measured flow rate at time k; This represents the measured rate of change in flow rate; The attenuation coefficient is... ; For the filtered first Weighted grey relational degree of a historical flood; The weighted grey relational degree after time decay;
[0192] Step 522) Output actions and execute network strategies;
[0193] The reinforcement learning agent generates actions and updates historical flood weights through its Actor policy network, as follows:
[0194] The action output by the reinforcement learning agent at time k Adjustment amount for historical flood weights The constraint is within the range of [-0.1, 0.1], and the expression is as follows:
[0195] ;
[0196] Actions are generated by the Actor policy network. The following results are generated:
[0197] ;
[0198] in, This represents the weight adjustment amount suggested by the Actor policy network for the a-th historical flood.
[0199] Historical flood weights are updated using the following formula:
[0200] ;
[0201] in, No. The old weight of a historical flood at time k; This is a truncation function to ensure that the calculated new weights do not exceed a reasonable range of [0,1]. For the first The new weight of the historical flood at time k+1;
[0202] right The weight normalization process is performed as follows:
[0203] ;
[0204] in, This is the sum of the weights of all historical floods after the update;
[0205] Step 523) Design the reward function;
[0206] Design a reward function R to optimize prediction accuracy and process line smoothness, as shown in the following expression:
[0207] ;
[0208] ;
[0209] in, The weighting coefficients for the error penalty term. ; The approximate second derivative of the measured flow rate process curve; The weighting coefficients for the smoothness penalty term. ;
[0210] Step 524) Perform network updates and exploration;
[0211] The network update uses the Critic network with a dual-Q network structure, and is updated synchronously with the Actor network. The target network soft update coefficient is 0.005.
[0212] The exploration strategy uses Ornstein-Uhlenbeck noise with noise parameter σ=0.1; the agent updates the strategy network parameters every 6 hours.
[0213] Step 53) Dynamic prediction and recalibration;
[0214] Step 531) Flood water quality prediction;
[0215] Using the historical water quality observation data obtained in step 1), including dissolved oxygen (DO), pH, total phosphorus, total nitrogen, chemical oxygen demand (COD), and permanganate index, the final predicted flood water quality data at each moment is calculated by multiplying the historical flood water quality observation data by the corresponding historical flood weights, as shown in the following expression:
[0216] ;
[0217] in, To ultimately predict flood water quality data, For the first Measured flood water quality data at time t for a historical flood event;
[0218] Step 532) Trigger recalibration;
[0219] Real-time monitoring of prediction absolute error ,when hour If the error is the largest prediction error in history, it is determined to be a model state mismatch, and a recalibration mechanism is immediately triggered: the hidden state of the long short-term memory network is reset, the weights are initialized and optimized.
[0220] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the technical principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.
Claims
1. A method for intelligent prediction of water quality in watershed floods based on dynamic weight optimization and deep reinforcement learning, characterized in that, Includes the following steps: Collect historical flood water quality observation data and historical flood hydrological data; The historical flood hydrological data are standardized to obtain comparison sequences and reference sequences, and the comparison sequences and reference sequences are then dimensionless. The dynamic grey relational coefficient is calculated using the dimensionless comparison sequence and the reference sequence. The weighted grey correlation degree of the comparison sequence to the reference sequence is calculated using the dynamic grey correlation coefficient, and the weighted grey correlation degree is sorted to form a weighted grey correlation degree sequence. The effective historical flood fields are selected by using the weighted grey relational sequence, and the historical flood weights of the effective historical flood fields are dynamically optimized based on a long short-term memory network and a reinforcement learning agent based on the TD3 algorithm. The final predicted flood water quality data is obtained by calculating the historical flood water quality observation data and the historical flood weights. The initial weights for historical floods are generated using a Long Short-Term Memory (LSTM) network, including the following steps: Extract historical flood hydrological data from effective historical flood fields, construct a standardized 216-dimensional feature vector for each historical flood, and construct a real-time flood state vector; The standardized 216-dimensional feature vector is stacked with the real-time flood state vector to form the input matrix of the long short-term memory network. The standardized input matrix is fed into a bidirectional long short-term memory network for high-dimensional spatiotemporal feature extraction. The attention weights at each time step during the flood process are calculated based on the spatiotemporal attention mechanism, and the weighted context feature vectors are generated using the obtained attention weights at each time step. The context vector is input into a fully connected neural network for transformation, and after normalization by the softmax function, the initial probability distribution weights of historical floods are obtained. A reinforcement learning agent based on the TD3 algorithm is used to normalize the initial weights. Real-time optimization includes the following steps: First, construct the state space, and at each optimization time... State vector It consists of the following four-dimensional features, expressed as follows: ; ; ; ; in, A standardized 216-dimensional feature vector; Let be the absolute error of the prediction at time k; Predict the flow rate value for the hydrological model at time k; Let be the measured flow rate at time k; This represents the measured rate of change in flow rate; The attenuation coefficient is... ; For the filtered first Weighted grey relational degree of a historical flood; The weighted grey relational degree is decayed over time; t0 is the flood surge time. The expression for the standardized 216-dimensional feature vector is: ; in, A standardized 216-dimensional feature vector; The flood surge flow is the flood surge flow sequence, which consists of the flood surge flow value. It consists of 72 copies. After filtering Hourly cumulative rainfall sequence of a historical flood; After filtering The hourly cumulative water volume sequence of a historical flood; x=1,2,…,72; For the first A historic flood, ; Represented as a 216-dimensional real number space; Secondly, reinforcement learning agents are used to generate actions through the Actor policy network to update historical flood weights: The action output by the reinforcement learning agent at time k Adjustment amount for historical flood weights The constraint is within the range of [-0.1, 0.1], and the expression is: ; Actions are generated by the Actor policy network. The generation is represented as: ; in, Indicates that the Actor policy network is the first... The recommended weighting adjustment for historical flood events; Historical flood weights are updated using the following formula: ; in, No. The old weight of a historical flood at time k; This is a truncation function to ensure that the calculated new weights do not exceed a reasonable range of [0, 1]. For the first The new weight of the historical flood at time k+1; right The weight normalization process is performed, and the expression is: ; in, This is the sum of the weights of all historical floods after the update; Furthermore, a reward function R is designed to optimize prediction accuracy and process line smoothness, expressed as: ; ; in, The weighting coefficients for the error penalty term. ; This is the second derivative of the measured flow rate process line; The weighting coefficients for the smoothness penalty term. ; Finally, the network update uses a Critic network with a dual-Q network structure, updated synchronously with the Actor network, and the target network soft update coefficient is 0.005; the exploration strategy uses Ornstein-Uhlenbeck noise with noise parameter σ=0.1; the agent updates the strategy network parameters synchronously every 6 hours.
2. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 1, characterized in that, Using the starting point of each historical flood rainfall event as the time origin, a flood similarity index set is extracted, including the initial flow rate, hourly cumulative rainfall sequence, and hourly cumulative water volume sequence, forming a comparison sequence, expressed as: ; in, For comparison sequences; The initial rise flow rate of the i-th historical flood; For comparison, hourly cumulative rainfall sequences; This is a comparison sequence of hourly cumulative water volume; x = 1, 2, ..., 72; i represents the i-th historical flood. N is a positive integer; Depending on the flood rise time t0, the flood rise flow, hourly cumulative rainfall sequence, and hourly cumulative water volume sequence are obtained through automatic data collection or manual observation from hydrological and rainfall monitoring stations. Using flood forecast data, the stage-specific cumulative water volume within the flood forecast period is obtained to form a reference sequence, expressed as: ; in, For reference sequence; This represents the current measured flood surge flow rate; The hourly cumulative rainfall series serves as a reference series. The hourly cumulative water volume sequence is the reference sequence; y=1,2,…, ; Subscript For flood forecast period, ≤72.
3. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 1, characterized in that, The comparison sequence and the reference sequence are dimensionless, expressed as follows: ; ; ; in, The comparison sequence is dimensionless, with j ∈ {flow, rainfall, water volume} as the three index dimensions, j = 1, 2, ... ; To compare the data of the j-th similarity index of the i-th flood in the sequence; The data is for the j-th similarity index in the reference sequence; This is the reference sequence after dimensionless processing.
4. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 3, characterized in that, Calculate the absolute difference between the dimensionless comparison sequence and the reference sequence. The expression is: ; Extract the maximum value. Minimum value The expression is: ; ; in, This indicates taking the minimum value in a series; This indicates taking the maximum value in the series; Using absolute difference maximum value Minimum value Calculate the dynamic grey relational coefficient The expression is: ; in, The time decay resolution coefficient, The expression is as follows: ; in, For predicting time.
5. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 4, characterized in that, Using dynamic grey relational coefficient Calculate the weighted grey relational degree of each indicator, the expression is: ; in, Let be the weighted grey relational degree of the i-th flood; The weights for each indicator are as follows: flow rate has a weight of 0.4, rainfall has a weight of 0.3, and water volume has a weight of 0.
3. The weighted grey relational degrees of n comparison sequences relative to the same reference sequence are arranged in order of magnitude to form a weighted grey relational degree sequence.
6. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 5, characterized in that, Using a weighted grey relational sequence, the effective historical flood field frequency A is selected, expressed as follows: ; Take the historical floods corresponding to the first A weighted grey relational degrees in the weighted grey relational degree sequence as the similar flood set.
7. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 6, characterized in that, The expression for the real-time flood state vector is: ; in, Let the real-time flood state vector be the vector; for the real-time flood state vector... t>t0+ During this period, its cumulative rainfall and cumulative water volume The input is filled with the historical average or zero value to meet the neural network’s requirement for a fixed input size and ensure that the dimension is always 216. The expression for the input matrix X of a Long Short-Term Memory (LSTM) network is: ; The standardized input matrix X is fed into a bidirectional long short-term memory network for high-dimensional spatiotemporal feature extraction, as shown in the following expression: ; ; ; in, Input for the current time step; and These are the hidden states of the forward long short-term memory network and the backward long short-term memory network at the previous time step t-1 and the next time step t+1, respectively. and The cells represent the forward long short-term memory network and the backward long short-term memory network at the previous time step t-1 and the next time step t+1, respectively. and These are the hidden states of the forward long short-term memory network and the backward long short-term memory network at the current time step t, respectively. and These represent the cell states of the forward long short-term memory network and the backward long short-term memory network at the current time step t, respectively. The final feature representation at time step t is formed by concatenating the hidden states of the forward long short-term memory network and the backward long short-term memory network at the current time step t; the Tanh function is used as the cell state activation function. A spatiotemporal attention mechanism is introduced to calculate the attention weight at each time step, focusing on the critical periods of the flood process. The expression for the attention weight at each time step is as follows: ; in, Let be the normalized attention weights at time step t; It is an exponential function; It is the sum of the exponential fractions of all time steps; The energy fraction at time step t is expressed as: ; in, Trainable feature score vectors; It is the hyperbolic tangent activation function; is a trainable weight matrix; It is a trainable bias vector; Context feature vector The expression is: ; context vector The input is transformed into a fully connected neural network, and the expression is: ; in, This is the original weight score vector; This is the first layer weight matrix; This is the first layer bias vector; This is the weight matrix for the second layer; This is the second layer bias vector; It is a linear rectifier unit. It is an S-shaped function, outputting the original weight score vector. ; The initial probability distribution weights of historical floods, obtained after normalization using the softmax function, are expressed as follows: ; in, For the first Normalized initial weights for historical floods; For the first The original weighted score vector of a historical flood.
8. The intelligent water quality prediction method for watershed floods based on dynamic weight optimization and deep reinforcement learning according to claim 1, characterized in that, The final predicted flood water quality data at each moment is calculated by multiplying historical flood water quality observation data by the corresponding historical flood weights, as expressed in the following expression: ; in, To ultimately predict flood water quality data, For the first Measured flood water quality data at time t for a historical flood event; Error assessment was performed on the final predicted flood water quality data obtained: Real-time monitoring of prediction absolute error ,when hour, If the error is the largest absolute error in history, it is determined to be a model state mismatch, and the recalibration mechanism is immediately triggered.