Sewage treatment process abnormal condition monitoring method based on a width slow feature neural network with incremental learning capability

By extracting slow features of the wastewater treatment process using the ILBSFNN model and dynamically expanding nodes, the problem of insufficient nonlinearity and dynamic characteristics in the monitoring of abnormal operating conditions during wastewater treatment is solved, thus achieving efficient monitoring of abnormal operating conditions during wastewater treatment and realizing efficient wastewater treatment.

CN116842420BActive Publication Date: 2026-06-19BEIJING UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING UNIV OF TECH
Filing Date
2023-06-05
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to efficiently monitor abnormal conditions during wastewater treatment, especially as the nonlinear and dynamic characteristics resulting from complex reactions and sedimentation processes are not fully extracted, and network training and updates are time-consuming.

Method used

We employ an Incremental Learning Wide Slow Feature Neural Network (ILBSFNN) model to extract dynamic features through slow feature analysis and dynamically add enhancement nodes incrementally, thereby avoiding network retraining and reducing time overhead.

🎯Benefits of technology

It achieves high-precision, low-cost monitoring of abnormal operating conditions in the wastewater treatment process, meets the requirements for real-time online updates, and improves the accuracy and efficiency of monitoring.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116842420B_ABST
    Figure CN116842420B_ABST
Patent Text Reader

Abstract

This wastewater treatment process anomaly monitoring method, based on a wide, slow-feature neural network with incremental learning capabilities, addresses the need for high accuracy and low time overhead in wastewater treatment process anomaly monitoring models. It solves the problems of significant nonlinearity and insufficient extraction of dynamic characteristics from collected data, thereby reducing network time consumption. When higher levels of accuracy in wastewater treatment anomaly monitoring are required, but network tuning is a time-consuming process, the ILBSFNN model expands the network width by allowing the dynamic addition of incremental enhancement nodes and determining the number of node parameters, rather than increasing its depth. This approach ensures the network always has a single hidden layer, maintaining its structural simplicity and eliminating the need for retraining the entire system. This reduces the time overhead associated with network tuning and updates, resulting in high learning efficiency and meeting the real-time online monitoring requirements of actual wastewater treatment plants.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This study aims to monitor abnormal operating conditions in wastewater treatment processes and proposes a method for monitoring abnormal operating conditions in wastewater treatment processes based on a wide slow-feature neural network with incremental learning capabilities. Background Technology

[0002] Monitoring abnormal operating conditions at wastewater treatment plants is crucial for ensuring the safe operation of the wastewater treatment process and achieving effluent quality standards. Wastewater from residential and industrial plants must undergo rigorous treatment before discharge. However, when large-scale wastewater treatment units malfunction, the losses are often catastrophic. For example, the deterioration of sludge settling, the increase in suspended particulate matter, and significant sludge loss directly threaten the system's normal wastewater treatment capacity. Furthermore, the characteristics of raw data collected from the wastewater treatment process are typically non-linear and dynamic, which can significantly affect the performance of early monitoring networks, and consequently, the effectiveness of abnormal operating condition monitoring. Therefore, extracting valuable characteristic information from the wastewater treatment process can improve the accuracy and efficiency of abnormal operating condition monitoring.

[0003] Over the past few decades, the rapid development of industrial automation and sensor technology has propelled the research progress of process anomaly monitoring models. Two main methods for process monitoring are multivariate statistical methods and neural network methods. Multivariate statistical process monitoring (MSPM) techniques, such as Principal Component Analysis (PCA), Partial Least Squares (PLS), Independent Component Analysis (ICA), Kernel Principal Component Analysis (KPCA), Kernel Partial Least Squares (KPLS), and Kernel Independent Component Analysis (KICA), have been widely applied to industrial process monitoring. However, kernel methods have some drawbacks, including a lack of prior knowledge required for selecting ideal kernel parameters and functions, and the enormous computational cost generated by kernel projection and delay functions. Artificial Neural Networks (ANNs), due to their excellent nonlinear interpretation capabilities, are used for anomaly monitoring in various industrial processes, including Deep Belief Networks (DBNs), Stacked Autoencoders (SAEs), and Recurrent Neural Networks (RNNs). However, although these deep learning networks have excellent feature extraction capabilities in extracting nonlinear features, their complex structures require frequent parameter adjustments, leading to high time costs in the training and retraining processes. To address the challenges of rapid training and updating of networks, wide learning systems (BLS) utilize pseudo-inverse computation to output relevant network weight parameters. Since its inception, BLS has attracted widespread research attention, including recurrent BLS (RBLS), generalized convolutional neural networks (BCNN), overcomplete BLS (OBLS), and multi-stage BLS (MBLS).

[0004] This study employs the Incremental Learning Slow Feature Learning System (ILLBSFNN) model. The ILBSFNN model expands the network's width by allowing the dynamic addition of incremental augmentation nodes and determining the number of node parameters, rather than increasing its depth. This approach ensures the network always has a single hidden layer, maintaining its structural simplicity and eliminating the need for retraining the entire system. This reduces the time overhead associated with network tuning and updates, resulting in high learning efficiency. Summary of the Invention

[0005] This study addresses the need for high accuracy and low time overhead in wastewater treatment process anomaly monitoring models by proposing an Incremental Learning Slow Feature Neural Network (ILBSFNN) model to accurately and efficiently monitor wastewater treatment processes. Specifically, ILBSFNN utilizes slow feature analysis to extract slowly changing information, thereby learning the dynamic characteristics of process variables during wastewater treatment. Another advantage of the ILBSFNN model is its ability to reduce network time overhead. When higher accuracy in monitoring wastewater treatment anomalies is required, but network tuning is a time-consuming process, incremental enhancement nodes can be dynamically added to determine the number of node parameters, thus expanding the network. The ILBSFNN monitoring network can therefore avoid retraining and updating the network from scratch, thereby reducing the time costs associated with network tuning and updates. This meets the real-time online monitoring requirements of actual wastewater treatment plants.

[0006] This study focuses on the wastewater treatment process, conducting in-depth research and analysis around the two main objectives of monitoring accuracy and efficiency. In practical applications of wastewater treatment, pre-establishing appropriate parameters for the ILBSFNN network can be challenging, especially when high accuracy is required for anomaly monitoring. In such cases, incorporating additional nodes into the anomaly monitoring network can improve its accuracy and efficiency. Fortunately, ILBSFNN dynamically adjusts the addition of augmenting nodes, allowing them to be seamlessly integrated into the ILBSFNN system. This study on anomaly monitoring in wastewater treatment processes is mainly divided into three parts: pretreatment, model training, and online application.

[0007] Preprocessing stage:

[0008] Step 1: Obtain the dataset. Divide the dataset into training and testing datasets in a 3:2 ratio. Next, determine the input and output process variables of the monitoring model. Input variables include: influent suspended solids concentration, readily biodegradable organic matter concentration, granular inert organic carbon concentration, active heterotrophic bacteria concentration, active autotrophic bacteria concentration, particulate matter concentration from biodegradation, dissolved oxygen concentration, nitrate nitrogen concentration, ammonia nitrogen concentration, soluble biodegradable organic nitrogen concentration, alkalinity molar concentration, particulate biodegradable organic nitrogen concentration, and chronically biodegradable organic matter concentration; input variables are represented by X. Output variables include the monitoring classification results for sludge bulking failure type, toxic shock failure type, and inhibition failure type, represented by Y; test set data X new The model's monitoring performance for abnormal operating conditions in wastewater treatment was tested using 18 fault types associated with three fault types: sludge bulking fault, toxic shock fault, and inhibition fault. The input variable names for the 18 fault types were the same as those in the training set.

[0009] Step 2: Represent the training set used for model training with the symbol X, i.e., X∈R N×M Normalization is performed to eliminate the adverse effects on experimental results caused by different assignments. R represents the set of real numbers, N is the number of samples, and M is the number of process variables. For an M-dimensional input signal X = [x1(t), x2(t), ..., x...], ... M [(t)], where t represents the sample.

[0010] Network training phase

[0011] Step 3: The goal of the ILBSFNN model is to find a mapping function. Where m represents the number of slow features, and the slow features of ILBSFNN are obtained through... The mapping function is applied to the input signal X. Following this mapping method, the slow feature information after slow feature processing is obtained as follows:

[0012] sf=g(X)=[[g(x1(t)),g(x2(t)),…,g(x m (t))] T ]

[0013] =[sf1,sf2,…,sf m ] T

[0014] Where X represents the 13 input variables of the training dataset, including the mass concentrations of suspended solids, readily biodegradable organic matter, particulate inert organic carbon, active heterotrophic bacteria, active autotrophic bacteria, particles generated by biodegradation, dissolved oxygen, nitrate nitrogen, ammonia nitrogen, soluble biodegradable organic nitrogen, alkalinity molar concentration, particulate biodegradable organic nitrogen, and chronically biodegradable organic matter. sf is the set of slow features extracted by the ILBSFNN model using SFA, sf1, sf2, and sf... m These are the 1st, 2nd, and mth slow features, sorted from slowest to fastest rate of change; g(X) is... A set of mapping functions; [·] T The symbols denote transpose. g(x1(t)), g(x2(t)) and g(x... m (t) respectively represent X=[x1(t),x2(t),…,x M (t)] after the mapping function The obtained slow feature information of the 1st, 2nd and mth.

[0015] Step 4: The ILBSFNN model first needs to find the first few features that change the slowest. If we want each Δ(sf) to... i From this perspective, we obtain the following optimization problem: ) minimize all of them.

[0016]

[0017] Among them, <·> t For slow feature samples at N time points The calculated average value; The first-order difference representing the i-th slow feature is... Δ(·) represents the expression for sf i A natural measure of the rate of change; represent The square of.

[0018] Step 5: The constraints in Step 4 are:

[0019] <sf i > t =0

[0020]

[0021]

[0022] The three constraints are, in order, zero-mean constraints. <sf i > t =0, unit variance constraint and decorrelation constraints The zero-mean and unit variance constraints are used to prevent the extracted slow features from being constant; furthermore, these two constraints scale the extracted slow features to a uniform scale, ensuring fairness in comparing the rate of change. The decorrelation constraint requires that the slow features extracted by SFA be pairwise uncorrelated, avoiding simple repetition between different slow features. Δ(sf i ) is sf i The degree of slowness of the rate of change; t For slow feature samples at N time points The calculated average value.

[0023] Step 6: For discrete data, use first-order differences to represent the rate of change:

[0024]

[0025] Each slow feature sf i and sf i-1 All are represented by linear combinations of all input variables, through Mapping functions are used to implement mapping, that is:

[0026]

[0027] sf=WX

[0028] Where W = [w1, w2, ..., w m ] T It is a parameter matrix that needs to be optimized by SFA, w i G(X) is the coefficient vector representing the parameter matrix W. A set of mapping functions, It is w i The transpose of ; sf is the set of slow features extracted by the ILBSFNN model through SFA. It is easy to prove that if we want the input variables to satisfy the zero mean constraint, we need to force an automatic mean removal operation on the input variables.

[0029] Step 7: Substitute the formula from Step 6 into Steps 4 and 5 to obtain the optimization objective of SFA in the linear case:

[0030]

[0031] Where the formula Represented by the symbol A, that is A is the covariance matrix of the first-order difference of the input X; the formula <XX T > Represented by the symbol B, i.e., B = <XX T >, where B is the covariance matrix of the input X. It is the first-order difference of the input X; w i The coefficient vector representing the parameter matrix W; It is w i The transpose of; where <·> t For slow feature samples at N time points The calculated average value; The first-order difference representing the i-th slow feature is... Δ(·) is a natural measure of the rate of change of sf; represent The square of the variance, as shown by the constraints in step 4, indicates a unit variance constraint.

[0032] <sf i 2 > t =1, <sf i 2 > t =w i T Bw i =1.

[0033] Step 8: Use singular value decomposition to find the parameter matrix W. First, perform singular value decomposition on matrix B:

[0034] B = UΛU T

[0035] Where U is an orthogonal matrix composed of singular vectors; Λ is a rectangular diagonal singular value matrix, with the elements on the diagonal being the singular values ​​of B; T represents the transpose.

[0036] Step 9: Based on the above equation, the original input X is spherized to remove correlation, i.e.:

[0037] z = Λ -1 / 2 U T X

[0038] Where U is an orthogonal matrix composed of singular vectors; Λ is a rectangular diagonal singular value matrix, with the elements on the diagonal being the singular values ​​of B; T denotes the transpose; and z is the spherized result of the input X. <zz T > t =I m I m It is an m-order identity matrix. Therefore, the original optimization problem is transformed into finding a matrix P such that sf = Pz and <sfsf T > t =I m This satisfies the unit variance and decorrelation constraints, thus enabling... <sfsf T > t =I m Substituting sf = Pz, we get:

[0039] PP T =I m

[0040] Where T denotes transpose; the above formula shows that P is an orthogonal matrix. Here, it is made that... The solution that minimizes the value is obtained using the covariance matrix of the first difference with respect to z, i.e., the covariance matrix of the first difference with respect to z. Solve by performing singular value decomposition. It is the first difference of z.

[0041]

[0042] Where P is an orthogonal matrix composed of eigenvectors; Ω is a rectangular diagonal singular value matrix, with the diagonal elements being the change values ​​of each slow feature; and T represents the transpose.

[0043] Step 10: Through the above steps, the final parameter coefficient matrix W is obtained by the following formula:

[0044] W = PΛ -1 / 2 U T

[0045] Where U is an orthogonal matrix composed of singular vectors; Λ is a rectangular diagonal singular value matrix, with the elements on the diagonal being the singular values ​​of B; T represents the transpose; and P is an orthogonal matrix composed of eigenvectors.

[0046] Step 11: Obtain the slow feature sf based on steps 1-10:

[0047] sf=WX=PΛ -1 / 2 U T X = Pz

[0048] Step 12: Input the sf obtained from slow feature analysis into the feature layer window of the ILBSFNN network for further feature extraction, and form the i-th set of feature nodes SF through the feature mapping function. i :

[0049]

[0050] in This is the feature mapping function of the ILBSFNN network. The activation function of the feature layer is tansig. The mapping function is used to extract variable information from the sewage dataset and feed it into the enhancement window. Tansig is chosen as the activation function of the feature layer, and p is the number of feature nodes in the feature layer. The weight matrix mapped to the feature nodes of the i-th window. The bias vector mapped to the feature nodes of the i-th window. To prevent weights... Due to the influence of randomness, a sparse autoencoder was used for fine-tuning, where the sparse matrix W... s It can be obtained through the following formula:

[0051]

[0052] Where λ1 is the L1 regularization parameter, λ1 = 2 -8 After sparsification, the feature nodes SF i =sfW s .

[0053] Step 13: Feature layer output SF p Repeat the previous step p times to obtain the result.

[0054] SF p =(SF1,SF2,…,SF) p )

[0055] Step 14: The activation function outputs SF based on the 13 input variables of the wastewater treatment processed by the feature layer of the first layer of the ILBSFNN network. p Perform nonlinear processing on the output SF of the feature layer in step 13. p This serves as an input layer to address the nonlinearity and dynamism of wastewater treatment process data. SF p Constructing the j-th group of enhanced nodes E j As shown below:

[0056]

[0057] Where q represents the total number of enhancement nodes in the enhancement layer, ξ j Since sigmoid is a non-linear activation function, it is chosen as the activation function for the enhancement layer, similar to the generation of weights and biases within the feature layer. and It is also a randomly generated connection weight and bias that are uniformly distributed within the interval (0,1).

[0058] Step 15: The output of the enhancement layer is expressed using E. q express.

[0059] E q =(E1,E2,…,E q )

[0060] Step 16: After obtaining all feature layers and enhancement layers based on steps 13-15, the first output Y of the ILBSFBNN network is:

[0061]

[0062] The actual output Y of the ILBSFNN network output layer represents the classification results for each of the three fault types: sludge bulking fault, toxic shock fault, and inhibition fault. [SF1,SF2,…,SF…] p |E1,E2,…,E q This can be represented by the expression A1, where A1 = [SF]. p |E q The connection weights of the output layer network in the first layer are represented by W. 1 It means that among them The ILBSFNN network outputs Y = (y1, y2, ..., y...). N ) T ∈R N×C The labels are represented by N, the number of samples is represented by C, the output dimension is represented by R, and the set of real numbers is represented by W. 1 W represents the network connection weights of the first output layer of the ILBSFNN. 1 =[SF p |E q ] + , and Both were obtained using ridge regression techniques, ξ1 and ξ q Both are sigmoid activation functions, i.e. It is the connection weight from the first layer's enhancement layer to the output layer. and All are tansig feature layer activation functions. These are the connection weights from the feature layer to the output layer in the first layer.

[0063] Step 17: For convenience, based on steps 15 and 16, use the generalized mapping function. This represents the generation of p groups of feature nodes, i.e. in The weights are randomly generated by the feature layer in the interval [0,1]. It is a bias randomly generated by the feature layer. It's the tansig activation function. Using the generalized mapping function... Representatives are used to generate q groups of augmented nodes, i.e. in The weights are randomly generated by the feature layer in the interval [0,1]. The biases ξ1, ξ2, ξ are randomly generated by the feature layer. q It is the tansig activation function, where the formula in step 16 is further expressed with respect to the output Y of the ILBSFNN network.

[0064]

[0065] Among them, weight and weight parameters These are a series of random weights generated by the feature layer and the enhancement layer, respectively. It is the weight matrix mapped to the feature nodes of the i-th window. These are weights randomly generated in the enhancement layer. (Symbol) Based on generalized mapping function and The composite function, i.e.

[0066]

[0067] Step 18: Theoretically, the connection weights W of the ILBSFNN network... 1 The solution can be obtained using ridge regression, which does not require excessive time expenditure.

[0068]

[0069]

[0070]

[0071] in This represents the pseudo-inverse of A1, where A1 = [SF]. p |E q ], λ is the L2 regularization coefficient, λ = 2 -30 I is the identity matrix. This represents the transpose of A1. W 1 W represents the network connection weights of the output layer of the ILBSFNN. 1 =[SF p |E q ] + A1;

[0072] Step 19: Generate the output y1 of ILBSFNN.

[0073]

[0074] Among them, weight and weight parameters These are a series of random weights generated by the feature layer and the enhancement layer, where is It is the weight matrix mapped to the feature nodes of the i-th window. The weights are randomly generated in the enhancement layer. These are the connection weights from the enhancement layer to the output layer of the ILBSFNN network; These are the connection weights from the feature layer to the output layer of the ILBSFNN network; and It includes important feature information learned from training on 13 input variables based on the wastewater treatment training set.

[0075] Step 20: The ILBSFNN network can dynamically add augmentation nodes through horizontal expansion. It uses the features mapped from the input as feature nodes, then randomly generates augmentation nodes with weights, and directly connects the mapped features and augmentation nodes to the output. The corresponding output coefficients are obtained through pseudo-inversion. Based on this theory, after adding new neural augmentation nodes, ILBSFNN does not need to learn from scratch; it only needs to adjust the weights related to the new nodes based on the feature layers of the original network. Assume that the ILBSFNN model adds b groups of augmentation nodes. Therefore, the output y2 of the ILBSFNN network is updated as follows: Among them, W be W is the newly added connection weight randomly generated in the (0,1) interval of the enhancement node layer. bE It adds new connection weights from the node layer to the updated output layer. and These are the weights randomly generated between (0,1) within the feature layer and the enhancement layer, respectively.

[0076] Step 21: We dynamically update W bE Use the following formula:

[0077]

[0078] in This is a unique feature of ILBSFNN, which retains parameter outputs before adding additional augmentation nodes, thus avoiding the time overhead of retraining and updating the network from scratch. The expression... Represented by the symbol D, that is

[0079] Step 22: Symbol B from Step 21 T The specific expression is:

[0080]

[0081] Where the expression Represented by the symbol C, that is (C) + Let C represent the pseudoinverse. I is the identity matrix. Represented by the symbol D, that is

[0082]

[0083] Step 23: Therefore, the connection weights of the updated ILBSFNN network are further expressed based on steps 20-22 as follows:

[0084]

[0085] Among them W bE It enhances the connection weights from the node layer to the updated output layer. and These are the connection weights between the output layer and feature layer of the ILBSFNN network, and the augmentation weights, respectively. bE It includes important feature information learned during training based on the error between the actual output and the expected output Y of each of the three fault types: sludge bulking fault, toxic shock fault, and inhibition fault.

[0086] Step 24: Execute steps 20-23 until the training accuracy of the ILBSFNN network reaches the expected requirements of actual wastewater treatment, save the model parameters, and the dynamic update process of the ILBSFNN network is complete.

[0087] Online application stage:

[0088] Real-time acquisition of online test dataset X new Based on the network parameters obtained during the training phase of the ILBSFNN network, the output results of the ILBSFNN network were calculated to test the accuracy of the model in monitoring abnormal operating conditions of wastewater treatment for 18 fault types related to three fault types: sludge bulking fault, toxic shock fault, and inhibition fault. The input variable names for the 18 fault types are the same as those in the training set.

[0089] Step 25: Use the ILBSFNN framework obtained during the network training phase to design and build a monitoring model for abnormal operating conditions in the wastewater treatment process. The specific details are as follows.

[0090] Step 26: Obtain new online sampling data X new Then, the test data is preprocessed, and the test set data uses the same input variables as the training set as the input to the model;

[0091] Step 27: Use the ILBSFNN model obtained in steps 3-20 of the network training phase to perform online monitoring and obtain new monitoring results. new If the monitoring accuracy obtained at this time meets the requirements of the sewage treatment plant, the incremental enhancement nodes are repeatedly increased by dynamically expanding the ILBSFNN monitoring model in steps 21-24 until the obtained monitoring accuracy is greater than or equal to the actual sewage treatment plant's standard for monitoring accuracy.

[0092] Step 28: Then return to step 23 to continue monitoring the new batch of test data;

[0093] Step 29: Until the fault type monitoring results for all batches are obtained, the online application monitoring phase uses one-hot encoding to output the results for classification;

[0094] Step 30: The online application phase of the ILBSFNN model is now complete.

[0095] The ILBSFNN model effectively addresses the problem of some networks failing to fully extract the significant nonlinearity and dynamic characteristics of sensor-collected data due to the complex reactions and sedimentation processes in wastewater treatment. The ILBSFNN model fully considers the nonlinearity and dynamic characteristics of the input variables in the wastewater treatment process. Unlike some deep neural networks that train and update weights through backpropagation, the ILBSFNN model uses matrix differentiation to update weights, greatly saving the time cost incurred by retraining and updating during online network adjustments. Therefore, the ILBSFNN network perfectly meets the high-precision and low-time-cost requirements of actual wastewater treatment plants for abnormal operating condition monitoring models. Attached Figure Description

[0096] Figure 1 This is a structural diagram of the ILBSFNN network used in this study;

[0097] Figure 2 A flowchart for modeling and monitoring abnormal operating conditions of wastewater treatment processes based on the ILBSFNN network;

[0098] Figure 3 A schematic diagram illustrating the wastewater treatment process at a wastewater treatment plant.

[0099] Figure 4 The figure shows the experimental results of the ILBSFNN algorithm in this study; Detailed Implementation

[0100] Wastewater treatment plants involve multiple treatment units employing physical, biological, and chemical methods. The International Water Association has developed a benchmark simulation platform, BSM1, based on the activated sludge process to simulate actual wastewater treatment processes. This benchmark simulation model uses A... 2 Based on the O process flow, and constructed using the No. 1 activated sludge model, this model describes the sedimentation and biochemical reactions in the wastewater treatment process. It consists of five activated sludge reactors and one secondary sedimentation tank. Two of the five activated sludge reactors are anoxic, while the rest are aeration. Furthermore, BSM1 simulates the actual operation of the wastewater treatment plant under sunny, rainy, and heavy rain conditions, collecting data every 15 minutes. This experiment collected data for 14 days, with 1344 samples collected for each fault type. The faults occurred between times 865 and 1344. The specific flow chart of the wastewater treatment process is shown below. Figure 1 As shown, the model has 10 sedimentation layers. The BSM1 baseline wastewater treatment process simulation model can reflect the nitrogen and phosphorus removal process of wastewater treatment from multiple perspectives, including kinetics, chemistry, and physics. Table 1 provides some wastewater treatment process variables and kinetic parameters, such as maximum specific hydrolysis rate, readily biodegradable organic matter, and NH4+. + Wastewater treatment process parameters include ion and NH3 ion concentrations and soluble biodegradable organic nitrogen. Table 2 describes the 18 types of failures involved in this study, including sludge bulking failure, toxic shock failure, and retention failure. Table 3 describes the 18 types of failures involved in the experiments of this invention, including sludge bulking failure, toxic shock failure, and retention failure. Sludge bulking is a common and serious failure type. When sludge bulking occurs, the settling and compression characteristics of sludge deteriorate, suspended solids increase, sludge loss is severe, and the biological system may malfunction, directly threatening the normal operation of the wastewater treatment system. Toxicity shock failure is caused by the discharge of large amounts of toxic wastewater, leading to a reduction in the activity of activated sludge microorganisms, and even the failure of the wastewater biological reaction system, affecting wastewater treatment. Retention failure occurs when the normal growth of heterotrophic organisms declines. In this experiment, different degrees of retention failure can be generated by adjusting the heterotrophic bacterial growth rate μH and the heterotrophic attenuation coefficient bH.

[0101] The specific application of the ILBSFNN model proposed in this study to actual wastewater treatment processes is described below:

[0102] A. Preprocessing and training phases of the ILBSFNN network during wastewater treatment:

[0103] Step 1: First, preprocess the sampling data collected from the wastewater treatment plant.

[0104] Step 2: Then, the slow feature analysis algorithm is used to extract slow features from the 13 input process variables. In the actual network training phase, there are a total of 26 sets of slow features, including slowly changing components and rapidly changing components. The slowly changing components contain key feature information of the wastewater treatment data, while the rapidly changing components represent noise and external / internal disturbances during wastewater treatment. These disturbances and noises can cause false alarms and false negatives in the modeling stage of the actual wastewater treatment anomaly detection network. Therefore, based on expert experience in wastewater treatment processes, the number of slow features is selected as 10.

[0105] Step 3: Obtain slow features using slow feature analysis and feed them into 10 feature windows of the ILBLS network. Further feature extraction is performed in each feature window to obtain the SF (Slow Feature Optimization). PThe data is then fed into an enhancement layer for further enhancement processing. The number of layers in the ILBSFNN stack is determined to be 1, and the number of feature nodes windows, the number of feature nodes in each window, and the number of enhancement nodes are (10, 10, 50), respectively.

[0106] Step 4: If the monitoring accuracy of the completed ILBSFNN network is lower than the standard requirements of the wastewater treatment plant, then according to steps 20-24 of the network training phase, the ILBSFNN network is expanded laterally using incremental learning to dynamically enhance nodes until the network monitoring accuracy meets the actual needs of the wastewater treatment plant. Specifically, this experiment selects 50 enhancement nodes as the starting point, with a step size of 10, and expands laterally 5 times. Each expansion obtains the number of feature node windows, the number of feature nodes and enhancement nodes in each window as (10,10,60), (10,10,70), (10,10,80), (10,10,90), and (10,10,100) respectively.

[0107] Step 5: After repeating the experiment 50 times, determine the necessary network parameters for establishing the ILBSFNN model and apply them to the ILBSFNN model for online wastewater treatment process application.

[0108] B. Online application training phase for wastewater treatment processes based on ILBSFNN network:

[0109] Step 6: The online data obtained during the wastewater treatment process is first preprocessed.

[0110] Step 7: In the online monitoring phase, the monitoring output results of the wastewater treatment under abnormal operating conditions obtained by the unique thermal coding method are used to evaluate the model monitoring performance using the evaluation indicators FAR, MAR and ACC selected in this study.

[0111] The training and online application training phases of the ILBSFNN model represent its practical application on a wastewater treatment platform. To evaluate the performance of the wastewater treatment process monitoring model, false alarm rate (FAR), false negative rate (MAR), and accuracy (ACC) are used as evaluation metrics.

[0112]

[0113]

[0114]

[0115] Table 1

[0116]

[0117] Table 2

[0118]

[0119] Table 3 presents the experimental results of KPCA, SAE, BLS, and ILBSFNN methods for monitoring 18 abnormal operating conditions in wastewater treatment processes. As shown in the table, the average false alarm rate of all algorithms is below 0.02. In most cases, the ILBSFNN method achieved the highest test accuracy, outperforming other methods, and its false alarm rate and false negative rate are equal to or better than BLS. Further details on the superiority of the proposed ILBSFNN model can be found in [the table / document / etc.]. Figure 4 The figure summarizes the monitoring results obtained by the ILBSFNN method and other comparative methods based on the indices FAR, MAR, and ACC. Experimental results show that the ILBSFNN method has the lowest MAR (0.0033) and the highest ACC (0.9495). Another advantage of the ILBSFNN model is its reduced network time overhead, which is particularly beneficial in wastewater treatment where higher monitoring accuracy is required to adjust the network. The ILBSFNN model can utilize incremental learning to add augmenting nodes to expand the network and improve the model's monitoring accuracy for abnormal operating conditions. This incremental learning approach allows for simple determination of the number of node parameters while avoiding the need to retrain the entire system, reducing the time overhead associated with network adjustment and updates. Additional experiments were conducted to test this idea, where the original network consisted of 10x10 feature nodes and 50 augmenting nodes, with the number of augmenting nodes increasing by 10 nodes per step until dynamically increasing from 50 to 100. Table 4 details the results of each update. The incremental version of ILBSFNN exhibits similar performance to the original ILBSFNN model. The dynamic increment of augmentation nodes provides an opportunity to improve performance by adjusting the structure and accuracy of the original BSFNN model, achieving the required fault detection accuracy. Notably, although the average false alarm rate (FAR) of ILBSFNN is slightly lower than that of BLS in Table 3, the incremental version of ILBSFNN achieves better FAR, MAR, and ACC than BLS. When the incremental version of the ILBSFNN network uses 100 feature nodes and 100 augmentation nodes, as shown in Table 4, the experimental results demonstrate that the incremental ILBSFNN is more competitive than the original ILBSFNN network. The experiments prove that the incremental learning algorithm is very effective and can adjust the structure and accuracy of the original model to achieve the performance level required for wastewater treatment.

[0120] Table 3

[0121]

[0122]

[0123] Table 4

[0124] Table 4

[0125] Feature nodes Enhanced nodes ACC Time expenditure FAR MAR 100 50 0.9494 59.758 0.0033 0.1354 100 50→60 0.9506 0.1947 0.0029 0.1330 100 60→70 0.9595 0.2024 0.0015 0.1106 100 70→80 0.9631 0.2055 0.0012 0.1008 100 80→90 0.9678 0.2134 0.0008 0.0888 100 90→100 0.9715 0.2094 0.0002 0.0793

Claims

1.A method for monitoring abnormal conditions of a wastewater treatment process based on a wide and slow feature neural network with incremental learning capability, characterized in that Includes the following steps: Preprocessing stage: Step 1: Obtain the dataset; Divide the dataset into a training dataset and a test dataset; Next, determine the input and output process variables of the monitoring model; Input variables include: influent suspended solids concentration, readily biodegradable organic matter concentration, granular inert organic carbon concentration, active heterotrophic bacteria concentration, active autotrophic bacteria concentration, particulate matter concentration from biodegradation, dissolved oxygen concentration, nitrate nitrogen concentration, ammonia nitrogen concentration, soluble biodegradable organic nitrogen concentration, alkalinity molar concentration, particulate biodegradable organic nitrogen concentration, and chronically biodegradable organic matter concentration; input variables are represented by X; output variables include the monitoring and classification results of sludge bulking failure type, toxic shock failure type, and inhibition failure type, represented by Y; test set data X new The model's monitoring performance for abnormal operating conditions in wastewater treatment was tested using 18 fault types related to three fault types: sludge bulking fault, toxic shock fault, and inhibition fault. The input variable names for the 18 fault types were the same as those in the training set; the 18 fault types are those specified in the wastewater treatment industry. Step 2: Represent the training set used for model training with the symbol X, i.e., X∈R N×M Normalization is performed to eliminate the adverse effects on experimental results caused by different assignments. R represents the set of real numbers, N is the number of samples, and M is the number of process variables. For an M-dimensional input signal X = [x1(t), x2(t), ..., x...], the input signal is normalized. M [(t)], where t represents the sample; Network training phase Step 3: The goal of the ILBSFNN model is to find a mapping function. Where m represents the number of slow features, and the slow features of ILBSFNN are obtained through... The mapping function is applied to the input signal X. Following this mapping method, the slow feature information after slow feature processing is obtained as follows: sf=g(X)=[[g(x1(t)),g(x2(t)),…,g(x m (t))] T ] =[sf1,sf2,…,sf m ] T Where x represents the 13 input variables of the training dataset, including the mass concentration of suspended solids in the influent, the mass concentration of readily biodegradable organic matter, the mass concentration of particulate inert organic carbon, the mass concentration of active heterotrophic bacteria, the mass concentration of active autotrophic bacteria, the mass concentration of particles generated by biodegradation, the mass concentration of dissolved oxygen, the mass concentration of nitrate nitrogen, the mass concentration of ammonia nitrogen, the mass concentration of soluble biodegradable organic nitrogen, the alkalinity molar concentration, the mass concentration of particulate biodegradable organic nitrogen, and the mass concentration of chronically biodegradable organic matter; sf is the slow feature set extracted by the ILBSFNN model through SFA, sf1, sf2, and sf... m These are the 1st, 2nd, and mth slow features, sorted from slowest to fastest rate of change; g(X) is... A set of mapping functions; [·] T The symbols denote transpose; g(x1(t)), g(x2(t)) and g(x m (t) respectively represent X=[x1(t),x2(t),…,x M (t)] after the mapping function The obtained slow feature information of the 1st, 2nd and mth; Step 4: The ILBSFNN model first needs to find the first few features that change the slowest; if we want each Δ(sf) to be the most stable, then... i To minimize all of these, from this perspective, we obtain the following optimization problem; Among them, <·> t For slow feature samples at N time points The calculated average value; The first-order difference representing the i-th slow feature is... Δ(·) represents the expression for sf i A natural measure of the rate of change; represent The square of; Step 5: The constraints in Step 4 are: <sf i > t =0 The three constraints are, in order, the zero-mean constraint Msf i > t =0, unit variance constraint and deconstraints The zero-mean and unit variance constraints are used to prevent the desired slow features from being constant; furthermore, these two constraints scale the extracted slow features to a uniform scale, ensuring fairness in comparing the rate of change; the decorrelation constraint requires that the slow features extracted by SFA be pairwise uncorrelated, avoiding simple repetition between different slow features; Δ(sf i ) is sf i The degree of slowness of the rate of change; t For slow feature samples at N time points The calculated average value; Step 6: For discrete data, use first-order differences to represent the rate of change: Each slow feature sf i and sf i-1 All are represented by linear combinations of all input variables, through Mapping functions are used to implement mapping, that is: sf=WX Where W = [w1, w2, ..., w m ] T It is a parameter matrix that needs to be optimized by SFA, w i G(X) is the coefficient vector representing the parameter matrix W. A set of mapping functions, It is w i The transpose of ; sf is the set of slow features extracted by the ILBSFNN model through SFA; it is easy to prove that if we want the input variables to satisfy the zero mean constraint, we need to force the automatic mean removal operation on the input variables. Step 7: Substitute the formula from Step 6 into Steps 4 and 5 to obtain the optimization objective of SFA in the linear case: Where the formula Represented by the symbol A, that is A is the covariance matrix of the first-order difference of the input X; the formula <XX T > Represented by the symbol B, i.e., B = <XX T >, where B is the covariance matrix of the input X. It is the first-order difference of the input X; w i The coefficient vector representing the parameter matrix W; It is w i The transpose of; where <·> t For slow feature samples at N time points The calculated average value; The first-order difference representing the i-th slow feature is... Δ(·) is a natural measure of the rate of change of sf; represent The square of the variance, as shown by the constraints in step 4, indicates a unit variance constraint. <sf i 2 > t =1, <sf i 2 > t =w i T Bw i =1; Step 8: Use singular value decomposition to find the parameter matrix W; first, perform singular value decomposition on matrix B: B = UΛU T Where U is an orthogonal matrix composed of singular vectors; Λ is a rectangular diagonal singular value matrix, with the elements on the diagonal being the singular values ​​of B; T denotes transpose; Step 9: Based on the above equation, the original input X is spherized to remove correlation, i.e.: z = Λ -1 / 2 U T X Where U is an orthogonal matrix composed of singular vectors; Λ is a rectangular diagonal singular value matrix, with the elements on the diagonal being the singular values ​​of B; T denotes the transpose; and z is the spherized result of the input X. <zz T > t =I m I m It is an m-order identity matrix; therefore, the original optimization problem is transformed into finding a matrix P such that sf = Pz and <sfsf T > t =I m This satisfies the unit variance and decorrelation constraints, thus enabling... <sfsf T > t =I m Substituting sf = Pz, we get: PP T = I m Where T denotes transpose; the above formula shows that P is an orthogonal matrix; here, it is made that... The solution that minimizes the value is obtained using the covariance matrix of the first difference with respect to z, i.e., the covariance matrix of the first difference with respect to z. Solve by performing singular value decomposition; It is the first difference of z; Where P is an orthogonal matrix composed of eigenvectors; Ω is a rectangular diagonal singular value matrix, with the diagonal elements being the change values ​​of each slow feature; T represents the transpose. Step 10: Through the above steps, the final parameter coefficient matrix W is obtained by the following formula: W = PΛ -1 / 2 U T Where U is an orthogonal matrix composed of singular vectors; Λ is a rectangular diagonal singular value matrix, with the elements on the diagonal being the singular values ​​of B; T denotes transpose; P is an orthogonal matrix composed of eigenvectors; Step 11: Obtain the slow feature sf based on steps 1-10: sf = WX = PΛ -1 / 2 U T X = Pz Step 12: Input the sf obtained from slow feature analysis into the feature layer window of the ILBSFNN network for further feature extraction, and form the i-th set of feature nodes SF through the feature mapping function. i : in This is the feature mapping function of the ILBSFNN network. The activation function of the feature layer is tansig. The mapping function is used to extract variable information from the sewage dataset and feed it into the enhancement window. Tansig is chosen as the activation function of the feature layer, and p is the number of feature nodes in the feature layer. The weight matrix mapped to the feature nodes of the i-th window. The bias vector mapped to the feature nodes of the i-th window; to prevent weights Due to the influence of randomness, a sparse autoencoder was used for fine-tuning, where the sparse matrix W... s It can be obtained through the following formula: where λ1is the L1 regularization parameter, λ1= 2 -8 , the feature nodes SF i = sfW s after sparsification Step 13: Feature layer output SF p Repeat the previous step p times to obtain the result; SF p = (SF1, SF2,..., SF p ) Step 14: The activation function outputs SF based on the 13 input variables of the wastewater treatment processed by the feature layer of the first layer of the ILBSFNN network. p Perform nonlinear processing on the output SF of the feature layer in step 13. p As an input to the enhancement layer, it addresses the issues of nonlinearity and dynamism in wastewater treatment process data; SF p Constructing the j-th group of augmentation nodes E j As shown below: Where q represents the total number of enhancement nodes in the enhancement layer, ξ j Since sigmoid is a non-linear activation function, it is chosen as the activation function for the enhancement layer, similar to the generation of weights and biases within the feature layer. and It is also a randomly generated connection weight and bias that are uniformly distributed within the interval (0,1); Step 15: The output of the enhancement layer is expressed using E. q express; E q = (E1, E2,..., En) q ) Step 16: After obtaining all feature layers and enhancement layers based on steps 13-15, the first output Y of the ILBSFBNN network is: The actual output Y of the ILBSFNN network output layer represents the classification results of three fault types: sludge bulking fault, toxic shock fault, and inhibition fault; [SF1, SF2, ..., SF p |E1,E2,…,E q Let A1 represent this, A1 = [SF] p |E q The connection weights of the output layer network in the first layer are represented by W. 1 It means that, among them The ILBSFNN network outputs Y = (y1, y2, ..., y...). N ) T ∈R N×C The labels are represented by N, the number of samples is represented by C, the output dimension is represented by R, and the set of real numbers is represented by W. 1 W represents the network connection weights of the first output layer of the ILBSFNN. 1 =[SF p |E q ] + , and Both were obtained using ridge regression techniques, ξ1 and ξ q Both are sigmoid activation functions, i.e. These are the connection weights from the enhancement layer to the output layer in the first layer; and All are tansig feature layer activation functions; These are the connection weights from the feature layer to the output layer in the first layer; Step 17: Based on steps 15 and 16, use the generalized mapping function Representatives are used to generate p groups of feature nodes, i.e. in The weights are randomly generated by the feature layer in the interval [0,1]. It is a bias randomly generated by the feature layer. It uses the tansig activation function; and the generalized mapping function. Representatives are used to generate q groups of enhanced nodes, i.e. in The weights are randomly generated by the feature layer in the interval [0,1]. The biases ξ1, ξ2, ξ are randomly generated by the feature layer. q It is the tansig activation function, where the formula in step 16 is further expressed with respect to the output Y of the ILBSFNN network; Among them, weight and weight parameters These are a series of random weights generated by the feature layer and the enhancement layer, respectively. It is the weight matrix mapped to the feature nodes of the i-th window. The weights are randomly generated in the enhancement layer; symbol Based on generalized mapping function and The composite function, i.e. Step 18: ILBSFNN network connection weight W 1 The solution is obtained using the ridge regression technique which does not require excessive time overhead. in This represents the pseudo-inverse of A1, where A1 = [SF]. p |E q ], λ is the L2 regularization coefficient, λ = 2 -30 I is the identity matrix. This represents the transpose of A1; W 1 W represents the network connection weights of the output layer of the ILBSFNN. 1 =[SF p |E q ] + A1; Step 19: Generate the output y1 of ILBSFNN; Among them, weight and weight parameters These are a series of random weights generated by the feature layer and the enhancement layer, where is It is the weight matrix mapped to the feature nodes of the i-th window. The weights are randomly generated in the enhancement layer; These are the connection weights from the enhancement layer to the output layer of the ILBSFNN network; These are the connection weights from the feature layer to the output layer of the ILBSFNN network; and It includes important feature information learned from training on 13 input variables based on the wastewater treatment training set; Step 20: The ILBSFNN network can dynamically add augmentation nodes through horizontal expansion. It uses the features mapped from the input as the network's feature nodes, then randomly generates augmentation nodes with weights, and directly connects the mapped features and augmentation nodes to the output. The corresponding output coefficients are obtained through pseudo-inversion. Based on this theory, after adding new neural augmentation nodes, the ILBSFNN does not need to learn from scratch; it only needs to adjust the weights related to the new nodes based on the feature layers of the original network. Assuming the ILBSFNN model adds b groups of augmentation nodes, the output y2 of the ILBSFNN network is updated as follows: Among them, W be W is the newly added connection weight randomly generated in the (0,1) interval of the enhancement node layer. bE It adds new connection weights from the node layer to the updated output layer. and These are the weights randomly generated between (0,1) within the feature layer and the enhancement layer, respectively. Step 21: We dynamically update W bE Using the following equation: in This is a unique feature of ILBSFNN that retains parameter outputs before adding additional augmentation nodes, thus avoiding the time overhead of retraining and updating the network from scratch. (Expression) Represented by the symbol D, that is Step 22: Symbol B in step 21 T The specific expression is: Where the expression Represented by the symbol C, that is (C) + Denotes the pseudo-inverse of C; I is the identity matrix. Represented by the symbol D, that is Step 23: Therefore, the connection weights of the updated ILBSFNN network are further expressed based on steps 20-22 as follows: Among them W bE It enhances the connection weights from the node layer to the updated output layer. and These are the connection weights between the output layer and feature layer of the ILBSFNN network, and the augmentation weights; W bE It includes important feature information learned during training, which includes the error between the actual output and the expected output Y of each category based on three fault types: sludge bulking fault, toxic shock fault, and inhibition fault. Step 24: Execute steps 20-23 until the training accuracy of the ILBSFNN network reaches the expected requirements of actual wastewater treatment, save the model parameters, and the dynamic update process of the ILBSFNN network is complete. Online application stage: Real-time acquisition of online test dataset X new Based on the network parameters obtained during the training phase of the ILBSFNN network, the output results of the ILBSFNN network were calculated to test the accuracy of the model in monitoring abnormal operating conditions of wastewater treatment for 18 fault types related to three fault types: sludge bulking fault, toxic shock fault, and inhibition fault. The input variable names for the 18 fault types are the same as those in the training set. Step 25: Use the ILBSFNN framework obtained during the network training phase to design and build a monitoring model for abnormal operating conditions in the wastewater treatment process. The specific details are as follows. Step 26: Obtain new online sampling data X new Then, the test data is preprocessed, and the test set data uses the same input variables as the training set as the input to the model; Step 27: Use the ILBSFNN model obtained in steps 3-20 of the network training phase to perform online monitoring and obtain new monitoring results. new If the monitoring accuracy obtained at this time meets the requirements of the sewage treatment plant, the incremental enhancement nodes are repeatedly increased by dynamically expanding the ILBSFNN monitoring model in steps 21-24 until the obtained monitoring accuracy is greater than or equal to the actual sewage treatment plant's standard for monitoring accuracy. Step 28: Then return to step 23 to continue monitoring the new batch of test data; Step 29: Until the fault type monitoring results for all batches are obtained, the online application monitoring phase uses one-hot encoding to output the results for classification; Step 30: The online application phase of the ILBSFNN model is now complete.

Citation Information

Patent Citations

  • Posting predication system based on nerual network technique

    CN104951836A

  • Software radio frequency spectrum monitoring and identification method based on neural network

    CN110166154A