An intelligent collection and verification method and system for sewage treatment exploration and collection

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By dividing the wastewater treatment plant area into core process units, extracting multi-dimensional clustering features and calibrating differentiated parameters, constructing a dynamic acquisition strategy and model verification, the limitations of data acquisition in wastewater reconnaissance and data collection were solved, achieving efficient and accurate data acquisition and verification.

CN122196386APending Publication Date: 2026-06-12POWERCHINA JIANGXI ELECTRIC POWER ENGINEERING CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: POWERCHINA JIANGXI ELECTRIC POWER ENGINEERING CO LTD
Filing Date: 2026-01-28
Publication Date: 2026-06-12

Application Information

Patent Timeline

28 Jan 2026

Application

12 Jun 2026

Publication

CN122196386A

IPC: G06F18/20; G06F18/15; G06F18/213; G06F18/25; G06F18/2321; G06F18/22; G06N3/045; G06N3/0455; G06N3/0464; G06N3/0442; G06N3/084; G06N3/088; G06Q10/04; G06Q50/26; G06N3/048

AI Tagging

Application Domain

Forecasting Biological models

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Temporal distribution pattern-based medium-term streamflow forecasting method and system
US20260160921A1Rainfall/precipitation gaugesWeather condition prediction
Optimization method for sharing energy storage capacity of distribution network with distributed photovoltaic
CN122203332AForecasting Single network parallel feeding arrangements
A satellite image-based dynamic monitoring and analysis method for forest ecological sensitive areas
CN122200408AForecasting Biological models
A wheat growth regulation and yield prediction integrated method based on artificial intelligence
CN120952231BForecasting Kernel methods
Systems and methods for extracting cash market commodity prices from unstructured data, inferring missing prices, and optimizing the supply chain based on the assembled structured data set
US20260162136A1Forecasting Commerce

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

The existing data collection methods in wastewater reconnaissance and data collection cannot identify parameter fluctuations within process units, are prone to missing highly sensitive areas, involve a large workload of manual sampling, and rely on laboratory testing for difficult-to-measure indicators, resulting in poor timeliness and a lack of predictive verification methods.

⚗Method used

The plant area is divided into core process units, multi-dimensional clustering features are extracted, and dynamic acquisition strategies are generated using differentiated parameter calibration and preset algorithms. A hard rule base and process adaptation model are constructed, and prediction and verification are performed in combination with real-time measurable parameters. A density clustering algorithm is used to optimize sampling points, and a fusion model is constructed for parameter prediction and verification.

🎯Benefits of technology

It enables intelligent and precise data collection and verification for wastewater treatment plants, improving collection efficiency and accuracy, overcoming the limitations of fixed sampling points, and is suitable for site surveys, assessments, and process optimization of municipal and industrial wastewater treatment plants.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122196386A_ABST

Patent Text Reader

Abstract

The application discloses a kind of intelligent collection and verification method and system of sewage treatment reconnaissance fund-raising, method includes the core process unit being divided into several in factory area, extract the multi-dimensional clustering features of core process unit;Different types of core process units are used for differential parameter calibration, and clustering results are obtained using a predetermined algorithm, to generate dynamic collection strategy, the data collected are preprocessed;According to sewage standard, construct hard rule base, construct process adaptation model, use sewage historical working condition data and hard rule base to train process adaptation model, learn parameter characteristic law;Based on the real-time measurable parameters of sewage, the working condition parameters of sewage are predicted, and the predicted value is obtained, compared with the actual value of the working condition parameters of sewage, and verified according to the comparison result.The application breaks through the limitation of fixed sampling point, improves the data collection efficiency and verification accuracy, and is suitable for municipal sewage treatment plant and industrial wastewater treatment station reconnaissance evaluation, process optimization and upgrading scene.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of wastewater treatment technology, specifically to an intelligent data collection and verification method and system for wastewater treatment site surveys. Background Technology

[0002] Site surveys and data collection for municipal wastewater treatment plants are a core prerequisite for process assessment, upgrading and renovation, and project application. The accuracy and representativeness of the data directly determine the feasibility of subsequent solutions.

[0003] Currently, when conducting site surveys and data collection for wastewater treatment, parameters within the process unit are typically collected at fixed sampling points and at a fixed frequency. After the parameters are collected, they are compared with urban wastewater treatment standards to verify the current wastewater treatment technology and determine whether the treatment technology is compliant.

[0004] However, the fixed sampling point and fixed frequency collection method cannot identify the parameter fluctuation differences within the process unit and is prone to missing key data in highly sensitive areas; manual sampling is labor-intensive and time-consuming, making it difficult to meet the needs of real-time field surveys, and when performing verification, relying solely on limit value comparison cannot identify hidden abnormal data; for difficult-to-measure indicators such as BOD5 and sludge heavy metals, laboratory testing is the only option, which has poor timeliness and lacks predictive verification methods. Summary of the Invention

[0005] Based on this, the purpose of this invention is to provide an intelligent data acquisition and verification method and system for wastewater treatment site surveys, aiming to solve the problems of current wastewater site survey data acquisition methods, which cannot identify parameter fluctuation differences within process units, easily miss highly sensitive areas, and rely solely on laboratory testing for difficult-to-measure indicators.

[0006] To achieve the above objectives, this invention proposes an intelligent data collection and verification method for wastewater treatment site surveys. The intelligent data collection and verification method for wastewater treatment site surveys includes: The factory area is divided into several core process units, and the multi-dimensional clustering features of the core process units are extracted. Differentiated parameter calibration is performed based on different types of core process units. Pre-set algorithms are used to obtain clustering results based on the differentiated parameters, and dynamic acquisition strategies are generated. The acquired data is then preprocessed. A hard rule base is constructed based on wastewater standards, a process adaptation model is built, and the process adaptation model is trained using historical wastewater operating data and the hard rule base to learn the parameter characteristic patterns. Based on real-time measurable parameters of wastewater, predict the operating parameters of wastewater, obtain the predicted values, compare them with the actual values of wastewater operating parameters, and verify them based on the comparison results.

[0007] According to one aspect of the above technical solution, the step of dividing the factory area into several core process units and extracting the multi-dimensional clustering features of the core process units specifically includes: Based on the wastewater treatment process, the plant area is divided into several core process units, which include at least a screen tank, a grit chamber, a biological treatment tank, a secondary sedimentation tank, a deep treatment unit, a disinfection tank, and a sludge treatment unit. The core process unit is spatially subdivided into several array grid units. The spatial granularity of the cluster analysis is matched with the actual gradient of the process reaction. Multi-dimensional clustering features of any array grid unit are extracted and the multi-dimensional clustering features are standardized. The multi-dimensional clustering features include basic parameter features, process reaction features, hydraulic characteristic features, and operating condition stability features.

[0008] According to one aspect of the above technical solution, in the steps of calibrating differentiated parameters based on different types of core process units, obtaining clustering results based on differentiated parameters using a preset algorithm, generating a dynamic acquisition strategy, and preprocessing the acquired data: Based on the parameter distribution characteristics of the core process unit, differentiated DBSCAN parameters are set for different types of core process units. The DBSCAN algorithm is used to output the clustering results of the current core process unit using multi-dimensional clustering features. The clustering results include core clusters, noise points, and edge points.

[0009] According to one aspect of the above technical solution, after obtaining the clustering results, the cluster density of the clustering results and the variance of the parameter fluctuation of the array grid unit in the core process unit are combined to determine the regional sensitivity level of the current core process unit. Based on the sensitivity level, different sampling frequencies and sampling equipment types are set to perform dynamic data acquisition. After acquiring the collected multi-source data, the high-frequency fluctuation parameters are denoised, the K-nearest neighbor algorithm is used to fill the short-term missing data, and the LSTM time series prediction model is used to fill the long-term missing data.

[0010] According to one aspect of the above technical solution, in the steps of constructing a hard rule base based on wastewater standards, constructing a process adaptation model, and training the process adaptation model using historical wastewater operating data and the hard rule base to learn the parameter characteristic patterns: Based on wastewater treatment standards, a hard rule base is constructed to define numerical ranges, logical relationships, and temporal consistency. Based on the wastewater treatment process mechanism, feature parameters strongly correlated with the core process unit are selected to form a process feature set. A process adaptation model consisting of an input layer, an encoding layer, a bottleneck layer, a decoding layer, and an output layer is constructed. The process adaptation model is trained and iterated using the parameters of the wastewater historical process feature set to learn the parameter feature patterns of the wastewater treatment process. The real-time time-series feature data verified by the hard rule base is input into the trained process adaptation model, and the reconstructed data is output. The reconstruction error between the reconstructed data and the real-time time-series feature data is calculated. The reconstruction error is the core indicator for abnormal data identification. Based on the reconstruction error, a dynamic update threshold is calculated. If the real-time reconstruction error is greater than the dynamic update threshold, it is determined that there is a potential anomaly in the operating data of the wastewater treatment process. By combining the wastewater treatment process mechanism, we verify whether the real-time time-series characteristic data of potential anomalies violate the process constraint logic.

[0011] According to one aspect of the above technical solution, the step of predicting the operating parameters of wastewater based on real-time measurable parameters of wastewater, obtaining the predicted values, comparing them with the actual values of the wastewater operating parameters, and verifying the results based on the comparison includes: The pre-built fusion model is trained using historical data of measurable parameters and parameters to be tested. Real-time measurable parameters are then input into the trained fusion model to obtain the theoretical predicted values of the parameters to be tested. The deviation between the theoretical predicted values and the actual values of the parameters to be tested is calculated and compared with a preset deviation threshold. If the deviation value is within the deviation threshold range, the prediction is valid. If the deviation value exceeds the deviation threshold range, a working condition correction coefficient is introduced, and the deviation threshold is adjusted based on the working condition correction coefficient.

[0012] This invention also proposes an intelligent data acquisition and verification system for wastewater treatment site surveys. This system is used to implement the aforementioned intelligent data acquisition and verification method for wastewater treatment site surveys. The system includes: The extraction module is used to divide the factory area into several core process units and extract the multi-dimensional clustering features of the core process units. The clustering module is used to perform differentiated parameter calibration based on different types of core process units. It uses a preset algorithm to obtain clustering results based on the differentiated parameters, generates a dynamic acquisition strategy, and preprocesses the acquired data. The learning module is used to construct a hard rule base based on wastewater standards, build a process adaptation model, and train the process adaptation model using historical wastewater operating data and the hard rule base to learn the parameter characteristic patterns. The verification module is used to predict the operating parameters of wastewater based on real-time measurable parameters of wastewater, obtain the predicted values, compare them with the actual values of the wastewater operating parameters, and verify them based on the comparison results.

[0013] The present invention also proposes a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the intelligent data collection and verification method for wastewater treatment site survey as described above.

[0014] The present invention also proposes an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, it implements the intelligent data collection and verification method for wastewater treatment site survey as described above.

[0015] In summary, this invention achieves multi-parameter adaptive data acquisition through dynamic optimization of sampling points using a density clustering algorithm, improving data acquisition efficiency and accuracy. It constructs a three-layer intelligent verification mechanism—rule base verification, unsupervised autoencoder anomaly identification, and prediction verification of parameters to be verified in the fusion model—to verify data in the wastewater treatment process. This invention overcomes the limitations of fixed sampling points, improving data acquisition efficiency and verification accuracy, and provides an intelligent, precise, and standardized solution for wastewater treatment plant site surveys and data collection. It is applicable to site surveys, assessments, process optimization, and upgrading of municipal wastewater treatment plants and industrial wastewater treatment stations.

[0016] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description

[0017] Figure 1 This is a flowchart of the intelligent data collection and verification method for wastewater treatment site survey in Embodiment 1 of the present invention; Figure 2 This is a schematic diagram of the intelligent data acquisition and verification system for wastewater treatment site survey in Embodiment 2 of the present invention; Figure 3 This is a structural block diagram of the electronic device in Embodiment 4 of the present invention. Detailed Implementation

[0018] To make the objectives, features, and advantages of the present invention more apparent and understandable, specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. Several embodiments of the present invention are shown in the drawings. However, the present invention can be implemented in many different forms and is not limited to the embodiments described herein. Rather, these embodiments are provided so that the disclosure of the present invention will be more thorough and complete.

[0019] It should be noted that when an element is referred to as being "fixed to" another element, it can be directly on the other element or there may be an intervening element. When an element is considered to be "connected" to another element, it can be directly connected to the other element or there may be an intervening element. The terms "vertical," "horizontal," "left," "right," "upper," "lower," and similar expressions used herein are for illustrative purposes only and are not intended to indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as limiting the invention.

[0020] In this invention, unless otherwise explicitly specified and limited, the terms "installation," "connection," "linking," and "fixing," etc., should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal communication between two components. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances. The term "and / or" as used herein includes any and all combinations of one or more of the related listed items.

[0021] Example 1 like Figure 1 The diagram shows a flowchart of an intelligent data collection and verification method for wastewater treatment site surveys according to Embodiment 1 of the present invention. The intelligent data collection and verification method for wastewater treatment site surveys includes the following steps S01-S04, wherein: S01. Divide the factory area into several core process units and extract the multi-dimensional clustering features of the core process units; S02. Based on different types of core process units, perform differentiated parameter calibration, use a preset algorithm to obtain clustering results according to the differentiated parameters, generate a dynamic acquisition strategy, and preprocess the acquired data. S03. Construct a hard rule base based on wastewater standards, construct a process adaptation model, and train the process adaptation model using historical wastewater operating data and the hard rule base to learn parameter characteristic patterns. S04. Based on real-time measurable parameters of wastewater, predict the operating parameters of wastewater, obtain the predicted values, compare them with the actual values of wastewater operating parameters, and verify them based on the comparison results.

[0022] Based on the actual process of the wastewater treatment plant, the entire plant area is divided into core process units such as grit chamber, grit chamber, biological treatment tank, secondary sedimentation tank, advanced treatment unit, disinfection tank, and sludge treatment unit. Each process unit is further subdivided into spatial grids. Taking the biological treatment tank as an example, a certain number of grids can be divided along the length direction, and the length and width of each grid are equal to ensure that the spatial granularity and the actual gradient of the process reaction can be matched during cluster analysis.

[0023] In order to abandon the traditional feature selection method of single parameter fluctuation, multi-dimensional clustering features of any array grid cell are extracted and the multi-dimensional clustering features are standardized. The multi-dimensional clustering features include basic parameter features, process reaction features, hydraulic characteristic features, and operating condition stability features.

[0024] Specifically, for basic parameter characteristics, the specific indicators can be the mean / variance of core water quality indicators. These can be calculated by selecting the daily average and daily fluctuation variance of COD, ammonia nitrogen, DO, and MLSS for the grid over the past 1-3 years, and then mapping the values to the [0,1] interval using Min-Max standardization. For hydraulic characteristic characteristics, the specific indicators can be the pollutant removal rate gradient. This can be achieved by calculating the difference in COD removal rate and ammonia nitrogen removal rate between the grid and the preceding grid, and then using range standardization to highlight the spatial differences in reaction intensity. For hydraulic characteristic characteristics, the specific indicators can be the hydraulic retention time (HRT) / flow velocity. The HRT within the grid can be calculated based on the pool size and flow rate, and the water flow velocity within the grid can be measured using a flow meter. Z-score standardization can be used to eliminate the magnitude differences in HRT between different units. For operational stability characteristics, the specific indicator can be the parameter mutation frequency. This can be achieved by statistically analyzing the number of times the parameters in the grid exceeded the normal range over the past year, and then normalizing the data to map it to a stability coefficient in the [0,1] interval.

[0025] In addition, for abnormal historical data, outliers caused by equipment failure or extreme operating conditions can be removed to avoid such data interfering with the clustering algorithm's identification of parameter fluctuation patterns under normal operating conditions.

[0026] Traditional DBSCAN clustering uses globally uniform parameters (neighborhood radius ε, minimum number of points MinPts), which cannot adapt to the parameter distribution differences among different process units in a wastewater treatment plant. This embodiment uses differentiated DBSCAN parameter configuration and cluster density fluctuations to perform dual-dimensional judgment on core process units, achieving accurate identification of high / low sensitivity areas.

[0027] Based on the parameter distribution characteristics of core process units, differentiated DBSCAN parameters are set for different types of core process units, and the values of ε and MinPts are set accordingly to solve the over-clustering or under-clustering problems caused by traditional global parameters. For example, for a biochemical pool, the neighborhood radius can be set to 0.3, and the minimum number of points (MinPts) can be set to 8. The calibration basis is that the parameters in this region fluctuate frequently, the point density within the cluster is high, and reducing ε avoids the merging of different fluctuating clusters. Based on this, differentiated parameter calibration can be performed for core process units.

[0028] Furthermore, multi-dimensional clustering features are input into the DBSCAN algorithm to output the clustering results of the current core process unit. The clustering results include core clusters, noise points, and edge points. Core clusters are regions with high grid point density and high feature similarity, representing process regions with consistent parameter fluctuation patterns. Noise points are isolated grid points that cannot be classified into any core cluster, representing special regions with abrupt parameter changes. Edge points are grid points that are adjacent to core clusters but have slightly lower density, representing transitional regions with parameter fluctuations.

[0029] After obtaining the clustering results, in order to break through the traditional single logic of judging sensitive regions solely based on fluctuation values, this invention constructs a two-dimensional judgment standard of cluster density + parameter fluctuation variance to accurately divide two types of regions: The regional sensitivity level of the current core process unit is determined by combining the cluster density of the clustering results with the variance of the parameter fluctuation of the array grid cells in the core process unit.

[0030] Specifically, if the variance of parameter fluctuation in the grid within the core cluster is greater than a preset threshold (e.g., COD variance > 50 mg / L²), and the cluster density is greater than 0.8 (number of points within the cluster / total number of grid cells in the unit), such as the boundary between the aerobic and anoxic zones of the biological treatment tank, or the mixing zone after the influent screen; or the grid area corresponding to a noise point, such as the connection point of the industrial wastewater branch pipe, or the area where aerator failures are frequent; then this area is determined to be a highly sensitive area. If the variance of parameter fluctuation in the grid within the core cluster is less than a preset threshold, and the cluster density is less than 0.3, such as the effluent area of the secondary sedimentation tank, or the stable reaction zone of the disinfection tank, the parameters in such areas are stable and the fluctuation amplitude is small, then this area is determined to be a low-sensitivity area.

[0031] Different sampling frequencies and sampling device types are set according to the sensitivity level for dynamic data acquisition. In highly sensitive areas, mobile sensor deployment points are added within each highly sensitive grid, while the monitoring range of existing fixed monitoring points is reduced to this area (to avoid wasting resources by covering low-sensitivity areas). For highly sensitive areas corresponding to noise points, temporary encrypted sampling points are set up to track parameter mutation patterns. In low-sensitivity areas, existing fixed monitoring points are retained to reduce redundant sampling points. For large, contiguous low-sensitivity areas, grid-based mobile sampling is used instead of fixed-point sampling. Furthermore, the data acquisition frequency in highly sensitive areas is higher than that in low-sensitivity areas.

[0032] After acquiring the collected multi-source data, the high-frequency fluctuation parameters are denoised, the K-nearest neighbor algorithm is used to fill the short-term missing data, and the LSTM time series prediction model is used to fill the long-term missing data.

[0033] Based on wastewater treatment standards, a hard rule base is constructed to define numerical ranges, logical relationships, and temporal consistency. Abandoning the traditional extensive mode of inputting all parameters, this paper selects characteristic parameters that are strongly correlated with the core reactions of the process unit based on the wastewater treatment process mechanism, forming a process feature set to improve the model's learning accuracy of normal operating conditions. The process feature set can include characteristic parameters such as DO, MLSS, aeration intensity, COD, ammonia nitrogen, sludge settling ratio (SV30), effluent SS, and hydraulic retention time (HRT).

[0034] Construct a process adaptation model consisting of an input layer, an encoding layer, a bottleneck layer, a decoding layer, and an output layer, where: Number of input layer neurons: equal to the number of feature parameters of each process unit × the length of the time window (e.g., if 5 parameters are selected in the aerobic zone of the biological treatment tank and the window length is 6, then the number of input layer neurons = 5 × 6 = 30). Encoding layer: A 3-layer fully connected network is used (the number of neurons is 30→20→10 in turn), and features are extracted through the ReLU activation function (adapting to the nonlinear fluctuation of the parameters). Bottleneck layer: 10 neurons (compression ratio 3:1, which preserves core features and avoids overfitting), and a Dropout layer (dropout=0.2) is introduced to enhance the model's generalization ability; Decoding layer: Symmetrical to the encoding layer (10→20→30), the activation function is Sigmoid (adapted to the normalized parameter range of [0,1]); Output layer: The number of neurons is the same as that of the input layer, and the output is the reconstructed temporal feature data.

[0035] Collect historical wastewater data for the target core process unit within 1-3 years. Use the 3σ criterion to remove abnormal data caused by equipment failure or extreme conditions (rainstorm overflow, industrial wastewater impact), and retain only the time-series characteristic data under normal operating conditions. Perform Min-Max standardization on the data (mapped to the [0,1] interval) and divide it into training set and validation set in a 7:3 ratio.

[0036] The goal is to minimize the mean squared error (MSE) between the input data and the reconstructed data. The Adam optimizer (learning rate = 0.001, number of iterations = 500) is used to train the model. The model performance is monitored through the validation set to avoid overfitting (training is stopped when the MSE of the validation set does not decrease for 10 consecutive rounds).

[0037] After training, the model has mastered the temporal correlation and numerical distribution of parameters under normal operating conditions, and can achieve high-precision reconstruction of normal data (small MSE). However, due to deviation from this pattern, the reconstruction accuracy of abnormal data will decrease significantly (large MSE).

[0038] The real-time time-series feature data verified by the hard rule base is input into the trained process adaptation model, and the reconstructed data is output. The reconstruction error between the reconstructed data and the real-time time-series feature data is calculated. The reconstruction error is the core indicator for abnormal data identification.

[0039] Based on the reconstruction error, a dynamic update threshold is calculated: Dynamic update threshold = Mean of validation set MSE + 2 × Standard deviation of validation set MSE. The threshold is then recalibrated using the latest normal operating data to adapt to the slow changes in process conditions. If the real-time reconstruction error is greater than the dynamic threshold, it is initially determined to be a potential anomaly. Based on the wastewater treatment process mechanism, implicit logical relationship constraints between characteristic parameters are constructed. For samples initially judged as potential anomalies, further verification is conducted to determine whether they violate the process constraint logic: if they violate the constraint, they are ultimately judged as implicit anomalies; if they do not violate the constraint, they are judged as normal deviations caused by operating condition fluctuations, so as to avoid misjudgment.

[0040] The pre-built fusion model is trained using historical data of measurable parameters and parameters to be tested. In this embodiment, the parameters to be tested can be difficult-to-measure indicators such as BOD5, fecal coliforms, and heavy metals in sludge, which rely on laboratory testing. The fusion model can be a CNN-LSTM fusion model, which consists of an input layer, a CNN feature extraction layer, an LSTM temporal learning layer, and a fusion output layer.

[0041] The CNN-LSTM fusion model uses 1-3 years of measurable parameter time series data and corresponding data of parameters to be tested as datasets. The CNN layer and LSTM layer are trained separately. The convolution kernel parameters of the CNN and the weight parameters of the LSTM are adjusted through backpropagation so that the fused features can accurately map to the true values of the difficult-to-measure indicators. Finally, the RMSE of the training set converges to below 0.05 (after standardization).

[0042] After obtaining the trained CNN-LSTM fusion model, real-time measurable parameters are input into the trained fusion model to obtain the theoretical predicted values of the parameters to be tested. The deviation between the theoretical predicted values and the actual values of the parameters to be tested is calculated and compared with a preset deviation threshold. If the deviation value is within the deviation threshold range, the prediction is valid. If the deviation value exceeds the deviation threshold range, a working condition correction coefficient is introduced, and the deviation threshold is adjusted based on the working condition correction coefficient. The correction coefficient is calculated based on the differences between real-time influent load, aeration intensity, and other process parameters and those under normal operating conditions.

[0043] In summary, this invention achieves multi-parameter adaptive data acquisition through dynamic optimization of sampling points using a density clustering algorithm, improving data acquisition efficiency and accuracy. It constructs a three-layer intelligent verification mechanism—rule base verification, unsupervised autoencoder anomaly identification, and prediction verification of parameters to be verified in the fusion model—to verify data in the wastewater treatment process. This invention overcomes the limitations of fixed sampling points, improving data acquisition efficiency and verification accuracy, and provides an intelligent, precise, and standardized solution for wastewater treatment plant site surveys and data collection. It is applicable to site surveys, assessments, process optimization, and upgrading of municipal wastewater treatment plants and industrial wastewater treatment stations.

[0044] Example 2 In another aspect, this invention provides an intelligent data collection and verification system for wastewater treatment site surveys. Please refer to [link / reference needed]. Figure 2 The diagram shows a schematic of the intelligent data acquisition and verification system for wastewater treatment site surveys in Embodiment 2 of the present invention. The intelligent data acquisition and verification system for wastewater treatment site surveys includes: Extraction module 11 is used to divide the factory area into several core process units and extract the multi-dimensional clustering features of the core process units; Clustering module 12 is used to perform differentiated parameter calibration based on different types of core process units. It uses a preset algorithm to obtain clustering results based on the differentiated parameters, generates a dynamic acquisition strategy, and preprocesses the acquired data. Learning module 13 is used to construct a hard rule base based on wastewater standards, construct a process adaptation model, train the process adaptation model using historical wastewater operating data and the hard rule base, and learn the parameter characteristic patterns. The verification module 14 is used to predict the operating parameters of wastewater based on real-time measurable parameters of wastewater, obtain the predicted values, compare them with the actual values of the wastewater operating parameters, and verify them based on the comparison results.

[0045] Example 3 In another aspect, the present invention also proposes a computer-readable storage medium having stored thereon one or more computer programs that, when executed by a processor, implement the above-described intelligent data collection and verification method for wastewater treatment site surveys.

[0046] Those skilled in the art will understand that the logic or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable storage medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable storage medium" can mean any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device.

[0047] More specific examples (a non-exhaustive list) of computer-readable storage media include: electrical connections (electronic devices) having one or more wires, portable computer disk drives (magnetic devices), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable storage media can even be paper or other suitable media on which the program can be printed, since the program can be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.

[0048] Example 4 Figure 3 This is a structural block diagram of an electronic device provided in Embodiment 4. The electronic device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the intelligent data collection and verification method for wastewater treatment site surveys described in the above embodiments. Figure 3 The electronic device 30 shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments of the present invention.

[0049] like Figure 3 As shown, the electronic device 30 can be manifested as a general-purpose computing device, such as a server device. The components of the electronic device 30 may include, but are not limited to: at least one processor 31, at least one memory 32, and a bus 33 connecting different system components (including memory 32 and processor 31).

[0050] Bus 33 includes a data bus, an address bus, and a control bus.

[0051] The memory 32 may include volatile memory, such as RAM 321 (random access memory), and / or cache memory 322, and may further include ROM 323 (read-only memory).

[0052] The memory 32 may also include a program tool 325 having a set (at least one) of program modules 324, including but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of these examples may include an implementation of a network environment.

[0053] The processor 31 executes various functional applications and data processing by running computer programs stored in the memory 32, such as the intelligent acquisition and verification method for wastewater treatment site survey data as described above.

[0054] Electronic device 30 can also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). This communication can be performed via I / O interface 35 (input / output interface). Furthermore, electronic device 30 can also communicate with one or more networks (e.g., local area network (LAN), wide area network (WAN), and / or public network, such as the Internet) via network adapter 36. Figure 3 As shown, network adapter 36 communicates with other modules of the model-generated electronic device 30 via bus 33. It should be understood that, although not shown in the figure, other hardware and / or software modules can be used in conjunction with the model-generated electronic device 30, including but not limited to: microcode, device drivers, redundant processors, disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems.

[0055] It should be noted that although several units / modules or sub-units / modules of the electronic device have been mentioned in the detailed description above, this division is merely exemplary and not mandatory. In fact, according to embodiments of the present invention, the features and functions of two or more units / modules described above can be embodied in one unit / module. Conversely, the features and functions of one unit / module described above can be further divided and embodied by multiple units / modules.

[0056] In the description of this specification, references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.

[0057] The embodiments described above are merely illustrative of several implementations of the present invention, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the present invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these all fall within the protection scope of the present invention. Therefore, the protection scope of this patent should be determined by the appended claims.

Claims

1. A method for intelligent data collection and verification of wastewater treatment site survey data, characterized in that, The intelligent data collection and verification method for wastewater treatment site surveys includes: The factory area is divided into several core process units, and the multi-dimensional clustering features of the core process units are extracted. Differentiated parameter calibration is performed based on different types of core process units. Pre-set algorithms are used to obtain clustering results based on the differentiated parameters, and dynamic acquisition strategies are generated. The acquired data is then preprocessed. A hard rule base is constructed based on wastewater standards, a process adaptation model is built, and the process adaptation model is trained using historical wastewater operating data and the hard rule base to learn the parameter characteristic patterns. Based on real-time measurable parameters of wastewater, predict the operating parameters of wastewater, obtain the predicted values, compare them with the actual values of wastewater operating parameters, and verify them based on the comparison results.

2. The intelligent data collection and verification method for wastewater treatment site surveys according to claim 1, characterized in that, The specific steps of dividing the factory area into several core process units and extracting the multi-dimensional clustering features of the core process units are as follows: Based on the wastewater treatment process, the plant area is divided into several core process units, which include at least a screen tank, a grit chamber, a biological treatment tank, a secondary sedimentation tank, a deep treatment unit, a disinfection tank, and a sludge treatment unit. The core process unit is spatially subdivided into several array grid units. The spatial granularity of the cluster analysis is matched with the actual gradient of the process reaction. Multi-dimensional clustering features of any array grid unit are extracted and the multi-dimensional clustering features are standardized. The multi-dimensional clustering features include basic parameter features, process reaction features, hydraulic characteristic features, and operating condition stability features.

3. The intelligent data collection and verification method for wastewater treatment site surveys according to claim 1, characterized in that, In the steps of calibrating differentiated parameters based on different types of core process units, obtaining clustering results based on the differentiated parameters using a preset algorithm, generating a dynamic acquisition strategy, and preprocessing the acquired data: Based on the parameter distribution characteristics of the core process unit, differentiated DBSCAN parameters are set for different types of core process units. The DBSCAN algorithm is used to output the clustering results of the current core process unit using multi-dimensional clustering features. The clustering results include core clusters, noise points, and edge points.

4. The intelligent data collection and verification method for wastewater treatment site surveys according to claim 3, characterized in that, After obtaining the clustering results, the cluster density of the clustering results and the variance of the parameter fluctuation of the array grid units in the core process unit are combined to determine the regional sensitivity level of the current core process unit. Based on the sensitivity level, different sampling frequencies and sampling equipment types are set to perform dynamic data acquisition. After acquiring the collected multi-source data, the high-frequency fluctuation parameters are denoised, the K-nearest neighbor algorithm is used to fill the short-term missing data, and the LSTM time series prediction model is used to fill the long-term missing data.

5. The intelligent data collection and verification method for wastewater treatment site surveys according to claim 1, characterized in that, In the steps of constructing a hard rule base based on wastewater standards, building a process adaptation model, and training the process adaptation model using historical wastewater operating data and the hard rule base to learn the parameter characteristic patterns: Based on wastewater treatment standards, a hard rule base is constructed to define numerical ranges, logical relationships, and temporal consistency. Based on the wastewater treatment process mechanism, feature parameters strongly correlated with the core process unit are selected to form a process feature set. A process adaptation model consisting of an input layer, an encoding layer, a bottleneck layer, a decoding layer, and an output layer is constructed. The process adaptation model is trained and iterated using the parameters of the wastewater historical process feature set to learn the parameter feature patterns of the wastewater treatment process. The real-time time-series feature data verified by the hard rule base is input into the trained process adaptation model, and the reconstructed data is output. The reconstruction error between the reconstructed data and the real-time time-series feature data is calculated. The reconstruction error is the core indicator for abnormal data identification. Based on the reconstruction error, a dynamic update threshold is calculated. If the real-time reconstruction error is greater than the dynamic update threshold, it is determined that there is a potential anomaly in the operating data of the wastewater treatment process. By combining the wastewater treatment process mechanism, we verify whether the real-time time-series characteristic data of potential anomalies violate the process constraint logic.

6. The intelligent data collection and verification method for wastewater treatment site surveys according to claim 1, characterized in that, The steps of predicting the operating parameters of wastewater based on real-time measurable parameters, obtaining predicted values, comparing them with the actual values of the wastewater operating parameters, and verifying the results based on the comparison include: The pre-built fusion model is trained using historical data of measurable parameters and parameters to be tested. Real-time measurable parameters are then input into the trained fusion model to obtain the theoretical predicted values of the parameters to be tested. The deviation between the theoretical predicted values and the actual values of the parameters to be tested is calculated and compared with a preset deviation threshold. If the deviation value is within the deviation threshold range, the prediction is valid. If the deviation value exceeds the deviation threshold range, a working condition correction coefficient is introduced, and the deviation threshold is adjusted based on the working condition correction coefficient.

7. An intelligent data acquisition and verification system for wastewater treatment site surveys, characterized in that, The intelligent data acquisition and verification system for wastewater treatment site surveys is used to implement the intelligent data acquisition and verification method for wastewater treatment site surveys as described in any one of claims 1-6, and the system includes: The extraction module is used to divide the factory area into several core process units and extract the multi-dimensional clustering features of the core process units. The clustering module is used to perform differentiated parameter calibration based on different types of core process units. It uses a preset algorithm to obtain clustering results based on the differentiated parameters, generates a dynamic acquisition strategy, and preprocesses the acquired data. The learning module is used to construct a hard rule base based on wastewater standards, build a process adaptation model, and train the process adaptation model using historical wastewater operating data and the hard rule base to learn the parameter characteristic patterns. The verification module is used to predict the operating parameters of wastewater based on real-time measurable parameters of wastewater, obtain the predicted values, compare them with the actual values of the wastewater operating parameters, and verify them based on the comparison results.

8. A computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by the processor, the program implements the intelligent data collection and verification method for wastewater treatment site surveys as described in any one of claims 1-6.

9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, When the processor executes the computer program, it implements the intelligent data collection and verification method for wastewater treatment site survey as described in any one of claims 1-6.