Machine learning based multi-point collaborative anomaly detection method and system for pollution sources

By constructing a spatiotemporal data matrix of pollutants from multiple monitoring points and an improved spatiotemporal Transformer network (STTN) model, the problem of difficulty in characterizing the propagation relationship of pollutants in multi-source environments was solved, enabling accurate location of pollution sources and efficient identification of abnormal events.

CN122241537APending Publication Date: 2026-06-19ZHUHAI DINGZHENG GUOXIN TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHUHAI DINGZHENG GUOXIN TECH CO LTD
Filing Date
2026-05-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In complex environments with multiple pollution sources and multiple monitoring points, existing technologies struggle to accurately characterize the propagation and attenuation relationships of pollutants between monitoring points. They also lack constrained modeling of feasible pollution propagation paths and cannot effectively distinguish the interference relationships between different pollution sources, leading to difficulties in pollution event identification and source location inversion.

Method used

A spatiotemporal data matrix of pollutants from multiple monitoring points is constructed. Combined with an improved spatiotemporal Transformer network (STTN) model, the modeling of pollutant propagation process and the location of pollution sources are realized through propagation consistency inversion calculation and cross-source interference propagation structure. This enables the identification of collaborative anomalies at multiple monitoring points.

Benefits of technology

It improves the accuracy of pollution transmission analysis and the reliability of pollution source location, enabling accurate identification of abnormal events and reconstruction of transmission paths in cases of multiple pollution sources overlapping, thereby reducing the risk of misjudgment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241537A_ABST
    Figure CN122241537A_ABST
Patent Text Reader

Abstract

This invention discloses a machine learning-based method and system for multi-point collaborative anomaly detection of pollution sources, comprising: collecting and preprocessing monitoring data, constructing a pollutant spatiotemporal matrix, and generating pollutant time-series input; calculating propagation attenuation and constructing a self-dilution sequence to determine the feasible propagation domain and form a propagation graph structure; inputting the propagation graph into an improved STTN model for spatiotemporal feature learning to obtain a positive pollution field; performing consistency inversion based on the propagation graph to construct a theoretical propagation field and comparing it with the positive results to identify anomalies; constructing a pollution propagation contribution matrix, calculating the pollution source interference intensity, and forming a cross-source interference propagation structure; solving for emission intensity based on the sparse interference structure, determining the pollution source and reconstructing the path, and outputting the detection results. This invention achieves multi-monitoring point collaborative anomaly identification and pollution source localization by constructing a pollutant propagation graph structure and combining it with an improved spatiotemporal Transformer model.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of machine learning technology, and in particular to a method and system for multi-point collaborative anomaly detection of pollution sources based on machine learning. Background Technology

[0002] With the acceleration of industrialization and the continuous expansion of urbanization, factors such as industrial emissions, transportation emissions, and regional pollution transport have led to a multi-source and dynamically changing nature of pollutants in the atmospheric environment. To achieve effective monitoring of environmental pollution, existing technologies typically deploy multiple pollution monitoring stations within cities or industrial parks to continuously collect data on particulate matter and gaseous pollutant concentrations, and then analyze pollutant trends in conjunction with meteorological data. Based on this, some technical solutions use time-series models or machine learning models to predict pollutant concentration changes, thereby enabling the identification and early warning of abnormal pollution events.

[0003] In existing technologies, common pollution anomaly detection methods mainly rely on single-monitoring-point concentration threshold judgment, time-series prediction bias analysis, or statistical feature-based anomaly detection methods. Some technologies also use graph neural networks or spatiotemporal deep learning models to model data from multiple monitoring points to identify the spatiotemporal relationships of pollutant changes between different monitoring points. However, these methods mostly focus on statistical modeling or predictive analysis of pollutant concentration changes, failing to adequately consider the actual propagation relationships of pollutants between monitoring points, and thus struggling to accurately describe the spatial diffusion process of pollutants. When dealing with pollution propagation problems caused by multiple pollution sources, existing technologies often cannot effectively distinguish the mutual influence between different pollution sources, easily misjudging concentration changes caused by multiple sources as a single pollution event.

[0004] In complex environments with multiple pollution sources and multiple monitoring points, existing technologies still have significant shortcomings. For example, they are difficult to accurately characterize the propagation and attenuation relationship of pollutants between monitoring points, lack constrained modeling of feasible pollution propagation paths, cannot verify the consistency of pollution concentration changes observed at monitoring points, and are difficult to identify the interference relationship between multiple pollution sources and effectively invert the location of pollution sources.

[0005] Therefore, how to provide a multi-point collaborative anomaly detection method and system for pollution sources based on machine learning is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention

[0006] One objective of this invention is to propose a machine learning-based method and system for detecting multi-point collaborative anomalies in pollution sources. This invention combines spatiotemporal data analysis of pollutants from multiple monitoring points, dynamic propagation graph structure modeling, and an improved spatiotemporal Transformer network (STTN) model to model the propagation process of pollutants among multiple monitoring points. Through propagation consistency inversion calculations and the construction of cross-source interference propagation structures, it achieves the identification of collaborative anomalies at multiple monitoring points and the inversion analysis of pollution source emission intensity and propagation path, thereby completing the pollution source localization. This invention can effectively characterize the spatial propagation relationship of pollutants, solve the problem of difficulty in distinguishing pollution propagation under conditions of multiple superimposed pollution sources, and has the advantages of high anomaly identification accuracy, strong pollution propagation analysis capability, and high reliability of pollution source localization.

[0007] The machine learning-based multi-point collaborative anomaly detection method for pollution sources according to embodiments of the present invention includes: Collect pollutant concentration data and environmental auxiliary data from multiple pollution monitoring points and preprocess them to construct a spatiotemporal data matrix of pollutants from multiple monitoring points and generate time-series input data of pollutants. Based on the time-series input data of pollutants, the pollutant propagation attenuation relationship is calculated and the pollutant self-dilution sequence is constructed to determine the feasible region of pollution propagation and form a dynamic propagation graph structure. The dynamic propagation graph structure is used as a spatial constraint input to improve the spatiotemporal Transformer network STTN model. The time series input data of pollutants is input into the STTN model for spatiotemporal feature learning and forward propagation reconstruction calculation to obtain the forward pollution field reconstruction result. Based on the dynamic propagation graph structure and pollutant time-series input data, the propagation consistency inversion calculation is performed on the changes in pollution concentration at monitoring points to construct a theoretical pollution propagation field. The difference between the theoretical field and the reconstructed results of the forward pollution field is calculated to obtain the pollution propagation consistency index and identify collaborative abnormal events at multiple monitoring points. After identifying multi-monitoring point coordinated abnormal events, a pollution propagation contribution matrix is ​​constructed based on the dynamic propagation diagram structure and the pollutant concentration change sequence of the monitoring points. The mutual interference intensity between candidate pollution sources is calculated to construct a pollution source interference matrix, forming a cross-source interference propagation structure. Based on the cross-source interference propagation structure, the emission intensity sequence of candidate pollution sources is solved by sparse constraint to determine the number of pollution sources, the location of pollution sources and the emission intensity change sequence. The pollution propagation path is reconstructed based on the dynamic propagation graph structure to obtain the multi-point collaborative anomaly detection results and pollution source location results.

[0008] Optionally, the pollutant concentration data includes particulate matter concentration data and gaseous pollutant concentration data, and the environmental auxiliary data includes wind speed data, wind direction data, temperature data, humidity data, and air pressure data.

[0009] Optionally, the step of constructing a multi-monitoring point pollutant spatiotemporal data matrix and generating pollutant time-series input data includes: The pollutant concentration data and environmental auxiliary data of each monitoring point are time-aligned according to a unified time scale and arranged according to the monitoring point number and time order to construct a pollutant spatiotemporal data matrix containing multiple monitoring points and multiple time steps. The pollutant spatiotemporal data matrix is ​​then processed by sliding segmentation according to a preset time window to divide the continuous time series data into multiple time segments and combine them in time order to form pollutant time series input data.

[0010] Optionally, determining the feasible region for pollution propagation and forming a dynamic propagation graph structure includes: Based on the pollutant time series input data, extract the pollutant concentration change sequence of each monitoring point in continuous time steps, obtain the geographical location information and environmental auxiliary data of each monitoring point, calculate the spatial distance between monitoring points according to the spatial relationship between monitoring points, and establish the spatiotemporal correlation between each monitoring point in time order. Based on the pollutant concentration change sequence at each monitoring point, the pollutant concentration change amplitude between any two monitoring points in adjacent time steps is calculated. Combined with the spatial distance between monitoring points and the wind speed and direction information in the environmental auxiliary data, the propagation delay time corresponding to the spread of pollutants from one monitoring point to another is determined, and the pollutant propagation attenuation relationship between monitoring points is obtained. Based on the pollutant propagation and attenuation relationship, the degree of propagation and attenuation between each monitoring point in multiple consecutive time windows is statistically analyzed, the average attenuation intensity of each monitoring point to adjacent monitoring points is calculated, and a pollutant self-dilution sequence is constructed based on the average attenuation intensity. Based on the pollutant self-dilution sequence, the propagation attenuation intensity between each monitoring point is screened by threshold, and the propagation relationship is checked for temporal consistency by combining the propagation delay time. Monitoring point pairs that meet the propagation attenuation conditions and have continuous temporal consistency are identified, and the feasible region for pollution propagation is determined. Based on the feasible region of pollution propagation, a set of candidate upstream monitoring points corresponding to each monitoring point is identified, and the propagation connection relationship between monitoring points is established according to the propagation attenuation intensity and propagation delay time. A dynamic propagation graph structure containing the set of monitoring points, the propagation connection relationship and the propagation weight is constructed.

[0011] Optionally, obtaining the positive contamination field reconstruction result includes: The dynamic propagation graph structure is aligned and encoded with the pollutant time-series input data. Each monitoring point and propagation connection in the dynamic propagation graph structure is mapped to a spatial topology input. The pollutant concentration values ​​of each monitoring point in the pollutant time-series input data at continuous time steps are mapped to a time series input, which serve as the spatial and temporal inputs of the improved spatiotemporal Transformer network STTN model. An improved spatiotemporal Transformer network STTN model is constructed, which includes a propagation feasible domain gated spatial attention layer, a self-dilution modulation temporal attention layer, and a physical constraint fusion reconstruction layer. The spatial attention layer with propagation feasible domain gate is constrained by the dynamic propagation graph structure. It performs attention calculation on the spatial association between monitoring points and performs gating processing on the spatial attention weights according to the pollution propagation feasible domain. Only the spatial propagation weights between monitoring point pairs within the propagation feasible domain are retained, resulting in a spatial feature representation constrained by the propagation feasible domain. The self-dilution modulated time attention layer takes pollutant time-series input data and pollutant self-dilution sequence as input, performs attention calculation on the correlation between time steps at multiple time scales, and modulates the time attention weights using the dilution intensity corresponding to the pollutant self-dilution sequence to obtain a time feature representation that simultaneously characterizes the pollutant time change characteristics and self-dilution properties. The physical constraint fusion reconstruction layer fuses the propagation feasible domain gated spatial feature representation with the self-dilution modulation temporal feature representation, and introduces the propagation weight information related to the monitoring points in the dynamic propagation graph structure. It performs forward propagation reconstruction calculation under physical constraints on the fused features to generate the pollution concentration reconstruction results of each monitoring point at the target time step. The pollution concentration reconstruction results of each monitoring point output by the third physical constraint fusion reconstruction layer are combined to form a positive pollution field reconstruction result.

[0012] Optionally, obtaining the pollution transmission consistency index and identifying coordinated abnormal events at multiple monitoring points includes: Based on the dynamic propagation graph structure and pollutant time-series input data, the pollutant concentration change sequence of each monitoring point in continuous time steps is extracted, and the set of candidate upstream monitoring points corresponding to each monitoring point is determined according to the propagation connection relationship between each monitoring point in the dynamic propagation graph structure. For each monitoring point, the pollutant concentration changes of each upstream monitoring point are time-aligned according to the propagation delay order within the candidate upstream monitoring point set. Combined with the attenuation intensity corresponding to the pollutant self-dilution sequence, the pollution propagation contribution of each upstream monitoring point to the target monitoring point within the historical time window is back-calculated to obtain the theoretical propagation contribution sequence of each upstream monitoring point to the target monitoring point. The propagation consistency inversion calculation is performed on the theoretical propagation contribution sequence. By accumulating and superimposing the propagation contributions of each upstream monitoring point within multiple consecutive time windows, and combining the propagation weights in the dynamic propagation graph structure, the theoretical pollution concentration of the target monitoring point at the current time step is estimated by inversion, and the theoretical pollution concentration sequence of each monitoring point is obtained. The theoretical pollution concentrations obtained from all monitoring points at each time step are spatially combined according to the monitoring point number to construct a theoretical pollution propagation field, where the theoretical pollution propagation field represents the theoretical diffusion state of pollutants formed under the joint constraints of dynamic propagation diagram structure, propagation delay, and pollutant self-dilution sequence. The difference between the theoretical pollution propagation field and the reconstructed positive pollution field is calculated. By statistically analyzing the concentration differences of each monitoring point at the same time step and performing overall aggregation processing, a pollution propagation consistency index is obtained. When the pollution propagation consistency index exceeds a preset threshold, it is determined that there is a multi-monitoring point collaborative abnormal event.

[0013] Optionally, calculating the mutual interference intensity between candidate pollution sources to construct a pollution source interference matrix and form a cross-source interference propagation structure includes: After identifying a coordinated anomaly event at multiple monitoring points, based on the dynamic propagation graph structure and the pollutant concentration change sequence at the monitoring points, monitoring points or emission points located upstream of the anomaly area and having a propagation impact are selected as a set of candidate pollution sources, and the pollutant concentration change sequence of each candidate pollution source at continuous time steps is extracted. Based on the relationship between the candidate pollution source set and each monitoring point in the dynamic propagation graph structure, and combined with the pollutant self-dilution sequence and propagation delay information, the pollution contribution of each candidate pollution source to each monitoring point at each time step is calculated point by point and time by time. The contribution results are arranged according to the candidate pollution source number and the monitoring point number to form a pollution propagation contribution matrix. Based on the pollution transmission contribution matrix, we conduct correlation analysis on the pollution transmission contribution change patterns of any two candidate pollution sources at multiple monitoring points and multiple time steps. We statistically analyze the situations where the transmission contributions of two candidate pollution sources increase simultaneously, cancel each other out, or alternately dominate at the same monitoring point and similar time steps. We calculate the mutual interference intensity between candidate pollution sources and arrange the interference intensity according to the candidate pollution source number to form a pollution source interference matrix. Based on the pollution propagation contribution matrix and the pollution source interference matrix, candidate pollution sources are grouped and clustered. Candidate pollution sources with mutual interference relationships are combined into cross-source interference subsets. Each cross-source interference subset is associated with the set of monitoring points that have a major impact and the corresponding propagation path, thus constructing a cross-source interference propagation structure that includes a set of candidate pollution sources, a pollution propagation contribution matrix, and a pollution source interference matrix.

[0014] Optionally, obtaining the multi-point collaborative anomaly detection results and pollution source location results includes: Based on the cross-source interference propagation structure, the set of candidate pollution sources and the pollution propagation contribution information of each candidate pollution source at continuous time steps are extracted. Based on the pollution propagation contribution matrix, an emission intensity sequence representation of the impact of candidate pollution sources on pollutant concentrations at each monitoring point is established. Based on the correspondence between the actual pollutant concentration change sequence observed at the monitoring points and the propagation contribution of candidate pollution sources, pollution propagation reconstruction data is constructed. The emission intensity of each candidate pollution source at different time steps and the corresponding propagation contribution are combined and calculated to obtain the pollution concentration sequence at the monitoring points. The difference between the actual pollutant concentration sequence and the pollution concentration sequence at the monitoring point is calculated, and a problem for solving the emission intensity of the pollution source is established with the goal of minimizing the difference. Sparse constraints are introduced to gradually eliminate candidate pollution sources with emission intensity below a preset threshold. Under sparse constraints, the emission intensity sequence of candidate pollution sources is iteratively solved, and candidate pollution sources with emission intensity or propagation contribution below the preset threshold are gradually eliminated. The optimal emission intensity sequence that satisfies the condition of minimizing difference is obtained, and the number and corresponding locations of pollution sources actually involved in pollution propagation are determined. The path of pollution propagation is backtracked by combining the dynamic propagation graph structure, and the propagation path of pollutants from pollution sources to each monitoring point is traced along the propagation connection relationship. The results of multi-point collaborative anomaly detection of pollution sources and the location of pollution sources are obtained.

[0015] A machine learning-based multi-point collaborative anomaly detection system for pollution sources according to an embodiment of the present invention includes: The data processing module is used to collect pollutant concentration data and environmental auxiliary data, perform preprocessing, and generate pollutant time-series input data. The propagation modeling module is used to calculate the pollutant propagation attenuation relationship and construct the pollutant self-dilution sequence based on the pollutant time-series input data, forming a dynamic propagation graph structure. The forward reconstruction module is used to improve the STTN model by taking the dynamic propagation graph structure as a spatial constraint input, and obtain the forward pollution field reconstruction result; The propagation inversion module is used to perform propagation consistency inversion calculations based on the dynamic propagation graph structure and pollutant time-series input data, and to identify collaborative abnormal events at multiple monitoring points. The interference modeling module is used to construct a pollution propagation contribution matrix and calculate the interference intensity of candidate pollution sources after identifying coordinated abnormal events, based on the dynamic propagation graph structure and the pollutant concentration change sequence at monitoring points, thus forming a cross-source interference propagation structure. The pollution source localization module is used to solve the sparse constraint of the emission intensity sequence of candidate pollution sources based on the cross-source interference propagation structure, so as to obtain the multi-point collaborative anomaly detection results and pollution source localization results.

[0016] The beneficial effects of this invention are: This invention constructs a spatiotemporal data matrix of pollutants from multiple monitoring points and establishes pollutant propagation attenuation relationships and self-dilution sequences based on time-series input data. This further determines the feasible domain for pollution propagation and forms a dynamic propagation graph structure, enabling the effective characterization of pollutant propagation relationships between different monitoring points. Compared to traditional methods that rely solely on single-point concentration thresholds or time-series predictions, this invention introduces spatial propagation constraints at the data level, giving the pollutant propagation process a clear spatial correlation structure during analysis and improving the ability to describe pollution changes across multiple monitoring points.

[0017] This invention utilizes an improved spatiotemporal Transformer Network (STTN) model to learn the spatiotemporal characteristics of pollutant time-series data under the spatial constraints of a dynamic propagation graph structure. A theoretical pollution propagation field is constructed through propagation consistency inversion calculation. The theoretical pollution propagation results are compared and analyzed with the positive pollution field obtained from the model to obtain a pollution propagation consistency index, enabling the identification of coordinated anomalies at multiple monitoring points. By introducing a propagation consistency inversion mechanism, the rationality of propagation of abnormal changes in monitoring data can be verified, effectively avoiding misjudgments caused by single-point measurement errors or short-term fluctuations, and improving the reliability of anomaly detection results.

[0018] This invention constructs a cross-source interference propagation structure by building a pollution propagation contribution matrix and a pollution source interference matrix. It then performs sparse constraint solving on the emission intensity sequence of candidate pollution sources and combines it with a dynamic propagation graph structure to perform backtracking analysis on the pollution propagation path. This enables the inversion calculation of the number, location, and emission intensity changes of pollution sources. It can identify the interference relationship between different pollution sources in the case of multiple pollution sources superimposed, and accurately reconstruct the pollution propagation path, improving the accuracy and stability of pollution source location. This provides a reliable technical means for pollution source tracing analysis in complex environments. Attached Figure Description

[0019] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart of the multi-point collaborative anomaly detection method for pollution sources based on machine learning proposed in this invention; Figure 2 This is a schematic diagram of the structure of the multi-point collaborative anomaly detection system for pollution sources based on machine learning proposed in this invention. Detailed Implementation

[0020] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.

[0021] refer to Figure 1 A machine learning-based multi-point collaborative anomaly detection method for pollution sources includes: Collect pollutant concentration data and environmental auxiliary data from multiple pollution monitoring points and preprocess them to construct a spatiotemporal data matrix of pollutants from multiple monitoring points and generate time-series input data of pollutants. Based on the time-series input data of pollutants, the pollutant propagation attenuation relationship is calculated and the pollutant self-dilution sequence is constructed to determine the feasible region of pollution propagation and form a dynamic propagation graph structure. The dynamic propagation graph structure is used as a spatial constraint input to improve the spatiotemporal Transformer network STTN model. The time series input data of pollutants is input into the STTN model for spatiotemporal feature learning and forward propagation reconstruction calculation to obtain the forward pollution field reconstruction result. Based on the dynamic propagation graph structure and pollutant time-series input data, the propagation consistency inversion calculation is performed on the changes in pollution concentration at monitoring points to construct a theoretical pollution propagation field. The difference between the theoretical field and the reconstructed results of the forward pollution field is calculated to obtain the pollution propagation consistency index and identify collaborative abnormal events at multiple monitoring points. After identifying multi-monitoring point coordinated abnormal events, a pollution propagation contribution matrix is ​​constructed based on the dynamic propagation diagram structure and the pollutant concentration change sequence of the monitoring points. The mutual interference intensity between candidate pollution sources is calculated to construct a pollution source interference matrix, forming a cross-source interference propagation structure. Based on the cross-source interference propagation structure, the emission intensity sequence of candidate pollution sources is solved by sparse constraint to determine the number of pollution sources, the location of pollution sources and the emission intensity change sequence. The pollution propagation path is reconstructed based on the dynamic propagation graph structure to obtain the multi-point collaborative anomaly detection results and pollution source location results.

[0022] In this embodiment, the pollutant concentration data includes particulate matter concentration data and gaseous pollutant concentration data, and the environmental auxiliary data includes wind speed data, wind direction data, temperature data, humidity data, and air pressure data.

[0023] In this embodiment, the step of constructing a multi-monitoring point pollutant spatiotemporal data matrix and generating pollutant time-series input data includes: The pollutant concentration data and environmental auxiliary data of each monitoring point are time-aligned according to a unified time scale and arranged according to the monitoring point number and time order to construct a pollutant spatiotemporal data matrix containing multiple monitoring points and multiple time steps. The pollutant spatiotemporal data matrix is ​​then processed by sliding segmentation according to a preset time window to divide the continuous time series data into multiple time segments and combine them in time order to form pollutant time series input data.

[0024] In this embodiment, determining the feasible region for pollution propagation and forming a dynamic propagation graph structure includes: Based on the pollutant time series input data, extract the pollutant concentration change sequence of each monitoring point in continuous time steps, obtain the geographical location information and environmental auxiliary data of each monitoring point, calculate the spatial distance between monitoring points according to the spatial relationship between monitoring points, and establish the spatiotemporal correlation between each monitoring point in time order. Based on the pollutant concentration change sequence at each monitoring point, the pollutant concentration change amplitude between any two monitoring points in adjacent time steps is calculated. Combined with the spatial distance between monitoring points and the wind speed and direction information in the environmental auxiliary data, the propagation delay time corresponding to the spread of pollutants from one monitoring point to another is determined, and the pollutant propagation attenuation relationship between monitoring points is obtained. Based on the pollutant propagation attenuation relationship, the degree of propagation attenuation among monitoring points in multiple consecutive time windows is statistically analyzed. The average attenuation intensity of each monitoring point on its adjacent monitoring points is calculated, and a pollutant self-dilution sequence is constructed based on the average attenuation intensity. Specifically, the calculation of the average attenuation intensity of each monitoring point on its adjacent monitoring points is as follows: Extract the pollutant concentration change sequence of the target monitoring point and adjacent monitoring points at each time step according to the preset time window, and align the pollutant concentration change sequence according to the propagation delay time. After completing the time alignment, the pollutant concentration decay value between the target monitoring point and the adjacent monitoring point within each continuous time window was calculated. The pollutant concentration attenuation values ​​obtained within multiple consecutive time windows are accumulated and averaged to obtain the average attenuation intensity of the target monitoring point to adjacent monitoring points. Based on the pollutant self-dilution sequence, the propagation attenuation intensity between each monitoring point is screened by threshold, and the propagation relationship is checked for temporal consistency by combining the propagation delay time. Monitoring point pairs that meet the propagation attenuation conditions and have continuous temporal consistency are identified, and the feasible region for pollution propagation is determined. Based on the feasible region of pollution propagation, a set of candidate upstream monitoring points corresponding to each monitoring point is identified, and the propagation connection relationship between monitoring points is established according to the propagation attenuation intensity and propagation delay time. A dynamic propagation graph structure containing the set of monitoring points, the propagation connection relationship and the propagation weight is constructed.

[0025] In this embodiment, obtaining the positive pollution field reconstruction result includes: The dynamic propagation graph structure is aligned and encoded with the pollutant time-series input data. Each monitoring point and propagation connection in the dynamic propagation graph structure is mapped to a spatial topology input. The pollutant concentration values ​​of each monitoring point in the pollutant time-series input data at continuous time steps are mapped to a time series input, which serve as the spatial and temporal inputs of the improved spatiotemporal Transformer network STTN model. An improved spatiotemporal Transformer network STTN model is constructed, which includes a propagation feasible domain gated spatial attention layer, a self-dilution modulation temporal attention layer, and a physical constraint fusion reconstruction layer. The spatial attention layer with propagation feasible domain gate is constrained by the dynamic propagation graph structure. It performs attention calculation on the spatial association between monitoring points and performs gating processing on the spatial attention weights according to the pollution propagation feasible domain. Only the spatial propagation weights between monitoring point pairs within the propagation feasible domain are retained, resulting in a spatial feature representation constrained by the propagation feasible domain. The self-dilution modulated time attention layer takes pollutant time-series input data and pollutant self-dilution sequences as inputs. It performs attention calculations on the correlations between time steps at multiple time scales, and modulates the time attention weights using the dilution intensity corresponding to the pollutant self-dilution sequence. This yields a time feature representation that simultaneously characterizes the pollutant's temporal variation and self-dilution properties. Specifically, the modulation of the time attention weights using the dilution intensity corresponding to the pollutant self-dilution sequence is as follows: Extract pollutant time-series input data from each monitoring point at a continuous time step, and divide it into multiple time segments according to a preset time scale; Based on the pollutant autodilution sequence corresponding to each time segment, obtain the dilution intensity change value between each time step; The correlation calculation results between the dilution intensity change value and the corresponding time step are matched. When the dilution intensity change value is higher than the preset threshold, the time attention weight of the corresponding time step is reduced. When the dilution intensity change value is within the preset threshold range and the propagation continuity meets the preset conditions, the time attention weight of the corresponding time step is increased. Then, the attention weights of each modulated time step are normalized to obtain a temporal feature representation that includes the temporal variation characteristics and self-dilution properties of pollutants. The physical constraint fusion reconstruction layer fuses the propagation feasible domain gated spatial feature representation with the self-dilution modulation temporal feature representation, and introduces propagation weight information related to monitoring points from the dynamic propagation graph structure. It then performs forward propagation reconstruction calculations under physical constraints on the fused features, generating the pollution concentration reconstruction results for each monitoring point at the target time step. Specifically, the forward propagation reconstruction calculations under physical constraints on the fused features are as follows: Extract the propagation connection relationships and corresponding propagation weight information between each monitoring point in the dynamic propagation graph structure, and determine the set of monitoring points participating in the propagation calculation based on the propagation feasible region; Within the target time step, the pollutant concentration characteristics of each monitoring point in the fusion features are combined with the propagation weights of its neighboring monitoring points to calculate the propagation contribution value of each neighboring monitoring point to the target monitoring point. The propagation contribution values ​​generated by all adjacent monitoring points at this time step are accumulated to obtain the predicted pollution concentration of the target monitoring point at the target time step. The calculation is repeated for all monitoring points to generate the pollution concentration reconstruction results of each monitoring point at the target time step. The pollution concentration reconstruction results of each monitoring point output by the third physical constraint fusion reconstruction layer are combined to form a positive pollution field reconstruction result.

[0026] In this embodiment, obtaining the pollution transmission consistency index and identifying coordinated abnormal events at multiple monitoring points includes: Based on the dynamic propagation graph structure and pollutant time-series input data, the pollutant concentration change sequence of each monitoring point in continuous time steps is extracted, and the set of candidate upstream monitoring points corresponding to each monitoring point is determined according to the propagation connection relationship between each monitoring point in the dynamic propagation graph structure. For each monitoring point, the pollutant concentration changes at each upstream monitoring point are time-aligned according to the propagation delay order within the candidate upstream monitoring point set. Combined with the attenuation intensity corresponding to the pollutant self-dilution sequence, the pollution propagation contribution of each upstream monitoring point to the target monitoring point within the historical time window is backward-estimated, resulting in the theoretical propagation contribution sequence of each upstream monitoring point to the target monitoring point. Specifically, the backward-estimated pollution propagation contribution of each upstream monitoring point to the target monitoring point within the historical time window is as follows: The propagation path relationship between the target monitoring point and each candidate upstream monitoring point is determined based on the dynamic propagation diagram structure, and the pollutant concentration change sequence of each upstream monitoring point is back-tracked and located based on the propagation delay time. After completing the time backtracking, the pollutant concentration change values ​​at the corresponding time steps are extracted, and combined with the decay intensity at the corresponding time steps in the pollutant self-dilution sequence, the pollution propagation impact of the upstream monitoring point on the target monitoring point within the time window is calculated step by step. The propagation impact of each time step is continuously accumulated within the historical time window to obtain the theoretical propagation contribution value of each upstream monitoring point to the target monitoring point within the historical time window. The theoretical propagation contribution value of each time step is recorded in chronological order to form the theoretical propagation contribution sequence of the corresponding upstream monitoring point to the target monitoring point. Propagation consistency inversion calculation is performed on the theoretical propagation contribution sequence. This is done by accumulating and superimposing the propagation contributions of each upstream monitoring point over multiple consecutive time windows, and combining this with the propagation weights in the dynamic propagation graph structure. This process then inverts and estimates the theoretical pollution concentration at the target monitoring point at the current time step, resulting in the theoretical pollution concentration sequence for each monitoring point. Specifically, the inversion estimation of the theoretical pollution concentration at the target monitoring point at the current time step involves: Extract the theoretical propagation contribution value of each upstream monitoring point within the historical time window corresponding to the current time step, and arrange them in order of propagation delay; Based on the propagation weights between each upstream monitoring point and the target monitoring point in the dynamic propagation graph structure, the theoretical propagation contribution values ​​are weighted. The weighted propagation contribution values ​​of each upstream monitoring point are accumulated and superimposed at the current time step to obtain the theoretical pollution concentration estimate of the target monitoring point at the current time step. The above processing is repeated for continuous time steps to form the theoretical pollution concentration sequence of the target monitoring point in continuous time steps. The theoretical pollution concentrations obtained from all monitoring points at each time step are spatially combined according to the monitoring point number to construct a theoretical pollution propagation field, where the theoretical pollution propagation field represents the theoretical diffusion state of pollutants formed under the joint constraints of dynamic propagation diagram structure, propagation delay, and pollutant self-dilution sequence. The difference between the theoretical pollution propagation field and the reconstructed positive pollution field is calculated. By statistically analyzing the concentration differences of each monitoring point at the same time step and performing overall aggregation processing, a pollution propagation consistency index is obtained. When the pollution propagation consistency index exceeds a preset threshold, it is determined that there is a multi-monitoring point collaborative abnormal event.

[0027] In this embodiment, calculating the mutual interference intensity between candidate pollution sources to construct a pollution source interference matrix and form a cross-source interference propagation structure includes: After identifying a coordinated anomaly event at multiple monitoring points, based on the dynamic propagation graph structure and the pollutant concentration change sequence at the monitoring points, monitoring points or emission points located upstream of the anomaly area and having a propagation impact are selected as a set of candidate pollution sources, and the pollutant concentration change sequence of each candidate pollution source at continuous time steps is extracted. Based on the propagation path relationship between the candidate pollution source set and each monitoring point in the dynamic propagation graph structure, and combined with pollutant self-dilution sequence and propagation delay information, the pollution contribution of each candidate pollution source to each monitoring point at each time step is calculated point-by-point and time-by-time. The contribution results are arranged according to the candidate pollution source number and the monitoring point number to form a pollution propagation contribution matrix. Specifically, the pollution contribution of each candidate pollution source to each monitoring point at each time step is calculated point-by-point and time-by-time. Based on the dynamic propagation diagram structure, the propagation path between candidate pollution sources and each monitoring point is determined. Based on the propagation delay information, the pollutant concentration change data of candidate pollution sources at historical time steps are time-aligned, and the source-end pollution concentration change value corresponding to the current time step of the target monitoring point is extracted. By combining the pollutant self-dilution sequence along the propagation path from the candidate pollution source to the target monitoring point, the pollution concentration change value at the source end is attenuated and corrected segment by segment along the propagation path to obtain the single-path pollution contribution value of the candidate pollution source to the target monitoring point at the current time step. The single-path pollution contribution values ​​of each feasible propagation path from the candidate pollution source to the target monitoring point are summarized to obtain the total pollution contribution value of the candidate pollution source to the target monitoring point at the current time step; Based on the pollution propagation contribution matrix, a correlation analysis is performed on the pollution propagation contribution change patterns of any two candidate pollution sources at multiple monitoring points and multiple time steps. The cases where the propagation contributions of two candidate pollution sources simultaneously increase, cancel each other out, or alternately dominate at the same monitoring point and similar time steps are statistically analyzed. The mutual interference intensity between candidate pollution sources is calculated, and the interference intensity is arranged according to the candidate pollution source number to form a pollution source interference matrix. Specifically, the calculation of the mutual interference intensity between candidate pollution sources involves: Extract the pollution propagation contribution values ​​of any two candidate pollution sources at multiple monitoring points and multiple time steps, and align them according to the monitoring point number and time order; For the same monitoring point, the direction and magnitude of the change in the pollution propagation contribution of two candidate pollution sources within similar time steps are compared. When the propagation contribution of the two candidate pollution sources increases synchronously, it is called enhanced interference. When the propagation contribution of one candidate pollution source increases while the propagation contribution of the other candidate pollution source decreases, it is called canceling interference. When the two candidate pollution sources alternately become the main source of propagation contribution of the monitoring point within consecutive time steps, it is called alternating interference. The enhanced interference, canceling interference, and alternating interference of the two candidate pollution sources at all monitoring points and all time steps are cumulatively statistically analyzed, and the mutual interference intensity between the two candidate pollution sources is determined based on the cumulative statistical results. Based on the pollution propagation contribution matrix and the pollution source interference matrix, candidate pollution sources are grouped and clustered. Candidate pollution sources with mutual interference relationships are combined into cross-source interference subsets. Each cross-source interference subset is associated with the set of monitoring points that have a major impact and the corresponding propagation path, thus constructing a cross-source interference propagation structure that includes a set of candidate pollution sources, a pollution propagation contribution matrix, and a pollution source interference matrix.

[0028] In this embodiment, obtaining the multi-point coordinated anomaly detection results and pollution source location results of the pollution source includes: Based on the cross-source interference propagation structure, the set of candidate pollution sources and the pollution propagation contribution information of each candidate pollution source at continuous time steps are extracted. Based on the pollution propagation contribution matrix, an emission intensity sequence representation of the impact of candidate pollution sources on pollutant concentrations at each monitoring point is established. Based on the correspondence between the actual pollutant concentration change sequences observed at monitoring points and the propagation contributions of candidate pollution sources, pollution propagation reconstruction data is constructed. The emission intensity of each candidate pollution source at different time steps is combined with its corresponding propagation contribution to obtain the pollution concentration sequence at the monitoring points. Specifically, the calculation of the combined emission intensity of each candidate pollution source at different time steps with its corresponding propagation contribution is as follows: Extract emission intensity data of each candidate pollution source at each time step, and match them with the pollution propagation contribution values ​​of the corresponding monitoring points at the same time step in chronological order; The emission intensity of each candidate pollution source at the current time step and its contribution to the pollution propagation at the target monitoring point are jointly calculated to obtain the pollution concentration contribution value of the candidate pollution source to the target monitoring point at that time step. The pollution concentration contribution values ​​of all candidate pollution sources to the same monitoring point are accumulated at the same time step to form the reconstructed pollution concentration of the corresponding monitoring point at that time step. The calculation is repeated for all time steps and all monitoring points to obtain the pollution concentration sequence of the monitoring point. The difference between the actual pollutant concentration sequence and the pollution concentration sequence at monitoring points is calculated. A problem for determining the emission intensity of pollution sources is established with the objective of minimizing this difference. A sparse constraint condition is introduced to progressively eliminate candidate pollution sources with emission intensities below a preset threshold. The sparse constraint condition refers to: In the process of solving the emission intensity sequence of candidate pollution sources, the number of candidate pollution sources participating in the pollution propagation calculation at the same time is limited so that only candidate pollution sources that meet the preset emission intensity threshold are retained in the emission intensity sequence. Candidate pollution sources with emission intensity below the preset threshold are gradually eliminated. In each iteration, the reconstruction results of the pollution concentration of each monitoring point by the remaining candidate pollution sources are recalculated. The difference between the reconstructed pollution concentration sequence and the actual pollutant concentration sequence is continuously compared. When the difference meets the preset convergence condition, the iteration stops, and the emission intensity sequence of candidate pollution sources that meets the sparsity constraint condition is obtained. Under sparse constraints, the emission intensity sequence of candidate pollution sources is iteratively solved, and candidate pollution sources with emission intensity or propagation contribution below the preset threshold are gradually eliminated. The optimal emission intensity sequence that satisfies the condition of minimizing difference is obtained, and the number and corresponding locations of pollution sources actually involved in pollution propagation are determined. The path of pollution propagation is backtracked by combining the dynamic propagation graph structure, and the propagation path of pollutants from pollution sources to each monitoring point is traced along the propagation connection relationship. The results of multi-point collaborative anomaly detection of pollution sources and the location of pollution sources are obtained.

[0029] refer to Figure 2 A machine learning-based multi-point collaborative anomaly detection system for pollution sources includes: The data processing module is used to collect pollutant concentration data and environmental auxiliary data, perform preprocessing, and generate pollutant time-series input data. The propagation modeling module is used to calculate the pollutant propagation attenuation relationship and construct the pollutant self-dilution sequence based on the pollutant time-series input data, forming a dynamic propagation graph structure. The forward reconstruction module is used to improve the STTN model by taking the dynamic propagation graph structure as a spatial constraint input, and obtain the forward pollution field reconstruction result; The propagation inversion module is used to perform propagation consistency inversion calculations based on the dynamic propagation graph structure and pollutant time-series input data, and to identify collaborative abnormal events at multiple monitoring points. The interference modeling module is used to construct a pollution propagation contribution matrix and calculate the interference intensity of candidate pollution sources after identifying coordinated abnormal events, based on the dynamic propagation graph structure and the pollutant concentration change sequence at monitoring points, thus forming a cross-source interference propagation structure. The pollution source localization module is used to solve the sparse constraint of the emission intensity sequence of candidate pollution sources based on the cross-source interference propagation structure, so as to obtain the multi-point collaborative anomaly detection results and pollution source localization results.

[0030] Example 1: To verify the feasibility of this invention in practice, it was applied to an industrial park in the northern part of a city. This area has long suffered from a problem of periodic increases in the concentrations of volatile organic compounds (VOCs) and nitrogen oxides. To strengthen regional air quality monitoring, the local ecological and environmental department has deployed multiple air quality monitoring points in and around the industrial park. In this example, eight fixed monitoring points were deployed within the park, with a spacing of approximately 0.8 to 1.5 kilometers between them. Each monitoring point is equipped with air quality monitoring equipment, capable of collecting real-time data on the concentrations of pollutants such as PM2.5, VOCs, and NO2, and simultaneously collecting auxiliary environmental data such as wind speed, wind direction, temperature, and humidity. The monitoring system records data every 10 minutes and uploads it to the environmental monitoring platform via a wireless network. During long-term operation, it was found that when some enterprises in the park are carrying out production operations or loading and unloading materials, multiple monitoring points will show an increase in pollutant concentrations within a similar time period. However, due to the large number of pollution sources within the park, the spread of pollutants in space exhibits a cumulative phenomenon. Traditional methods based on threshold judgments at single monitoring points are insufficient to determine whether the pollution is a real transmission event, and it is also difficult to pinpoint the location of the pollution source.

[0031] In this scenario, the proposed machine learning-based multi-point collaborative anomaly detection method for pollution sources is deployed on the park's environmental monitoring platform server to automatically analyze data collected from monitoring points. The system first collects pollutant concentration data and environmental auxiliary data from each monitoring point, and performs time alignment, outlier removal, and missing value completion on the data. Then, it constructs a spatiotemporal data matrix of pollutants from multiple monitoring points and generates time-series input data for pollutants. Based on the geographical relationships between monitoring points and the trend of pollutant concentration changes, the system calculates the pollutant propagation attenuation relationship and constructs a pollutant self-dilution sequence. By analyzing the degree of propagation attenuation and time delay, it determines the feasible region for pollution propagation, further forming a dynamic propagation graph structure. This dynamic propagation graph structure is then used as the spatial constraint input to improve the spatiotemporal Transformer network (STTN) model. Spatiotemporal feature learning and forward propagation reconstruction calculations are performed using the time-series input data for pollutants, thereby obtaining the forward pollution field reconstruction result.

[0032] After obtaining the positive pollution field, the system further performs a propagation consistency inversion calculation on the changes in pollution concentration at monitoring points based on the dynamic propagation graph structure and the time-series input data of pollutants, constructing a theoretical pollution propagation field and calculating the difference between the theoretical pollution propagation field and the positive pollution field. When the difference exceeds a preset threshold, the system determines that there is a coordinated anomaly event at multiple monitoring points during that time period. After identifying the coordinated anomaly event, the system constructs a pollution propagation contribution matrix based on the dynamic propagation graph structure and the pollutant concentration change sequence at monitoring points, and calculates the interference intensity between candidate pollution sources, thus forming a cross-source interference propagation structure. Subsequently, the system performs sparse constraint solution on the emission intensity sequence of candidate pollution sources based on the cross-source interference propagation structure, gradually eliminating candidate pollution sources with emission intensities below the threshold, and performs backtracking analysis on the pollution propagation path in conjunction with the dynamic propagation graph structure, thereby determining the number of pollution sources, the location of pollution sources, and the changes in emission intensity.

[0033] Taking a monitoring event from 7 PM to 8 PM on October 12th as an example, the industrial park experienced a regional increase in pollution concentration that evening. Monitoring data showed that at monitoring point M2, located on the west side of the park, the VOCs concentration rose from 42 micrograms per cubic meter to 71 micrograms per cubic meter at 7:20 PM. Simultaneously, monitoring points M3 and M4, located downwind, also showed an increase in concentration approximately 10 minutes later. The system constructed a dynamic propagation map structure based on the spatial relationship of the monitoring points and the prevailing northwest wind conditions, and obtained the forward pollution field using the STTN model. Subsequent propagation consistency inversion calculations confirmed that the anomaly conformed to the spatial propagation law of pollutants. After constructing a pollution propagation contribution matrix and a pollution source interference matrix, the system identified the tank loading and unloading area in the northwest corner of the park as the main source of pollution, inferring that its emission contribution accounted for more than 60% of the overall propagation contribution. The environmental protection department subsequently conducted an on-site inspection of the area and found that the volatile gas recovery device at the company was operating unstablely during loading and unloading, leading to a short-term increase in VOCs emissions, which was largely consistent with the system's identification results.

[0034] Table 1. Pollutant concentration monitoring data at some monitoring points in the industrial park As shown in Table 1, from 19:00 to 20:00, the VOCs concentration at the three monitoring points generally showed a trend of first increasing and then decreasing. From 19:00 to 19:10, the concentration changes at each monitoring point were relatively small, with M2 increasing from 41 μg / m³. 3 Increased to 42 μg / m 3 M3 is 38 μg / m 3 Increased to 39 μg / m 3 M4 is 36 μg / m 3 Increased to 37 μg / m 3The air quality in the area is relatively stable, while the wind speed remains between 2.3 and 2.5 m / s, and the wind direction is northwest.

[0035] At 19:20, the VOCs concentration at monitoring point M2 decreased from 42 μg / m³. 3 Rapidly rose to 71 μg / m 3 A significant anomaly was observed, while the concentrations of M3 and M4 remained low. Subsequently, between 19:30 and 19:40, the concentrations of M3 and M4 gradually increased, reaching 62 μg / m³, respectively. 3 and 55 μg / m 3 And further increased to 68 μg / m 3 and 63 μg / m 3 This change indicates that the pollutants first appeared near M2, and then spread downwind to M3 and M4 within about 10 to 20 minutes.

[0036] After 19:40, the concentrations at each monitoring point gradually decreased, reaching 55 μg / m³ at 20:00. 3 52μg / m 3 and 50 μg / m 3 Throughout the process, the wind speed remained between 2.2 and 2.7 m / s and the wind direction was consistently northwest, indicating that the concentration changes were mainly affected by local emissions. This suggests that the pollutants spread from region M2 to regions M3 and M4, demonstrating a clear spatial propagation characteristic across multiple monitoring points.

[0037] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

Claims

1. A multi-point collaborative anomaly detection method for pollution sources based on machine learning, characterized in that, include: Pollutant concentration data and environmental auxiliary data from multiple pollution monitoring points were collected and preprocessed to construct a spatiotemporal data matrix of pollutants from multiple monitoring points and generate time-series input data of pollutants. Based on the time-series input data of pollutants, the pollutant propagation attenuation relationship is calculated and the pollutant self-dilution sequence is constructed to determine the feasible region of pollution propagation and form a dynamic propagation graph structure. An improved spatiotemporal Transformer network (STTN) model is constructed, which includes a propagation feasible domain gated spatial attention layer, a self-dilution modulation temporal attention layer, and a physical constraint fusion reconstruction layer. The dynamic propagation graph structure is used as the spatial constraint input, and the pollutant time series input data is input into the STTN model for spatiotemporal feature learning and forward propagation reconstruction calculation to obtain the forward pollution field reconstruction result. Based on the dynamic propagation graph structure and pollutant time-series input data, the propagation consistency inversion calculation is performed on the changes in pollution concentration at monitoring points to construct a theoretical pollution propagation field. The difference between the theoretical field and the reconstructed results of the forward pollution field is calculated to obtain the pollution propagation consistency index and identify collaborative abnormal events at multiple monitoring points. After identifying multi-monitoring point coordinated abnormal events, a pollution propagation contribution matrix is ​​constructed based on the dynamic propagation diagram structure and the pollutant concentration change sequence of the monitoring points. The mutual interference intensity between candidate pollution sources is calculated to construct a pollution source interference matrix, forming a cross-source interference propagation structure. Based on the cross-source interference propagation structure, the emission intensity sequence of candidate pollution sources is solved by sparse constraint to determine the number of pollution sources, the location of pollution sources and the emission intensity change sequence. The pollution propagation path is reconstructed based on the dynamic propagation graph structure to obtain the multi-point collaborative anomaly detection results and pollution source location results.

2. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The pollutant concentration data includes particulate matter concentration data and gaseous pollutant concentration data, and the environmental auxiliary data includes wind speed data, wind direction data, temperature data, humidity data, and air pressure data.

3. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The construction of a multi-monitoring point pollutant spatiotemporal data matrix to generate pollutant time-series input data includes: The pollutant concentration data and environmental auxiliary data of each monitoring point are time-aligned according to a unified time scale and arranged according to the monitoring point number and time order to construct a pollutant spatiotemporal data matrix containing multiple monitoring points and multiple time steps. The pollutant spatiotemporal data matrix is ​​then processed by sliding segmentation according to a preset time window to divide the continuous time series data into multiple time segments and combine them in time order to form pollutant time series input data.

4. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The process of determining the feasible region for pollution propagation and forming a dynamic propagation graph structure includes: Based on the pollutant time series input data, extract the pollutant concentration change sequence of each monitoring point in continuous time steps, obtain the geographical location information and environmental auxiliary data of each monitoring point, calculate the spatial distance between monitoring points according to the spatial relationship between monitoring points, and establish the spatiotemporal correlation between each monitoring point in time order. Based on the pollutant concentration change sequence at each monitoring point, the pollutant concentration change amplitude between any two monitoring points in adjacent time steps is calculated. Combined with the spatial distance between monitoring points and the wind speed and direction information in the environmental auxiliary data, the propagation delay time corresponding to the spread of pollutants from one monitoring point to another is determined, and the pollutant propagation attenuation relationship between monitoring points is obtained. Based on the pollutant propagation and attenuation relationship, the degree of propagation and attenuation between each monitoring point in multiple consecutive time windows is statistically analyzed, the average attenuation intensity of each monitoring point to adjacent monitoring points is calculated, and a pollutant self-dilution sequence is constructed based on the average attenuation intensity. Based on the pollutant self-dilution sequence, the propagation attenuation intensity between each monitoring point is screened by threshold, and the propagation relationship is checked for temporal consistency by combining the propagation delay time. Monitoring point pairs that meet the propagation attenuation conditions and have continuous temporal consistency are identified, and the feasible region for pollution propagation is determined. Based on the feasible region of pollution propagation, a set of candidate upstream monitoring points corresponding to each monitoring point is identified, and the propagation connection relationship between monitoring points is established according to the propagation attenuation intensity and propagation delay time. A dynamic propagation graph structure containing the set of monitoring points, the propagation connection relationship and the propagation weight is constructed.

5. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The obtained positive pollution field reconstruction results include: The dynamic propagation graph structure is aligned and encoded with the pollutant time-series input data. Each monitoring point and propagation connection in the dynamic propagation graph structure is mapped to a spatial topology input. The pollutant concentration values ​​of each monitoring point in the pollutant time-series input data at continuous time steps are mapped to a time series input, which serve as the spatial and temporal inputs of the improved spatiotemporal Transformer network STTN model. An improved spatiotemporal Transformer network STTN model is constructed, which includes a propagation feasible domain gated spatial attention layer, a self-dilution modulation temporal attention layer, and a physical constraint fusion reconstruction layer. The spatial attention layer with propagation feasible domain gate is constrained by the dynamic propagation graph structure. It performs attention calculation on the spatial association between monitoring points and performs gating processing on the spatial attention weights according to the pollution propagation feasible domain. Only the spatial propagation weights between monitoring point pairs within the propagation feasible domain are retained, resulting in a spatial feature representation constrained by the propagation feasible domain. The self-dilution modulated time attention layer takes pollutant time-series input data and pollutant self-dilution sequence as input, performs attention calculation on the correlation between time steps at multiple time scales, and modulates the time attention weights using the dilution intensity corresponding to the pollutant self-dilution sequence to obtain a time feature representation that simultaneously characterizes the pollutant time change characteristics and self-dilution properties. The physical constraint fusion reconstruction layer fuses the propagation feasible domain gated spatial feature representation with the self-dilution modulation temporal feature representation, and introduces the propagation weight information related to the monitoring points in the dynamic propagation graph structure. It performs forward propagation reconstruction calculation under physical constraints on the fused features to generate the pollution concentration reconstruction results of each monitoring point at the target time step. The pollution concentration reconstruction results of each monitoring point output by the third-layer physical constraint fusion reconstruction layer are combined to form a positive pollution field reconstruction result.

6. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The obtained pollution transmission consistency index, which identifies coordinated abnormal events at multiple monitoring points, includes: Based on the dynamic propagation graph structure and pollutant time-series input data, the pollutant concentration change sequence of each monitoring point in continuous time steps is extracted, and the set of candidate upstream monitoring points corresponding to each monitoring point is determined according to the propagation connection relationship between each monitoring point in the dynamic propagation graph structure. For each monitoring point, the pollutant concentration changes of each upstream monitoring point are time-aligned according to the propagation delay order within the candidate upstream monitoring point set. Combined with the attenuation intensity corresponding to the pollutant self-dilution sequence, the pollution propagation contribution of each upstream monitoring point to the target monitoring point within the historical time window is back-calculated to obtain the theoretical propagation contribution sequence of each upstream monitoring point to the target monitoring point. The propagation consistency inversion calculation is performed on the theoretical propagation contribution sequence. By accumulating and superimposing the propagation contributions of each upstream monitoring point within multiple consecutive time windows, and combining the propagation weights in the dynamic propagation graph structure, the theoretical pollution concentration of the target monitoring point at the current time step is estimated by inversion, and the theoretical pollution concentration sequence of each monitoring point is obtained. The theoretical pollution concentrations obtained from all monitoring points at each time step are spatially combined according to the monitoring point number to construct a theoretical pollution propagation field, where the theoretical pollution propagation field represents the theoretical diffusion state of pollutants formed under the joint constraints of dynamic propagation diagram structure, propagation delay, and pollutant self-dilution sequence. The difference between the theoretical pollution propagation field and the reconstructed positive pollution field is calculated. By statistically analyzing the concentration differences of each monitoring point at the same time step and performing overall aggregation processing, a pollution propagation consistency index is obtained. When the pollution propagation consistency index exceeds a preset threshold, it is determined that there is a multi-monitoring point collaborative abnormal event.

7. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The calculation of the mutual interference intensity between candidate pollution sources to construct a pollution source interference matrix and form a cross-source interference propagation structure includes: After identifying a coordinated anomaly event at multiple monitoring points, based on the dynamic propagation graph structure and the pollutant concentration change sequence at the monitoring points, monitoring points or emission points located upstream of the anomaly area and having a propagation impact are selected as a set of candidate pollution sources, and the pollutant concentration change sequence of each candidate pollution source at continuous time steps is extracted. Based on the relationship between the candidate pollution source set and each monitoring point in the dynamic propagation graph structure, and combined with the pollutant self-dilution sequence and propagation delay information, the pollution contribution of each candidate pollution source to each monitoring point at each time step is calculated point by point and time by time. The contribution results are arranged according to the candidate pollution source number and the monitoring point number to form a pollution propagation contribution matrix. Based on the pollution transmission contribution matrix, we conduct correlation analysis on the pollution transmission contribution change patterns of any two candidate pollution sources at multiple monitoring points and multiple time steps. We statistically analyze the situations where the transmission contributions of two candidate pollution sources increase simultaneously, cancel each other out, or alternately dominate at the same monitoring point and similar time steps. We calculate the mutual interference intensity between candidate pollution sources and arrange the interference intensity according to the candidate pollution source number to form a pollution source interference matrix. Based on the pollution propagation contribution matrix and the pollution source interference matrix, candidate pollution sources are grouped and clustered. Candidate pollution sources with mutual interference relationships are combined into cross-source interference subsets. Each cross-source interference subset is associated with the set of monitoring points that have a major impact and the corresponding propagation path, thus constructing a cross-source interference propagation structure that includes a set of candidate pollution sources, a pollution propagation contribution matrix, and a pollution source interference matrix.

8. The multi-point collaborative anomaly detection method for pollution sources based on machine learning according to claim 1, characterized in that, The obtained multi-point collaborative anomaly detection results and pollution source location results include: Based on the cross-source interference propagation structure, the set of candidate pollution sources and the pollution propagation contribution information of each candidate pollution source at continuous time steps are extracted. Based on the pollution propagation contribution matrix, an emission intensity sequence representation of the impact of candidate pollution sources on pollutant concentrations at each monitoring point is established. Based on the correspondence between the actual pollutant concentration change sequence observed at the monitoring points and the propagation contribution of candidate pollution sources, pollution propagation reconstruction data is constructed. The emission intensity of each candidate pollution source at different time steps and the corresponding propagation contribution are combined and calculated to obtain the pollution concentration sequence at the monitoring points. The difference between the actual pollutant concentration sequence and the pollution concentration sequence at the monitoring point is calculated, and a problem for solving the emission intensity of the pollution source is established with the goal of minimizing the difference. Sparse constraints are introduced to gradually eliminate candidate pollution sources with emission intensity below a preset threshold. Under sparse constraints, the emission intensity sequence of candidate pollution sources is iteratively solved, and candidate pollution sources with emission intensity or propagation contribution below the preset threshold are gradually eliminated. The optimal emission intensity sequence that satisfies the condition of minimizing difference is obtained, and the number and corresponding locations of pollution sources actually involved in pollution propagation are determined. The path of pollution propagation is backtracked by combining the dynamic propagation graph structure, and the propagation path of pollutants from pollution sources to each monitoring point is traced along the propagation connection relationship. The results of multi-point collaborative anomaly detection of pollution sources and the location of pollution sources are obtained.

9. A machine learning-based multi-point collaborative anomaly detection system for pollution sources, comprising executing the machine learning-based multi-point collaborative anomaly detection method for pollution sources as described in any one of claims 1 to 8, characterized in that, include: The data processing module is used to collect pollutant concentration data and environmental auxiliary data, perform preprocessing, and generate pollutant time-series input data. The propagation modeling module is used to calculate the pollutant propagation attenuation relationship and construct the pollutant self-dilution sequence based on the pollutant time-series input data, forming a dynamic propagation graph structure. The forward reconstruction module is used to improve the STTN model by taking the dynamic propagation graph structure as a spatial constraint input, and obtain the forward pollution field reconstruction results; The propagation inversion module is used to perform propagation consistency inversion calculations based on the dynamic propagation graph structure and pollutant time-series input data, and to identify collaborative abnormal events at multiple monitoring points. The interference modeling module is used to construct a pollution propagation contribution matrix and calculate the interference intensity of candidate pollution sources after identifying coordinated abnormal events, based on the dynamic propagation graph structure and the pollutant concentration change sequence at monitoring points, thus forming a cross-source interference propagation structure. The pollution source localization module is used to solve the sparse constraint of the emission intensity sequence of candidate pollution sources based on the cross-source interference propagation structure, so as to obtain the multi-point collaborative anomaly detection results and pollution source localization results.