A method and system for precise tracing of nitrogen compound pollution sources based on multi-modal data fusion and graph neural network

By employing multimodal data fusion and graph neural network methods, the challenge of pollution source identification in scenarios involving multiple coexisting sources and complex pollution migration was solved. This approach enabled fine-grained differentiation and reliable identification of pollution sources, improving the resolution and reliability of source tracing and providing stable source tracing mapping and quantitative contribution rates.

CN122241065APending Publication Date: 2026-06-19CHINESE RES ACAD OF ENVIRONMENTAL SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHINESE RES ACAD OF ENVIRONMENTAL SCI
Filing Date
2026-03-09
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies, when faced with multiple sources and complex pollution migration processes, struggle to simultaneously achieve the synergistic utilization of multimodal high-dimensional fingerprint information, the effective characterization of spatial topological associations, and the quantitative decomposition of contribution rates and physical consistency constraints, resulting in insufficient resolution and reliability in pollution source identification.

Method used

A method based on multimodal data fusion and graph neural networks is adopted. By acquiring and quantifying the multimodal fingerprint features of each typical pollution source, a source feature matrix is ​​constructed. Combined with the spatial topology map of the monitoring points, a graph neural network is used to generate contribution weight vectors. A composite loss function is constructed for iterative optimization, and a quantitative contribution rate result that conforms to physical meaning is output.

Benefits of technology

It enables fine-grained differentiation and reliable identification of pollution sources in complex nitrogen pollution scenarios, improves the resolution and reliability of pollution source identification, overcomes the limitations of traditional methods in not considering hydrodynamic propagation and directional action mechanisms, and provides stable source tracing mapping relationships and interpretable quantitative contribution rates.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241065A_ABST
    Figure CN122241065A_ABST
Patent Text Reader

Abstract

This invention provides a method and system for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks, belonging to the field of pollution source tracing technology. This invention acquires and quantifies the multimodal fingerprint features of each typical pollution source to form a source feature matrix, obtains multimodal spectral data from downstream monitoring points to obtain actual monitored downstream fingerprint vectors, and constructs a spatial topology map based on the spatial topology between monitoring points. The source feature matrix and the spatial topology map are input into a graph neural network to obtain contribution weight vectors, and the predicted downstream fingerprint vectors are reconstructed accordingly. A composite loss function including spectral reconstruction loss, physical constraint loss, and sparsity constraint is constructed, and convergent contribution weights satisfying non-negativity and sum-to-one constraints are obtained through iterative optimization. This invention can output a physically meaningful and interpretable quantitative contribution rate of pollution sources for precise source tracing analysis in nitrogen compound pollution scenarios.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of pollution source tracing technology, and in particular to a method and system for accurate source tracing of nitrogen complex pollution sources based on multimodal data fusion and graph neural networks. Background Technology

[0002] In the field of nitrogen complex pollution source tracing, common approaches in existing technologies include isotope tracing and statistical analysis. For example, source apportionment studies based on nitrate dual isotopes combined with mixing models are relatively abundant, capable of providing quantitative proportions of pollution sources and conducting uncertainty assessments under certain conditions. However, when end-member characteristics of different sources overlap, or when isotopic fractionation occurs during pollutant migration and transformation, the identification stability and reliability of the above methods are easily affected. Meanwhile, multivariate statistical and chemometric methods (such as principal component analysis) are also widely used in water quality assessment and potential pollution source identification, playing a role in revealing the correlation structure between variables. However, for complex pollution migration chains affected by hydrodynamic processes, environmental transformation, and multi-source superposition interference, they often struggle to provide physically consistent quantitative source tracing conclusions.

[0003] With the widespread adoption of characterization techniques such as non-targeted high-resolution mass spectrometry and spectral analysis, source tracing research is gradually shifting from relying on a few conventional water quality indicators to utilizing high-dimensional, heterogeneous "fingerprinted" information for source identification and contribution analysis. Related research emphasizes constructing comparable DOM fingerprints through non-targeted high-resolution mass spectrometry and combining them with spectral information to enhance the characterization of complex organic components; three-dimensional fluorescence EEM and other methods are also being used for organic fingerprint feature extraction and the characterization and tracking of its migration and transformation. Simultaneously, for scenarios with significant spatial correlation structures, such as river networks or monitoring networks, research using graph structures to express spatial topological relationships and conduct source tracing inference is gradually increasing, showing a technological development trend of incorporating topological information such as spatial connectivity and directionality into pollution source tracing analysis.

[0004] While isotope and chemometric methods can support source tracing under specific conditions, they often struggle to simultaneously address the challenges of utilizing multimodal high-dimensional fingerprint information, effectively characterizing spatial topological relationships, and quantitatively decomposing contribution rates while maintaining physical consistency constraints, especially in situations involving multiple sources and complex pollution migration processes. Furthermore, some graph-based source tracing studies, while leveraging topological relationships to enhance localization or inference capabilities, tend to focus on single-type monitoring information or rely primarily on discrete conclusions such as source location and category. A more unified and robust technical approach is still lacking for hybrid reconstruction driven by multimodal endmember fingerprint databases and for generating interpretable quantitative contribution results under constraints such as "non-negative contribution rates with a sum of one." Therefore, establishing a consistent source tracing mechanism among multimodal fingerprint features, spatial topological relationships, and quantitative contribution rate constraints remains a pressing technical challenge in this field. Summary of the Invention

[0005] To overcome the shortcomings of existing technologies, the purpose of this invention is to provide a precise source tracing method and system for nitrogen composite pollution sources based on multimodal data fusion and graph neural networks. Under the joint constraints of multimodal fingerprint features and spatial topological propagation relationships, it can output quantitative contribution rate results of pollution sources that are physically meaningful and interpretable.

[0006] To achieve the above objectives, the present invention provides the following solution:

[0007] A precise source tracing method for nitrogen compound pollution sources based on multimodal data fusion and graph neural networks includes:

[0008] The multimodal fingerprint features of each typical pollution source are acquired and quantified to form a source feature matrix;

[0009] The multimodal spectral data of downstream monitoring points are acquired and fused to obtain the actual downstream fingerprint vector;

[0010] Construct a spatial topology map based on the spatial topology between monitoring points;

[0011] The source feature matrix and spatial topology graph are input into a graph neural network to obtain the contribution weight vectors corresponding to the downstream monitoring points.

[0012] Based on the contribution weight vector, a weighted linear superposition is performed on the fingerprint vectors of each pollution source in the source feature matrix to obtain the predicted downstream fingerprint vector of the downstream monitoring point.

[0013] Construct a composite loss function; the composite loss function includes at least spectral reconstruction loss, physical constraint loss and sparsity constraint; among which, spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight is one, and sparsity constraint is used to make the contribution weight vector tend to be sparse.

[0014] The contribution weight vector and the model parameters of the graph neural network are iteratively optimized based on the composite loss function until convergence, and the converged contribution weight vector is output as the quantitative contribution rate of each pollution source at the downstream monitoring point.

[0015] Preferably, constructing a spatial topology map based on the spatial topology between monitoring points includes:

[0016] The spatial topological relationships between monitoring points are determined based on the connectivity of river networks or water systems.

[0017] Each monitoring point is mapped to a graph node in the spatial topology graph;

[0018] Monitoring point pairs with spatial topological associations are mapped to graph edges in a spatial topological graph to form a spatial topological graph that characterizes pollution migration relationships.

[0019] Preferably, constructing a spatial topology map based on the spatial topology between monitoring points further includes:

[0020] In the spatial topology graph, at least some graph edges are associated with directional information to characterize the direction of pollution migration from upstream to downstream;

[0021] The spatial topology graph contains at least some graph edge association migration impact information; the migration impact information is used to characterize the differences in the impact of different spatial topology associations on downstream monitoring points;

[0022] Directional information and migration impact information are incorporated into the spatial topology graph for use by graph neural networks.

[0023] Preferably, the source feature matrix and spatial topology graph are input into a graph neural network to obtain contribution weight vectors corresponding to downstream monitoring points, including:

[0024] The source feature matrix is ​​used as the input feature to characterize the fingerprint of a typical pollution source;

[0025] The spatial topology map is used as the input structure to represent the spatial association of monitoring points;

[0026] In graph neural networks, information between monitoring points is propagated and aggregated based on spatial topology graphs, and contribution weight vectors corresponding to downstream monitoring points are generated by combining the source feature matrix.

[0027] Preferably, the process of inputting the source feature matrix and spatial topology graph into a graph neural network to obtain contribution weight vectors corresponding to downstream monitoring points further includes:

[0028] In graph neural networks, different propagation influence weights are assigned to adjacent monitoring points that have different spatial topological associations with downstream monitoring points;

[0029] Based on the propagation impact weight, the propagation information from different adjacent monitoring points is weighted and aggregated to enhance the ability to characterize the dominant pollution migration relationship;

[0030] Output the contribution weight vector based on the weighted aggregation result.

[0031] Preferably, the process of inputting the source feature matrix and spatial topology graph into a graph neural network to obtain contribution weight vectors corresponding to downstream monitoring points further includes:

[0032] In a graph neural network, spatial topological associations that play a major role in the formation of contribution weight vectors are recorded.

[0033] The spatial topological associations that play a major role are organized into a set of source-tracing association paths to characterize the dominant propagation paths that affect the pollution contribution of downstream monitoring points.

[0034] Output the set of source-tracing related paths to provide a basis for the source-tracing interpretation of the contribution weight vector.

[0035] Preferably, the source feature matrix and the spatial topology graph are input into the graph neural network, including:

[0036] Using the downstream monitoring point as the center, extract a local spatial topology map containing the downstream monitoring point from the spatial topology map;

[0037] The source feature matrix and the local spatial topology graph are input into the graph neural network to obtain the contribution weight vector corresponding to the downstream monitoring points, thereby reducing the interference of irrelevant spatial topology associations on the source tracing results.

[0038] Preferably, the multimodal spectral data of downstream monitoring points are acquired and fused to obtain the actual downstream fingerprint vector, including:

[0039] A consistent fusion process is performed on the multimodal spectral data obtained from the same downstream monitoring point at different sampling periods to obtain the actual downstream fingerprint vector of the corresponding sampling period;

[0040] The actual downstream fingerprint vectors from different sampling periods are used as the basis for training and optimizing the graph neural network, so that the contribution weight vector can reflect the time-varying characteristics of pollution contribution.

[0041] Preferably, the contribution weight vector and the model parameters of the graph neural network are iteratively optimized based on the composite loss function until convergence, including:

[0042] The spectral reconstruction loss, physical constraint loss, and sparsity constraint are applied together to the contribution weight vector so that the contribution weight vector tends to be sparse under the condition that the non-negativity constraint is satisfied and the sum of each contribution weight is one.

[0043] At the same time, the spectral reconstruction loss is applied to the model parameters of the graph neural network so that the graph neural network can form a stable source-tracing mapping relationship to the downstream monitoring points under the constraints of the spatial topology graph;

[0044] When the convergence condition is met, the converged contribution weight vector is output as the quantitative contribution rate of each pollution source at the downstream monitoring point.

[0045] A precise source tracing system for nitrogen compound pollution sources based on multimodal data fusion and graph neural networks includes:

[0046] The pollution source fingerprint construction unit is used to acquire and quantify the multimodal fingerprint features of each typical pollution source to form a source feature matrix;

[0047] The downstream fingerprint acquisition unit is used to acquire and fuse multimodal spectral data of downstream monitoring points to obtain the actual monitored downstream fingerprint vector;

[0048] Spatial topology building unit, used to construct a spatial topology map based on the spatial topology structure between monitoring points;

[0049] The contribution weight inference unit is used to input the source feature matrix and spatial topology graph into the graph neural network to obtain the contribution weight vector corresponding to the downstream monitoring point.

[0050] The downstream fingerprint reconstruction unit is used to perform weighted linear superposition of the fingerprint vectors of each pollution source in the source feature matrix based on the contribution weight vector to obtain the predicted downstream fingerprint vector of the downstream monitoring point.

[0051] A composite loss building unit is used to construct a composite loss function. The composite loss function includes at least spectral reconstruction loss, physical constraint loss, and sparsity constraint. Among them, spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight is one, and sparsity constraint is used to make the contribution weight vector tend to be sparse.

[0052] The contribution weight optimization unit is used to iteratively optimize the contribution weight vector and the model parameters of the graph neural network based on the composite loss function until convergence, and outputs the converged contribution weight vector as the quantitative contribution rate of each pollution source at the downstream monitoring point.

[0053] The present invention discloses the following technical effects:

[0054] (1) This invention constructs a source feature matrix by acquiring and quantifying the multimodal fingerprint features of each typical pollution source, and acquires multimodal spectral data of downstream monitoring points and fuses them to form the actual monitoring downstream fingerprint vector. This makes pollution source tracing no longer dependent on a single water quality indicator or isotope single-channel information, but can use high-dimensional heterogeneous fingerprint features to achieve fine-grained differentiation of different pollution sources, thereby improving the resolution and reliability of pollution source identification in a multi-source coexistence environment.

[0055] (2) The present invention constructs a spatial topology map based on the spatial topology between monitoring points and uses the spatial topology map as the input structure of a graph neural network, so that the upstream-downstream correlation of pollution migration can be expressed in terms of topological relationships, thereby overcoming the limitations of traditional statistical methods in not considering the mechanisms of hydrodynamic propagation, connectivity and directionality, and improving the ability to characterize pollution migration paths and impact intensity.

[0056] (3) This invention outputs the contribution weight vector corresponding to the downstream monitoring point through a graph neural network, and then performs weighted linear superposition with the source feature matrix to form a predicted fingerprint vector. The contribution weight vector is iteratively optimized with a composite loss function. Therefore, it can directly give the quantitative contribution rate of each pollution source to the downstream monitoring point, which solves the problem that most existing graph models only output discrete judgments of "source location", "category" or "risk zone" and are difficult to obtain quantitative contribution ratio.

[0057] (4) In this invention, the spectrum reconstruction loss, physical constraint loss and sparsity constraint are set in the composite loss function at the same time, so that the contribution weight vector not only conforms to the physical meaning of non-negativity constraint and weight sum to one, but also retains the sparsity feature to highlight the dominant pollution source, thus having both physical consistency and interpretability, avoiding the problems of "contribution rate has no physical meaning" or "contribution result is difficult to interpret" caused by the lack of physical constraints in traditional endmember decomposition and multivariate statistical methods.

[0058] (5) The present invention adopts an iterative optimization strategy based on a composite loss function, which enables the graph neural network to form a stable source mapping relationship in real scenarios with multiple sources superimposed, complex pollution migration, and pollution contribution varying with space. This improves the adaptability to complex nitrogen pollution situations and makes up for the shortcomings of existing methods in terms of source tracing result stability, adaptability to complex situations, and model generalizability. Attached Figure Description

[0059] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0060] Figure 1A flowchart of the method provided in an embodiment of the present invention;

[0061] Figure 2 This is a schematic diagram of the spatial topology provided in an embodiment of the present invention;

[0062] Figure 3 This is a schematic diagram of the system structure provided in an embodiment of the present invention. Detailed Implementation

[0063] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0064] The purpose of this invention is to provide a method and system for accurate source tracing of nitrogen complex pollution sources based on multimodal data fusion and graph neural networks. By leveraging the topological expression capabilities and physical constraint mechanisms of graph neural networks, stable, reliable and quantifiable pollution source tracing can be achieved for complex nitrogen complex pollution scenarios.

[0065] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0066] Figure 1 The method flowchart provided in the embodiments of the present invention is as follows: Figure 1 As shown, this invention provides a precise source tracing method for nitrogen compound pollution sources based on multimodal data fusion and graph neural networks, including:

[0067] Step 100: Obtain and quantify the multimodal fingerprint features of each typical pollution source to form a source feature matrix;

[0068] Step 200: Obtain and fuse multimodal spectral data from downstream monitoring points to obtain the actual downstream fingerprint vector;

[0069] Step 300: Construct a spatial topology map based on the spatial topology between monitoring points;

[0070] Step 400: Input the source feature matrix and spatial topology graph into the graph neural network to obtain the contribution weight vector corresponding to the downstream monitoring points;

[0071] Step 500: Perform a weighted linear superposition of the fingerprint vectors of each pollution source in the source feature matrix based on the contribution weight vector to obtain the predicted downstream fingerprint vector of the downstream monitoring point;

[0072] Step 600: Construct a composite loss function; the composite loss function includes at least spectral reconstruction loss, physical constraint loss and sparsity constraint; among which, spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight is one, and sparsity constraint is used to make the contribution weight vector tend to be sparse.

[0073] Step 700: Iteratively optimize the contribution weight vector and the model parameters of the graph neural network based on the composite loss function until convergence, and output the converged contribution weight vector as the quantitative contribution rate of each pollution source at the downstream monitoring point.

[0074] Specifically, in this embodiment, step 100 involves systematically sampling and characterizing typical pollution sources of different types to obtain multimodal fingerprint features for source tracing analysis, and then vectorizing them under a unified feature system. Specifically, this embodiment selects representative nitrogen pollution sources such as industrial wastewater, domestic sewage, livestock and poultry breeding wastewater, and agricultural runoff, and obtains various characterization information corresponding to each pollution source, including non-targeted mass spectrometry data, three-dimensional fluorescence spectroscopy data, and microbial community composition data. Non-targeted mass spectrometry is used to characterize the composition of dissolved organic matter and the overall fingerprint features of nitrogen-containing organic matter; three-dimensional fluorescence spectroscopy is used to characterize fluorescent components such as humic substances and proteinaceous substances; and microbial community information is used to characterize the microbial lineages unique to or occurring in high abundance from different pollution sources. To address the aforementioned multimodal characterization information, this embodiment employs pre-defined feature screening and construction rules. It selects representative combinations of response intensities from non-targeted mass spectrometry data, comprehensive indicators of typical fluorescence regions from three-dimensional fluorescence spectroscopy, and several discriminative community composition indicators from microbial community data. All indicators are then normalized and standardized according to uniform dimensions and ranges, thereby forming a multimodal fingerprint feature that stably reflects the characteristics of various typical pollution sources. The "multimodal fingerprint feature" referred to here is an ordered set of organic composition characteristics, spectral characteristics, and microbial characteristics obtained through different analytical methods under the same pollution source conditions and after standardization. This set is spatially independent of the specific pollution source location but statistically distinguishes different pollution source types.

[0075] Based on this, this embodiment arranges the multimodal fingerprint features corresponding to each typical pollution source in a preset feature order to form a source feature matrix for subsequent modeling. Specifically, this embodiment takes a single typical pollution source as a unit, and arranges each indicator in its multimodal fingerprint features into an ordered set of feature values ​​in a fixed order. This ordered set of feature values ​​is regarded as the vectorized representation of the pollution source. Then, the vectorized representations of different typical pollution sources are combined under the same feature system to form a two-dimensional data structure with several rows and columns, which is used to simultaneously carry the multimodal feature information of multiple typical pollution sources. The "source feature matrix" referred to here is a feature set with typical pollution sources as the basic objects, the uniformly constructed multimodal fingerprint features as the basic dimensions, and arranged in a row and column manner. On the one hand, it is used as the input feature to represent the pollution source fingerprint in the graph neural network; on the other hand, it is used to combine with the contribution weight vector in the weighted linear superposition process to realize the reconstruction of the mixed fingerprint of downstream monitoring points.

[0076] Optionally, in this embodiment, step 200 involves multi-channel environmental characterization of downstream monitoring points to obtain multimodal spectral data for source tracing modeling, and forming a downstream fingerprint vector under a unified feature system. Specifically, this embodiment selects monitoring points located downstream of river networks or in areas affected by catchment, and collects multimodal spectral data such as non-targeted mass spectrometry data, three-dimensional fluorescence spectral data, and microbial community spectral maps obtained at these monitoring points at different sampling periods. Non-targeted mass spectrometry reflects the overall compositional characteristics of dissolved organic matter in the water, three-dimensional fluorescence spectroscopy reflects the ratio and structure of humic substances and proteinaceous substances, and microbial community spectral maps reflect the differences in microbial lineages that may be caused by specific pollution sources. This embodiment uses pre-determined spectral feature construction rules to unify the dimensions, normalize the scales, and select indicators for the above-mentioned multimodal spectral data, enabling it to express the chemical, biological, and optical fingerprint characteristics of downstream monitoring points at different sampling periods within the same feature system. In this embodiment, "multimodal spectral data" refers to a collection of spectral characterization data obtained through different analytical methods under the same downstream monitoring point conditions. This collection can reflect the multidimensional fingerprint information of pollutant components in the water body.

[0077] Based on this, this embodiment performs a consistent fusion process on the multimodal spectral data obtained from the same downstream monitoring point at different sampling periods to form the actual downstream fingerprint vector for the corresponding sampling period. Specifically, this embodiment combines the multimodal spectral data under a unified feature system based on the spectral feature data corresponding to each sampling period to highlight the overall characteristics of the water pollution mixing effect and form a set of time-series fingerprint vectors that can be used for training and optimization. Each fingerprint vector corresponds to a sampling period to reflect the temporal variation of different pollution contributions. The "actual downstream fingerprint vector" referred to in this embodiment refers to a sequence vector that characterizes the multimodal pollution features of the downstream monitoring point at a specific sampling period under a unified feature system. It is used to provide time-varying information about pollution migration, accumulation, and mixing to the graph neural network, so that the subsequent contribution weight vector can reflect the temporal variation characteristics of the fingerprint features of the downstream monitoring point by different pollution sources.

[0078] Furthermore, in this embodiment, step 300 involves structurally representing the spatial topological relationships between downstream monitoring points and other monitoring points to form a spatial topological graph that can be used for source tracing analysis. Specifically, this embodiment first identifies spatial topological associations between monitoring points based on the natural connectivity of the water system or river network, where spatial topological associations characterize the pollution migration relationships between different monitoring points caused by water flow connectivity. Subsequently, each monitoring point is mapped as a graph node in the spatial topological graph, and pairs of monitoring points with spatial topological associations are mapped as graph edges in the spatial topological graph, thereby forming a topological structure with a set of nodes and a set of edges. The "spatial topological graph" referred to in this embodiment is a graph structure with monitoring points as nodes and spatial topological associations that can lead to the downstream propagation of pollutants as connections, used to carry information on the potential migration paths of pollutants in the water body. Since the propagation of pollution in river networks or water systems has connectivity characteristics, and the spatial topological graph can structurally represent this characteristic, it can be used for spatial relationship modeling in subsequent graph neural networks.

[0079] Building upon this foundation, this embodiment further incorporates at least some of the graph edge association directionality information and migration impact information into the spatial topology graph to enhance the graph structure's ability to represent pollution migration processes. Directional information characterizes the upstream-to-downstream directional characteristics of pollution propagation, distinguishing the upstream and downstream positional relationships between different monitoring points in the hydrodynamic process. Migration impact information characterizes the potential differences in the impact of different spatial topological associations on downstream monitoring points, such as differentiating the contribution differences between tributaries or the transmission efficiency differences between the main channel and tributaries. This embodiment incorporates directional and migration impact information as components of the spatial topology graph, ensuring that the graph not only includes monitoring points and connectivity relationships but also crucial auxiliary information for pollution migration modeling. This allows the subsequent graph neural network to infer pollution propagation paths and their impacts based on this structural information. The "migration impact information" referred to in this embodiment describes the potential differences in the impact of pollutant components during transmission in different spatial topological associations. It reflects the strength of different topological associations in pollution source tracing, enabling the graph neural network to identify the dominant and secondary pollution migration paths during the learning process.

[0080] Figure 2 This is a schematic diagram of the spatial topology provided in an embodiment of the present invention. In the diagram, labels 1 and 2 represent monitoring points located upstream and tributary, label 3 represents a confluence monitoring point formed by the convergence of upstream and tributary monitoring points, and label 4 represents a midstream monitoring point located downstream of the confluence monitoring point. The arrowed lines connecting labels 1 and 2 to label 3 represent the spatial topological associations and pollution migration directions between the upstream and tributary monitoring points and the confluence monitoring point, and the arrowed lines connecting labels 3 to label 4 represent the spatial topological associations and pollution migration directions between the confluence monitoring point and the midstream monitoring point. This is used to illustrate the structured expression of pollution migration relationships in the spatial topology diagram of the present invention.

[0081] In this exemplary embodiment, directional information is used to characterize the upstream-to-downstream migration of pollution in water bodies. This information originates from the natural flow direction information of river networks or water systems, as well as the unidirectional transmission trend determined by topography and hydrodynamic conditions. In practical applications, this embodiment uses water system data, topographic information, and flow-related hydrological data to determine the upstream and downstream relationships between monitoring points, thereby identifying the directional structure of pollution migration. This directional structure is then attached to the corresponding spatial topological associations, forming directional information to characterize the direction of pollution propagation. In this embodiment, the directional information is used to indicate the direction of information transmission in a graph neural network, enabling the network to distinguish the dominant role of upstream information on downstream nodes when aggregating input features from different monitoring points.

[0082] In this embodiment, migration impact information is used to characterize the differences in the potential impact of different spatial topological associations on downstream monitoring points. This information originates from the mechanisms of attenuation, dilution, mixing, or accumulation of pollutants during their transport through water bodies. In practical applications, this embodiment uses water connectivity characteristics, the relative water volume relationship between tributaries and the main channel, flow velocity, catchment area, or other hydrological and hydraulic information that reflects the impact of transport as reference factors to distinguish the degree of influence of different spatial topological associations. This degree of influence is then attached to the graph edge attributes as migration impact information to reflect the relative contribution of different topological associations in the pollution source tracing process. The migration impact information in this embodiment is used to distinguish the propagation intensity of different graph edges during the information propagation stage of the graph neural network, enabling the network to identify dominant and secondary paths when learning pollution migration paths.

[0083] Furthermore, in this embodiment, the purpose of step 400 is to establish a consistent source-tracing mapping relationship among the pollution source end-member fingerprint, the monitoring point spatial topology, and the downstream hybrid fingerprint, thereby outputting a contribution weight vector corresponding to the downstream monitoring point. To this end, step 400 does not merely perform a static fit on the observation results of the downstream monitoring point, but rather uses the typical pollution source fingerprint carried by the source feature matrix as a source-tracing reference and the monitoring point association relationship carried by the spatial topology map as a propagation constraint, so that the contribution weight vector is formed under spatial structural constraints, thus providing a foundation for source comparison, path interpretation, and contribution quantification.

[0084] On the input side of step 400, this embodiment uses the source feature matrix to characterize the multimodal fingerprint differences of each typical pollution source, ensuring that different pollution sources have distinguishable references at the fingerprint level. Simultaneously, a spatial topology graph is used to characterize the spatial topological relationships between monitoring points, ensuring that the pollution impact sources of downstream monitoring points can be expressed along the river network or water system connectivity. The directional information contained in the spatial topology graph is used to define the dominant direction of pollution migration, and the migration impact information is used to distinguish the differences in the strength of the impact of different spatial topological relationships on downstream monitoring points. Through the above input configuration, the graph neural network can simultaneously utilize pollution source fingerprint references and spatial relationship constraints when processing downstream monitoring points, avoiding source ambiguity caused by relying solely on single-point observations.

[0085] In the core inference process of step 400, this embodiment uses a graph neural network to propagate and aggregate information between monitoring points based on the spatial topology graph-defined associations. This allows the influence of adjacent monitoring points with spatial topological associations to be passed down level by level and aggregated to the downstream monitoring point. Here, "propagation" refers to the transmission of relevant information from adjacent monitoring points to the downstream monitoring point along the spatial topological associations; "aggregation" refers to the unified aggregation of propagated information from multiple adjacent monitoring points into a comprehensive representation of the downstream monitoring point. This comprehensive representation reflects the combined influence of upstream and adjacent paths on the downstream monitoring point under spatial structural constraints, thus providing a traceable structural basis for the generation of the contribution weight vector.

[0086] Unlike simply generating an "impact representation," step 400 further introduces a source feature matrix as a typical pollution source fingerprint reference. This enables the graph neural network to establish a correspondence between the comprehensive representation and the fingerprints of each typical pollution source, and generate a contribution weight vector corresponding to the downstream monitoring point. Each element in the contribution weight vector corresponds one-to-one with a typical pollution source in the source feature matrix, representing the relative contribution of each typical pollution source to the pollution fingerprint of the downstream monitoring point. This output method allows subsequent step 500 to directly perform a weighted linear superposition of the pollution source fingerprint vectors in the source feature matrix based on the contribution weight vector to obtain the predicted downstream fingerprint vector. This forms a closed-loop support between the "comparable endmember fingerprint reference" and the "verifiable hybrid fingerprint reconstruction," avoiding the lack of verifiable evidence for the contribution conclusion.

[0087] To enhance the ability to characterize the dominant pollution migration relationships, in the preferred embodiment of step 400, different propagation influence weights are assigned to adjacent monitoring points with different spatial topological associations with downstream monitoring points, and the propagation information from different adjacent monitoring points is weighted and aggregated accordingly. The setting of propagation influence weights is constrained by directional information and migration influence information: directional information makes the propagation influence weights reflect the dominant transmission characteristics from upstream to downstream, while migration influence information makes the propagation influence weights reflect the differences in the strength of influence of different topological association paths. By differentially characterizing the propagation process, this embodiment can highlight the influence of the main propagation path and suppress the interference of weakly associated paths in the aggregation stage, making the contribution weight vector more stable and more in line with the pollution migration law, and also making it easier to achieve interpretable quantitative optimization under subsequent composite loss constraints.

[0088] In terms of interpretability, step 400 can also record the spatial topological associations that play a major role in the formation of the contribution weight vector and organize them into a source-tracing association path set output. The source-tracing association path set consists of several continuous spatial topological association paths, used to characterize the dominant propagation paths affecting the pollution contribution of downstream monitoring points; through this set, the contribution weight vector of downstream monitoring points can not only explain the "contribution ratio of each typical pollution source", but also explain the "spatial propagation basis for contribution formation". In addition, to reduce the interference caused by topological structures with weak relationships with downstream monitoring points, this embodiment can also extract a local spatial topological map from the spatial topological map with the downstream monitoring points as the center, and input the source feature matrix and the local spatial topological map into a graph neural network, so that the graph information propagation and aggregation focus on the main influence range of the downstream monitoring points, thereby improving the focus of the source-tracing association path set and the robustness of the contribution weight vector.

[0089] Furthermore, in this embodiment, step 500 is used to combine and reconstruct the fingerprint information of each typical pollution source in the source feature matrix under the constraint of the contribution weight vector output in step 400, thereby forming the predicted downstream fingerprint vector of the downstream monitoring point. Specifically, the source feature matrix contains pollution source fingerprint vectors corresponding to each typical pollution source, and each pollution source fingerprint vector is used to characterize the feature composition of the typical pollution source under the multimodal fingerprint feature system; the contribution weight vector is used to characterize the relative contribution of each typical pollution source to the pollution fingerprint of the downstream monitoring point. Therefore, this embodiment uses the contribution weight vector as the weight benchmark to perform weighted combination of each pollution source fingerprint vector in the source feature matrix, so that the combined result can correspond to the actual monitored downstream fingerprint vector of the downstream monitoring point under the same feature system, so as to realize the comparable reconstruction of the mixed pollution fingerprint of the downstream monitoring point.

[0090] As an example, this embodiment defines "weighted linear superposition" as follows: Weighted linear superposition refers to combining the fingerprint vectors of each pollution source according to the relative contributions indicated by the contribution weight vector within the same multimodal fingerprint feature system, resulting in a predicted downstream fingerprint vector characterizing the mixed fingerprint features of downstream monitoring points. The "predicted downstream fingerprint vector" refers to a set of fingerprint features formed by weighted combination of typical pollution source fingerprint vectors, used to quantify the differences with the actual monitored downstream fingerprint vectors. This provides a comparison object for constructing the spectral reconstruction loss in step 600 and an evaluation basis for iterative optimization of the contribution weight vector in step 700. Through this step, this embodiment establishes a direct correspondence between the contribution weight vector and the fingerprint reconstruction results of downstream monitoring points, enabling the quantitative inference of source tracing contributions to have verifiable fingerprint consistency support, thereby improving the interpretability and reliability of the source tracing results.

[0091] As an optional implementation, in this embodiment, step 600 constructs a composite loss function to ensure that the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector are consistent overall. This further constrains the contribution weight vector to satisfy physical meaning and present a prominent contribution distribution of the dominant pollution source. To balance the two types of difference sources, "amplitude error" and "structural similarity," this embodiment designs the spectral reconstruction loss as a combination of normalized amplitude deviation and structural similarity deviation. Simultaneously, a physical constraint loss is constructed by penalizing negative weights and deviations in the weight sum, ensuring that the contribution weight vector conforms to the physical meaning of being non-negative and summing to one. Furthermore, an entropy-type regularization term, which promotes sparsity under simplex constraints, is introduced as a sparsity constraint to suppress non-dominant sources and highlight dominant sources, thereby improving the stability and interpretability of the source tracing results.

[0092] The composite loss function in this embodiment is defined as follows:

[0093]

[0094] The spectral reconstruction loss is defined as:

[0095]

[0096] The physical constraint loss is defined as:

[0097]

[0098] The sparsity constraint is defined as:

[0099]

[0100] in, This is a composite loss function, used as the objective of the iterative optimization in step 700; The spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector. The physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and ensure that the sum of all contribution weights is one. This is a sparsity constraint used to make the contribution weight vector tend to be sparse. To predict the downstream fingerprint vector in the 1st... The components of each fingerprint dimension are derived from the predicted downstream fingerprint vector obtained in step 500. To actually monitor the downstream fingerprint vector in the first... The components of each fingerprint dimension are derived from the actual downstream fingerprint vector obtained in step 200; To predict the overall downstream fingerprint vector, To actually monitor the overall downstream fingerprint vector; This represents the dot product of vectors, used to characterize the structural similarity of two fingerprint vectors; This represents the L2 norm, used for scaling structural similarity. For the contribution weight vector and the first The contribution weight components corresponding to each typical pollution source are derived from the contribution weight vector obtained in step 400. The number of typical pollution sources, The number of dimensions of fingerprint features; To prevent positive constants with zero denominators and odd logarithms, and to ensure computational stability, a value of 0.001 is used.

[0101] Furthermore, in this embodiment, step 700 is used to iteratively optimize the contribution weight vector and the model parameters of the graph neural network under the unified constraint of the composite loss function, so that the output contribution weight vector can not only effectively reconstruct the actual downstream fingerprint vector of the downstream monitoring point, but also meet the physical meaning and interpretability requirements of the source tracing contribution. Specifically, this embodiment applies the spectral reconstruction loss, physical constraint loss, and sparsity constraint to the contribution weight vector, so that the contribution weight vector simultaneously satisfies three objectives during the optimization process: First, the spectral reconstruction loss reduces the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, so that the contribution weight vector and the fingerprint reconstruction results of the downstream monitoring point form a consistent correspondence; second, the physical constraint loss suppresses weight values ​​that do not conform to physical meaning, so that the contribution weight vector satisfies the non-negativity constraint and the sum of each contribution weight is one; third, the sparsity constraint guides the contribution weight vector to highlight a few dominant pollution sources and weaken non-dominant pollution sources, thereby making the quantitative contribution results more interpretable and stable.

[0102] In optimizing the model parameters of the graph neural network, this embodiment uses the spectral reconstruction loss as the main driving constraint on the model parameters, enabling the graph neural network to form a stable source-tracing mapping relationship for downstream monitoring points under the constraints of the spatial topology graph. Specifically, the graph neural network performs graph information propagation and aggregation between monitoring points based on the spatial topology graph, and generates a contribution weight vector by combining the source feature matrix. When the spectral reconstruction loss is reduced, it means that the predicted downstream fingerprint vector reconstructed from the contribution weight vector and the source feature matrix is ​​more consistent with the actual monitored downstream fingerprint vector. Therefore, this embodiment guides the graph neural network parameters to learn the mapping law that can express the pollution migration effect and achieve endmember mixing decomposition under the constraints of the spatial topology structure by minimizing the spectral reconstruction loss. This reduces the source-tracing deviation caused by differences in the distribution of monitoring points or local noise, and improves the consistency of source-tracing results under different samples and different time periods.

[0103] This embodiment sets convergence conditions during the iterative optimization process and outputs the converged contribution weight vector when the convergence conditions are met, serving as the quantitative contribution rate of each pollution source at the downstream monitoring points. The "convergence condition" refers to a state where the changes in the composite loss function and the contribution weight vector tend to stabilize, characterizing that the current contribution weight vector and the graph neural network model parameters have reached a relatively stable optimization result. After reaching the convergence condition, the output contribution weight vector can effectively reconstruct the downstream fingerprint vector of the actual monitoring at the downstream monitoring points, satisfying the non-negativity constraint and ensuring that the sum of all contribution weights is one. Furthermore, due to the sparsity constraint, it highlights the dominant pollution source, thus serving as the quantitative contribution rate result of each typical pollution source at the downstream monitoring points for subsequent pollution source tracing analysis and interpretation.

[0104] In another embodiment, the pollution source contribution rate quantification of the present invention can be implemented using an optimizable weight learning framework, so that the pollution source analysis process can be expressed as a "trainable solution of contribution weight vectors" problem. In this embodiment, the source feature library construction module first stores and quantifies the multimodal fingerprint features of each typical pollution source to form a source feature matrix, and uses the source feature matrix as a reference basis for the fingerprints of typical pollution sources; simultaneously, the contribution weight vectors corresponding one-to-one with the pollution sources in the source feature matrix are initialized. The contribution weight vectors are used to characterize the initial contribution weight estimate of each pollution source to the downstream monitoring points, and are updated in the subsequent optimization process to obtain the quantitative contribution rate result. To ensure clarity of terminology, in this embodiment, the "source feature library construction module" refers to the functional module used to collect, organize, and output the set of fingerprint vectors of each typical pollution source, and its output is used to form the source feature matrix; the "trainable contribution weight parameters" refer to the weights in the contribution weight vectors participating in the optimization solution as variables to be determined, so that they can characterize the quantitative contribution of the pollution source.

[0105] In this embodiment, the contribution weight vector and the source feature matrix are used together to simulate the formation mechanism of pollution fingerprints at downstream monitoring points. Specifically, a weighted linear superposition is performed on the pollution source fingerprint vectors in the source feature matrix based on the contribution weight vector to obtain a predicted downstream fingerprint vector. This predicted downstream fingerprint vector is then made comparable to the actual downstream monitoring fingerprint vector obtained in step 200 within the same fingerprint feature system. As a non-limiting example, the weighted linear superposition can be represented as follows:

[0106]

[0107] in, To predict downstream fingerprint vectors; The first element in the source characteristic matrix Pollution source fingerprint vectors corresponding to a typical pollution source; For the contribution weight vector and the first Contribution weights corresponding to each typical pollution source; This represents the number of typical pollution sources. In this way, a direct correspondence is established between the contribution weight vector and the predicted downstream fingerprint vector, enabling the contribution weight vector obtained through subsequent optimization to be interpreted as a quantitative decomposition result of the downstream mixed fingerprint.

[0108] Furthermore, to ensure that the contribution weight vector satisfies both fingerprint reconstruction consistency and physical meaning and interpretability requirements, this embodiment constructs a composite loss function and uses it as the optimization objective. The composite loss function includes at least spectral reconstruction loss, physical constraint loss, and sparsity constraint. As a non-limiting example, the composite loss function can be expressed as:

[0109]

[0110] in, To quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, the mean square error, cosine distance, or a combination of the two can be selected to balance amplitude consistency and structural similarity. This is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight to be one, so that the contribution rate result conforms to the physical meaning; This is used to make the contribution weight vector sparser, so as to highlight the few dominant pollution sources and suppress the interference of non-dominant pollution sources on the results. To avoid excessive manual parameter tuning, this embodiment preferably does not introduce additional multiple weight coefficients; when it is necessary to balance the strength of various effects, no more than two balance coefficients may be introduced, and the balance coefficients can be determined by verifying the consistency criteria of the samples to avoid arbitrary setting.

[0111] Using the other embodiment described above, the present invention can transform the pollution source analysis process into an optimizable weight learning process based on a composite loss function, thereby achieving a quantitative solution for the contribution rate of pollution sources in complex nitrogen composite pollution scenarios. By applying a non-negative and sum-to-one constraint to the contribution weight vector through physical constraint loss, the output results have a clear physical meaning and avoid negative contributions or total drift. By highlighting the dominant pollution source and suppressing interference from non-dominant sources through sparsity constraints, the interpretability and stability of the contribution rate conclusion are improved. Furthermore, this weight learning framework can be jointly optimized with the graph neural network in step 400 under the constraint of spatial topology graph, thereby further improving the ability of the source tracing results to characterize the spatial propagation structure and the spatiotemporal consistency.

[0112] In another embodiment, the present invention discloses an adaptive learning method, the application process of which is reflected in the following: the solution of the contribution weight vector does not rely on a pre-specified single tracer or fixed empirical weights, but automatically adjusts the contribution weights of each typical pollution source based on the feedback of the difference between the actual monitored downstream fingerprint vector and the predicted downstream fingerprint vector at the downstream monitoring point, so that the contribution weight vector gradually approaches the solution that minimizes the composite loss function. Specifically, the difference between "prediction and measurement" is quantified by the spectral reconstruction loss, enabling the model to automatically identify the feature structures that contribute more significantly to the difference in the multimodal fingerprint dimension; the range of values ​​and total amount of the contribution weight vector are limited by the physical constraint loss, so that the adaptive adjustment process is always within a physically meaningful feasible region; the sparsity constraint guides the contribution weight vector to a state where the dominant source is prominent and the non-dominant source is suppressed, so that the model can still form a clear main source interpretation in the case of multi-source intersection and partial overlap of endmembers. Through the above adaptive adjustment mechanism, this embodiment can automatically form a contribution weight vector that matches the scene based on the observation data under different monitoring points, different sampling periods, and different pollution mixing intensities, rather than relying on a fixed contribution ratio preset by humans.

[0113] Furthermore, when this embodiment is combined with the graph neural network described in step 400, adaptive learning is not only reflected in the updating of the contribution weight vector, but also in the adaptive characterization of the "spatial topology propagation law" by the graph neural network: under the constraints of the spatial topology graph, the graph neural network performs graph information propagation and aggregation between monitoring points, and automatically adjusts the model parameters through the feedback of the spectral reconstruction loss, thereby forming a more stable source tracing mapping relationship that conforms to the actual pollution migration impact under different river network connectivity structures or different tributary confluence conditions. To avoid ambiguity in terminology, in this embodiment, "adaptive learning" refers to: under the condition of not presetting fixed weights or fixed propagation intensity, based on the reconstruction consistency and constraint satisfaction degree reflected by the composite loss function, iteratively updating the contribution weight vector and graph neural network model parameters, so that the model output can be automatically adjusted with changes in data distribution and spatial structure; this mechanism enables the present invention to automatically extract the most effective differential features for source tracing from high-dimensional, heterogeneous multimodal fingerprint information, and suppress irrelevant disturbances under the constraints of spatial topology structure, thereby improving the anti-interference ability and generalization applicability in complex scenarios.

[0114] Corresponding to the above methods, such as Figure 3 As shown, this embodiment also provides a precise source tracing system for nitrogen compound pollution sources based on multimodal data fusion and graph neural networks, including:

[0115] The pollution source fingerprint construction unit is used to acquire and quantify the multimodal fingerprint features of each typical pollution source to form a source feature matrix;

[0116] The downstream fingerprint acquisition unit is used to acquire and fuse multimodal spectral data of downstream monitoring points to obtain the actual monitored downstream fingerprint vector;

[0117] Spatial topology building unit, used to construct a spatial topology map based on the spatial topology structure between monitoring points;

[0118] The contribution weight inference unit is used to input the source feature matrix and spatial topology graph into the graph neural network to obtain the contribution weight vector corresponding to the downstream monitoring point.

[0119] The downstream fingerprint reconstruction unit is used to perform weighted linear superposition of the fingerprint vectors of each pollution source in the source feature matrix based on the contribution weight vector to obtain the predicted downstream fingerprint vector of the downstream monitoring point.

[0120] A composite loss building unit is used to construct a composite loss function. The composite loss function includes at least spectral reconstruction loss, physical constraint loss, and sparsity constraint. Among them, spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight is one, and sparsity constraint is used to make the contribution weight vector tend to be sparse.

[0121] The contribution weight optimization unit is used to iteratively optimize the contribution weight vector and the model parameters of the graph neural network based on the composite loss function until convergence, and outputs the converged contribution weight vector as the quantitative contribution rate of each pollution source at the downstream monitoring point.

[0122] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the systems disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the descriptions are relatively simple; relevant parts can be referred to the method section.

[0123] This document uses specific examples to illustrate the principles and implementation methods of the present invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of the present invention. Furthermore, those skilled in the art will recognize that, based on the ideas of the present invention, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of the present invention.

Claims

1. A method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks, characterized in that, include: The multimodal fingerprint features of each typical pollution source are acquired and quantified to form a source feature matrix; The multimodal spectral data of downstream monitoring points are acquired and fused to obtain the actual downstream fingerprint vector; Construct a spatial topology map based on the spatial topology between monitoring points; The source feature matrix and spatial topology graph are input into a graph neural network to obtain the contribution weight vectors corresponding to the downstream monitoring points. Based on the contribution weight vector, a weighted linear superposition is performed on the fingerprint vectors of each pollution source in the source feature matrix to obtain the predicted downstream fingerprint vector of the downstream monitoring point. Construct a composite loss function; the composite loss function includes at least spectral reconstruction loss, physical constraint loss and sparsity constraint; among which, spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight is one, and sparsity constraint is used to make the contribution weight vector tend to be sparse. The contribution weight vector and the model parameters of the graph neural network are iteratively optimized based on the composite loss function until convergence, and the converged contribution weight vector is output as the quantitative contribution rate of each pollution source at the downstream monitoring point.

2. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, A spatial topology map is constructed based on the spatial topology between monitoring points, including: The spatial topological relationships between monitoring points are determined based on the connectivity of river networks or water systems. Each monitoring point is mapped to a graph node in the spatial topology graph; Monitoring point pairs with spatial topological associations are mapped to graph edges in a spatial topological graph to form a spatial topological graph that characterizes pollution migration relationships.

3. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, Constructing a spatial topology map based on the spatial topology between monitoring points also includes: In the spatial topology graph, at least some graph edges are associated with directional information to characterize the direction of pollution migration from upstream to downstream; The spatial topology graph contains at least some graph edge association migration impact information; the migration impact information is used to characterize the differences in the impact of different spatial topology associations on downstream monitoring points; Directional information and migration impact information are incorporated into the spatial topology graph for use by graph neural networks.

4. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, The source feature matrix and spatial topology graph are input into a graph neural network to obtain contribution weight vectors corresponding to downstream monitoring points, including: The source feature matrix is ​​used as the input feature to characterize the fingerprint of a typical pollution source; The spatial topology map is used as the input structure to represent the spatial association of monitoring points; In graph neural networks, information between monitoring points is propagated and aggregated based on spatial topology graphs, and contribution weight vectors corresponding to downstream monitoring points are generated by combining the source feature matrix.

5. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, The source feature matrix and spatial topology graph are input into a graph neural network to obtain contribution weight vectors corresponding to downstream monitoring points, which also include: In graph neural networks, different propagation influence weights are assigned to adjacent monitoring points that have different spatial topological associations with downstream monitoring points; Based on the propagation impact weight, the propagation information from different adjacent monitoring points is weighted and aggregated to enhance the ability to characterize the dominant pollution migration relationship; Output the contribution weight vector based on the weighted aggregation result.

6. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, The source feature matrix and spatial topology graph are input into a graph neural network to obtain contribution weight vectors corresponding to downstream monitoring points, which also include: In a graph neural network, spatial topological associations that play a major role in the formation of contribution weight vectors are recorded. The spatial topological associations that play a major role are organized into a set of source-tracing association paths to characterize the dominant propagation paths that affect the pollution contribution of downstream monitoring points. Output the set of source-tracing related paths to provide a basis for the source-tracing interpretation of the contribution weight vector.

7. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, Input the source feature matrix and spatial topology graph into the graph neural network, including: Using the downstream monitoring point as the center, extract a local spatial topology map containing the downstream monitoring point from the spatial topology map; The source feature matrix and the local spatial topology graph are input into the graph neural network to obtain the contribution weight vector corresponding to the downstream monitoring points, thereby reducing the interference of irrelevant spatial topology associations on the source tracing results.

8. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, Acquire and fuse multimodal spectral data from downstream monitoring points to obtain the actual downstream fingerprint vector, including: A consistent fusion process is performed on the multimodal spectral data obtained from the same downstream monitoring point at different sampling periods to obtain the actual downstream fingerprint vector of the corresponding sampling period; The actual downstream fingerprint vectors from different sampling periods are used as the basis for training and optimizing the graph neural network, so that the contribution weight vector can reflect the time-varying characteristics of pollution contribution.

9. The method for precise source tracing of nitrogen compound pollution sources based on multimodal data fusion and graph neural networks according to claim 1, characterized in that, The contribution weight vector and the model parameters of the graph neural network are iteratively optimized based on the composite loss function until convergence, including: The spectral reconstruction loss, physical constraint loss, and sparsity constraint are applied together to the contribution weight vector so that the contribution weight vector tends to be sparse under the condition that the non-negativity constraint is satisfied and the sum of each contribution weight is one. At the same time, the spectral reconstruction loss is applied to the model parameters of the graph neural network so that the graph neural network can form a stable source-tracing mapping relationship to the downstream monitoring points under the constraints of the spatial topology graph; When the convergence condition is met, the converged contribution weight vector is output as the quantitative contribution rate of each pollution source at the downstream monitoring point.

10. A precise source tracing system for nitrogen compound pollution sources based on multimodal data fusion and graph neural networks, characterized in that, include: The pollution source fingerprint construction unit is used to acquire and quantify the multimodal fingerprint features of each typical pollution source to form a source feature matrix; The downstream fingerprint acquisition unit is used to acquire and fuse multimodal spectral data of downstream monitoring points to obtain the actual monitored downstream fingerprint vector; Spatial topology building unit, used to construct a spatial topology map based on the spatial topology structure between monitoring points; The contribution weight inference unit is used to input the source feature matrix and spatial topology graph into the graph neural network to obtain the contribution weight vector corresponding to the downstream monitoring point. The downstream fingerprint reconstruction unit is used to perform weighted linear superposition of the fingerprint vectors of each pollution source in the source feature matrix based on the contribution weight vector to obtain the predicted downstream fingerprint vector of the downstream monitoring point. A composite loss building unit is used to construct a composite loss function. The composite loss function includes at least spectral reconstruction loss, physical constraint loss, and sparsity constraint. Among them, spectral reconstruction loss is used to quantify the difference between the predicted downstream fingerprint vector and the actual monitored downstream fingerprint vector, physical constraint loss is used to constrain the contribution weight vector to satisfy the non-negativity constraint and the sum of each contribution weight is one, and sparsity constraint is used to make the contribution weight vector tend to be sparse. The contribution weight optimization unit is used to iteratively optimize the contribution weight vector and the model parameters of the graph neural network based on the composite loss function until convergence, and outputs the converged contribution weight vector as the quantitative contribution rate of each pollution source at the downstream monitoring point.