Neural network-based environmental health causal inference and risk prediction system and method
The system for causal inference and risk prediction of environmental health based on neural networks solves the problems of heterogeneous network modeling and spatiotemporal coupling in causal inference and risk prediction of environmental health. It achieves accurate identification of causal relationships and quantification of multi-path effects, and provides decision support for real-time multi-scale risk warning.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUBEI PROVINCIAL ACADEMY OF ECO-ENVIRONMENTAL SCIENCES(PROVINCIAL ECOLOGICAL ENVIRONMENT ENGINEERING ASSESSMENT CENTER)
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies cannot effectively model heterogeneous network structures and spatiotemporal coupling mechanisms in environmental health causal inference and risk prediction. They lack causal identification and conditional independence testing methods applicable to spatiotemporally related data, making it difficult to automatically identify and quantify multi-path causal effects. Risk prediction models lack causal mechanism support, resulting in poor interpretability of prediction results. They are unable to handle the fusion and causal modeling problems of multi-scale, multi-source, and heterogeneous environmental health data.
An environmental health causal inference and risk prediction system based on neural networks is adopted, including a spatiotemporal heterogeneous data fusion module, a graph neural network modeling engine, a deep learning causal inference module, and a dynamic risk prediction module. Through multi-head spatiotemporal attention mechanism, ST-CIT conditional independence test, and multi-path causal effect calculator, the system can accurately model the causal relationship of complex environmental health systems and dynamically track their evolution.
It breaks through the bottleneck of spatiotemporal causal inference, realizes the accurate identification of causal relationships and the quantitative assessment of multi-path causal effects in complex environments, supports real-time multi-scale risk early warning, and provides a scientific basis for environmental health management.
Smart Images

Figure CN122245835A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of environmental health and artificial intelligence technology, specifically to a system and method for causal inference and risk prediction of environmental health based on neural networks. Background Technology
[0002] Identifying the causal relationship between environmental pollution and human health is a core issue in the field of environmental health science. It has significant theoretical and practical value for scientifically assessing environmental health risks, formulating precise prevention and control measures, and supporting environmental policy-making. However, the environmental health system is essentially a multi-source, heterogeneous, dynamically evolving, and complexly interwoven mega-system with multiple causal chains. Traditional causal inference methods have revealed numerous technical bottlenecks when dealing with such systems.
[0003] First, environmental health data generally exhibits significant spatiotemporal heterogeneity and dynamic correlations. Multi-source data, including pollutant concentrations, meteorological conditions, population movement, and health events, suffer from non-uniform sampling, inconsistent resolution, and high missing rates across time and space. Traditional causal inference methods based on regression or structural equation models typically assume independent and identically distributed data, making it difficult to effectively model spatiotemporal dependency structures. This leads to serious biases in causal identification results, and may even produce spurious correlations or spurious independence.
[0004] Secondly, environmental health systems exhibit typical heterogeneous network structure characteristics. Pollutants typically undergo a complex process from emission source to health endpoint, involving multiple stages, media, and pathways—"emission—transmission—transformation—exposure—effect"—involving various types of entities such as pollution sources, environmental media, exposure pathways, and health endpoints, as well as the diverse and heterogeneous relationships between them. Traditional causal inference methods are mostly based on variable lists or homogeneous graph structures, lacking the ability to model heterogeneous nodes and multivariate relationships, making it difficult to accurately characterize the propagation mechanisms and causal paths of pollutants in complex networks.
[0005] Third, existing methods generally lack the ability to identify and quantify multi-path causal effects. In real-world environmental systems, a single pollution source may simultaneously affect health endpoints through multiple exposure pathways, such as atmospheric inhalation, food intake, and drinking water intake, and different pathways may exhibit synergistic, antagonistic, or independent mechanisms of action. Traditional mediation analysis methods often focus on single pathways, making it difficult to automatically identify, decompose, and quantify multi-path causal effects, which severely restricts the accurate assessment of environmental health risks and the scientific formulation of intervention strategies.
[0006] Furthermore, existing environmental health risk prediction models are mostly based on statistical correlation modeling, lacking explicit modeling of causal mechanisms. This results in poor extrapolation ability and interpretability when environmental policies change or pollution structures evolve. For example, when a pollution source is forcibly shut down or meteorological conditions change drastically, purely statistical models often fail to accurately predict the dynamic trends of health risks.
[0007] In recent years, Graph Neural Networks (GNNs) have demonstrated powerful capabilities in processing non-Euclidean structured data, automatically learning low-dimensional representations of nodes and edges from graph structures and capturing high-order interactions in complex systems. However, existing GNN methods mostly focus on prediction tasks and lack the ability to identify causal directions; at the same time, their modeling capabilities for heterogeneous spatiotemporal graphs, dynamic graphs, and spatiotemporally coupled graphs remain insufficient, making it difficult to meet the combined requirements of environmental health systems for causality, interpretability, and multi-scale modeling.
[0008] In the field of causal inference theory, the causal identification framework based on structural causal models (SCM), do-calculus, and counterfactual reasoning provides a theoretical foundation for identifying causal effects from observational data. However, traditional causal discovery algorithms such as PC and FCI rely on conditional independence tests and generally assume that the data are independent and identically distributed, making them difficult to apply to environmental health data with spatiotemporal autocorrelation, resulting in unreliable causal structure learning results.
[0009] In summary, existing technologies face the following key technical challenges in inferring causal relationships and predicting risks related to environmental health: 1. It is impossible to effectively model the heterogeneous network structure and spatiotemporal coupling mechanism of environmental health systems; 2. There is a lack of methods for causal identification and conditional independence testing applicable to spatiotemporally related data; 3. It is difficult to automatically identify and quantify multi-path causal effects; 4. The risk prediction model lacks causal mechanism support, resulting in poor interpretability and weak adaptability of the prediction results; 5. It cannot handle the fusion and causal modeling of multi-scale, multi-source, and heterogeneous environmental health data.
[0010] Therefore, there is an urgent need to develop a new methodological framework that integrates spatiotemporally sensitive graph neural networks and deep learning causal inference to break through the technical bottlenecks of traditional methods in causal identification, path quantification, and risk prediction. This framework will enable accurate modeling, dynamic evolution tracking, and interpretable risk warning of causal relationships in complex environmental health systems, providing strong technical support for environmental health scientific research and public health decision-making. Summary of the Invention
[0011] The purpose of this invention is to provide a system and method for causal inference and risk prediction of environmental health based on neural networks, so as to solve the problems mentioned in the background art.
[0012] To achieve the above objectives, the present invention provides the following technical solution: A neural network-based environmental health causal inference and risk prediction system includes: The spatiotemporal heterogeneous data fusion module is used to perform multi-source heterogeneous fusion of pollution source data, environmental medium data, human exposure data, and health effect data to construct a unified spatiotemporal data representation. A graph neural network modeling engine is used to model environmental health systems as heterogeneous spatiotemporal causal graph networks, learn the representations of nodes and edges, and capture the complex relationships between pollution sources, environmental media, exposure pathways and health endpoints. The deep learning causal inference module is used to identify causal structures based on graph network representation, control confounding factors, estimate causal effects, and achieve spatiotemporally sensitive causal identification through the spatiotemporal conditional independence test mechanism ST-CIT. The multipath causal effect calculator is used to identify and quantify multiple causal paths from pollution sources to health endpoints, and calculate the causal effect of each path and its contribution to the total effect. The dynamic risk prediction module is used to perform multi-step health risk prediction based on causal inference results and graph recurrent neural networks, and generate early warning information.
[0013] Furthermore, the spatiotemporal heterogeneous data fusion module includes: A multi-source data preprocessor uses a deep learning-based spatiotemporal data registration algorithm to achieve spatiotemporal alignment of multi-source data; Spatiotemporal graph construction unit, used to construct heterogeneous spatiotemporal graph networks that include pollution sources, environmental media, exposure pathways and health endpoints; A data quality control mechanism is used to ensure the reliability of fused data through spatiotemporal consistency checks and outlier detection.
[0014] Furthermore, graph neural network modeling engines include: Heterogeneous spatiotemporal graph construction unit, using a hierarchical graph representation method to model various node and edge types in the environmental health system; The spatiotemporal graph representation learning unit is used to learn node embeddings through the fusion of graph convolutional neural networks and long short-term memory networks; the graph neural network modeling engine employs a multi-layer heterogeneous spatiotemporal graph causal embedding algorithm, including: Heterogeneous node encoding layer, used for feature encoding of different types of nodes; The causal relationship perception layer is used to maintain causal direction information; The spatiotemporal dynamic update layer is used to fuse dynamic information in time and space; The causal effect prediction layer is used to predict the strength of causal effects based on node embedding.
[0015] A multi-head spatiotemporal attention mechanism is used to calculate attention weights under spatiotemporal dependencies; Graph convolution computation units are used to process large-scale heterogeneous spatiotemporal graph data and extract causal sensing features.
[0016] Furthermore, the deep learning causal inference module includes: The causal structure learning network employs a combination of variational autoencoder and graph neural network to learn the causal graph structure. The counterfactual prediction unit uses a neural network to implement the do-operator and generate counterfactual prediction results. A confounding factor control algorithm automatically identifies backdoor paths and determines the minimum adjustment set based on graph structure. Causal effect estimator, used to estimate average causal effect, conditional average causal effect, and individual causal effect; The spatiotemporal conditional independence test mechanism ST-CIT identifies causal relationships under spatiotemporal dependence by introducing a spatiotemporal weighting function to adjust the conditional independence test statistic.
[0017] Furthermore, the multipath causality calculator includes: The causal path identification unit uses a breadth-first search algorithm to identify all causal paths from the source node to the target node; The path weight calculation module is used to calculate path weights based on path importance, reliability, and strength of evidence. Effect propagation algorithm, used to calculate single-path causal effects; The total effect aggregator is used to weight and aggregate the causal effects of all paths to obtain the total causal effect.
[0018] Furthermore, the dynamic risk prediction module includes: Temporal causal modeling networks are used to construct temporal dynamic models that consider causal relationships. A risk prediction network based on a graph recurrent neural network to achieve multi-step health risk prediction; Uncertainty quantizer, used to estimate the uncertainty of prediction results using Bayesian deep learning methods; The early warning and decision-making system is used to generate multi-level risk warnings and decision-making suggestions based on the prediction results.
[0019] Furthermore, the multi-head spatiotemporal attention mechanism calculates attention weights using the following formula:
[0020] in, , Let Φ be the node representation vector, Φ be the temporal difference coding function, Ψ be the spatial distance coding function, W be the weight matrix, and a be the attention vector.
[0021] Furthermore, the multipath causality calculator uses the following formula to calculate the total causal effect:
[0022] in, Let P(S→T) be the total effect, and let P(S→T) be the set of all causal paths. For path weights, This is a single-path effect.
[0023] Furthermore, the spatiotemporal condition independence test mechanism ST-CIT is implemented through the following steps: Construct the spatiotemporal distance matrix; Design a spatiotemporal weighting function; Calculate the weighted conditional independence test statistic; Adjust the significance level to control for biases caused by spatiotemporal correlation.
[0024] A method for causal inference and risk prediction of environmental health based on spatiotemporal sensitive graph neural networks, using the aforementioned neural network-based system for causal inference and risk prediction of environmental health, is characterized by the following steps: Step S1: Construct a heterogeneous spatiotemporal graph representation of the environmental health system, modeling pollution sources, environmental media, exposure pathways, and health endpoints as graph nodes, and modeling the relationships between them as graph edges; Step S2: Use a graph neural network to learn the representation vectors of nodes and edges, and capture spatiotemporal dependencies through a multi-head spatiotemporal attention mechanism; Step S3: Based on deep learning causal inference algorithms, identify causal structures from observational data and estimate causal effects; Step S4: Calculate the multipath causal effects, quantify the indirect causal effects and the total causal effect; Step S5: Based on the causal inference results, predict future environmental health risks and generate dynamic early warning information.
[0025] Compared with the prior art, the beneficial effects of the present invention are: 1. Breakthrough in spatiotemporal causal inference bottleneck: The pioneering ST-CIT mechanism solves the problem of pseudo-independence of spatiotemporal related data and enables accurate identification of causal relationships in complex environments.
[0026] 2. Pioneering multi-path causal quantification: Automatically decomposes multiple transmission paths from pollution source to health endpoint, quantitatively assesses the contribution of each path, and supports precise intervention.
[0027] 3. Constructing Heterogeneous Causal Graph Embedding: Propose the MHGCE algorithm to uniformly represent multiple types of entities and their causal relationships, and realize end-to-end causal structure learning.
[0028] 4. Achieve causally explainable predictions: Based on causal mechanisms, predictions have clear causal chains and remain robust in extrapolation when environmental conditions change.
[0029] 5. Significantly improve decision-making efficiency: The system supports real-time multi-scale risk warning and sensitivity analysis, providing scientific and operable decision-making basis for environmental health management. Attached Figure Description
[0030] Figure 1 This is a diagram showing the overall architecture of the system of the present invention; Figure 2 This is a schematic diagram of a heterogeneous spatiotemporal graph network structure; Figure 3 This is a flowchart of the deep learning causal inference module algorithm. Figure 4 This is a calculation framework diagram for the multipath causal effect calculator; Figure 5 This is a flowchart of the computation of the spatiotemporal sensitive graph attention mechanism.
[0031] Figure 6 The flowchart of the spatiotemporal conditional independence test mechanism (ST-CIT) is shown below. Figure 7 This is a network architecture diagram of the Multilayer Heterogeneous Spatiotemporal Graph Causal Embedding Algorithm (MHGCE). Detailed Implementation
[0032] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0033] Reference Figures 1 to 7 This invention provides an environmental health causal inference and risk prediction system based on a spatiotemporally sensitive graph neural network, comprising: The spatiotemporal heterogeneous data fusion module is used to integrate environmental health data from different sources and at different spatiotemporal scales to construct a unified spatiotemporal data representation. A graph neural network modeling engine is used to model environmental health systems as heterogeneous spatiotemporal graph networks, learning the representations of nodes and edges through graph neural networks; The deep learning causal inference module is used to realize causal structure learning, confounding factor control, and causal effect estimation based on a deep learning framework. Multipath Causality Calculator, used to identify and quantify multiple causal paths from pollution sources to health endpoints; The dynamic risk prediction module is used to achieve dynamic prediction and early warning of environmental health risks based on causal inference results and time series modeling.
[0034] The system comprises five core modules (S1-S5), each capable of operating independently yet also working together to form a complete causal inference and prediction process. For example... Figure 1 As shown, the overall workflow of the system is as follows: First, multi-source environmental health data is integrated through the spatiotemporal heterogeneous data fusion module; then, the graph neural network modeling engine constructs a heterogeneous spatiotemporal graph network representation; next, the deep learning causal inference module identifies causal relationships; then, the multi-path causal effect calculator quantifies multi-path effects; and finally, the dynamic risk prediction module outputs the risk prediction results. Figures 2 to 7 The internal structure and workflow of each module are shown.
[0035] The following section provides further explanation of the system setup process in conjunction with the above methods and steps: (I) Spatiotemporal Heterogeneous Data Fusion Module This module aims to solve the problem of spatiotemporal fusion of multi-source heterogeneous data in the field of environmental health. It achieves a unified representation of data at different spatiotemporal scales through advanced spatiotemporal registration algorithms. This module consists of four core sub-modules: a multi-source data preprocessor, a spatiotemporal graph construction unit, a data standardization processor, and a spatiotemporal registration algorithm.
[0036] The actual operation process of this module is as follows: First, the system acquires environmental monitoring data, health monitoring data, spatial geographic data, and time series data through multi-source data interfaces; second, the system automatically performs spatiotemporal registration and standardization processing to construct a unified spatiotemporal data representation; finally, a heterogeneous spatiotemporal graph network structure is constructed based on the processed data. If data quality issues are encountered, the system will automatically flag them and provide a data quality assessment report.
[0037] 1.1 Multi-source data preprocessor
[0038] This preprocessor employs a deep learning-based spatiotemporal data registration algorithm. Its core architecture includes a spatiotemporal data representation learning model, a multi-head spatiotemporal attention mechanism, and a spatiotemporal distance modulation function.
[0039] The specific implementation of the spatiotemporal data representation learning model is as follows:
[0040] The model parameters are configured as follows: input feature dimension dx = 256, spatial coordinate dimension d l =2, time dimension dt=1, hidden layer representation dimension d h=512, network layer n=4, activation function is ReLU, dropout rate is 0.1. The network structure adopts a combination of fully connected layers. The first layer maps the original features X to a 512-dimensional latent space, the second layer fuses spatial information L, the third layer fuses temporal information T, and the fourth layer outputs the final spatiotemporal representation H.
[0041] The calculation formula for the multi-head spatiotemporal attention mechanism is as follows:
[0042] Where the number of attention heads h=8, and the query vector q i and key vector Dimensions =64, the spatiotemporal distance modulation function g adopts the form of a Gaussian kernel function: Spatial standard deviation σ s =1000 meters, time standard deviation σ t =24 hours.
[0043] 1.2 Spatiotemporal Graph Construction Unit
[0044] This unit is responsible for constructing a heterogeneous spatiotemporal graph network structure containing four node types and six edge types. Its core implementation includes node type identification algorithms, edge relationship inference algorithms, and graph topology optimization algorithms.
[0045] The node type definition and encoding are as follows: Pollution source node (S): The coding dimension is 128, including pollution source type (industrial source, transportation source, agricultural source, domestic source), emission intensity, emission composition, geographical location and time characteristics; Environmental media node (M): The coding dimension is 96, including media type (atmosphere, surface water, groundwater, soil, organism), pollutant concentration, physicochemical properties, transmission characteristics and distribution features; Exposure route node (E): The coding dimension is 64, including the exposure method (respiratory exposure, oral exposure, skin contact), exposure intensity, exposure time, exposed population and exposure scenario; Health endpoint (H): The coding dimension is 80, including the type of health effect (acute effect, chronic effect, carcinogenic effect), severity, affected population, biomarkers, and clinical manifestations.
[0046] The edge relationship type definition and weight calculation are as follows: Emission relationship edge (S→M): Weight calculation is based on emission intensity and transmission efficiency. Where α is the normalization coefficient, For emission intensity, For transmission efficiency; Medium transfer edge (M→M): The weights are based on the inter-medium transport model, taking into account physicochemical processes and environmental conditions; Exposure relationship edge (M→E): The weights are based on the exposure model, taking into account contact frequency, contact intensity and contact time; Pathogenicity relationship edge (E→H): The weights are based on dose-response relationships, combined with epidemiological evidence and toxicological data; Time-related edges: weights are based on time-lag correlations and employ a time decay function; Spatial relation edges: The weights are based on spatial distance decay and adopt a power-law function form.
[0047] 1.3 Data Standardization Processor
[0048] This processor enables standardized processing of multi-source heterogeneous data, ensuring data quality and consistency. Core functions include data quality assessment, missing value handling, outlier detection, and data normalization.
[0049] Data quality assessment employs a multi-dimensional evaluation system:
[0050] Where Q is the overall data quality score, C is the integrity score, I is the consistency score, A is the accuracy score, and T is the timeliness score. The weighting coefficients are dynamically adjusted according to the application scenario and data characteristics, and are generally w=0.3, w=0.25, w3=0.3, and w4=0.15.
[0051] Missing values are handled using an interpolation algorithm based on a graph neural network.
[0052] in, Let be the interpolated value of node i, and A be the adjacency matrix. f is the feature matrix excluding node i. i Let i be the known features of node i.
[0053] Outlier detection employs an isolation forest-based algorithm, incorporating domain knowledge to set detection thresholds. Data normalization utilizes a classification-based standardization method, applying appropriate normalization strategies to different data types.
[0054] 1.4 Spatiotemporal Registration Algorithm
[0055] This algorithm addresses the mismatch between different data sources in the spatiotemporal dimension, achieving precise spatiotemporal alignment. The core implementation includes spatiotemporal interpolation, scale transformation, and error correction.
[0056] Spatiotemporal interpolation employs a tensor decomposition-based method:
[0057]
[0058] in, Let u be the interpolated spacetime tensor, and let u be the eigenvalue. r v r w r These are feature vectors representing spatial, temporal, and feature dimensions, respectively.
[0059] The scale conversion adopts a multi-resolution pyramid structure, supporting multi-time scale conversion from hour to year and multi-spatial scale conversion from meter to kilometer.
[0060] Error correction is based on a Bayesian framework, which quantifies the uncertainty in the registration process and provides confidence intervals.
[0061] (II) Graph Neural Network Modeling Engine
[0062] This module is the core component of this invention, employing a heterogeneous spatiotemporal graph neural network architecture to achieve representation learning and feature extraction for the environmental health network. This module comprises four core sub-modules: a heterogeneous spatiotemporal graph representation learning network, a spatiotemporal attention mechanism, a multi-type node encoder, and a relation-aware convolutional layer.
[0063] 2.1 Heterogeneous Spatiotemporal Graph Representation Learning Network
[0064] This network employs a hierarchical graph convolutional architecture, with specialized convolutional operations designed for different types of nodes and edges. The core innovation lies in the multi-layer heterogeneous spatiotemporal graph causal embedding algorithm (MHGCE), which enables causal-aware graph representation learning.
[0065] 2.1.1 Basic Network Architecture
[0066] Meta-path-aware message passing mechanism:
[0067] The network parameters employ an adaptive configuration strategy: the number of graph convolutional layers is determined based on the graph size, the hidden layer dimension matches the task complexity, and the number of relation types and node types is set based on the actual heterogeneous spatiotemporal graph structure. Each relation type is configured with an independent learnable weight matrix.
[0068] Metapath definitions include: SMEH: Source of pollution → Environmental medium → Exposure pathway → Health endpoint; SMMEH: Source of pollution → Environmental medium → Medium transfer → Exposure pathway → Health endpoint; SMEEH: Source of pollution → Environmental medium → Exposure pathway → Enhanced exposure → Health endpoint.
[0069] Attention weight The calculation combines node feature similarity and relationship importance: Among them, a rLet r be the attention vector of relation r, and the optimal weight allocation is obtained through learning.
[0070] 2.1.2 Multilayer Heterogeneous Spatiotemporal Graph Causal Embedding Algorithm (MHGCE)
[0071] This algorithm is the core technological innovation of this invention, realizing the joint representation learning of causal relationships between different types of entities in a unified vector space.
[0072] Core design principle: Traditional graph embedding methods only consider structural similarity and cannot capture the directionality and strength of causal relationships. MHGCE achieves causal-aware graph representation learning through multi-layered causal constraints and heterogeneous information fusion.
[0073] Four-layer architecture design: Layer 1: Heterogeneous Node Encoding Layer A specialized encoder is selected based on the node type to ensure that the features of different types of entities are fully expressed.
[0074] Level 2: Causal Relationship Perception Level
[0075] in, Let r be the causal transformation matrix of relation r, which is specifically designed to preserve causal direction information.
[0076] Level 3: Spatiotemporal Dynamic Update Layer It integrates dynamic information from both temporal evolution and spatial propagation.
[0077] Level 4: Causal Effect Prediction Level The strength of causal effects can be directly predicted through node embedding.
[0078] Multi-objective optimization training: The loss functions respectively constrain graph structure reconstruction, causal relationship preservation, temporal consistency, and spatial proximity.
[0079] MHGCE's core technological advantages over traditional graph embedding include: achieving unified modeling of structural similarity and causal relationships for the first time; supporting joint representation learning of heterogeneous nodes and edges; possessing spatiotemporal dynamic perception and adaptive update capabilities; and directly outputting causal effect prediction results.
[0080] 2.2 Spatiotemporal Attention Mechanism
[0081] This mechanism employs a spatiotemporally sensitive attention weight calculation, taking into account both temporal and spatial influences. The core implementation includes spatiotemporal distance encoding, multi-head attention calculation, and spatiotemporally aware message passing.
[0082] The formula for calculating the spatiotemporal sensitivity attention weight is:
[0083] Time Distance Encoding Sine position encoding is used:
[0084] in, For the time difference, =64 represents the time encoding dimension, and k represents the location index.
[0085] Spatial distance coding Gaussian radial basis functions are used:
[0086] in, Where σ is the Euclidean distance, σ=1000 is the spatial standard deviation, and θ is the relative azimuth angle.
[0087] The multi-head attention mechanism is configured as follows: number of attention heads h=8, and dimension d of each head. h =32, total dimensions 256. Each head calculates different types of spatiotemporal relationships. Heads 1-2 focus on short-term time dependencies, heads 3-4 focus on long-term time trends, heads 5-6 focus on short-distance spatial relationships, and heads 7-8 focus on long-distance spatial transmission.
[0088] 2.3 Multi-type node encoder
[0089] This encoder has designed specialized encoding strategies for four different types of nodes, making full use of the type-specific information of the nodes.
[0090] Pollution source node encoder structure:
[0091] in, One-hot encoding (20 dimensions) is used. Logarithmic transformation and standardization are performed (dimension 1). Composition is encoded using chemical fingerprinting (dimension 128), location using geocoding (dimension 64), and time using periodic coding (dimension 32). MLP_s is a three-layer fully connected network: 245→128→256.
[0092] Environmental media node encoders take into account the physicochemical properties of the medium: h m = MLP m( [medium_type, concentration, properties, dynamics]) The exposure path node encoder focuses on exposure patterns and intensity: h e = MLP e ([exposure_route, intensity, duration, population]) The health endpoint encoder integrates clinical and epidemiological information: h h = MLP h ([effect_type, severity, biomarkers, epidemiology]) Each encoder employs residual connections and batch normalization to improve training stability and convergence speed.
[0093] 2.4 Relationship-Aware Convolutional Layer
[0094] This layer designs specialized convolution operations for six different relation types to capture relation-specific information propagation patterns.
[0095] Relationship-specific message calculation:
[0096] in, The edge features of relation r include information such as relation strength, confidence, and time delay.
[0097] Relational aggregation uses weighted summation:
[0098] in, The global weight of relation r is obtained through learning. For self-loop weights, the original information of the nodes is preserved.
[0099] (III) Deep Learning Causal Inference Module
[0100] This module, based on a deep learning framework, enables causal structure learning and causal effect estimation, which is the core innovation of this invention. The module comprises four core sub-modules: a causal structure learning network, a counterfactual prediction network, a confounding factor control algorithm, and a causal effect estimator.
[0101] 3.1 Causal Structure Learning Network
[0102] This network uses a combination of variational autoencoders and graph neural networks to learn causal graph structures. The core idea is to infer the most likely causal structure by maximizing the data likelihood.
[0103] The posterior distribution model of the causal structure is as follows:
[0104] The likelihood function p(X|G) is parameterized through the structural equation model:
[0105] in, This represents the set of parent nodes of node i in the causal graph G. For the parameters of node i.
[0106] The prior distribution p(G) is subject to sparsity constraints:
[0107] in, Let G be the number of edges in graph G, and λ be the sparsity parameter, set to 0.01.
[0108] Variational inference employs the Gumbel-Softmax technique to achieve continuous optimization of discrete structures:
[0109] in, Let τ be the connection probability to be learned, and τ be the temperature parameter, which decreases linearly from 1.0 to 0.1 during training.
[0110] The network architecture adopts an encoder-decoder structure: Encoder:
[0111] Decoder:
[0112] 3.2 Counterfactual Prediction Network
[0113] This network implements the do-operator through a neural network to generate counterfactual predictions, and is a core component for estimating causal effects.
[0114] The mathematical expression for counterfactual prediction is:
[0115] The network architecture employs a multi-task learning framework:
[0116] Shared encoder: Causal predictor:
[0117] Characterization predictor:
[0118] Shared encoders learn hybrid representations of individuals The dimension is set to 128. The causal predictor and the representation predictor share the underlying representation but have different output heads.
[0119] The training loss function consists of three parts:
[0120] Actual losses:
[0121] Characterization loss:
[0122] Regularization loss:
[0123] The weighting coefficients are set to λ1=0.1 and λ2=0.01.
[0124] To improve the accuracy of counterfactual predictions, the network employs the following techniques: Domain adversarial training: adversarial loss ensures that the representation distributions of different treatment groups are similar; Balanced representation learning: Using IPW to reweight and balance the group distribution; Local linearity constraint: enforces linear causal relationships within a local neighborhood.
[0125] 3.3 Confounding Factor Control Algorithm
[0126] This algorithm automatically identifies backdoor paths and minimum adjustment sets based on graph structures, enabling intelligent control of mixed factors.
[0127] The mathematical definition of the minimum adjustment set is:
[0128] The algorithm is implemented using a search-based method: Step 1: Identify all backdoor paths from X to Y BackdoorPaths(X,Y,G) = {π|π is a path from X to Y that includes an edge pointing to X} Step 2: Solving the Minimum Set Cover Problem Find the smallest Z such that every backdoor path is blocked by at least one variable in Z. Step 3: Verify the sufficiency of the adjusted set: Check whether the adjusted set satisfies the requirements.
[0129] Algorithm optimization strategies: Greedy search: Prioritize variables that can block the most paths; Pruning strategy: Eliminate descendants that are directly related to the variable X being processed; Heuristic scoring: Comprehensively consider blocking ability, measurement cost, and reliability.
[0130] 3.4 Causal Effect Estimator
[0131] This estimator, based on counterfactual prediction and confounding control, achieves accurate causal effect quantification. Its core implementation includes average treatment effect estimation, conditional average treatment effect estimation, individual treatment effect estimation, and uncertainty quantification.
[0132] The formula for estimating the average treatment effect (ATE) is:
[0133] Conditional average treatment effect (CATE) takes into account the heterogeneity of individual characteristics:
[0134] Individual treatment effect (ITE) provides an estimate of the effect at the individual level:
[0135] Uncertainty quantification employs a Bayesian deep learning framework: The confidence interval for estimating the generation effect is obtained by inferring the approximate posterior distribution through variational inference.
[0136] The network structure adopts a multi-head output architecture: Shared Feature Extractor: ; Effect estimation head: ; Result prediction head: ; Training strategies include: dual machine learning: training the result model and the processing model separately to reduce regularization bias; cross-fitting: using sample segmentation to avoid overfitting; and target regularization: directly optimizing the causal effect estimation error.
[0137] (iv) Multipath Causal Effect Calculator
[0138] This calculator is a key innovative module of this invention, responsible for identifying and quantifying multiple causal paths from pollution sources to health endpoints, and comprehensively assessing the indirect causal impacts in complex networks. This module comprises four core sub-modules: a causal path identification algorithm, a path effect calculation unit, an effect aggregator, and a path importance assessment module.
[0139] 4.1 Causal Path Identification Algorithm
[0140] This algorithm employs an improved breadth-first search method to identify all valid causal paths in the causal graph. The core implementation includes path search strategy, validity verification, and path optimization.
[0141] The mathematical definition of an effective causal path is: P(S→H) = {π|π is a directed path from S to H and satisfies the causal validity condition} The causal validity conditions include: path direction consistency: all edges are along the causal direction; no collider constraint: the path does not contain unadjusted collider structures; length limit: the path length does not exceed the preset maximum value (default is 6).
[0142] Directed graphs and sets of paths: Let G=(V,E) be a directed graph. Given its adjacency matrix. Given source point s∈V, target point t∈V, and maximum path length L∈N.
[0143] The set of simple directed paths of length k is defined as:
[0144] Recursive construction:
[0145]
[0146] Candidate path set (length not exceeding L):
[0147] Valid causal path screening: Define a path validity indicator function:
[0148] Set of valid causal paths:
[0149] Where: s, t are the source and target nodes; L is the maximum path length (e.g., L=6); Z is the adjustment set; the collider structure refers to u→c←v; the temporal consistency requirement is that the timestamps are monotonically non-decreasing. .
[0150] Path validity verification includes: Structural verification: Checks whether the path constitutes a valid causal chain; Conditional verification: Verifies the conditional independence assumption; Temporal verification: Ensures the correctness of the causal time sequence.
[0151] Path optimization strategies: Pruning algorithm: terminate the search for invalid paths in advance; Caching mechanism: store calculated sub-paths to avoid repeated calculations; Parallel search: perform path search in parallel for different starting nodes.
[0152] 4.2 Path Effect Calculation Unit
[0153] This unit calculates the effect strength of a single causal path, taking into account the causal coefficient and transmission mechanism of each edge in the path.
[0154] The formula for calculating the single-path effect is:
[0155] in, Let be the causality coefficient of edge e, and Transmission(π) be the path transfer function.
[0156] Methods for estimating the marginal causality coefficient: Direct edge: = Obtained directly from the causal effect estimator; Intermediary side: = Indirect effects based on mediation analysis; Adjusting the edge: = Consider the conditional effect of the moderating variable.
[0157] Path transfer functions consider various attenuation factors:
[0158] in: Where is the path length, and α is the length decay parameter (set to 0.1). Let be the reliability coefficient of node i; This is the time decay factor.
[0159] Handling of special path types: Parallel path: Multiple paths act simultaneously, resulting in superimposed effects; Sequential path: Paths are connected in series, leading to multiplicative effects; Feedback path: Feedback loops exist, requiring dynamic system methods.
[0160] Path characterization and effect calculation: make A path containing K edges For node features, The edge feature is defined as follows.
[0161] Local fusion representation (edge-by-edge): ; Path embedding vector: ; Basic effects (approximately a product of single-path side effects): .
[0162] Transfer factor (considering length, reliability, and time-delay decay):
[0163] Single-path causal effect:
[0164] Where σ(·) represents a nonlinear activation (such as ReLU / ELU); , , ,b, The parameters are learnable; L(π)=K is the path length; α>0 is the length decay coefficient; ρ(v_i)∈(0,1] is the node reliability coefficient; For time decay function (e.g.) ).
[0165] 4.3 Effect Aggregator
[0166] This aggregator rationally combines the effects of multiple paths to obtain the overall causal effect. Its core implementation includes a weighting strategy, an effect integration algorithm, and a conflict resolution mechanism.
[0167] The aggregate formula for the total effect is:
[0168] Weighting strategies consider multiple factors:
[0169] Where: Importance(π): based on the structural importance of the path in the network; Reliability(π): Based on the data quality of each stage in the path; Evidence(π): Path-based evidence strength.
[0170] Path importance calculation:
[0171] The centrality score uses a combination of betweenness centrality and eigenvector centrality:
[0172] Effect integration considers the interactions between pathways: Independent effect: No interaction between paths, simple linear superposition; Synergistic effect: Positive interaction exists between paths, amplifying the effect; Antagonistic effect: Negative interaction exists between paths, weakening the effect.
[0173] Interaction detection employs a graph-based method: Total effect aggregation:
[0174] Weight allocation (softmax normalization):
[0175] in:
[0176] Interaction determination: Let the interaction index of the two paths π_1 and π_2 be:
[0177] Judgment rules: S>0 indicates cooperation, S<0 indicates antagonism, and S≈0 indicates independence.
[0178] Among them: Importance(π) can be obtained by combining centrality / structure weights; Reliability(π) is the sum of data quality at each stage; Evidence(π) is the strength of evidence (such as statistical significance and information content).
[0179] Conflict resolution mechanism: Conflict in the direction of effect: Choosing the dominant direction based on the strength of evidence Effect size conflict: Calculating the weighted average Uncertainty quantification: providing a larger confidence interval for conflict scenarios 4.4 Path Importance Assessment This evaluator quantifies the relative importance of each causal path, providing guidance for prioritizing policy interventions.
[0180] Path importance scoring system:
[0181] Weight settings for each dimension: =0.4, =-0.2, =0.3, =0.1
[0182] Effect size standardization:
[0183] Path length penalty:
[0184] Controllability score:
[0185] ControlScore is based on the policy operability of nodes.
[0186] Evidence strength score:
[0187] Path ranking algorithm: Calculate the importance score of all paths; sort them in descending order of score; identify the Top-K important paths (K defaults to 10); generate a path importance report.
[0188] Visualization features: Path diagram: shows the network structure of important paths; Effect decomposition diagram: shows the effect contribution of each path; Heat map: shows the spatial distribution of path importance; Time series diagram: shows the temporal variation of path effects.
[0189] (v) Dynamic Risk Prediction Module
[0190] This module, based on causal inference results and time-series modeling, enables dynamic prediction and early warning of environmental health risks, and is the final output module of the system of this invention. This module comprises four core sub-modules: a time-series causal modeling network, a risk prediction network, an uncertainty quantifier, and an early warning decision system.
[0191] 5.1 Temporal Causal Modeling Network
[0192] This network constructs a time-series dynamic model that considers causal relationships, integrating causal inference results into the time series prediction framework. The core implementation includes causal time-series equations, a dynamic graph update mechanism, and time-varying parameter estimation.
[0193] The mathematical expression of the time-series causal model is:
[0194] in: Y_{t+1} represents the health risk variable at time t+1; X_t represents the environmental exposure variable at time t; G_t is the causal graph structure at time t; θ_t is a time-varying parameter; ε_{t+1} is the random error term; The specific form of the causal time series equation:
[0195] Parameter configuration: The lag order is set to 5 (to consider the influence of the first 5 time points); the number of causal effect terms is determined based on the main causal paths identified (usually 3-8); the trend term adopts a local linear trend, and the seasonal term is represented by a Fourier series.
[0196] Dynamic graph update mechanism: Structural change measurement and update criteria; Given the current graph G_t at time t and the candidate graphs learned based on the most recent time window W. _{t+1}, defining structural differences:
[0197] Update rules:
[0198] Difference measurement (example):
[0199] Time-varying effect estimation (sliding window):
[0200] Where: Adj(·) is the adjacency matrix; τ{update} is the structure update threshold; The loss function is (e.g., mean squared error or negative log-likelihood).
[0201] Time-varying parameter estimation uses a state-space model:
[0202] Where Φ is the state transition matrix, η t State noise. Parameter estimation uses an extended Kalman filter: Prediction step:
[0203] Update steps:
[0204] in: ; ; ; ; ; ; .
[0205] 5.2 Risk Prediction Network
[0206] This network employs a graph recurrent neural network for multi-step risk prediction, fusing temporal and graph structure information. The core architecture includes a graph temporal encoder, a multi-scale predictor, and an attention fusion mechanism.
[0207] Network architecture design: Graph Convolutional LSTM Temporal Coding: Normalized adjacency:
[0208] Graph Convolution and Gating:
[0209]
[0210]
[0211]
[0212]
[0213]
[0214]
[0215] Temporal attention convergence: For the encoded sequence {h_1,…,h_T}, the attention weights are:
[0216] Context:
[0217] Multi-scale prediction fusion: ; Fusion weights:
[0218] Fusion characterization (it's unclear which of these two formulas to use):
[0219] K-step prediction (it's unclear which of these two formulas to use):
[0220] in, ; ; ; ; ; || is used for vector concatenation.
[0221] LSTM convolutional unit:
[0222] The update equation for the convolutional LSTM is:
[0223]
[0224]
[0225]
[0226]
[0227]
[0228] Multi-scale prediction strategy: Short-term forecast (1-7 days): Direct recursive forecast, with a step size of 1 day. Medium-term forecast (1-4 weeks): Layered forecast, first forecasting the weekly average and then breaking it down to the daily average. Long-term forecast (January-December): Trend-seasonal decomposition, forecasting each component separately. Attention fusion mechanisms take into account the importance of different time scales:
[0229]
[0230] 5.3 Uncertainty Quantizer
[0231] This quantifier assesses the uncertainty of forecast results, providing confidence intervals and risk level assessments. Its core implementation includes cognitive uncertainty estimation, accidental uncertainty estimation, and overall uncertainty aggregation.
[0232] Understanding Uncertainty (MC Dropout): The model performs M forward samplings with dropout activated, to obtain the predicted { ^{(m)}}_{m=1}^M.
[0233] Mean:
[0234] variance:
[0235] Random uncertainty (heteroscedastic regression): Network output mean and variance:
[0236]
[0237] Negative log-likelihood loss (Gaussian):
[0238] Total uncertainty:
[0239] Where: M is the number of samplings (e.g., M=50~200); this variance characterizes the uncertainty of knowledge (epistemic); softplus(z)=log(1+e^z), ensuring σ^2>0.
[0240] Confidence interval calculation:
[0241]
[0242] .
[0243] 5.4 Early Warning Decision System
[0244] This system generates tiered early warnings and decision-making recommendations based on risk prediction results. Its core implementation includes risk level classification, dynamic adjustment of early warning thresholds, multi-objective decision optimization, and early warning information generation.
[0245] Risk level classification criteria: Low risk (green): Predicted risk value < threshold_1 and upper bound of confidence interval < threshold_2 Medium risk (yellow): Threshold_1 ≤ predicted risk value < Threshold_3 or uncertainty exists. High risk (orange): Threshold_3 ≤ predicted risk value < threshold_4 and the trend is upward. Extremely high risk (red): Predicted risk value ≥ threshold_4 or lower bound of confidence interval > threshold_3 Dynamic threshold adjustment: Calculations based on the predicted and observed binary events (early warning / occurrence) within a time window:
[0246] Threshold update: Let the threshold vector be θ_k=[θ_{k,1},…,θ_{k,m}]^T, the fitness rate be η∈(0,1), and the tolerance be τ.
[0247] Segmented updates:
[0248]
[0249]
[0250] Warning level determination:
[0251] ; ; ; .
[0252] Where: TP, FP, FN are obtained from the confusion matrix; η (e.g., 0.1), τ (e.g., 0.05)
[0253] Multi-objective decision optimization considers multiple objectives: Accuracy: Maximize early warning accuracy; Timeliness: Minimize early warning delay; Completeness: Maximize risk coverage; Economy: Minimize false alarm costs.
[0254] Optimize the objective function:
[0255] Weight settings: =0.4, =0.3, =0.2, =0.1
[0256] Key risk factor identification algorithm: Top-K factors are selected based on contribution ranking:
[0257] Affected area identification algorithm:
[0258] Susceptible population identification algorithm:
[0259] Suggested measures matching function:
[0260] Warning message generation function:
[0261] Parameter and symbol explanation: L∈{1,2,3,4}: Warning level (green, yellow, orange, red); ∈R + : Predict the mean risk; 95% confidence interval; U∈R + Overall uncertainty indicator; T∈N: Effective warning time range (hours); Key risk factor index set
[0262] : A set of geospatial units, where R(s) is the risk index of unit s; Stratified population groups (grouped by age, gender, underlying diseases, etc.); : A subset of high-risk areas; : A subset of susceptible populations; A collection of structured recommendations and measures; , Space and population risk thresholds (dynamically adjustable); The importance contribution of factor j (calculated based on Shapley value or gradient method).
[0263] Message output format:
[0264] (vi) System integration and optimization
[0265] This section describes the integration methods of each module in the system and the overall performance optimization strategy to ensure the stability, scalability, and efficiency of the system.
[0266] 6.1 Data Flow Design Between Modules
[0267] The system adopts an event-driven asynchronous architecture, and the modules exchange data through message queues.
[0268] End-to-end mapping: Given the original data X, define the module operator: F1: Spatiotemporal heterogeneous data fusion; F2: Graph modeling engine; F3: Causal inference; F4: Multipath effect calculation; F5: Dynamic risk prediction.
[0269] System output:
[0270] in:" "This is for function composition; the internal formulas of each F_i are shown in the corresponding sections of the main text (ST-CIT, MHGCE, attention, counterfactual, etc.)."
[0271] 6.2 Performance Optimization Strategies
[0272] Multiple optimization techniques are employed to improve system performance: Graph sampling techniques: Sampling large-scale graph networks to reduce computational complexity; Parallel computing: Multi-GPU parallel training and distributed inference; Model compression: Knowledge distillation, model pruning, and quantization; Caching mechanism: Intelligent caching of intermediate results to avoid redundant computation.
[0273] 6.3 Scalable Design
[0274] The system adopts a microservice architecture and supports horizontal scaling: service registration and discovery; load balancing; containerized deployment; and elastic scaling.
[0275] 6.4 Memory and Storage Optimization
[0276] Storage requirements analysis: Basic graph structure: approximately 2.3GB / 10,000 nodes; Time series data: approximately 4.7GB / year / 10,000 nodes; Model parameters: approximately 156MB-892MB (depending on network size); Intermediate result cache: approximately 1.2GB / 10,000 nodes.
[0277] Optimization strategies: Data compression: LZ4 algorithm is used, with a compression rate of 67.3% - Hot and cold data separation: Hot data (<30 days) is stored on SSD, and cold data is stored on HDD; Edge computing support: The core algorithm supports deployment on edge devices with a minimum memory requirement of 8GB.
[0278] Typical parameter configurations and computational resource requirements: Learning rate: 0.0001-0.01 (adaptively adjusted, initial value 0.001); Batch size: 32-256 (automatically selected based on GPU memory); Hidden layer dimension: 128-512 (128 for graphs with <1000 nodes, 512 for >5000 nodes); Number of attention heads: 4-16 (8 heads recommended for balancing performance and computation); Maximum path length: 4-8 (6 recommended for environmental health applications); ST-CIT bandwidth parameter: σ∈[100,5000] (1000 for city scale, 5000 for region scale); Temperature parameter: initial value 1.0, linearly decaying to 0.1 (decaying within 200 training epochs); Regularization weight: λ1=0.1, λ2=0.01 (weights for causal loss and temporal loss); Data specifications: supports 100-50000 nodes, 500-500000 edges, 30-365 days of time series length, and 10-200 feature dimensions.
[0279] Computing resources: Minimum configuration: 8 CPU cores, 16GB RAM, 100GB storage; Recommended configuration: ≥8GB GPU memory, 16 CPU cores, 32GB RAM, ≥500GB SSD storage; For large-scale applications, distributed deployment is recommended, with ≥24GB GPU memory per node.
[0280] Algorithm complexity: ST-CIT time complexity O(N²T), MHGCE training complexity O(|E|dL), multipath recognition complexity O(|V|^L), where N is the number of nodes, T is the time step, d is the feature dimension, and L is the maximum path length.
[0281] Application Examples
[0282] To verify the effectiveness and practicality of the system of this invention, we conducted application tests in several typical environmental health scenarios. The following details three representative application examples, demonstrating the system's performance and application value in different scenarios.
[0283] Application Example 1: Causal Inference Between Air Pollution and Respiratory Diseases in an Industrial Park
[0284] 1.1 Application Background and Data Description
[0285] This application example addresses the high incidence of respiratory diseases among residents surrounding a large chemical industrial park. The system of this invention is used to accurately identify the causal relationship between pollution sources and health effects. The study area covers the chemical industrial park and a 15-kilometer radius around it, involving 3 administrative districts, 12 communities, and a total population of approximately 185,000.
[0286] Data source and scale: Environmental monitoring data: Hourly monitoring data from 32 air quality monitoring stations inside and outside the park for four consecutive years from 2019 to 2022, including 15 pollutants such as PM2.5, PM10, SO2, NO2, O3, CO, and VOCs, totaling 4.2 million records.
[0287] Health data: Respiratory disease visit records from 6 hospitals and 15 community health service centers in the region, including 8 diseases such as asthma, COPD, and pneumonia, involving 32,847 patients and a total of 158,329 visit records.
[0288] Pollution source data: exhaust gas emission monitoring data of 23 enterprises in the park, including real-time monitoring data and emission inventories of 147 emission outlets.
[0289] Supporting data: meteorological data, demographic data, socioeconomic data, etc.
[0290] 1.2 System Configuration and Processing Flow
[0291] Graph network construction configuration: Number of nodes: 147 pollution source nodes, 32 environmental media nodes, 36 exposure pathway nodes, and 8 health endpoint nodes, for a total of 223 nodes.
[0292] Number of edges: 147 emission relationship edges, 96 media transfer edges, 288 exposure relationship edges, 72 pathogenic relationship edges, 1,256 spatiotemporal relationship edges, totaling 1,859 edges.
[0293] Graph network update frequency: Updated weekly, with network structure adjusted based on new data.
[0294] Deep learning model configuration: Graph Neural Network: 4-layer GraphSAGE, hidden layer dimension 256, attention heads 8. Causal Inference Network: Variational Autoencoder structure, latent space dimension 128, temperature parameter decays from 1.0 to 0.1. Training parameters: learning rate 0.001, batch size 64, training epochs 200, early stopping patience 20.
[0295] 1.3 Processing Results and Analysis
[0296] Causal structure learning results: The system identified 73 significant causal relationships (edge probability > 0.8), including: 23 direct emission relationships from pollution sources to environmental media; 18 media transport relationships between environmental media; and 32 pathogenic causal relationships from exposure to health endpoints.
[0297] Key findings include: A significant causal relationship was found between VOC emissions from a fine chemical enterprise and the incidence of childhood asthma (causal effect coefficient β=0.847, p<0.001).
[0298] The overall SO2 emissions from the park affect the incidence of COPD within a 5-kilometer radius downwind through atmospheric diffusion (with a lag effect of 3-7 days).
[0299] The PM2.5 emissions from multiple enterprises have a synergistic pathogenic effect, with the combined effect being 42.3% stronger than the individual effects.
[0300] Multipath Causal Effect Analysis: The system identified 13 major causal pathways from primary pollution sources to respiratory diseases. Pathway 1 (Contribution rate 28.7%): Fine chemical enterprises → VOCs emissions → Atmospheric diffusion → Respiratory exposure → Airway inflammation → Asthma Pathway 2 (Contribution rate 23.4%): Coal-fired power plants → SO2 emissions → Atmospheric transport → Respiratory exposure → Lung function impairment → COPD Pathway 3 (contribution rate 18.9%): Multi-source PM2.5 → Particulate matter deposition → Alveolar damage → Oxidative stress → Pneumonia The path effect quantification results show that the first three paths contributed 70.1% of the total causal effect, providing a clear target for precise intervention.
[0301] 1.4 Validation of Prediction Results
[0302] Performance evaluation of dynamic risk prediction (validated using data from the second half of 2022): Short-term forecast (1-7 days): Mean absolute percentage error (MAPE) = 8.3%, correlation coefficient R = 0.917; Medium-term forecast (1-4 weeks): MAPE = 14.7%, R = 0.863; Long-term forecast (1-3 months): MAPE = 22.5%, R = 0.781; Compared with traditional methods: Compared with linear regression models: prediction accuracy is improved by 34.6%; Compared with random forest models: prediction accuracy is improved by 28.1%; Compared with LSTM time series models: prediction accuracy is improved by 19.3%.
[0303] 1.5 Practical Application Value
[0304] Based on the results of the system analysis, the local environmental protection department implemented precise pollution control measures: Production restrictions or shutdowns were implemented at three identified key polluting enterprises, resulting in a 67.3% reduction in VOC emissions. The layout of the industrial park will be optimized, and the two chemical companies will be relocated to downwind areas to reduce the impact on residential areas. Establish an early warning mechanism based on system predictions to issue health protection recommendations 3-5 days in advance; Implementation effectiveness evaluation (6 months after intervention): New asthma cases in the region decreased by 41.2%; the hospitalization rate for acute exacerbations of COPD decreased by 38.7%. Total medical expenses for respiratory diseases were reduced by approximately NT$18.47 million; the comprehensive environmental health risk index decreased by 52.8%.
[0305] 1.6 Technological Value Demonstration This application fully validates the system's core technological advantages: - Breakthrough in causal identification technology: Successfully identifies complex causal chains that traditional methods cannot discover, providing a scientific basis for accurate source tracing. - Multi-path decomposition capability: Accurately quantifies the relative contributions of different transmission paths, realizing the transformation from "extensive control" to "precise intervention". - Predictive stability advantage: Maintains good predictive performance when environmental conditions change, avoiding the overfitting problem of traditional models. - Decision support value: Provides interpretable causal mechanism analysis, making control measures more targeted and scientific.
[0306] Application Example 2: Dynamic Risk Prediction of Air Pollution and Cardiovascular Disease in a Major City
[0307] 2.1 Application Background and Data Description
[0308] This application case study focuses on the dynamic risk prediction of the impact of air pollution on cardiovascular diseases in a megacity, with a particular emphasis on the short-term health effects and long-term cumulative impacts of pollutants such as PM2.5 and O3. The study covers all 16 administrative districts of the city, with a total area of 16,400 square kilometers and a resident population of 21.54 million.
[0309] Data source and scale: Environmental monitoring data: 5 years of continuous monitoring data from 2018 to 2023 from 85 national-level air quality monitoring stations and 312 local monitoring points in the city, including 6 conventional pollutants and meteorological parameters, totaling 16.8 million records.
[0310] Health data: Outpatient, emergency and inpatient data from the cardiovascular departments of 127 secondary and above hospitals in the city, covering 12 diseases including acute myocardial infarction, arrhythmia and hypertension, with 1.426 million patients.
[0311] Population data: Population distribution, age structure, socioeconomic status, etc., by street (township), covering 333 streets (townships).
[0312] Supporting data: traffic flow, industrial activity intensity, building density, green coverage, etc.
[0313] 2.2 System Configuration and Modeling
[0314] Heterogeneous Spatiotemporal Graph Network Design
[0315] Spatial unit: Divided into 3km×3km grids, with a total of 1,823 grid units.
[0316] Node types: 2,156 pollution source nodes (stationary sources + mobile sources), 397 environmental monitoring points, 1,823 exposure assessment units, and 12 health endpoint nodes.
[0317] Spatiotemporal relationships: Consider the interaction within a 24-hour time window and a 50km spatial range.
[0318] Spatiotemporal Sensitive Map Neural Network Configuration: Spatiotemporal attention mechanism: 8 attention heads, each focusing on different spatiotemporal scales; Time coding: Periodic coding (hour, day, week, month, season) + trend coding; Spatial coding: geographic distance coding + wind direction transmission coding + terrain influence coding.
[0319] 2.3 Causal Inference and Prediction Results
[0320] Causal Relationship Identification: The system identified the main causal pathways between air pollutants and cardiovascular diseases. The short-term effects of PM2.5 (lag of 0-3 days): PM2.5 → vascular endothelial damage → thrombosis → acute myocardial infarction. Causal effect: For every 10 μg / m³ increase in PM2.5, the risk of acute myocardial infarction increases by 6.8% (95% CI: 4.2%-9.5%).
[0321] The long-term effects of O3 (lag of 1-6 months) are: O3 → oxidative stress → arteriosclerosis → hypertension → heart disease. Causal effect: For every 10 μg / m³ increase in annual average O3 concentration, the prevalence of hypertension increases by 3.2% (95% CI: 2.1%-4.4%).
[0322] NO2 vascular effects pathway (lag of 1-7 days): NO2 → vasoconstriction → increased blood pressure → arrhythmia. Causal effect: For every 10 μg / m³ increase in NO2, the risk of arrhythmia increases by 4.1% (95% CI: 2.8%-5.6%).
[0323] Dynamic Risk Prediction: Based on causal inference results, the system establishes a multi-scale dynamic prediction model: Short-term forecast (1-7 days) performance: Acute myocardial infarction: prediction accuracy 89.3%, with an early warning of 2-3 days in advance; Cardiac arrhythmia: 85.7% prediction accuracy, with an early warning rate of 1-2 days; Hypertensive emergencies: Significantly improved accuracy, with an early warning 1 day in advance; Mid-term forecast (1-4 weeks): Cardiovascular disease hospitalization rate: MAPE = 12.4%, R = 0.876; Chronic heart disease outpatient visits: MAPE = 15.8%, R = 0.841.
[0324] 2.4 Risk Stratification and Spatial Distribution
[0325] The system generated a detailed spatial distribution map of cardiovascular disease risk across the city: High-risk areas (red, accounting for 12.3% of the city's area): mainly distributed in heavy industrial areas, transportation hubs and densely populated areas; the average annual incidence of cardiovascular diseases is 47.6% higher than the city's average; involving a population of approximately 2.58 million; Medium-risk areas (orange, accounting for 28.7% of the city's area): mainly distributed in the main urban area and suburban industrial areas; the average annual incidence rate is 18.2% higher than the city's average; involving a population of approximately 9.67 million.
[0326] Low-risk areas (green, accounting for 59.0% of the city's area): mainly distributed in the suburbs and areas with high green coverage; the average annual incidence rate is 23.4% lower than the city's average; involving a population of approximately 9.29 million.
[0327] 2.5 Application Effect of Early Warning System
[0328] After system deployment, a four-level risk early warning mechanism was established: Level I (Red) alert trigger conditions: daily average PM2.5 > 150 μg / m³ or daily maximum 8-hour O3 > 200 μg / m³; predicted increase in acute cardiovascular events > 30%; affected population > 1 million.
[0329] Statistics on actual application effects (January-December 2023): A total of 187 warnings at all levels were issued, including 12 Level I warnings and 34 Level II warnings; the accuracy rate of warnings reached 88.7%, and the false alarm rate was controlled within 11.3%; through warning protection, it is estimated that about 2,847 cases of acute cardiovascular events were avoided; and medical expenses were saved by about 42.36 million yuan.
[0330] Application Example 3: Comprehensive health risk assessment of the synergistic effects of multiple pollution sources.
[0331] 3.1 Application Background and Challenges
[0332] This application case study focuses on a watershed area with combined industrial and agricultural pollution, presenting complex environmental health challenges due to the synergistic effects of multiple pollution sources, including industrial waste gas, agricultural non-point source pollution, and traffic exhaust. The study area encompasses two prefecture-level cities and seven counties, covering a total area of 8,247 square kilometers and a population of 4.63 million. The main challenges faced by this region include: diverse pollution sources (heavy industry, pesticide use, vehicle emissions, etc.); complex exposure pathways (multiple routes including inhalation, ingestion of agricultural products, and drinking water); and diverse health effects (involving respiratory, cardiovascular, nervous, and reproductive systems).
[0333] 3.2 Data Integration and Network Construction
[0334] Multi-source data integration: Industrial pollution sources: 78 key enterprises, involving emission data of 25 types of pollutants; Agricultural pollution sources: Data on the usage of 13 major pesticides in 1.42 million mu of farmland; Traffic pollution sources: 3,247 km road network, traffic flow and emission factor data; Environmental monitoring: monitoring data for media such as air, water, soil, and agricultural products, at 68 monitoring points; Health monitoring: 32 medical institutions, covering data on 15 types of environment-related diseases.
[0335] Complex Network Modeling: Network Scale: Total Number of Nodes: 2,847; Pollution Source Nodes: 1,456 (78 industrial sources, 1,142 agricultural sources, 236 transportation sources); Environmental Media Nodes: 468 (128 atmosphere, 89 water bodies, 187 soil, 64 biological sources); Exposure Pathway Nodes: 156 (52 respiratory, 78 oral, 26 skin); Health Endpoint Nodes: 767 (15 disease categories × age and gender grouping). Total Number of Edges: 18,364; Time Range: 2018-2023, 6 years of dynamic data.
[0336] 3.3 Multipath Causal Effect Analysis
[0337] Synergistic Effect Identification: The system identified 14 significant synergistic effect patterns among pollution sources. Synergistic combination 1: Industrial VOCs + pesticide residues → neurodevelopmental effects; Individual effects: Industrial VOCs risk index 0.34, pesticide residue risk index 0.28; Synergistic effect: Joint risk index 0.89 (synergistic factor 2.43); Main effects: attention deficit and delayed cognitive development in children.
[0338] Synergistic combination 2: Industrial heavy metals + traffic PM2.5 → cardiovascular damage; Individual effects: Heavy metal risk index 0.41, PM2.5 risk index 0.36; Synergistic effect: Joint risk index 1.12 (synergistic factor 2.73); Main impact: Increased incidence of hypertension and coronary heart disease in adults.
[0339] Synergistic combination 3: Agricultural nitrogen and phosphorus + industrial organic matter → eutrophication of water bodies → drinking water safety; Environmental effects: Eutrophication levels in water bodies increased by 156%; Health effects: The incidence of digestive system diseases increased by 23.7%; Multipath effect decomposition: Taking the impact of industrial lead emissions on children's intellectual development as an example, the system identified five main transmission pathways: Pathway A (direct atmospheric exposure, contribution rate 42.3%): Industrial emissions → Atmospheric lead → Respiratory exposure → Increased blood lead levels → Neurodevelopmental effects; Pathway B (soil accumulation exposure, contribution rate 28.7%): Industrial emissions → atmospheric deposition → soil accumulation → exposure → elevated blood lead → neurodevelopmental effects; Pathway C (food chain transmission, contribution rate 18.9%): Industrial emissions → farmland pollution → crop absorption → food intake → elevated blood lead → neurodevelopmental effects; Pathway D (water transport, contribution rate 7.8%): Industrial emissions → Surface runoff → Water pollution → Drinking water intake → Increased blood lead levels → Neural developmental effects; Pathway E (mother-to-child transmission, contribution rate 2.3%): Pregnancy exposure → placental transmission → fetal exposure → postnatal developmental impact.
[0340] 3.4 Comprehensive Risk Prediction and Control
[0341] Scenario Analysis: The system simulated the health benefits of four pollution control scenarios: Scenario 1 (Status unchanged): By 2030, the environmental health risk index will increase by 27.4%; the incidence of multisystem diseases will continue to rise.
[0342] Scenario 2 (50% reduction in industrial emissions): Industrial-related health risks decrease by 48.6%; however, the impacts from agriculture and transportation remain, and the overall risk decreases by only 23.1%.
[0343] Scenario 3 (30% reduction in agricultural emissions + 20% reduction in transportation emissions): Agriculture-related health risks are reduced by 35.2%; transportation-related health risks are reduced by 28.7%; however, industrial sources still have the dominant impact, with overall risk reduced by 18.9%.
[0344] Scenario 4 (Comprehensive Emission Reduction): 50% reduction in industrial emissions + 30% reduction in agricultural emissions + 20% reduction in transportation emissions; overall health risk reduced by 67.3%; achieving optimal results in multi-source coordinated control; precise management and control strategies: Based on multi-path causal effect analysis, the system generates differentiated control strategies: Precise control of industrial sources: Key enterprises: 12 key enterprises with a contribution rate of >5% have been identified; Key pollutants: VOCs, heavy metals, and particulate matter are the priority control targets; Key periods: Strengthen control during the unfavorable diffusion conditions in spring and summer.
[0345] Precise management of agricultural sources: Key areas: farmland within 1 kilometer of drinking water sources; Key crops: planting areas of high-accumulation crops such as leafy vegetables and root vegetables; Key pesticides: persistent pesticides such as organophosphates and organochlorines.
[0346] Precise traffic source control: key road sections: roads within 500 meters of schools and hospitals; key time periods: morning and evening traffic peaks; key vehicle types: heavy-duty diesel vehicles and old gasoline vehicles.
[0347] 3.5 Implementation Results and Verification
[0348] Effectiveness of control measures (January-December 2023): Environmental quality improved: the annual average concentration of PM2.5 in the atmosphere decreased by 31.2% (from 42.6 μg / m³ to 29.3 μg / m³); the rate of heavy metal contamination in farmland soil decreased by 45.8%; and the compliance rate of drinking water source quality increased to 98.7%.
[0349] Health benefits achieved: the incidence of respiratory diseases decreased by 28.4%; the incidence of cardiovascular diseases decreased by 23.7%; the rate of excessive blood lead levels in children decreased by 52.3%; and the detection rate of neurodevelopmental abnormalities decreased by 34.6%.
[0350] Economic benefit assessment: Total investment: 1.73 billion yuan for environmental governance; Health benefits: 1.28 billion yuan in medical expenses avoided and 840 million yuan in lost work time reduced; Net benefit: 2.12 billion yuan, with a return on investment ratio of 1:2.23.
[0351] 3.6 The Innovative Value of the System
[0352] This application example fully demonstrates the core innovative value of the system: 1. Complex network modeling capability: Successfully handled large-scale heterogeneous networks containing 2,847 nodes and 18,364 edges, accurately depicting the complex interaction relationships of multi-source pollution; 2. Synergistic effect quantification capability: Accurately identifies and quantifies the synergistic effects of 14 pollution sources, with the synergistic factor calculation accuracy reaching 91.7%, providing a scientific basis for synergistic control; 3. Multi-path causal decomposition capability: It achieves accurate decomposition of 5-layer transmission paths, with a path contribution rate calculation error of <8%, which is significantly better than traditional single-path analysis methods; Dynamic prediction capability: The accuracy rate of multi-scenario prediction reaches 86.3%, providing reliable support for long-term environmental health planning.
[0353] Decision support capability: The generated precise control strategies achieved a 67.3% reduction in overall health risks, demonstrating the practical application value of the system.
[0354] Three application examples fully demonstrate the effectiveness and superiority of the system of this invention in different environmental health scenarios. The system achieves accurate modeling of complex environmental health systems through spatiotemporal sensitive graph neural networks, accurate identification of causal relationships through deep learning causal inference, comprehensive assessment of health impacts through multi-path effect calculation, and proactive health management through dynamic risk prediction.
[0355] Compared with traditional methods, the system of this invention has significantly improved key indicators such as causal identification accuracy, prediction precision, and computational efficiency, providing important technical support for environmental health scientific research and management practices, and has broad application prospects and significant social value.
[0356] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A neural network-based environmental health causal inference and risk prediction system, characterized in that, include: The spatiotemporal heterogeneous data fusion module is used to perform multi-source heterogeneous fusion of pollution source data, environmental medium data, human exposure data, and health effect data to construct a unified spatiotemporal data representation. A graph neural network modeling engine is used to model environmental health systems as heterogeneous spatiotemporal causal graph networks, learn the representations of nodes and edges, and capture the complex relationships between pollution sources, environmental media, exposure pathways and health endpoints. The deep learning causal inference module is used to identify causal structures based on graph network representation, control confounding factors, estimate causal effects, and achieve spatiotemporally sensitive causal identification through the spatiotemporal conditional independence test mechanism ST-CIT. The multipath causal effect calculator is used to identify and quantify multiple causal paths from pollution sources to health endpoints, and calculate the causal effect of each path and its contribution to the total effect. The dynamic risk prediction module is used to perform multi-step health risk prediction based on causal inference results and graph recurrent neural networks, and generate early warning information.
2. The environmental health causal inference and risk prediction system based on neural networks according to claim 1, characterized in that, The spatiotemporal heterogeneous data fusion module includes: A multi-source data preprocessor uses a deep learning-based spatiotemporal data registration algorithm to achieve spatiotemporal alignment of multi-source data; Spatiotemporal graph construction unit, used to construct heterogeneous spatiotemporal graph networks that include pollution sources, environmental media, exposure pathways and health endpoints; A data quality control mechanism is used to ensure the reliability of fused data through spatiotemporal consistency checks and outlier detection.
3. The environmental health causal inference and risk prediction system based on neural networks according to claim 1, characterized in that, The graph neural network modeling engine includes: Heterogeneous spatiotemporal graph construction unit, using a hierarchical graph representation method to model various node and edge types in the environmental health system; The spatiotemporal graph representation learning unit is used to learn node embeddings through the fusion of graph convolutional neural networks and long short-term memory networks; the graph neural network modeling engine employs a multi-layer heterogeneous spatiotemporal graph causal embedding algorithm, including: Heterogeneous node encoding layer, used for feature encoding of different types of nodes; The causal relationship perception layer is used to maintain causal direction information; The spatiotemporal dynamic update layer is used to fuse dynamic information in time and space; Causal effect prediction layer, used to predict the strength of causal effects based on node embedding; A multi-head spatiotemporal attention mechanism is used to calculate attention weights under spatiotemporal dependencies; Graph convolution computation units are used to process large-scale heterogeneous spatiotemporal graph data and extract causal sensing features.
4. The environmental health causal inference and risk prediction system based on neural networks according to claim 1, characterized in that, The deep learning causal inference module includes: The causal structure learning network employs a combination of variational autoencoder and graph neural network to learn the causal graph structure. The counterfactual prediction unit uses a neural network to implement the do-operator and generate counterfactual prediction results. A confounding factor control algorithm automatically identifies backdoor paths and determines the minimum adjustment set based on graph structure. Causal effect estimator, used to estimate average causal effect, conditional average causal effect, and individual causal effect; The spatiotemporal conditional independence test mechanism ST-CIT identifies causal relationships under spatiotemporal dependence by introducing a spatiotemporal weighting function to adjust the conditional independence test statistic.
5. The environmental health causal inference and risk prediction system based on neural networks according to claim 1, characterized in that, The multipath causality calculator includes: The causal path identification unit uses a breadth-first search algorithm to identify all causal paths from the source node to the target node; The path weight calculation module is used to calculate path weights based on path importance, reliability, and strength of evidence. Effect propagation algorithm, used to calculate single-path causal effects; The total effect aggregator is used to weight and aggregate the causal effects of all paths to obtain the total causal effect.
6. The environmental health causal inference and risk prediction system based on neural networks according to claim 4, characterized in that, The dynamic risk prediction module includes: Temporal causal modeling networks are used to construct temporal dynamic models that consider causal relationships. A risk prediction network based on a graph recurrent neural network to achieve multi-step health risk prediction; Uncertainty quantizer, used to estimate the uncertainty of prediction results using Bayesian deep learning methods; The early warning and decision-making system is used to generate multi-level risk warnings and decision-making suggestions based on the prediction results.
7. The environmental health causal inference and risk prediction system based on neural networks according to claim 3, characterized in that, The multi-head spatiotemporal attention mechanism calculates attention weights using the following formula: in, , Let Φ be the node representation vector, Φ be the temporal difference coding function, Ψ be the spatial distance coding function, W be the weight matrix, and a be the attention vector.
8. The environmental health causal inference and risk prediction system based on neural networks according to claim 1, characterized in that, The multipath causality calculator uses the following formula to calculate the total causality effect: in, Let P(S→T) be the total effect, and let P(S→T) be the set of all causal paths. For path weights, This is a single-path effect.
9. The environmental health causal inference and risk prediction system based on neural networks according to claim 1, characterized in that, The spatiotemporal conditional independence test mechanism (ST-CIT) is implemented through the following steps: Construct the spatiotemporal distance matrix; Design a spatiotemporal weighting function; Calculate the weighted conditional independence test statistic; Adjust the significance level to control for biases caused by spatiotemporal correlation.
10. A method for causal inference and risk prediction of environmental health based on spatiotemporally sensitive graph neural networks, using the neural network-based environmental health causal inference and risk prediction system as described in any one of claims 1-9, characterized in that, Includes the following steps: Step S1: Construct a heterogeneous spatiotemporal graph representation of the environmental health system, modeling pollution sources, environmental media, exposure pathways, and health endpoints as graph nodes, and modeling the relationships between them as graph edges; Step S2: Use a graph neural network to learn the representation vectors of nodes and edges, and capture spatiotemporal dependencies through a multi-head spatiotemporal attention mechanism; Step S3: Based on deep learning causal inference algorithms, identify causal structures from observational data and estimate causal effects; Step S4: Calculate the multipath causal effects, quantify the indirect causal effects and the total causal effect; Step S5: Based on the causal inference results, predict future environmental health risks and generate dynamic early warning information.