Marine multi-scale environment situation prediction method based on graph neural network

By combining the physical prior bias terms of arrival probability, arrival time, and downstream consistency coefficient with the Lagrange drift integral and graph Transformer network, the problem of insufficient spatial relationship modeling in marine environmental situation prediction in existing technologies is solved, and the stability and accuracy of multi-scale situation prediction are improved.

CN122242871APending Publication Date: 2026-06-19LUDONG UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
LUDONG UNIVERSITY
Filing Date
2026-04-22
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing marine environmental situation prediction methods struggle to reflect the time-varying correlation between flow and wind fields when constructing spatial relationships. Attention message passing lacks physical constraints, resulting in insufficient prediction stability and accuracy. Furthermore, inconsistencies in cross-scale relationships during multi-scale modeling affect prediction performance.

Method used

The reachability domain is generated by using Lagrange drift integral, and the arrival probability and arrival time between observation nodes are calculated. A fine-scale directed graph is constructed and the arrival probability, arrival time and downstream consistency coefficient are introduced as physical prior bias terms. Message passing is carried out through a graph Transformer network, and a cross-scale homogeneous dynamic graph is constructed through hierarchical graph pooling to improve the prediction stability and accuracy.

🎯Benefits of technology

Dynamically updating node associations and constraining the direction of information propagation improves the stability and accuracy of multi-scale situation prediction and enhances the generalization capability under complex sea conditions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242871A_ABST
    Figure CN122242871A_ABST
Patent Text Reader

Abstract

This invention discloses a method for predicting marine multi-scale environmental situations based on graph neural networks. To address the problem of marine environmental relationships changing over time and the fragility of static relationship models under dynamic flow and wind fields, leading to the propagation of erroneous information, this invention acquires marine environmental observation data and geographically constrained data from observation nodes and constructs node feature sequences. Based on Lagrange drift integrals and combined with initial perturbation conditions, it generates reachable domains, calculates arrival probabilities and arrival times between nodes to construct a fine-scale time-varying directed graph and edge features, and generates a directed causal mask. The fine-scale graph is input into a graph Transformer network, and a physical prior bias term consisting of arrival probabilities and arrival times is introduced into the attention score, with message passing performed under the constraints of the causal mask. Hierarchical graph pooling is further performed to obtain a coarse-scale graph, and coarse-scale edge weights are obtained by pooling the fine-scale drift edges to form a cross-scale homogeneous dynamic graph. This completes the fusion of coarse and fine-scale representations and outputs the prediction results, achieving physically interpretable modeling of time-varying marine environmental relationships and improving the stability and accuracy of multi-scale situation prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of marine environmental situation prediction, and in particular to a method for marine multi-scale environmental situation prediction based on graph neural networks. Background Technology

[0002] Marine environmental situation prediction focuses on current velocity, current direction, wind speed, wind direction, and other environmental factors, providing predictive information for scenarios such as shipping safety, marine engineering operation and management, maritime search and rescue, and oil spill response. Existing technologies employ two main approaches: one primarily uses marine numerical models, establishing coupled calculations of ocean dynamics and atmospheric forcing to predict the spatiotemporal distribution of target sea areas. The other approach is data-driven, utilizing multi-source data such as buoy data, shore-based data, remote sensing data, and reanalysis data to construct predictive models. With the development of deep learning technology, temporal neural networks and spatiotemporal convolutional networks are used to characterize temporal evolution and local spatial correlations. Furthermore, graph neural networks abstract observation nodes as graph nodes and relationships between nodes as graph edges to model the spatial dependencies of irregular observation networks. Graph attention networks and graph Transformers can adaptively weight and aggregate neighborhood information. Multi-scale structures such as hierarchical graph pooling are also used to extract situation features at different spatial scales and improve predictive performance.

[0003] Current technologies still have the following shortcomings in marine environmental prediction:

[0004] 1. Many graph-based prediction methods typically use geographical proximity, correlation coefficients, or fixed adjacency matrices to construct spatial relationships, which are difficult to reflect the connectivity relationships that change over time driven by dynamic flow and wind fields. This makes static relationship models prone to failure under conditions such as strong flow shear and typhoons.

[0005] 2. Existing attention or message passing mechanisms are mostly based on data similarity and lack mechanisms to introduce physically interpretable quantities such as arrival probability, arrival time, and downstream consistency into attention scoring and propagation constraints. This can easily lead to the spread of erroneous information that does not conform to the physical propagation direction, affecting the stability and reliability of predictions.

[0006] 3. A common practice in multi-scale modeling is to use different relationship construction strategies for fine-scale and coarse-scale. Cross-scale edge relationships lack a consistent source, making it difficult to simultaneously represent time-varying correlation structures at different scales, thus limiting further improvement in the accuracy of multi-scale situation prediction.

[0007] Therefore, a method for predicting the marine environmental situation that can overcome the shortcomings of the existing technology is a problem that needs to be solved by those skilled in the art. Summary of the Invention

[0008] One objective of this invention is to propose a multi-scale marine environmental situation prediction method based on graph neural networks. Addressing the shortcomings of existing technologies that rely on geographic nearest neighbors or fixed adjacency matrices to construct spatial relationships, making it difficult to characterize time-varying relationships driven by flow and wind fields, and lacking physical constraints in attention message passing leading to erroneous propagation paths, as well as inconsistent relationship sources between multi-scale graph structures affecting prediction stability and accuracy, this invention proposes a method based on Lagrange drift integrals combined with perturbation initial conditions to generate reachable regions. It calculates arrival probabilities and arrival times between observed nodes to construct a fine-scale time-varying directed graph and edge features, and generates a directed causal mask. The fine-scale graph is input into a graph Transformer network, and a physical prior bias term consisting of arrival probability, arrival time, downstream consistency coefficient, and path cost is introduced into the attention score. Message passing is performed under the constraint of the directed causal mask. Furthermore, hierarchical graph pooling is used to obtain coarse-scale nodes and allocation matrices, and coarse-scale edge weights are formed by converging fine-scale drift edges to construct a cross-scale homogeneous dynamic graph. Finally, the coarse and fine-scale representations are fused to output the prediction result. The present invention has the technical effects of being able to dynamically update node associations and constrain the direction of information propagation in a physically interpretable manner, improving the stability and accuracy of multi-scale situation prediction, and enhancing the generalization ability under complex sea conditions.

[0009] This invention provides a method for predicting marine multi-scale environmental situations based on graph neural networks, comprising:

[0010] S1. Acquire marine environmental observation data and geographic constraint data from multiple observation nodes in the target sea area, and perform time alignment and coordinate unification; S2. Construct a node set based on the geographic location of each observation node, and map the observation data to the node set according to time to form a node feature sequence; S3. For each source node in the node set, perform Lagrange drift integral based on the flow field data and wind field data in the node feature sequence within a preset prediction time window, generate multiple sets of initial disturbance conditions for the source node according to a preset disturbance distribution, and perform Lagrange drift integral for each to form a reachable domain, calculate the arrival probability and arrival time of each target node relative to the source node, generate a fine-scale directed edge set and its edge weights, perform geographic constraint processing on the fine-scale directed edge set, generate a fine-scale edge feature set based on the arrival probability and arrival time, and generate a fine-scale directed causal mask based on the fine-scale directed edge set; S4. Input the fine-scale graph into a graph Transformer network, based on S5. Calculate the physical prior bias term for the fine-scale edge features and add attention scores. Constrain the attention calculation using a fine-scale directed causal mask to obtain a fine-scale node representation sequence. S6. Perform hierarchical graph pooling on the fine-scale node representation sequence to obtain a coarse-scale node set and a fine-scale to coarse-scale node allocation matrix. Aggregate the coarse-scale node feature sequence based on the node allocation matrix. Based on the node allocation matrix, aggregate the fine-scale directed edges and their weights into coarse-scale directed edges and their weights to generate a coarse-scale directed causal mask. S7. Input the coarse-scale graph into a graph Transformer network. Calculate the physical prior bias attention based on the coarse-scale edge weights in the same way as in S4 and perform message passing under the constraints of the coarse-scale directed causal mask to obtain a coarse-scale node representation sequence. S8. Map the coarse-scale node representation sequence to the fine-scale nodes according to the node allocation matrix, fuse them with the fine-scale node representation sequence, and output the marine environmental situation prediction results corresponding to each observation node under the preset prediction lead.

[0011] Optionally, S1 includes:

[0012] The marine environmental observation data of multiple observation nodes within the target sea area within a preset time period are acquired. The marine environmental observation data includes surface or layer current velocity data, current direction data, wind speed data, and wind direction data. The marine environmental observation data is then time-aligned and coordinate-unified to form the observation dataset.

[0013] Simultaneously, the geographic constraint data corresponding to the target sea area is obtained. The geographic constraint data includes shoreline data used to characterize the land-sea boundary and water depth data used to characterize the water depth distribution. The geographic constraint data is then coordinate-unified to form the geographic constraint dataset.

[0014] Optionally, S2 includes:

[0015] Each observation node is assigned a unique identifier based on its geographical location and its longitude, latitude, and water depth information are recorded to construct the node set.

[0016] The time series is determined according to the time alignment result of the observation dataset. At each moment of the time series, the flow field data and wind field data corresponding to each observation node in the observation dataset are mapped to the node feature vector of that observation node. The node feature vectors at each moment are arranged in chronological order to form the node feature sequence. The flow velocity, flow direction, wind speed, and wind direction in the flow field data and wind field data are converted into component representations in the same coordinate system and numerically normalized.

[0017] In the presence of missing data, interpolation is used to complete the data or missing data markers are set in the node feature vectors to maintain the consistency of the time length of the node feature sequence.

[0018] Optionally, S3 includes:

[0019] For each source node in the node set, the drift velocity is determined using the flow field data and wind field data in the node feature sequence within a preset prediction time window, and the drift velocity is numerically integrated with a preset time step to obtain the drift trajectory.

[0020] Multiple sets of initial disturbance conditions are generated at the initial position and initial velocity of the source node according to the preset disturbance distribution, and the numerical integration is performed on each set of initial disturbance conditions to form the reachable domain with the set of drift trajectories.

[0021] For any target node, the arrival probability is determined based on the proportion of drift trajectories that enter the preset neighborhood radius centered on the target node in the reachable domain, and the moment when the drift trajectory first enters the preset neighborhood radius is determined as the arrival time;

[0022] The downstream consistency coefficient is calculated based on the angle between the velocity direction of the drift trajectory at the arrival time and the displacement direction from the source node to the target node.

[0023] When the arrival probability is not less than a preset probability threshold and the arrival time is not greater than a preset time threshold, a fine-scale directed edge is generated from the source node to the target node.

[0024] Based on the shoreline data, fine-scale directed edges that intersect the drift trajectory with the shoreline are removed, and the path cost of the remaining fine-scale directed edges is calculated based on the water depth data along the drift trajectory.

[0025] Calculate fine-scale edge weights based on the arrival probability, arrival time, downstream consistency coefficient, and path cost, and construct the fine-scale edge feature set using the fine-scale edge weights, arrival time, and downstream consistency coefficient;

[0026] The fine-scale directed causal mask is generated based on the fine-scale directed edges.

[0027] Optionally, S4 includes:

[0028] In the graph Transformer network, for each time step of the node feature sequence, a corresponding query vector, key vector, and value vector are generated based on the node feature vector at that time step.

[0029] For each fine-scale directed edge in the set of fine-scale directed edges, a physical prior bias term is calculated based on the fine-scale edge feature vector corresponding to that fine-scale directed edge.

[0030] For any source node and its target node connected by a fine-scale directed edge, an attention score is calculated based on the query vector and the key vector, and the physical prior bias term is added to the attention score to obtain physical prior bias attention.

[0031] The physical prior bias attention is constrained by the fine-scale directed causal mask, so that node pairs not defined in the fine-scale directed edge set do not participate in attention calculation.

[0032] The value vector is weighted and aggregated based on the constrained physical prior bias attention to obtain the fine-scale node representation at that moment, and the fine-scale node representations at each moment are arranged in chronological order to form a fine-scale node representation sequence.

[0033] Optionally, S5 includes:

[0034] Calculate the pooling score for each fine-scale node in the fine-scale node representation sequence, and determine the number of coarse-scale nodes according to the preset pooling ratio.

[0035] A node allocation matrix from fine-scale nodes to coarse-scale nodes is generated based on the pooling score, and a set of coarse-scale nodes is determined based on the node allocation matrix.

[0036] At each time step of the fine-scale node representation sequence, the fine-scale node representation at that time step is weighted and aggregated according to the node allocation matrix to obtain the coarse-scale node features at that time step, and the coarse-scale node features at each time step are arranged in chronological order to form a coarse-scale node feature sequence.

[0037] For each fine-scale directed edge in the set of fine-scale directed edges, the source node corresponding to the coarse-scale node and the target node corresponding to the coarse-scale node are determined according to the node allocation matrix, thereby generating a set of coarse-scale directed edges.

[0038] The fine-scale edge weights with the same source coarse-scale node and the same target coarse-scale node are aggregated to obtain the corresponding coarse-scale edge weights and form a coarse-scale edge weight set.

[0039] A coarse-scale directed causal mask is generated based on the coarse-scale directed edge set.

[0040] Optionally, S6 includes:

[0041] In the graph Transformer network, for each time step of the coarse-scale node feature sequence, a corresponding query vector, key vector, and value vector are generated based on the coarse-scale node features at that time step.

[0042] For each coarse-scale directed edge in the set of coarse-scale directed edges, calculate the physical prior bias term based on the coarse-scale edge weight corresponding to the coarse-scale directed edge.

[0043] For any source coarse-scale node and its target coarse-scale node connected by a coarse-scale directed edge, an attention score is calculated based on the query vector and the key vector, and the physical prior bias term is added to the attention score to obtain physical prior bias attention.

[0044] The physical prior bias attention is constrained by the coarse-scale directed causal mask, so that coarse-scale node pairs not defined in the coarse-scale directed edge set do not participate in attention calculation.

[0045] The value vector is weighted and aggregated based on the constrained physical prior bias attention to obtain the coarse-scale node representation at that moment, and the coarse-scale node representations at each moment are arranged in chronological order to form a coarse-scale node representation sequence.

[0046] Optionally, the S7 includes:

[0047] In the node allocation matrix, the coarse-scale node corresponding to each fine-scale node is determined according to the allocation weight of each fine-scale node, and the coarse-scale node representation sequence is upsampled according to the node allocation matrix to form a mapped coarse-scale representation sequence at the granularity of the node set.

[0048] The mapped coarse-scale representation sequence and the fine-scale node representation sequence are fused according to a preset fusion rule to obtain a fused representation sequence, wherein the preset fusion rule is one of weighted summation, linear transformation after concatenation, or gated fusion.

[0049] The fused representation sequence is regressed or classified using a prediction head network to obtain the marine environmental situation prediction results for each observation node under the preset prediction lead.

[0050] The coarse-scale node representation sequence is then regressed or classified using a coarse-scale prediction head network to obtain coarse-scale marine environmental situation prediction results corresponding to the coarse-scale node set.

[0051] Optionally, the physical prior bias term is a function of arrival probability, arrival time, downstream consistency coefficient, and path cost, and satisfies the following: the physical prior bias term increases when the arrival probability increases, the physical prior bias term decreases when the arrival time increases, and the physical prior bias term decreases when the path cost increases.

[0052] The physical prior bias term includes a term that takes the logarithm of the arrival probability and / or a term that exponentially decays the arrival time, and the physical prior bias term is truncated or normalized to limit its numerical range.

[0053] Optionally, when performing aggregation calculations on fine-scale edge weights with the same source coarse-scale nodes and the same target coarse-scale nodes, a weighted summation or weighted mean with arrival probabilities as weights is used.

[0054] Furthermore, the minimum or weighted average of the fine-scale arrival time is used as part of the coarse-scale edge features to generate the coarse-scale directed causal mask.

[0055] The beneficial effects of this invention are:

[0056] 1. Based on flow field data and wind field data, perform Lagrange drift integral and combine it with the initial disturbance conditions to form the reachable domain. Further calculate the arrival probability and arrival time to generate time-varying directed edges and edge weights, so that the correlation between observation nodes can be dynamically updated over time and has physical interpretability, thereby reducing the risk of static mapping failing in strong flow change scenarios.

[0057] 2. In the graph Transformer attention score, a physical prior bias term consisting of arrival probability, arrival time, downstream consistency coefficient and path cost is introduced. A directed causal mask is used to constrain the message transmission direction, suppress the propagation path of erroneous information that does not conform to the drift direction, improve the stability and reliability of the prediction process, and enhance the prediction accuracy under complex sea conditions.

[0058] 3. A coarse-scale representation is constructed by hierarchical graph pooling, and the coarse-scale edge weights and coarse-scale causal masks are obtained by pooling the fine-scale drift edges and their weights. This enables dynamic graph relationship updates with common origins across scales, maintains a consistent physical driving association between coarse and fine-scale structures, enhances the ability to express multi-scale features, and thus improves the overall situation prediction performance and generalization ability. Attached Figure Description

[0059] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings:

[0060] Figure 1 This is a flowchart of a multi-scale marine environmental situation prediction method based on graph neural networks proposed in this invention. Detailed Implementation

[0061] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.

[0062] refer to Figure 1 A method for predicting marine multi-scale environmental situations based on graph neural networks includes:

[0063] S1. Acquire marine environmental observation data and geographic constraint data from multiple observation nodes in the target sea area, and perform time alignment and coordinate unification; S2. Construct a node set based on the geographic location of each observation node, and map the observation data to the node set according to time to form a node feature sequence; S3. For each source node in the node set, perform Lagrange drift integral based on the flow field data and wind field data in the node feature sequence within a preset prediction time window, generate multiple sets of initial disturbance conditions for the source node according to a preset disturbance distribution, and perform Lagrange drift integral for each to form a reachable domain, calculate the arrival probability and arrival time of each target node relative to the source node, generate a fine-scale directed edge set and its edge weights, perform geographic constraint processing on the fine-scale directed edge set, generate a fine-scale edge feature set based on the arrival probability and arrival time, and generate a fine-scale directed causal mask based on the fine-scale directed edge set; S4. Input the fine-scale graph into a graph Transformer network, based on S5. Calculate the physical prior bias term for the fine-scale edge features and add attention scores. Constrain the attention calculation using a fine-scale directed causal mask to obtain a fine-scale node representation sequence. S6. Perform hierarchical graph pooling on the fine-scale node representation sequence to obtain a coarse-scale node set and a fine-scale to coarse-scale node allocation matrix. Aggregate the coarse-scale node feature sequence based on the node allocation matrix. Based on the node allocation matrix, aggregate the fine-scale directed edges and their weights into coarse-scale directed edges and their weights to generate a coarse-scale directed causal mask. S7. Input the coarse-scale graph into a graph Transformer network. Calculate the physical prior bias attention based on the coarse-scale edge weights in the same way as in S4 and perform message passing under the constraints of the coarse-scale directed causal mask to obtain a coarse-scale node representation sequence. S8. Map the coarse-scale node representation sequence to the fine-scale nodes according to the node allocation matrix, fuse them with the fine-scale node representation sequence, and output the marine environmental situation prediction results corresponding to each observation node under the preset prediction lead.

[0064] In this specific embodiment, S1 includes:

[0065] A unified data clipping boundary is determined based on the spatial extent of the target sea area. Marine environmental observation data within a preset time period is then synchronously acquired from multiple observation nodes using this clipping boundary as a constraint. These observation nodes are buoy stations, shore-based stations, or offshore observation platforms with fixed geographical locations. The marine environmental observation data includes surface current velocity, surface current direction, surface wind speed, surface wind direction, and stratified current velocity and direction, with stratification depth taken as... and Two depth layers are used, and the depth is recorded in the data structure as a "depth layer identifier" field. The values ​​for flow velocity and wind speed are standardized to [unit not specified]. The units for flow direction and wind direction are unified in degrees, and the angle range is unified as follows: The direction of flow is defined as the angle at which the seawater flows, with true north as the reference point. And it increases clockwise. When the wind direction is defined as the direction of arrival in the original data, it is increased by increasing the angle of arrival. and according to The loop process is converted into a destination definition consistent with the flow direction to eliminate the discrepancy in direction definition.

[0066] Then, time alignment processing is performed on the observation data of all observation nodes to convert the original timestamps to Coordinated Universal Time and construct a unified time series:

[0067] ;

[0068] in This represents a unified set of time series after time alignment. This indicates the start time of the preset time period. Indicates a uniform sampling time interval and takes , Indicates the time step index. The total number of time steps is represented by the end and start times of the preset time period and the unified sampling time interval. During the alignment process, piecewise linear interpolation is used to generate values ​​for flow velocity and wind speed on the unified time series, while retaining the original missing observation markers for subsequent processing. An interpolation strategy with angle as the circumferential variable is used for flow direction and wind direction to avoid crossing. The time jump occurs, forming an observation dataset that covers all observation nodes and has a consistent time axis;

[0069] Simultaneously, geographic constraint data corresponding to the target sea area is acquired. The geographic constraint data includes shoreline data used to characterize the land-sea boundary and water depth data used to characterize the water depth distribution. The shoreline data is represented in the form of a vector polyline and each vertex contains longitude and latitude coordinates. The water depth data is represented in the form of a regular grid and each grid cell contains a water depth value and is represented in metric units based on sea level.

[0070] Finally, coordinate unification processing is performed, converting the longitude and latitude of the observation nodes, the vertex coordinates of the shoreline data, and the raster spatial reference system of the water depth data into the WGS-84 geographic coordinate system. Clipping and storage are then completed under the same coordinate datum. The location of the observation nodes is represented by longitude and latitude, and stored in association with their unique identifiers. The shoreline data is clipped according to the clipping boundaries and stored as a set of shoreline elements for subsequent geographic constraint determination. The water depth data is clipped according to the clipping boundaries and resampled. After obtaining the spatial resolution, the data is stored as a water depth raster dataset for subsequent path cost calculation, resulting in an observation dataset and a geographic constraint dataset for subsequent steps.

[0071] In this specific embodiment, S2 includes:

[0072] Based on the location of the observation nodes corresponding to the observation dataset, a unique identifier is assigned to each observation node and a node set is constructed. ,in Represents a set of nodes. Indicates the first One observation node, Represents the node index and its value ranges from 1 to... This represents the total number of observation nodes, and for each observation node... Record its longitude ,latitude With water depth ,in and All angle values ​​are in the WGS-84 geographic coordinate system. The water depth at the location corresponding to the observation node is determined by the water depth raster dataset from step S1. The interpolation value is obtained and stored in meters;

[0073] Then, based on the unified time series constructed in step S1 Perform time mapping, where Indicates the first A unified moment, Represents the time step index and its value ranges from 0 to... This represents the total number of time steps, for each moment. With each observation node The flow velocity and direction, as well as wind speed and direction, are read from the observation dataset at that moment. The flow velocity and direction are converted into eastward and northward flow velocity components in the same coordinate system, and the wind speed and direction are converted into eastward and northward wind speed components in the same coordinate system. The coordinate system uses an eastward axis with geographic east as positive and a northward axis with geographic north as positive. The flow direction and wind direction are defined directionally and are based on geographic true north. The number of nodes increases clockwise. The eastward component is obtained by multiplying the corresponding velocity by the sine of the corresponding direction angle, and the northward component is obtained by multiplying the corresponding velocity by the cosine of the corresponding direction angle. This constructs a node feature vector for each observation node at each moment, and these vectors are arranged chronologically to form a node feature sequence. The node feature vector satisfies the following:

[0074] ;

[0075] in Indicates the observation node At any moment The node feature vectors, This represents the eastward flow velocity component at that moment, with units of . This represents the northward flow velocity component at that moment, with units of 1. This represents the eastward wind speed component at that moment, with units of 1. This represents the northward wind speed component at that moment, with units of 1. Represents a node water depth and unit This indicates a missing measurement marker, and is used when any flow field or wind field observation is missing at that moment. Otherwise take , Indicates vector transpose;

[0076] Subsequently, the continuous numerical dimensions in the node feature vectors were analyzed. and Numerical normalization is performed by statistically analyzing the mean and standard deviation of all nodes and all time points along each dimension, followed by a zero-mean, unit-variance transformation. In cases of missing data, the corresponding continuous numerical dimension is filled with the normalized zero value of that dimension while retaining the missing data marker. To ensure that the time length of the node feature sequence is the same across all nodes It can also be processed by subsequent graph Transformer networks with fixed-dimensional input.

[0077] In this specific embodiment, S3 includes:

[0078] For a set of nodes Each source node To unify time series The current moment in As the starting point for prediction, and with a preset prediction time window length set as... In the interval The Lagrange drift integral is performed internally to generate a set of drift trajectories originating from the source node and to form the reachable region. The drift integral employs a fourth-order Runge-Kutta numerical integration method, with an integration time step of [value missing]. In each integral substep, the particle position is represented as a latitude-longitude pair in the WGS-84 coordinate system, and the displacement is updated in the local east-north plane before being mapped back to latitude and longitude to avoid the scale error caused by direct accumulation of latitude and longitude.

[0079] The drift velocity is determined by the unnormalized flow field components and wind field components from step S2. Specifically, at any integration time and at any particle position, the eight nearest observation nodes are selected based on the spherical distance from the particle position to each observation node position. Spatial interpolation is then performed on the eastward and northward flow velocity components, and the eastward and northward wind speed components using an inverse square distance weighting. Then, adjacent nodes are interpolated on the time axis. and Linear interpolation is performed to obtain the continuous-time flow field and wind field components at the particle's location, followed by the wind-induced drift coefficient. The drift velocity component of the particle is obtained by superimposing the wind field component onto the flow field component.

[0080] Multiple sets of initial disturbance conditions are generated for the source node according to the preset disturbance distribution, and the Lagrange drift integral is performed on each set. The number of sets of initial disturbance conditions is determined by... , No. The initial location of the group perturbation is at the source node. Geographical location as center, radius Within the circular region, coordinates are generated uniformly and converted to latitude and longitude coordinates. The initial velocity disturbance of the group is a constant vector superimposed on the drift velocity, and its eastward and northward disturbances are both based on the mean value. Standard deviation is The normal distribution is generated by using the source node identifier and the current time index together as the seed for the random number generator to achieve deterministic reproduction of the perturbation, resulting in a distribution containing... The reachable domain of the drift trajectory;

[0081] For any target node Construct a neighborhood radius centered on the geographical location of the target node. The spherical neighborhood is used, and the great circle distance under WGS-84 is used to calculate the distance from the particle to the target node. For each drift trajectory in the reachable domain, it is determined whether it is within the time window. The first entry into the spherical neighborhood is recorded, and the source node is determined based on the proportion of valid trajectories entering the spherical neighborhood to the total number of trajectories. To the target node Arrival probability The first entry time of all valid trajectories entering the neighborhood of the sphere is relative to The time difference is taken as the arithmetic mean to determine the arrival time. And the unit is hours;

[0082] Downstream consistency coefficient The angle cosine between the drift velocity direction of each valid trajectory at its first entry moment and the displacement direction from the source node to the target node is calculated and... The arithmetic mean is obtained after truncating within the interval, where the cosine of the included angle is less than 0 to characterize that non-current propagation does not contribute to consistency;

[0083] When performing geographic constraint processing on the set of directed edges at a fine scale, the line segment connecting adjacent integration points of each drift trajectory is used to perform intersection determination with the shoreline polyline using shoreline data. Trajectories with intersections are then eliminated to remove propagation paths that cross land. Subsequently, the water depth raster dataset is queried point by point along the remaining valid trajectory to obtain the water depth sequence, and the path cost is calculated by accumulating the inverse of the water depth penalty. Among them, integration points with a water depth of less than 2 m are judged as impassable and the corresponding trajectories are eliminated, ensuring that the path cost reflects the obstruction of drift propagation in shallow water areas.

[0084] If and only if and Time generation from source node Point to target node The fine-scale directed edges are added to the fine-scale directed edge set, and the arrival probability is considered. Arrival time Consistency coefficient of downstream flow Path cost Calculate fine-scale edge weights Its calculation satisfies:

[0085] ;

[0086] in Represents fine-scale edge weights. Indicates the probability of arrival. Indicates arrival time. This represents the downstream consistency coefficient. Represents path cost, The weights represent the contribution of the arrival probability to the edge weights and are taken as follows: This represents the weight of the arrival time decay term's contribution to the edge weights, and takes... This represents the weight of the contribution of the downstream consistency term to the edge weights, and takes... This represents the weight of the path cost term in relation to the edge weight penalty, and takes... This indicates a constant that avoids logarithmic singularities and takes... Represents the time decay constant and takes and the calculated Cut off to interval To limit the range of values;

[0087] Based on this, the fine-scale edge feature set is defined as containing features in a fixed order. and A five-dimensional vector set is used to generate a fine-scale directed causal mask based on the fine-scale directed edge set. ,in Represents a fine-scale directed causal mask matrix if and only if there exists a matrix composed of... point to Fine-scale directed edge time Otherwise This ensures that subsequent attention calculations occur only on the directed connectivity relationships defined by the set of directed edges at a fine scale.

[0088] In this specific embodiment, S4 includes:

[0089] The node feature sequence is combined with the fine-scale directed edge set, the fine-scale edge feature set, and the fine-scale directed causal mask. Input the fine-scale graph into the Transformer network at each time step. A single spatial message passing operation is performed independently to obtain a sequence of fine-scale node representations, where... Representing a unified time series The At that moment and Represents the total number of observed nodes and the set of nodes. Consistent, Represents the source node in a directed causal mask. Point to target node Connectivity marker and only if fine-scale directed edges If it exists, set it to 1; otherwise, set it to 0.

[0090] The input node feature vectors of the fine-scale graph Transformer network are adopted And first projected onto the model dimension via a linear embedding layer. Obtain the initial node embedding ,in Represents a node At any moment The node feature vector with dimension 1 This represents the node embedding vector of layer 0 with dimension . ;

[0091] Fine-scale graph Transformer network is composed of The graph consists of stacked Transformer layers, each containing a multi-head attention sublayer and a feedforward network sublayer. The number of attention heads in the multi-head attention sublayer is determined by... And the head dimension of each attention head is taken as and satisfy In any layer With any time Inside, for each attention head The nodes of the previous layer are represented by three sets of linear transformation matrices respectively. Mapped to query vector Key vector AND value vector ,in Indicates the first Layer nodes At any moment The representation vector with dimension 1 All are dimensions ;

[0092] For each fine-scale directed edge fine-scale edge feature vectors Input the physics prior bias calculation module to obtain the scalar physics prior bias term corresponding to each attention head. ,in Representing an edge The edge feature vector with dimension Represents fine-scale edge weights. Indicates the probability of arrival. Indicates arrival time. This represents the downstream consistency coefficient. Representing path cost, the physical prior bias calculation module adopts a two-layer perceptron structure, and the first layer weight matrix has a dimension of . The second-layer weight matrix has a dimension of The activation function takes ReLU and the output is divided into intervals. Truncation to limit the bias amplitude;

[0093] In attention calculation, for any source node With any target node In the Layer Attention scoring for each individual is based on:

[0094] ;

[0095] in Indicates at time Attention scalar This represents the transpose of a vector. Indicated by head dimension The square root is scaled to stabilize the numerical range. This represents mask suppression implemented with negative infinity to ensure... The nodes do not participate in attention normalization;

[0096] Subsequently, the same source node All target node scores are Softmax normalized to obtain attention weights, and the corresponding value vectors are then processed. The weighted summation yields the aggregated result of this attention head. The aggregation results of each attention head are concatenated according to the head dimension, and then the output is linearly transformed to obtain the multi-head attention sub-layer output. After residual connection with the input, LayerNorm normalization is performed.

[0097] The feedforward sublayer employs a two-layer fully connected structure with GELU activation between the two layers and a hidden layer dimension of 128. A "fully connected-GELU-fully connected-Dropout" operation is performed on the output of the attention sublayer, with a Dropout rate of 0.1. Residual connections and LayerNorm are then applied to obtain the first... Layer output node representation ,Will The node representation at time is denoted as the fine-scale node representation. And in chronological order for all moments of Stacking them together forms a sequence of fine-scale node representations.

[0098] In this specific embodiment, S5 includes:

[0099] For the sequence of fine-scale node representations, first, for each fine-scale node at all time points... Fine-scale node representation The time average is used to obtain the time summary representation. ,in Represents a set of nodes The first in Observation nodes and Indicates the number of fine-scale nodes. Representing a unified time series The At that moment and Represents a node At any moment The fine-scale node representation vector has a dimension of Represents a node The time summary is represented as a vector with a dimension of 64;

[0100] Then for each Pooling scores are calculated using a pooling score network. The pooling scoring network is a single-layer fully connected structure with parameters being weight vectors. With bias and will Sort by largest to smallest and then based on preset pooling ratio Determine the number of coarse-scale nodes Based on this, the top pooling scores are selected. The fine-scale nodes are used as the set of coarse-scale nodes. ,in Indicates the first coarse-scale nodes and ;

[0101] Then, a node assignment matrix from fine-scale to coarse-scale is constructed. ,in Representing fine-scale nodes Assigned to coarse-scale nodes Assign weights and satisfy any have Weights are assigned by fine-scale nodes. With coarse-scale nodes The great circle distance is calculated using the exponential kernel weights and then normalized by rows. The distance calculation uses the spherical distance in the WGS-84 coordinate system, and the exponential kernel bandwidth is taken as... This ensures that each fine-scale node participates in the construction of the coarse-scale representation with a non-zero weight and that the assignment matrix remains consistent at all times.

[0102] At every moment Based on the node assignment matrix, the fine-scale node representation matrix is ​​generated. Aggregates into a coarse-scale node feature matrix The fine-scale directed edges and their weights are then aggregated into coarse-scale directed edges and their weights to obtain the coarse-scale weight matrix. The aggregation and convergence are performed according to the formula:

[0103] ;

[0104] in Indicates time The coarse-scale node feature matrix, Represents the node allocation matrix transpose, Indicates by each The fine-scale node representation matrix obtained by stacking nodes by index. The fine-scale edge weight matrix is ​​defined if and only if the fine-scale directed edges At any moment Existence of time Otherwise Represents fine-scale edge weights. Describes the coarse-scale edge weight matrix and its elements Corresponding to point to Coarse-scale edge weights;

[0105] Based on the coarse-scale edge weight matrix at each time step Generate a coarse-scale directed edge set and a coarse-scale directed causal mask. , where if and only if Coarse-scale directed edges Add a coarse-scale directed edge set and let Otherwise The resulting coarse-scale node feature sequence Coarse-scale edge weight sequence With coarse-scale directed causal mask sequence The output is sent to step S6 for message passing computation of the coarse-scale graph Transformer network.

[0106] In this specific embodiment, S6 includes:

[0107] coarse-scale node feature sequence Coarse-scale edge weight sequence and coarse-scale directed causal mask sequences Input a coarse-scale graph into a Transformer network, and at each time step... Independently execute a message passing operation based on a coarse-scale directed graph to obtain a sequence of coarse-scale node representations, where Representing a unified time series The At that moment and , Indicates time The coarse-scale node feature matrix, Indicates the number of coarse-scale nodes. Represents the feature dimension of the model and takes Indicates time The coarse-scale edge weight matrix, Indicates time The coarse-scale directed causal mask matrix is ​​such that the corresponding element is 1 if and only if a coarse-scale directed edge exists, otherwise it is 0;

[0108] The coarse-scale graph Transformer network adopts the same structural configuration as the fine-scale graph Transformer network and uses an independent parameter set; the number of network layers is [missing information]. Attention head count The head dimension of each attention head is taken as... And satisfy For any coarse-scale node At any moment The input feature vector is denoted as and for The row vectors, where Indicates the coarse-scale node index;

[0109] In the A coarse-scale Transformer layer and Inside, for each attention head The nodes of the previous layer are represented by three sets of linear transformation matrices respectively. Mapped to query vector Key vector AND value vector ,in ;

[0110] For each coarse-scale directed edge At any moment coarse-scale edge weight Calculate the physical prior bias term ,in This represents the index of the coarse-scale target node. This represents the elements of the coarse-scale edge weight matrix, which are obtained by converging the fine-scale edge weights.

[0111] The physical prior bias term is implemented through a physical prior bias calculation module, which is a two-layer perceptron with an input dimension of 1 and a first-layer weight matrix dimension of 1. The first layer bias dimension is 16, and the second layer weight matrix dimension is... The second layer has a bias dimension of 1, uses ReLU activation function, and truncates the output to the specified interval. To limit the magnitude of the bias and avoid attention saturation;

[0112] During the attention scoring phase, the scaled dot product attention is added to the physical prior bias term, and the disallowed node pairs are structurally masked using a coarse-scale directed causal mask. Layer Attention is focused at all times Attention score meets:

[0113] ;

[0114] in A scalar representing attention scores. This represents the transpose of a vector. This indicates scaling by the square root of the head dimension. This represents the mask constant used for numerical implementation so that the weights of the masked terms approach 0 after Softmax normalization;

[0115] Then, for the fixed source node All of Softmax normalization is performed to obtain attention weights and the corresponding values ​​are then adjusted. The weighted summation yields the aggregated result of this attention head. The aggregation results of the attention heads are concatenated according to the feature dimension and subjected to an output linear transformation to obtain the attention sub-layer output. Then, residual connections and LayerNorm are executed sequentially.

[0116] The feedforward network sublayer adopts a two-layer fully connected structure with a hidden layer dimension of 128, uses GELU as the activation function, and has a dropout rate of 0.1. Residual connections and LayerNorm are performed after the feedforward output to obtain the... Layer output ;

[0117] Will The output at that time is denoted as the coarse-scale node representation. And stack them in chronological order to form a coarse-scale node representation sequence. .

[0118] In this specific embodiment, S7 includes:

[0119] Based on node allocation matrix Perform upsampling mapping on the coarse-scale node representation sequence to the fine-scale node set At the granularity level, a coarse-scale representation sequence is formed after mapping, where Indicates the number of fine-scale nodes. Indicates the number of coarse-scale nodes. Representing fine-scale nodes Assigned to coarse-scale nodes The allocation weights and For each fine-scale node The corresponding coarse-scale node index is determined based on its assigned weight. and Take To obtain the maximum value This establishes a one-to-one mapping relationship between fine-scale nodes and coarse-scale nodes;

[0120] At every moment Representing coarse-scale nodes The mapping relationship is copied to the corresponding fine-scale node to obtain the mapped coarse-scale representation. ,in Representing a unified time series The At that moment and Representing fine-scale nodes At any moment The upsampled coarse-scale representation vector has a dimension of 64;

[0121] The mapped coarse-scale representation and fine-scale node representation are compared. The fusion is performed according to the fusion rules of the linear transformation after splicing to obtain the fused representation. The fusion operation satisfies:

[0122] ;

[0123] in This represents the fused representation vector. This represents the weight matrix of the fused linear layer. This represents the vector concatenation operation. This represents the bias vector of the fused linear layer;

[0124] fusion representation Input the fine-scale prediction head network to perform regression and obtain the preset prediction lead. The detailed-scale marine environmental situation prediction results are shown below. The detailed-scale prediction head network has a two-layer fully connected structure, and the first layer weight matrix has a dimension of 1. The first layer has a bias dimension of 128. The first layer activation function is GELU with Dropout applied at a rate of 0.1. The second layer weight matrix has a dimension of... The second layer has a bias dimension of 4, and the output vector is defined as follows:

[0125] ;

[0126] in and These represent the predicted eastward and northward velocity components, respectively. and These represent the predicted eastward and northward wind speed components, respectively.

[0127] Simultaneously, coarse-scale nodes are represented. Inputting the coarse-scale prediction head network yields a coarse-scale node set. The corresponding coarse-scale marine environmental situation prediction results show that the coarse-scale prediction head network and the fine-scale prediction head network adopt the same two-layer fully connected structure and the same parameter dimension configuration, but use independent parameter sets and output four-component prediction vectors of the same dimension.

[0128] In this specific embodiment, the physical prior bias term is constructed in the fine-scale graph Transformer network in step S4 in the manner of "each fine-scale directed edge corresponds to a scalar bias" and reused in the attention scoring calculation of the fine-scale directed edge in all network layers and all attention heads, thereby ensuring that the bias is determined only by physical quantities and has interpretability.

[0129] For any fine-scale directed edge To reach probability Arrival time Consistency coefficient of downstream flow Path cost Calculate the physical prior bias term for this edge. Furthermore, deterministic range constraints are applied to the input physical quantities before calculation to avoid numerical instability, where the arrival probability... Restricted to and Arrival time The unit is hours, and the downstream consistency coefficient is... The range of values ​​is Path cost The value is taken according to the shallow water penalty accumulation result in step S3 and expressed as a dimensionless quantity, and further expressed as... Perform proportional normalization to obtain ;

[0130] The physical prior bias term is calculated using a monotonic analytic function and truncated at the output to limit its numerical range, satisfying:

[0131] ;

[0132] in Represents fine-scale directed edges The physical prior bias term is a scalar. The natural logarithm function is used to slow down the increase in the contribution of the arrival probability to the bias in the high probability interval. The exponential function is used to introduce exponential decay over the arrival time. Represents the time decay constant and takes Indicates the weight of the arrival probability term and takes Indicates the weight of the arrival time decay term and takes Represents the weight of the downstream consistency term and takes Represents the weight of the path cost penalty term and takes Denotes the path cost normalization constant and takes The truncation function is used to cut the input to a range. Inside, and Take -5 and 5 respectively;

[0133] The above construction increases the arrival probability. When it increases Increase thus Increase, so that the arrival time When it increases Reduce thus Reduce path cost When it increases Reduce thus Reduce, and the The additive bias of attention scoring in step S4, together with the directed causal mask, constrains the direction and strength of message passing.

[0134] In this specific implementation, for the process of "converging and calculating the fine-scale edge weights of nodes with the same source coarse-scale and nodes with the same target coarse-scale" in step S5, the process is first based on the node allocation matrix. Each fine-scale node Deterministic assignment to a unique coarse-scale node ,in This represents the node allocation matrix from fine-scale to coarse-scale. Indicates the number of fine-scale nodes. Indicates the number of coarse-scale nodes. Indicates the first fine-scale nodes and , Indicates the first coarse-scale nodes and Take The index of the maximum value is obtained so that each fine-scale node belongs to only one coarse-scale node;

[0135] Then, for any pair of coarse-scale nodes Construct its corresponding fine-scale edge index set ,in From all the fine-scale directed edges Existence and and "ordered node pairs Composition, and reading each from the fine-scale edge features of step S3. Corresponding fine-scale edge weights Fine-scale arrival probability With fine-scale arrival time ,in Represents fine-scale directed edges The right to the border, This represents the reachability probability of a directed edge at this fine scale, and its value ranges from 1 to 2. This indicates the arrival time of the directed edge at this fine scale, in hours.

[0136] During the convergence computation, a weighted mean with arrival probabilities as weights is used to generate coarse-scale edge weights, and the minimum value of fine-scale arrival time is used as part of the coarse-scale edge features to generate a coarse-scale directed causal mask. The coarse-scale edge weights and coarse-scale arrival times satisfy the following:

[0137] ;

[0138] in Indicates coarse-scale nodes Pointing to coarse-scale nodes coarse-scale edge weights This indicates the corresponding coarse-scale arrival time. Represents a set Summation is performed on all fine-scale edges within the inner region. Represents a set The minimum arrival time for all fine-scale elements is taken. Represents a constant to prevent the denominator from being zero and takes ;

[0139] Generating coarse-scale directed causal masks At that time, with Non-empty and As a necessary condition for the existence of coarse-scale directed edges, and let Otherwise and coarse-scale edge weights As input to the coarse-scale graph Transformer network for computing physical prior biases, the coarse-scale arrival time is used. As a time constraint quantity used in coarse-scale edge features for causal mask generation and verification.

[0140] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

[0141] This invention addresses the problem of static relationship models failing due to the temporal changes in marine environmental relationships. It elevates "relationship modeling" from fixed nearest neighbors or static correlations to computable, time-varying directed adjacency driven by flow and wind field data. Specifically, by performing Lagrange drift integrals on the source node and combining this with initial perturbation conditions to form a reachable domain, the arrival probability and arrival time of the target node relative to the source node are calculated. This generates directed edges, edge weights, and edge features that update with the prediction time window, enabling the graph structure to reflect the actual transport direction and time delay of matter and energy in the ocean. Furthermore, a directed causal mask is used to constrain message passing only along the reachable direction. A physical prior bias term, consisting of arrival probability, arrival time, downstream consistency coefficient, and path cost, is introduced into the graph Transformer attention score. This strengthens adjacency contributions that conform to physical propagation laws during the information aggregation stage, suppresses unreasonable reverse or long-distance error propagation, and improves the stability, accuracy, and interpretability of predictions under complex sea conditions.

[0142] This invention addresses the aforementioned technical problems by making improvements for time-varying correlation: First, it uses the reachable domain, arrival probability, and arrival time as the basis for edge construction and the source of edge features, giving adjacency relationships a clear physical meaning and allowing them to be dynamically updated with time windows, thus avoiding structural mismatch in static graphs under abrupt changes in the flow field. Second, it directly incorporates physical quantities into the attention calculation rules to form physical prior biased attention, and uses a directed causal mask generated from the drift direction to achieve structural constraints on the propagation direction, thereby reducing the spread of erroneous paths. Third, in multi-scale modeling, after obtaining coarse-scale nodes through hierarchical graph pooling, it aggregates fine-scale drift edges and their weights to form coarse-scale directed edges and a coarse-scale causal mask, achieving dynamic graph updates with coarse and fine scales originating from the same source. This ensures consistent and coordinated multi-scale feature extraction and time-varying correlation modeling, thereby more effectively achieving the technical effect of predicting the marine multi-scale environmental situation.

Claims

1. A method for predicting marine multi-scale environmental situation based on graph neural networks, comprising: S1. Acquire marine environmental observation data and geographic constraint data from multiple observation nodes in the target sea area, and perform time alignment and coordinate unification; S2. Construct a node set based on the geographical location of each observation node, and map the observation data to the node set according to time to form a node feature sequence; S3. For each source node in the node set, perform Lagrange drift integral based on the flow field data and wind field data in the node feature sequence within a preset prediction time window, generate multiple sets of initial disturbance conditions for the source node according to the preset disturbance distribution, and perform Lagrange drift integral for each to form a reachable domain, calculate the arrival probability and arrival time of each target node relative to the source node, generate a fine-scale directed edge set and its edge weights, perform geographical constraint processing on the fine-scale directed edge set, generate a fine-scale edge feature set based on the arrival probability and arrival time, and generate a fine-scale directed causal mask based on the fine-scale directed edge set; S4. Input the fine-scale graph into the graph Transformer network, calculate the physical prior bias term based on the fine-scale edge features and add the attention score, use the fine-scale directed causal mask to constrain the attention calculation, and obtain the fine-scale node representation sequence. S5. Perform hierarchical graph pooling on the fine-scale node representation sequence to obtain the coarse-scale node set and the node allocation matrix from fine-scale to coarse-scale. Aggregate the coarse-scale node feature sequence based on the node allocation matrix. Based on the node allocation matrix, aggregate the fine-scale directed edges and their weights into coarse-scale directed edges and their weights to generate a coarse-scale directed causal mask. S6. Input the coarse-scale graph into a graph Transformer network. Calculate the physical prior bias attention based on the coarse-scale edge weights in the same way as in S4 and perform message passing under the constraints of the coarse-scale directed causal mask to obtain the coarse-scale node representation sequence. S7. Map the coarse-scale node representation sequence to the fine-scale nodes according to the node allocation matrix, fuse it with the fine-scale node representation sequence, and output the marine environmental situation prediction results corresponding to each observation node under the preset prediction lead.

2. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S1 includes: The marine environmental observation data of multiple observation nodes within the target sea area within a preset time period are acquired. The marine environmental observation data includes surface or layer current velocity data, current direction data, wind speed data, and wind direction data. The marine environmental observation data is then time-aligned and coordinate-unified to form the observation dataset. Simultaneously, the geographic constraint data corresponding to the target sea area is obtained. The geographic constraint data includes shoreline data used to characterize the land-sea boundary and water depth data used to characterize the water depth distribution. The geographic constraint data is then coordinate-unified to form the geographic constraint dataset.

3. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S2 includes: Each observation node is assigned a unique identifier based on its geographical location and its longitude, latitude, and water depth information are recorded to construct the node set. The time series is determined according to the time alignment result of the observation dataset. At each moment of the time series, the flow field data and wind field data corresponding to each observation node in the observation dataset are mapped to the node feature vector of that observation node. The node feature vectors at each moment are arranged in chronological order to form the node feature sequence. The flow velocity, flow direction, wind speed, and wind direction in the flow field data and wind field data are converted into component representations in the same coordinate system and numerically normalized. In the presence of missing data, interpolation is used to complete the data or missing data markers are set in the node feature vectors to maintain the consistency of the time length of the node feature sequence.

4. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S3 includes: For each source node in the node set, the drift velocity is determined using the flow field data and wind field data in the node feature sequence within a preset prediction time window, and the drift velocity is numerically integrated with a preset time step to obtain the drift trajectory. Multiple sets of initial disturbance conditions are generated at the initial position and initial velocity of the source node according to the preset disturbance distribution, and the numerical integration is performed on each set of initial disturbance conditions to form the reachable domain with the set of drift trajectories. For any target node, the arrival probability is determined based on the proportion of drift trajectories that enter the preset neighborhood radius centered on the target node in the reachable domain, and the moment when the drift trajectory first enters the preset neighborhood radius is determined as the arrival time; The downstream consistency coefficient is calculated based on the angle between the velocity direction of the drift trajectory at the arrival time and the displacement direction from the source node to the target node. When the arrival probability is not less than a preset probability threshold and the arrival time is not greater than a preset time threshold, a fine-scale directed edge is generated from the source node to the target node. Based on the shoreline data, fine-scale directed edges that intersect the drift trajectory with the shoreline are removed, and the path cost of the remaining fine-scale directed edges is calculated based on the water depth data along the drift trajectory. Calculate fine-scale edge weights based on the arrival probability, arrival time, downstream consistency coefficient, and path cost, and construct the fine-scale edge feature set using the fine-scale edge weights, arrival time, and downstream consistency coefficient; The fine-scale directed causal mask is generated based on the fine-scale directed edges.

5. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S4 includes: In the graph Transformer network, for each time step of the node feature sequence, a corresponding query vector, key vector, and value vector are generated based on the node feature vector at that time step. For each fine-scale directed edge in the set of fine-scale directed edges, a physical prior bias term is calculated based on the fine-scale edge feature vector corresponding to that fine-scale directed edge. For any source node and its target node connected by a fine-scale directed edge, an attention score is calculated based on the query vector and the key vector, and the physical prior bias term is added to the attention score to obtain physical prior bias attention. The physical prior bias attention is constrained by the fine-scale directed causal mask, so that node pairs not defined in the fine-scale directed edge set do not participate in attention calculation. The value vector is weighted and aggregated based on the constrained physical prior bias attention to obtain the fine-scale node representation at that moment, and the fine-scale node representations at each moment are arranged in chronological order to form a fine-scale node representation sequence.

6. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S5 includes: Calculate the pooling score for each fine-scale node in the fine-scale node representation sequence, and determine the number of coarse-scale nodes according to the preset pooling ratio. A node allocation matrix from fine-scale nodes to coarse-scale nodes is generated based on the pooling score, and a set of coarse-scale nodes is determined based on the node allocation matrix. At each time step of the fine-scale node representation sequence, the fine-scale node representation at that time step is weighted and aggregated according to the node allocation matrix to obtain the coarse-scale node features at that time step, and the coarse-scale node features at each time step are arranged in chronological order to form a coarse-scale node feature sequence. For each fine-scale directed edge in the set of fine-scale directed edges, the source node corresponding to the coarse-scale node and the target node corresponding to the coarse-scale node are determined according to the node allocation matrix, thereby generating a set of coarse-scale directed edges. The fine-scale edge weights with the same source coarse-scale node and the same target coarse-scale node are aggregated to obtain the corresponding coarse-scale edge weights and form a coarse-scale edge weight set. A coarse-scale directed causal mask is generated based on the coarse-scale directed edge set.

7. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S6 includes: In the graph Transformer network, for each time step of the coarse-scale node feature sequence, a corresponding query vector, key vector, and value vector are generated based on the coarse-scale node features at that time step. For each coarse-scale directed edge in the set of coarse-scale directed edges, calculate the physical prior bias term based on the coarse-scale edge weight corresponding to the coarse-scale directed edge. For any source coarse-scale node and its target coarse-scale node connected by a coarse-scale directed edge, an attention score is calculated based on the query vector and the key vector, and the physical prior bias term is added to the attention score to obtain physical prior bias attention. The physical prior bias attention is constrained by the coarse-scale directed causal mask, so that coarse-scale node pairs not defined in the coarse-scale directed edge set do not participate in attention calculation. The value vector is weighted and aggregated based on the constrained physical prior bias attention to obtain the coarse-scale node representation at that moment, and the coarse-scale node representations at each moment are arranged in chronological order to form a coarse-scale node representation sequence.

8. The method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 1, S7 includes: In the node allocation matrix, the coarse-scale node corresponding to each fine-scale node is determined according to the allocation weight of each fine-scale node, and the coarse-scale node representation sequence is upsampled according to the node allocation matrix to form a mapped coarse-scale representation sequence at the granularity of the node set. The mapped coarse-scale representation sequence and the fine-scale node representation sequence are fused according to a preset fusion rule to obtain a fused representation sequence, wherein the preset fusion rule is one of weighted summation, linear transformation after concatenation, or gated fusion. The fused representation sequence is regressed or classified using a prediction head network to obtain the marine environmental situation prediction results for each observation node under the preset prediction lead. The coarse-scale node representation sequence is then regressed or classified using a coarse-scale prediction head network to obtain coarse-scale marine environmental situation prediction results corresponding to the coarse-scale node set.

9. A method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 5, characterized in that, The physical prior bias term is a function of arrival probability, arrival time, downstream consistency coefficient, and path cost, and satisfies the following: the physical prior bias term increases when the arrival probability increases, the physical prior bias term decreases when the arrival time increases, and the physical prior bias term decreases when the path cost increases; the physical prior bias term includes a term that takes the logarithm of the arrival probability and / or a term that exponentially decays the arrival time, and the physical prior bias term is truncated or normalized to limit its numerical range.

10. A method for predicting marine multi-scale environmental situation based on graph neural networks according to claim 6, characterized in that, When performing aggregation calculations on fine-scale edge weights that have the same source coarse-scale nodes and the same target coarse-scale nodes, a weighted summation or weighted mean with arrival probabilities as weights is used. Furthermore, the minimum or weighted average of the fine-scale arrival time is used as part of the coarse-scale edge features to generate the coarse-scale directed causal mask.