A traffic prediction method and system based on a space-time diffusion attention network

By constructing a maritime road network and a spatiotemporal diffused attention network (ST-DAN), the problem of processing the spatiotemporal characteristics of maritime vessel traffic data was solved, improving the accuracy and applicability of maritime traffic forecasting and optimizing vessel operation and management.

CN122242837APending Publication Date: 2026-06-19INST OF COMPUTING TECH CHINESE ACAD OF SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
INST OF COMPUTING TECH CHINESE ACAD OF SCI
Filing Date
2026-02-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively handle the spatiotemporal characteristics of maritime vessel traffic data, resulting in low accuracy in maritime traffic prediction and insufficient applicability of existing models in maritime traffic scenarios.

Method used

The concept of a maritime road network is constructed by dividing the sea surface into grids. By integrating the diffusion mechanism and the probabilistic sparse attention mechanism through the Spatiotemporal Diffusion Attention Network (ST-DAN), the spatiotemporal characteristics of ship trajectories are captured to generate future traffic flow predictions.

🎯Benefits of technology

It significantly reduced MAE, MAPE, and RMSE, optimized ship operations and maritime traffic management, and improved the accuracy and applicability of maritime traffic forecasting.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242837A_ABST
    Figure CN122242837A_ABST
Patent Text Reader

Abstract

This application discloses a traffic flow prediction method and system based on a spatiotemporal diffusion attention network. The system includes: a maritime road network construction module for dividing the sea surface into multiple grids, mapping ship nodes in the same area to the same grid, and outputting a set of ship trajectories represented by grid node trajectories; a spatiotemporal graph construction module for constructing a weighted adjacency matrix of a directed graph based on each set of ship trajectories, extracting atomic files from the adjacency matrix, and constructing a spatiotemporal graph based on the atomic files; and a spatiotemporal diffusion attention network construction module for constructing a spatiotemporal diffusion attention network, which includes: an input module, a spatiotemporal convolution module, and an output module. The input module is used to process the atomic files into dataset features in a standard data format; the spatiotemporal convolution module is used to capture the spatiotemporal features of the dataset features; and the output module is used to generate traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of water transport technology, specifically to a flow prediction method and system based on a spatiotemporal diffusion attention network. Background Technology

[0002] In recent years, with the development of the global shipping industry, the number of ships has increased significantly, and maritime transport has also developed rapidly. Inland waterway transport has played a crucial role in the economic development of various countries. Unlike sea routes, ship navigation on inland waterways is affected by many factors, such as channel width, water depth, vertical clearance, and complex encounter situations. These factors greatly increase the difficulty of ship maneuvering and raise the potential risk of maritime accidents. Early intervention based on traffic forecasting is considered key to improving the efficiency of transportation systems and alleviating traffic problems. Therefore, optimizing ship operations and improving navigation efficiency by analyzing and forecasting ship traffic flow is an important task for maritime regulatory agencies.

[0003] First, maritime traffic data differs from urban traffic flow data. Ship flow and throughput data have large statistical time and spatial spans and exhibit significant fluctuations, making traffic flow prediction extremely difficult. Second, waterways lack the strict constraints of urban road networks, and the complex spatial relationships between waterways also affect prediction accuracy. Furthermore, the spatiotemporal characteristics of waterways are affected by spatiotemporal heterogeneity, with adjacent waterway characteristics influencing each other. In addition, AIS-based ship traffic data is typical spatiotemporal data, containing both static time-based information and dynamic ship information, including MMSI, timestamps, latitude and longitude, heading, and speed. Efficiently processing this multi-layered information containing both temporal and spatial characteristics directly impacts traffic prediction performance.

[0004] The aforementioned challenges have attracted the attention and research of many scholars. Maritime traffic forecasting falls under the category of medium- to long-term forecasting, and previous methods for this purpose have broadly categorized into two types: dynamic modeling and data-driven methods. Dynamic modeling utilizes mathematical tools and physical knowledge to solve traffic problems through computational simulation. However, the simulation process requires complex system programming, consumes significant computational resources, and unrealistic assumptions and simplifications can reduce forecast accuracy. In contrast, data-driven methods use classical statistical models and machine learning models for prediction. The Autoregressive Integrated Moving Average (ARIMA) and its variants are representative of classical statistics.

[0005] In the field of deep learning, networks such as recurrent neural networks (RNNs) and long short-term memory (LSTMs) have been applied to traffic prediction, with LSTMs used to predict inland waterway vessel flow. However, these methods struggle to jointly extract spatiotemporal features from the input. To fully utilize spatial features, researchers use convolutional neural networks (CNNs) to capture spatial relationships within traffic networks, while simultaneously employing recurrent neural networks (RNNs) to model the temporal dimension.

[0006] To date, numerous groundbreaking spatiotemporal graph convolutional models have been proposed. Representative works include Graph Neural Networks for Multivariate Time Series Prediction (MTGNN), Diffused Convolutional Networks (DCRNN), and Spatiotemporal Graph Convolutional Networks (STGCN), all of which have demonstrated exceptional capabilities in urban traffic prediction. This paper decouples the traffic prediction task from urban traffic scenarios and delves into whether existing successful traffic prediction models are well-suited for marine traffic scenarios.

[0007] Ship traffic and throughput data have a large statistical time span, a large spatial span, and significant data fluctuations. At the same time, ships are not subject to the strict constraints of road networks compared to vehicles. Therefore, some spatiotemporal graph convolutional traffic prediction models are not suitable for shipping scenarios.

[0008] Existing research can be broadly divided into two categories:

[0009] 1) A research method focuses on the dynamic prediction method and system of traffic flow on municipal roads. The prediction method includes the following steps: using on-board sensors of connected vehicles to form a dynamic self-organizing perception network, collecting data through the vehicle dynamic self-organizing perception network, and outputting traffic flow prediction results through model training, thereby improving the accuracy of traffic flow prediction.

[0010] However, this research method mainly involves dynamic prediction of traffic flow on municipal roads. It collects data through a vehicle dynamic self-organizing perception network, trains a model, and outputs traffic flow prediction results, thus improving the accuracy of traffic flow prediction. It does not propose integrating diffusion mechanisms and probabilistic sparse attention mechanisms into spatiotemporal graph neural networks, nor does it propose a spatiotemporal diffusion attention network, nor does it propose the concept and construction method of maritime road networks for application in waterway traffic to predict ship traffic flow.

[0011] 2) Another research focuses on providing residual spatiotemporal diffusion models, which can achieve accurate prediction of the orbital satellite status through modular design and effective image fusion methods.

[0012] However, this study employs a residual spatiotemporal diffusion model, utilizing modular design and effective image fusion to achieve accurate prediction of orbital satellite states. It does not propose constructing a spatiotemporal diffusion attention network based on the spatiotemporal attributes of a ship traffic dataset using AIS and geographic data. Furthermore, it has not been applied to waterway traffic or to predict ship traffic flow.

[0013] Therefore, in response to the challenges mentioned above and the limitations of previous work, it is necessary to study how to construct a maritime vessel traffic dataset, introduce the concept of trajectory pixelation, and propose the concept and construction method of maritime road networks. In addition, it is necessary to propose a new spatiotemporal graph neural network—the spatiotemporal diffusion attention network—for vessel traffic prediction.

[0014] There is an urgent need to research a new method and system to separate traffic flow prediction from urban traffic scenarios and apply it to marine traffic scenarios, and to explore in depth whether existing successful traffic prediction models have good adaptability to marine traffic scenarios. Summary of the Invention

[0015] To address the problems existing in current technologies, this application constructs a ship traffic dataset based on AIS and marine watershed geographic data, and proposes the concept and construction method of a maritime road network. The performance and applicability of two typical spatiotemporal graph convolutional traffic prediction models in ship traffic prediction are discussed. By integrating diffusion mechanisms and probabilistic sparse attention mechanisms into spatiotemporal graph neural networks, a spatiotemporal diffuse attention network (ST-DAN) is proposed. Based on the open-source traffic prediction framework LibCity, full, high, and low traffic prediction experiments are conducted under fair conditions. Comprehensive analysis shows that MAE, MAPE, and RMSE are significantly reduced by 1.6%–14.9%. By implementing the proposed model, ship operations and maritime traffic can be optimized, thereby benefiting stakeholders such as maritime regulatory agencies, ship operators, freight forwarders, and port management departments.

[0016] In a first aspect, embodiments of this application provide a traffic prediction system based on a spatiotemporal diffusion attention network, applied to marine transportation. The system includes:

[0017] Maritime road network construction module: used to divide the sea surface into multiple grids, map ship nodes in the same area to the same grid, and output a set of ship trajectories represented by grid node trajectories;

[0018] The spatiotemporal graph construction module is used to construct a weighted adjacency matrix of a directed graph based on each group of ship trajectories, extract atomic files from the adjacency matrix, and construct the spatiotemporal graph based on the atomic files.

[0019] The Spatiotemporal Diffusion Attention Network Construction Module is used to construct a spatiotemporal diffusion attention network. The spatiotemporal diffusion attention network includes an input module, a spatiotemporal convolution module, and an output module. The input module is used to process atomic files into dataset features in a standard data format; the spatiotemporal convolution module is used to capture the spatiotemporal features of the dataset features; and the output module is used to generate traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.

[0020] In this embodiment of the invention, the above-mentioned maritime road network construction module includes:

[0021] Sea surface grid division and trajectory mapping module: Divides the selected sea surface area into multiple equal sea surface grids, maps ship trajectory data to multiple sea surface grids, ships in the same area are mapped to the same grid, and obtains the ship navigation trajectory represented by grid node trajectory, and converts the grid node trajectory data into a preset coordinate format;

[0022] The trajectory data processing module processes the obtained grid node trajectory data into a trajectory sequence with the grid entity number as the primary key in chronological order. It calculates the range of all grids and uses the latitude and longitude of the center point of all grids as the latitude and longitude of the grid entity to obtain a set of ship trajectories under the grid, with each time point corresponding to one ship trajectory.

[0023] In this embodiment of the invention, the spatiotemporal graph construction module includes:

[0024] The adjacency matrix construction module collects navigation data from the Automatic Identification System (AIS) and constructs a weighted adjacency matrix of a directed graph based on the ship's trajectory. The weights of the adjacency matrix represent the number of connections between entity nodes, and the adjacency matrix reflects the geographical relationships between the nodes.

[0025] Atomic file extraction module: Extracts atomic files from network entities in the adjacency matrix. Atomic files include: geographic file, relation file, and dynamic file. The geographic file records the entity node number, type, and corresponding spatial coordinates; the relation file records the associations between all nodes in a many-to-many format based on the entity nodes; the dynamic file records the time-series data of each entity node.

[0026] Spatiotemporal graph construction module: Based on the trajectory points of ships at sea, for each time step The relative positions of the ships are used to construct a spatiotemporal graph, which includes the set of vertices of the ships and the set of edges in the spatiotemporal graph.

[0027] In this embodiment of the invention, the above-mentioned spatiotemporal diffusion attention network construction module includes:

[0028] Input module: By loading dynamic files, it extracts traffic data and time series, generates time series samples through a sliding window based on the inflow and outflow of entities, performs data format conversion preprocessing, and outputs the directed weighted adjacency matrix and traffic data of the maritime road network.

[0029] Spatiotemporal convolution module: Receives a directed weighted adjacency matrix and traffic data, and uses convolution to capture spatiotemporal features. The spatiotemporal convolution module includes two spatiotemporal convolution blocks, each containing two temporal convolutional layers and one spatial convolutional layer; the temporal convolutional layers are used to capture temporal features, and the spatial convolutional layer is used to capture spatial features.

[0030] Output module: After receiving the features output by the spatiotemporal convolution module, normalizes them, integrates and summarizes the features, and generates traffic flow prediction values ​​for future time steps. The output module includes two temporal convolutional layers and a fully convolutional layer. The temporal convolutional layer is used to further extract temporal correlations, and the fully convolutional layer is used to integrate multi-level features.

[0031] In this embodiment of the invention, the temporal convolutional layer in the above-mentioned spatiotemporal convolutional module includes a temporal attention layer, a two-dimensional causal convolution, and a gated linear unit;

[0032] The input data passes through a temporal attention layer to learn the correlations between different time points in the data. After processing by the temporal attention layer, it is further processed by alignment operations and two-dimensional causal convolution. In the two-dimensional causal convolution, local dependencies in the time series are captured. The fused result and the remaining part of the two-dimensional causal convolution are passed through a gated linear unit. The feature fusion result is weighted and added to the aligned original input to obtain the gated output features. Then, an activation function is combined to introduce nonlinearity into the gated output features.

[0033] In this embodiment of the invention, the spatial convolutional layer in the spatiotemporal convolution module captures spatial features, including the following steps:

[0034] A diffusion mechanism is introduced to establish connections between multi-hop neighbor nodes to capture the graph structure. The diffusion process is represented as a weighted combination of infinite random walks of graph signals, which converges to a stationary distribution space after multiple time steps.

[0035] A probabilistic sparse self-attention mechanism is introduced. By using KL divergence comparison to determine the difference between the current vector and the target distribution, vectors that contribute little to the final result are identified and simplified or discarded.

[0036] Graph convolution is introduced to graph-structured data to extract highly meaningful patterns and features in the spatial domain.

[0037] Secondly, embodiments of this application provide a traffic prediction method based on a spatiotemporal diffusion attention network, employing the aforementioned traffic prediction system based on a spatiotemporal diffusion attention network, applied to marine transportation, the method comprising:

[0038] The steps for constructing a maritime road network are as follows: Divide the sea surface into multiple grids, map ship nodes in the same area to the same grid, and output a set of ship trajectories represented by grid node trajectories;

[0039] The steps for constructing a spatiotemporal graph are as follows: construct a weighted adjacency matrix of a directed graph based on each group of ship trajectories, extract atomic files from the adjacency matrix, and construct the spatiotemporal graph based on the atomic files;

[0040] The construction steps of the spatiotemporal diffusion attention network are as follows: Construct a spatiotemporal diffusion attention network, which includes an input module, a spatiotemporal convolution module, and an output module. The input module is used to process atomic files into dataset features in a standard data format; the spatiotemporal convolution module is used to capture the spatiotemporal features of the dataset features; and the output module is used to generate traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.

[0041] In this embodiment of the invention, the above-mentioned spatiotemporal diffusion attention network construction steps include:

[0042] Input steps: By loading dynamic files, extract traffic data and time series, generate time series samples through a sliding window based on the inflow and outflow of entities, perform data format conversion preprocessing, and output the directed weighted adjacency matrix and traffic data of the maritime road network.

[0043] Spatiotemporal convolution steps: Receive a directed weighted adjacency matrix and traffic data, and use convolution to capture spatiotemporal features. The spatiotemporal convolution module includes two spatiotemporal convolution blocks, each containing two temporal convolutional layers and one spatial convolutional layer; the temporal convolutional layers are used to capture temporal features, and the spatial convolutional layer is used to capture spatial features.

[0044] Output steps: After normalizing the features output by the spatiotemporal convolution module, the features are integrated and summarized to generate traffic flow prediction values ​​for future time steps. The output module includes two temporal convolutional layers and a fully convolutional layer. The temporal convolutional layer is used to further extract temporal correlations, and the fully convolutional layer is used to integrate multi-level features.

[0045] Thirdly, embodiments of this application provide a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of a traffic prediction method based on a spatiotemporal diffusion attention network.

[0046] Fourthly, embodiments of this application provide an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the steps of a traffic prediction method based on a spatiotemporal diffusion attention network as described above.

[0047] Compared with existing technologies, it has the following outstanding advantages:

[0048] 1) The method and system of this invention propose the concept of a maritime road network. The unrestricted sea surface is divided into regular grids, and ship trajectories are mapped onto the maritime road network with grids as geographical units. This effectively simplifies ship traffic prediction. Historical inbound and outbound traffic data is statistically analyzed using grids as observation points to predict future traffic. Furthermore, the direction of ship trajectories can be inferred from the trends in inbound and outbound traffic values ​​across all grids, defining the routes taken by most ships as maritime roads.

[0049] 2) This invention explores the predictive performance and applicability of classic spatiotemporal graph convolutional traffic prediction models on marine vessel datasets. The first type of methods, exemplified by DCRNN, GWNET, and MTGNN, improves temporal information extraction through diffusion mechanisms, adaptively improved graph convolution, and the addition of gating and dilation mechanisms. The second type of methods, exemplified by STGCN, STSGCN, and ASTGCN, captures spatiotemporal features by stacking spatiotemporal units. A novel spatiotemporal graph neural network, ST-DAN, is proposed, integrating diffusion and self-attention mechanisms.

[0050] 3) The method and system of this invention employ an objective, fair, and unified approach to compare the merits of different traffic prediction models. The experiments of this invention are based on the unified, comprehensive, and scalable codebase LibCity, using unified evaluation criteria, testing environment, and datasets to examine the differences in prediction results among various models. Attached Figure Description

[0051] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:

[0052] Figure 1 This is a schematic diagram of the traffic prediction system based on spatiotemporal diffusion attention network of the present invention;

[0053] Figure 2 This is a schematic diagram of the construction of a maritime road network according to an embodiment of the present invention;

[0054] Figure 3 This is a schematic diagram of ship trajectories under a grid according to an embodiment of the present invention;

[0055] Figure 4This is a schematic diagram illustrating the construction of a spatiotemporal graph according to an embodiment of the present invention;

[0056] Figure 5 This is a schematic diagram of the topology graph corresponding to the adjacency matrix in an embodiment of the present invention;

[0057] Figure 6 This is a schematic diagram of the heatmap corresponding to the adjacency matrix in an embodiment of the present invention;

[0058] Figure 7 This is a schematic diagram of the spatiotemporal diffusion attention network architecture according to an embodiment of the present invention;

[0059] Figure 8 This is a bar chart showing the inflow and outflow of the grid nodes in an embodiment of the present invention.

[0060] Figure 9 This is a pie chart showing the ratio of inflow and outflow of the grid nodes in an embodiment of the present invention.

[0061] Figure 10 This is a comparison curve of the inflow prediction performance of the model in an embodiment of the present invention;

[0062] Figure 11 This is a comparison curve of the outflow prediction performance of the model in the embodiment of the present invention;

[0063] Figures 12a-12d This is a comparison chart of the prediction errors of ST-DAN and three baseline models in this invention.

[0064] Figures 13a-13d This is a comparison chart of the indices of three variants and the original model in the ablation experiment of this invention.

[0065] Figure 14 This is a schematic diagram of the traffic prediction method based on spatiotemporal diffusion attention network of the present invention;

[0066] Figure 15 This is a schematic diagram of the data modeling process in an embodiment of the present invention;

[0067] Figure 16 This is a schematic diagram of the computer hardware of the present invention. Detailed Implementation

[0068] In this invention, "at least one" means one or more, and "more than one" means two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of a single item or a plurality of items. For example, at least one of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, and c can be a single item or multiple items.

[0069] It should also be understood that the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. A and B can be singular or plural. Additionally, the character " / " in this article generally indicates an "or" relationship between the preceding and following related objects, but it can also represent an "and / or" relationship. Please refer to the context for a more accurate understanding.

[0070] It should also be understood that, in various embodiments of the present invention, the order of the above-mentioned process numbers does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

[0071] In the several embodiments provided by this invention, it should be understood that the disclosed devices, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0072] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0073] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0074] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0075] To make the above-mentioned features and effects of the present invention clearer and easier to understand, specific embodiments are described below in conjunction with the accompanying drawings. This specification discloses one or more embodiments incorporating the features of the present invention. The disclosed embodiments are merely illustrative. The scope of protection of the present invention is not limited to the disclosed embodiments, but is defined by the appended claims.

[0076] The following are system embodiments corresponding to the above method embodiments. This embodiment can be implemented in conjunction with the above embodiments. The relevant technical details mentioned in the above embodiments are still valid in this embodiment, and will not be repeated here to reduce repetition. Accordingly, the relevant technical details mentioned in this embodiment can also be applied to the above embodiments.

[0077] This invention constructs a ship traffic dataset based on AIS and Huangpu River basin geographic data, and innovatively proposes the concept and construction method of a maritime road network. It discusses the performance and applicability of two typical spatiotemporal graph convolutional traffic prediction models in ship traffic prediction, and proposes a spatiotemporal diffuse attention network (ST-DAN) by integrating diffusion and probabilistic sparse attention mechanisms into spatiotemporal graph neural networks. Based on the open-source traffic prediction framework LibCity, experiments were conducted under fair conditions to predict all, high, and low traffic volumes. Comprehensive analysis shows that MAE, MAPE, and RMSE are significantly reduced by 1.6%–14.9%. By implementing the proposed model, ship operations and maritime traffic can be optimized, thereby benefiting stakeholders such as maritime regulatory agencies, ship operators, freight forwarders, and port management departments.

[0078] The system of this application embodiment will be described in detail below with reference to specific embodiments:

[0079] like Figure 1As shown, the method and system of this invention propose to construct a maritime vessel traffic dataset, introduce the concept of trajectory pixelation, and propose the concept and construction method of a maritime road network. In addition, a new spatiotemporal graph neural network—the spatiotemporal diffusion attention network—is proposed to predict vessel traffic.

[0080] This invention discloses a traffic prediction system based on a spatiotemporal diffusion attention network, applicable to marine transportation. The system includes:

[0081] Maritime Road Network Construction Module 10: This module is used to divide the sea surface into multiple grids, map ship nodes in the same area to the same grid, and output a set of ship trajectories represented by grid node trajectories.

[0082] Because ships are not constrained by road networks like those used by vehicles on land, their navigation is more free, leading to increased sparsity in their trajectories and lower similarity between ship trajectories. To address this deficiency, this invention constructs a maritime road network. The main idea is to divide the sea surface into grids of equal size and map ship nodes to these grid areas. Ship nodes within a certain area will be mapped to the same network nodes. Therefore, ship trajectories are mapped to trajectories with the network as nodes, which can roughly reflect the ship's route. Analogous to a road network on land, ship traffic information is captured by collecting traffic signals through this constructed traffic network.

[0083] Spatiotemporal graph construction module 20: used to construct a weighted adjacency matrix of a directed graph based on each group of ship trajectories, extract atomic files from the adjacency matrix, and construct a spatiotemporal graph based on the atomic files;

[0084] After the maritime road network is constructed, a set of ship trajectories will be obtained under the grid. Each time point corresponds to a ship trajectory map. First, a weighted adjacency matrix of a directed graph is constructed based on this set of ship trajectories. Then, geographical, relational, and dynamic atomic files are extracted from the adjacency matrix. The dynamic atomic files contain key time series data. Finally, a spatiotemporal graph is constructed.

[0085] Spatiotemporal diffusion attention network construction module 30: used to construct a spatiotemporal diffusion attention network, which includes an input module, a spatiotemporal convolution module, and an output module. The input module is used to process atomic files into dataset features in a standard data format; the spatiotemporal convolution module is used to capture the spatiotemporal features of the dataset features; and the output module is used to generate traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.

[0086] The architecture of the Spatiotemporal Diffusion Attention Network (ST-DAN) consists of three modules: an input module, a spatiotemporal convolutional module, and an output module. The input module processes atomic files into a dataset with features in a standard data format and returns the key information, including the directed weighted adjacency matrix of the maritime road network and time-series data, i.e., traffic flow data. The spatiotemporal convolutional module uses convolution to capture spatiotemporal features. Finally, the output module generates traffic flow predictions for future time steps based on the features extracted by the spatiotemporal convolutional module.

[0087] In this embodiment of the invention, the above-mentioned maritime road network construction module 10 includes:

[0088] Sea surface grid division and trajectory mapping module 101: Divides the selected area of ​​the sea surface into multiple equal sea surface grids, maps the ship trajectory data to multiple sea surface grids, ships in the same area are mapped to the same grid, and obtains the ship navigation trajectory represented by grid node trajectory, and converts the grid node trajectory data into a preset coordinate format;

[0089] The trajectory data processing module 102 processes the obtained grid node trajectory data into a trajectory sequence with the grid entity number as the primary key in chronological order, calculates the range of all grids and uses the latitude and longitude of the center point of all grids as the latitude and longitude of the grid entity to obtain a set of ship trajectories under the grid, with each time point corresponding to one ship trajectory.

[0090] In this embodiment of the invention, the spatiotemporal graph construction module 20 includes:

[0091] Module 201 for constructing the adjacency matrix: Collects navigation data collected by the Automatic Identification System for Ships and constructs a weighted adjacency matrix of a directed graph through ship trajectories. The weights of the adjacency matrix represent the number of connections between entity nodes, and the adjacency matrix reflects the geographical relationships between nodes.

[0092] Atomic file extraction module 202: Extracts atomic files from network entities in the adjacency matrix. Atomic files include: geographic file, relation file, and dynamic file. The geographic file records the entity node number, type, and corresponding spatial coordinates; the relation file records the associations between all nodes in a many-to-many format based on the entity nodes; and the dynamic file records the time-series data of each entity node.

[0093] Spatiotemporal graph construction module 203: Based on the trajectory points of ships at sea, for each time step The relative positions of the ships are used to construct a spatiotemporal graph, which includes the set of vertices of the ships and the set of edges in the spatiotemporal graph.

[0094] In this embodiment of the invention, the spatiotemporal diffusion attention network construction module 30 includes:

[0095] Input module 301: By loading dynamic files, it extracts flow data and time series, generates time series samples through a sliding window based on the inflow and outflow of entities, performs data format conversion preprocessing, and outputs the directed weighted adjacency matrix and flow data of the maritime road network.

[0096] Spatiotemporal convolution module 302: Receives a directed weighted adjacency matrix and flow data, and uses convolution to capture spatiotemporal features. The spatiotemporal convolution module includes two spatiotemporal convolution blocks, each containing two temporal convolutional layers and one spatial convolutional layer; the temporal convolutional layers are used to capture temporal features, and the spatial convolutional layer is used to capture spatial features.

[0097] Output module 303: After receiving the features output by the spatiotemporal convolution module and normalizing them, the output module generates traffic flow prediction values ​​for future time steps through feature integration and summarization. The output module includes two temporal convolutional layers and a fully convolutional layer. The temporal convolutional layer is used to further extract temporal correlations, and the fully convolutional layer is used to integrate multi-level features.

[0098] In this embodiment of the invention, the temporal convolutional layer in the above-mentioned spatiotemporal convolutional module includes a temporal attention layer, a two-dimensional causal convolution, and a gated linear unit;

[0099] The input data passes through a temporal attention layer to learn the correlations between different time points in the data. After processing by the temporal attention layer, it is further processed by alignment operations and two-dimensional causal convolution. In the two-dimensional causal convolution, local dependencies in the time series are captured. The fused result and the remaining part of the two-dimensional causal convolution are passed through a gated linear unit. The feature fusion result is weighted and added to the aligned original input to obtain the gated output features. Then, an activation function is combined to introduce nonlinearity into the gated output features.

[0100] In this embodiment of the invention, the spatial convolutional layer in the spatiotemporal convolution module captures spatial features, including the following steps:

[0101] A diffusion mechanism is introduced to establish connections between multi-hop neighbor nodes to capture the graph structure. The diffusion process is represented as a weighted combination of infinite random walks of graph signals, which converges to a stationary distribution space after multiple time steps.

[0102] A probabilistic sparse self-attention mechanism is introduced. By using KL divergence comparison to determine the difference between the current vector and the target distribution, vectors that contribute little to the final result are identified and simplified or discarded.

[0103] Graph convolution is introduced to graph-structured data to extract highly meaningful patterns and features in the spatial domain.

[0104] Specifically, in a specific embodiment of the present invention, the construction process of the traffic prediction system based on spatiotemporal diffusion attention network includes:

[0105] 1. For example Figure 2 As shown, the construction process of the maritime road network includes:

[0106] (1) Sea surface grid division and trajectory mapping

[0107] First, the Huangpu River bend is divided into 100m × 100m grids, and finally, the selected area is divided into 100 × 100 grids. Ship trajectory data, with MMSI (Maritime Mobile Service Identity) as the primary key, is mapped to individual sea surface grids; ships in the same area are mapped to the same grid. Figure 4 As shown, the ship's navigation trajectory is represented by a grid node trajectory. The vector data is then exported as grid node data in Mercator coordinate format using PostGIS and Postgres, and then converted to WGS84 coordinate format for subsequent graph structure establishment.

[0108] (2) Trajectory data processing

[0109] like Figure 3 As shown, Figure 3 This is a schematic diagram of ship trajectories under a grid. The obtained grid node trajectory data is processed in chronological order into a trajectory sequence with the grid entity number as the primary key. At the same time, the range of all grids is calculated and the latitude and longitude of their center points are used as the latitude and longitude of the grid entities. Finally, a set of ship trajectories under the grid is obtained, with one ship trajectory corresponding to each time point.

[0110] 2. Controllable generation oriented towards structural deformation, such as Figure 4 As shown below, the construction process of the spacetime graph will be described in detail:

[0111] The dataset used in this invention is AIS (Automatic Identification System) data. AIS data is collected by an automatic tracking system installed on ships—the Automatic Identification System. It obtains information such as unique identification codes, ship names, positions, headings, and speeds by exchanging electronic data with nearby ships, AIS shore stations, and satellites.

[0112] (1) Construction of the adjacency matrix

[0113] like Figure 5 and Figure 6As shown, a topology graph and a heatmap of the adjacency matrix are drawn using 15 grid nodes as an example, briefly illustrating the connectivity and weight information between grid nodes. A weighted adjacency matrix of a 100×100 directed graph is constructed by displaying ship trajectories, with weights representing the number of connections between entity nodes. The adjacency matrix reflects the geographical relationships between nodes and is an important component of the input data for traffic flow prediction.

[0114] (2) Extraction of atomic files

[0115] The AIS geographic vector data of the Huangpu River bend is processed to obtain atomic files. The geographic file (geo) records the entity node number, type, and corresponding spatial coordinates. The relation file (rel) records the relationships between all nodes in a many-to-many format. Together, the geographic and relation files constitute a complete graph structure, describing the existence and connectivity of nodes and edges in the graph. The dynamic file (dyna) records the time-series data of each node in the graph, where in_flow and out_flow represent the inflow and outflow of that entity node, respectively. The configuration file (config) records the necessary parameter settings for the dataset, including the adjacency matrix settings when constructing the graph structure and the time interval parameters for prediction tasks.

[0116] This invention extracts the network entities corresponding to the non-zero rows of the adjacency matrix as geographic entities in the atomic file geo. For the non-zero items in the adjacency matrix, the corresponding starting geographic entity, ending geographic entity, and weight are extracted as data content in the atomic file rel. For the atomic file dyna, time information is also combined to count the inflow and outflow values ​​according to the geographic entity nodes. This file is one of the important components of the input data for traffic prediction.

[0117] (3) Construction of spatiotemporal diagram

[0118] To construct a spatiotemporal graph for the trajectory points of ships at sea, in order to achieve the following at each time step: To represent the relative positions of ships in the scene, construct a set of spatiotemporal graphs. ,in Let represent the set of vertices of the ship, as defined in formula (1).

[0119]

[0120] in, These are the ship's coordinates. It is a picture The edge set in the array is represented as:

[0121]

[0122] Among them, if and If repeated n times, then ,otherwise .

[0123] 3. For example Figure 7 As shown, the construction process of the spatiotemporal diffusion attention network includes:

[0124] This invention details the architecture of the proposed Spatiotemporal Diffusion Attention Network (ST-DAN). ST-DAN comprises three modules: an input module, a spatiotemporal convolution module, and an output module. The input module processes atomic files into a dataset with features returned in a standard data format. Key information includes the directed weighted adjacency matrix of the maritime road network and temporal data, i.e., traffic flow data. The spatiotemporal convolution module uses convolution to capture spatiotemporal features. Finally, the output module generates traffic flow predictions for future time steps based on the features extracted by the spatiotemporal convolution module.

[0125] 1) Data Input Module

[0126] The input module loads and normalizes the atomic files generated during the spatiotemporal data processing stage, then passes them to the Dataloader class in PyTorch to convert the data into a Batch structure, and finally feeds it back to the spatiotemporal convolution module.

[0127] (1) Load the atomic file

[0128] First, load the dyna atomic file to extract flow data [inflow,outflow] and time series, and generate time series samples using a sliding window based on input_window and output_window.

[0129] (2) Normalization

[0130] The data is normalized using Z-score normalization, min-max normalization, and logarithmic normalization methods to scale the data to the same range, eliminate the dimensional differences between different features, and thus improve the performance and stability of the model.

[0131] 2) Spatiotemporal Convolution Module

[0132] The spatiotemporal convolution module consists of two spatiotemporal convolution blocks (ST-Conv), each of which contains two temporal convolutional layers and one spatial convolutional layer.

[0133] (1) Temporal convolutional layers capture temporal features

[0134] On the timeline, a full convolution operation is employed to capture the temporal dynamics of traffic flow. For example... Figure 7 As shown, the temporal convolutional layer includes a temporal attention layer with a width of [missing information]. The two-dimensional causal convolution is followed by a gated linear unit (GLU) as a nonlinear feature.

[0135] First, the input dataset features pass through a temporal attention layer. This layer learns the correlations between different time points in the data to highlight important time steps and ignore irrelevant or noisy information, improving the model's representation ability in the temporal dimension. After processing by the temporal attention layer, the results are further processed through Align alignment and 2D causal convolution. The main function of Align alignment is to dynamically adjust the dimension and number of channels of the input tensor, allowing data of different lengths or time scales to be processed within the same model framework, ensuring consistency in feature dimensions. Furthermore, Align alignment improves the model's adaptability to different time-series data. In 2D causal convolution, the convolution operation ensures that the current time step depends only on data from past time steps through a causal structure, avoiding information leakage. This convolution effectively captures local dependencies in the time series, enhancing the model's ability to characterize dynamic changes over time. Next, the alignment results are fused with a portion of the results from the 2D causal convolution. Feature fusion improves the representation of temporal information by integrating the features after alignment with the local temporal features extracted by the 2D causal convolution. To further control the effect of feature fusion, the fused result, along with the remaining part of the 2D causal convolution, is passed through a sigmoid function, restricting the output value to the (0,1) interval, serving as the weights for the gating mechanism. This gating mechanism controls the flow of information, dynamically adjusting the influence of the fusion result by learning the information dependencies between different time steps. Under the gating mechanism, the feature fusion result is weighted and added to the aligned original input to obtain the final gated output. To avoid problems such as gradient vanishing or gradient exploding, the model connects the gating result to the aligned original result through residual connections, ensuring that gradients can be smoothly transmitted and retaining the information of the original features. Finally, the ReLU activation function is combined to introduce non-linearity into the features, further enhancing the model's expressive power. Through this series of operations, the model can effectively capture the dynamic changes of traffic flow on the time axis and extract fine temporal features.

[0136] (2) Spatial convolutional layers capture spatial features

[0137] a) Diffusion mechanism

[0138] By introducing the diffusion mechanism from DCRNN to establish connections between multi-hop neighbor nodes to capture graph structure, the diffusion process is represented as a weighted combination of infinite random walks of graph signals. After multiple time steps, the diffusion process converges to a stationary distribution space.

[0139] b) Probabilistic sparse attention mechanism

[0140] Unlike temporal convolutional layers, spatial convolutional layers introduce a probabilistic sparse self-attention mechanism. This approach addresses the long-tailed distribution of scores in traditional self-attention mechanisms, where a small subset of vectors contributes the majority of attention scores, while the majority contribute only a small amount. To resolve this, the probabilistic sparse self-attention mechanism uses KL divergence to determine whether a vector can be ignored. KL divergence is a metric that measures the difference between two probability distributions. By comparing the current vector with the target distribution, it identifies vectors that contribute little to the final result, allowing them to be simplified or discarded in the computation. This mechanism reduces computational cost while still capturing key spatial features, making the model more efficient in handling spatial attention in large-scale traffic networks.

[0141] c) Graph Convolution

[0142] In the model of this invention, graph convolution is directly applied to graph-structured data to extract highly meaningful patterns and features in the spatial domain. Two approximation strategies are employed to reduce the complexity of graph convolution: Chebyshev polynomial approximation and the first-order approximation of graph Laplace.

[0143] 3) Output module

[0144] The first temporal convolutional layer employs a temporal convolutional network specifically to capture features along the temporal dimension. Through convolutional operations, useful time-series patterns are extracted from the input spatiotemporal features. These patterns are key inputs for subsequent predictions. After the first temporal convolutional layer, layer normalization is used to standardize the data. LayerNorm helps stabilize the training process and prevent gradient vanishing or exploding problems. By normalizing the inputs of each layer, the model's training is ensured to be more stable and converge faster. The third fully convolutional layer further processes the features output from the previous temporal convolutional layers, ensuring that each time step of the output corresponds to a complete feature map. Under this fully convolutional operation, the model can better integrate multi-level features and enhance its expressiveness. The fourth temporal convolutional layer again captures patterns along the temporal dimension, further enriching and strengthening spatiotemporal features. In this process, the model can utilize multi-level spatiotemporal convolutions to extract more complex temporal correlations and provide a more accurate basis for prediction. Finally, the features processed by two temporal convolutions and a fully convolutional layer are summarized to generate traffic flow predictions for future time steps. This predicted value represents the model's estimate of future traffic conditions and is the most representative information extracted after performing multi-layer convolution on the input spatiotemporal data.

[0145] The following describes specific implementation examples of the present invention in conjunction with a waterway on a certain river:

[0146] 1. Basic Calculation Model

[0147] A transportation network is a directed or undirected graph. ,in yes A collection of nodes, each node corresponding to a deployed sensor. yes A set of edges. Reachability between nodes is determined by a weighted adjacency matrix. This indicates that the distance between paired road networks can be obtained. Since there is no clearly defined road network at sea, this invention innovatively proposes the concept of a maritime road network, detailed in the method section of Section 4.

[0148] Traffic signals Indicates transportation network The observations of all sensors at time step t, where C is the number of features acquired by the sensors.

[0149] Traffic flow prediction further sets traffic signals to flow rates, with inbound and outbound flow rates relative to the grid. Given past data... Historical traffic flow at each time step Predicting the future Future traffic flow at the most recent time step Essentially, it involves learning a mapping function f, i.e. .

[0150] 2. Construction of the maritime road network

[0151] Because ships are not constrained by road networks like vehicles on land, their navigation is more free, leading to increased sparsity in their trajectories and lower similarity between ship trajectories. To address this deficiency, this invention constructs a maritime road network. The main idea is to divide the sea surface into grids of equal size and map ship nodes to these grid areas. Ship nodes within a certain area will be mapped to the same network nodes. Therefore, ship trajectories are mapped to trajectories with the network as nodes, which can roughly reflect the ship's route. Analogous to a road network on land, ship traffic information is captured by collecting traffic signals through this constructed traffic network. The specific process of constructing the maritime road network will be described in detail below.

[0152] like Figure 2 As shown, the sea surface grid division and trajectory mapping are as follows: First, a bend in the river is divided into 100m×100m grids. Finally, the selected area is divided into 100×100m grids. Ship trajectory data, with mmsi (Maritime Mobile ServiceIdentity) as the primary key, is mapped to individual sea surface grids. Ships in the same area are mapped to the same grid. Figure 2As shown, the ship's navigation trajectory is represented by a grid node trajectory. The vector data is then exported as grid node data in Mercator coordinate format using PostGIS and Postgres, and then converted to WGS84 coordinate format for subsequent graph structure establishment.

[0153] Trajectory data processing: The obtained grid node trajectory data is processed in chronological order into a trajectory sequence with the grid entity number as the primary key. Simultaneously, the range of all grids is calculated, and the latitude and longitude of their center points are used as the latitude and longitude of the grid entities. Finally, a set of ship trajectories under a grid is obtained, with one ship trajectory corresponding to each time point. For example... Figure 3 The image shows a schematic diagram of ship trajectories within a grid.

[0154] 3. For example Figure 4 As shown, the construction of the spacetime graph

[0155] After the maritime road network is constructed, a set of ship trajectories will be obtained under the grid, such as... Figure 3 As shown, each time point corresponds to a ship trajectory map. First, a weighted adjacency matrix of a directed graph is constructed based on this set of ship trajectories. Then, geographical, relational, and dynamic atomic files are extracted from the adjacency matrix. The dynamic atomic files contain key time series data. Finally, a spatiotemporal graph is constructed. The construction process of the spatiotemporal graph will be described in detail below.

[0156] The dataset used in this invention is AIS (Automatic Identification System) data. AIS data is collected by an automatic tracking system installed on ships—the Automatic Identification System. It obtains information such as unique identification codes, ship names, positions, headings, and speeds by exchanging electronic data with nearby ships, AIS shore stations, and satellites.

[0157] like Figure 5 and Figure 6 As shown, the adjacency matrix is ​​constructed by displaying ship trajectories to create a weighted adjacency matrix of a 100×100 directed graph. The weights represent the number of connections between entity nodes. A topology graph and a heatmap of the adjacency matrix are drawn using 15 grid nodes as an example, briefly illustrating the connectivity and weight information between grid nodes. The adjacency matrix reflects the geographical relationships between nodes and is an important component of the input data for traffic flow prediction.

[0158] Atomic file extraction: AIS geographic vector data of the Huangpu River bend is processed to obtain atomic files. The standard format of the atomic files is shown in Table 1. The geographic file (geo) records the entity node number, type, and corresponding spatial coordinates. The relation file (rel) records the relationships between all nodes in a many-to-many format. The geographic file and relation file together constitute a complete graph structure, describing the existence and connection relationships of nodes and edges in the graph. The dynamic file (dyna) records the time-series data of each node in the graph, where in_flow and out_flow represent the inflow and outflow of that entity node, respectively. The configuration file (config) records the necessary parameter settings for the dataset, including the adjacency matrix settings when constructing the graph structure and the time interval parameters for prediction tasks.

[0159] Table 1 Data Structure of Atomic Files

[0160]

[0161] This invention extracts the network entities corresponding to the non-zero rows of the adjacency matrix as geographic entities in the atomic file `geo`. For each non-zero item in the adjacency matrix, the corresponding starting geographic entity, ending geographic entity, and weight are extracted as data content in the atomic file `rel`. For the atomic file `dyna`, time information is also incorporated, and inflow / outflow values ​​are statistically analyzed for each geographic entity node. This file is a crucial component of the input data for flow prediction. Relevant information regarding the Huangpu River bend data is shown in Table 2.

[0162] Table 2. Data Scale of Atomic Files Generated from Huangpu River Bend Data

[0163]

[0164] Construction of the spatiotemporal graph: A spatiotemporal graph is constructed for the trajectory points of ships at sea, in order to achieve the following at each time step. To represent the relative positions of ships in the scene, construct a set of spatiotemporal graphs. ,in Let represent the set of vertices of the ship, as defined in formula (4).

[0165]

[0166] in, These are the ship's coordinates. It is a picture The edge set in the array is represented as:

[0167]

[0168] Among them, if and If repeated n times, then ,otherwise This invention proposes a concept of maritime routes, therefore the spatiotemporal map is constructed based on the defined grid. The construction of the spatiotemporal map is as follows: Figure 4 As shown.

[0169] like Figure 7 As shown, this invention elaborates on the architecture of the proposed Spatiotemporal Diffusion Attention Network (ST-DAN). Figure 7 As shown, ST-DAN comprises three modules: an input module, a spatiotemporal convolution module, and an output module. The input module processes atomic files into a dataset with features in a standard data format and returns the key information, including the directed weighted adjacency matrix of the maritime road network and time-series data, i.e., traffic flow data. The spatiotemporal convolution module uses convolution to capture spatiotemporal features. Finally, the output module generates traffic flow predictions for future time steps based on the features extracted by the spatiotemporal convolution module.

[0170] 1. Input module

[0171] The input module loads and normalizes the atomic files generated during the spatiotemporal data processing stage, then passes them to the Dataloader class in PyTorch to convert the data into a Batch structure, and finally feeds it back to the spatiotemporal convolution module.

[0172] Loading atomic files: First, load the dyna atomic files to extract flow data [inflow,outflow] and time series. Then, generate time series samples using a sliding window based on input_window and output_window.

[0173] Normalization: Data is normalized using Z-score normalization, min-max normalization, and logarithmic normalization methods to scale the data to the same range, eliminate the dimensional differences between different features, and thus improve the performance and stability of the model.

[0174] 2. Spatiotemporal Convolution Module

[0175] The spatiotemporal convolution module consists of two spatiotemporal convolution blocks (ST-Conv), each of which contains two temporal convolutional layers and one spatial convolutional layer.

[0176] 1) Temporal convolutional layers capture temporal features

[0177] On the timeline, a full convolution operation is employed to capture the temporal dynamics of traffic flow. For example... Figure 7 As shown, the temporal convolutional layer includes a temporal attention layer with a width of [missing information]. The two-dimensional causal convolution is followed by a gated linear unit (GLU) as a nonlinear feature.

[0178] First, the input dataset features pass through a temporal attention layer. This layer learns the correlations between different time points in the data to highlight important time steps and ignore irrelevant or noisy information, improving the model's representation ability in the temporal dimension. After processing by the temporal attention layer, the results are further processed through Align alignment and 2D causal convolution. The main function of Align alignment is to dynamically adjust the dimension and number of channels of the input tensor, allowing data of different lengths or time scales to be processed within the same model framework, ensuring consistency in feature dimensions. Furthermore, Align alignment improves the model's adaptability to different time-series data. In 2D causal convolution, the convolution operation ensures that the current time step depends only on data from past time steps through a causal structure, avoiding information leakage. This convolution effectively captures local dependencies in the time series, enhancing the model's ability to characterize dynamic changes over time. Next, the alignment results are fused with a portion of the results from the 2D causal convolution. Feature fusion improves the representation of temporal information by integrating the features after alignment with the local temporal features extracted by the 2D causal convolution. To further control the effect of feature fusion, the fused result, along with the remaining part of the 2D causal convolution, is passed through a sigmoid function, restricting the output value to the (0,1) interval, serving as the weights for the gating mechanism. This gating mechanism controls the flow of information, dynamically adjusting the influence of the fusion result by learning the information dependencies between different time steps. Under the gating mechanism, the feature fusion result is weighted and added to the aligned original input to obtain the final gated output. To avoid problems such as gradient vanishing or gradient exploding, the model connects the gating result to the aligned original result through residual connections, ensuring that gradients can be smoothly transmitted and retaining the information of the original features. Finally, the ReLU activation function is combined to introduce non-linearity into the features, further enhancing the model's expressive power. Through this series of operations, the model can effectively capture the dynamic changes of traffic flow on the time axis and extract fine temporal features.

[0179] 1) Spatial convolutional layers capture spatial features

[0180] (1) Diffusion mechanism

[0181] By introducing the diffusion mechanism from DCRNN to establish connections between multi-hop neighbor nodes to capture graph structure, the diffusion process is represented as a weighted combination of infinite random walks of graph signals, converging to a stationary distribution after multiple time steps. (See Formula 6), the first OK Indicates from node Possibility of spread:

[0182]

[0183] Image signal and filter Generate diffuse convolution :

[0184]

[0185] Diffusion convolution layer (Add formula 8) Use diffusing convolution to map P-dimensional features to Q-dimensional output:

[0186]

[0187] in, For the probability of restarting, Here is the state transition matrix. It is an out-degree diagonal matrix. In diffusing convolution... These are the parameters of the filter. These represent the transition matrices for the diffusion and reverse diffusion processes, respectively. The parameter tensors in the diffusion convolutional layer... , The term parameterizes the convolution filter with the p-th input and the q-th output. It is input. It is the output. It's a filter. This is the activation function.

[0188] (2) Probabilistic sparse attention mechanism

[0189] Unlike temporal convolutional layers, spatial convolutional layers introduce attention using a probabilistic sparse self-attention mechanism, a technique employed in existing technologies. This approach addresses the long-tailed distribution of scores in traditional self-attention mechanisms, where a small subset of vectors contributes the majority of attention scores, while the majority contribute only a small amount. To resolve this, the probabilistic sparse self-attention mechanism uses KL divergence to determine whether a vector can be ignored. KL divergence is a metric that measures the difference between two probability distributions. By comparing the current vector with the target distribution, it identifies vectors that contribute little to the final result, allowing for simplification or discarding in the computation. This mechanism reduces computation while maintaining the capture of key spatial features, making the model more efficient in handling spatial attention in large-scale traffic networks. First, the attention probability distribution of the query vector is calculated. The probability distribution of the i-th query vector is... This indicates the probability distribution when the value vector has a significant impact on the attention weights. The probability of deviating from a uniform distribution When the value vector has little impact on the attention weights, its probability distribution... The probability will be biased towards a uniform distribution. The attention score for the i-th query vector is calculated using the probabilistic form shown in Formula 9:

[0190]

[0191] The probability distribution and uniform distribution of the query vector are calculated as shown in Equations 10 and 11:

[0192]

[0193]

[0194] in, It is an asymmetric exponential kernel function , Let be the length of the query vector. Then, use the KL divergence formula to measure sparsity.

[0195]

[0196] in, The first key matrix represents the key matrix. There are _i_ key vectors. The first term calculates the sum of logarithms over all keys, and the second term calculates the arithmetic mean. If the _i_th query vector has a higher sparsity score, its attention probabilities are more diverse, and it is more likely to contain the principal dot product pairs of the long-tailed self-attention distribution. Therefore, _i_ query vectors are used to calculate the sum of logarithms over all keys. A certain number of query vectors with larger values ​​are used as the primary vectors for attention calculation.

[0197] (3) Graph convolution

[0198] In the model of this invention, graph convolution is directly applied to graph-structured data to extract highly meaningful patterns and features in the spatial domain. Two approximation strategies are employed to reduce the complexity of graph convolution: Chebyshev polynomial approximation and the first-order approximation of graph Laplace. To localize the filter and reduce the number of parameters, the formula for spectral graph convolution is as follows:

[0199]

[0200] The Chebyshev polynomial approximation method is used to restrict the filter (kernel) to... polynomial, i.e. ,in These are polynomial sparse vectors. K is the size of the graph convolution kernel, which determines the maximum radius of the convolution starting from the center node. Chebyshev polynomials. Used to approximate the kernel as a truncated expansion of order k-1, i.e. And readjust ( express The largest eigenvalue), graph convolution rewritten as:

[0201]

[0202] in, It is scaling the Laplace matrix Evaluation at point Chebyshev polynomial. As shown in formula (13), it is calculated recursively by polynomial approximation. For sub-local convolution, the cost of Equation (1) can be reduced to .

[0203] The first-order approximation, achieved by stacking multiple local graph convolutional layers and the first-order approximation of the graph Laplacian, defines a hierarchical linear representation. Therefore, a deeper architecture can be constructed to recover spatial information without being limited to the explicit parameterization given by the polynomial. Due to scaling and normalization in neural networks, this invention can further assume... Therefore, formula (13) can be simplified to:

[0204]

[0205] in, These are two shared parameters of the kernel. To constrain these parameters and stabilize numerical performance, and With one parameter Substitution, i.e. ; and Through respectively and Perform renormalization. Then, graph convolution can be represented as:

[0206]

[0207] 3. Output module

[0208] like Figure 7The output module shown begins with a first temporal convolutional layer, employing a temporal convolutional network specifically designed to capture features along the temporal dimension. Through convolutional operations, useful time-series patterns are extracted from the input spatiotemporal features. These patterns are crucial inputs for subsequent predictions. Following the first temporal convolutional layer, layer normalization is used for data standardization. LayerNorm helps stabilize the training process and prevent gradient vanishing or exploding problems. Normalizing the inputs of each layer ensures more stable model training and faster convergence. The third fully convolutional layer further processes the features output from the previous temporal convolutional layers, ensuring that each time step corresponds to a complete feature map. Under this fully convolutional operation, the model can better integrate multi-level features and enhance its expressiveness. The fourth temporal convolutional layer again captures patterns along the temporal dimension, further enriching and strengthening spatiotemporal features. During this process, the model can utilize multi-level spatiotemporal convolutions to extract more complex temporal correlations and provide a more accurate prediction basis. Finally, the features processed by two temporal convolutions and a fully convolutional layer are aggregated to generate traffic flow predictions for future time steps. This predicted value represents the model's estimate of future traffic conditions and is the most representative information extracted after performing multi-layer convolution on the input spatiotemporal data.

[0209] As described above, the system and method of the present invention can be implemented well.

[0210] Compared with the prior art, the present invention has the following outstanding advantages and beneficial effects:

[0211] This invention conducted detailed experiments to demonstrate that the proposed ST-DAN model outperforms other similar models in predicting maritime traffic. Experiments were performed on the Huangpu River dataset mentioned in this invention, and quantitative analysis was achieved by comparing evaluation metrics with the baseline. Qualitative analysis of the ST-DAN's predictive performance was conducted using predicted-to-real value curves.

[0212] 1. Evaluation Indicators

[0213] For the traffic flow prediction task, five indicators were used to evaluate the experimental results, namely: , , , This invention will briefly introduce the calculation methods of these indicators. MAE, as shown in Formula 16, is defined as the mean absolute error between the actual and predicted values, and is a widely used performance indicator in regression tasks. MAPE, as shown in Formula 17, is defined as the mean absolute percentage error between the true and predicted values. RMSE, as shown in Formula 18, is defined as the root mean square error between the true and predicted values. Compared to MAE, it is more sensitive to large errors and measures the stability and robustness of the prediction model. As shown in Formula 19, the coefficient of determination measures the degree to which the independent variable explains the variation of the dependent variable. The larger the value, the better the model fit.

[0214]

[0215]

[0216]

[0217]

[0218] The actual value is The predicted value is n is the number of samples, and the mean is... ,variance .

[0219] 2. Benchmark

[0220] To demonstrate the superiority of the proposed model, this invention selects several comparative models for horizontal comparison. Based on whether spatial features are processed and whether spatiotemporal features are extracted using a spatiotemporal block stacking method, the baseline models are divided into three main categories: temporal models, spatiotemporal separation models, and spatiotemporal block stacking models. RNN, Seq2Seq, and Transformer are classic spatiotemporal prediction methods, which have proven effective in multiple fields such as NLP and traffic prediction; DCRNN, GWNET, and MTGNN are spatiotemporal separation models; STGCN, ASTGCN, and STSGCN are spatiotemporal graph models. In addition, there are two recent traffic prediction models, STTBAN and STGNCDE.

[0221] DCRNN: Diffusion Convolutional Recurrent Neural Network, which captures temporal dependencies using graph convolution formalized by the diffusion process and captures spatial dependencies using an encoder-decoder framework.

[0222] GWNET: A spatiotemporal graph convolutional network that integrates diffusing convolution and one-dimensional dilated causal convolution to capture spatiotemporal dependencies.

[0223] MTGNN: A graph neural network framework specifically designed for multivariate time series data, combining graph convolutional networks with mixed-jump propagation layers and dilated composite convolutional layers to capture spatiotemporal dependencies.

[0224] STGCN: Spatiotemporal Graph Convolutional Network, which combines graph convolution and gated temporal convolution to capture spatiotemporal relationships.

[0225] STSGCN: Spatiotemporally Synchronous Graph Convolutional Network (STSGCN) is dedicated to prediction of spatiotemporal network data. Through a carefully designed spatiotemporally synchronized modeling mechanism, this model can effectively extract complex and localized spatiotemporal relationships.

[0226] ASTGCN: A spatiotemporal graph convolutional network based on attention, which integrates spatiotemporal attention mechanism and spatiotemporal convolution to capture dynamic spatiotemporal features.

[0227] STTBAN: Self-supervised spatiotemporal bottleneck attention network, designed to improve the efficiency and accuracy of long-term traffic forecasting through a multi-task framework and bottleneck attention mechanism.

[0228] STGNCDE: Spatiotemporal Graph Neural Control Differential Equation Model, which effectively addresses the spatiotemporal dependency problem in traffic prediction by combining graph convolutional networks with neural control differential equation techniques.

[0229] 3. Parameter settings

[0230] This invention divides the training, testing, and validation sets in a 7:2:1 ratio. The time sampling interval is set to 6 hours, with input_window=12 and output_window=6, meaning that data from the previous 12 days is used to predict maritime traffic characteristics for the following 6 days. Regarding model parameters, the maximum epoch for training all models is 100, and the baseline uses the optimal parameter settings provided in various papers. In experiments using the Huangpu River dataset, the output dimension is 2, i.e., inflow and outflow. The proposed model, ST-DAN, has an initial learning rate of 0.001 and employs the RMSProp optimizer and StepLR learning rate strategy. All experiments were conducted on an NVIDIA TESLA V100 GPU.

[0231] 4. Results and Analysis

[0232] The experimental results of the Huangpu River dataset test set are shown in Table 3. The table shows the experimental evaluation metrics of all baseline models and the model proposed in this invention. It also shows the prediction results of steps 1, 3, and 6. These results show that ST-DAN outperforms most baseline models.

[0233] Table 3 shows the prediction and evaluation metrics for the flow of all grid nodes based on experimental data. Bold text indicates the best results, and underlined text indicates the second-best results.

[0234]

[0235] Next, this invention will discuss the performance of each model and demonstrate the superiority of ST-DAN. The three time steps mentioned above are defined as short-term, medium-term, and long-term predictions, and performance evaluation metrics are considered separately. In short-term prediction, ST-DAN does not outperform STGCN, but it outperforms most baseline models. In medium-term prediction, the model of this invention reduces MAE, MAPE, and RMSE by 1.6%, 7.5%, and 5.8%, respectively. It increased by 4.3%. In terms of long-term forecasts, MAE, MAPE, and RMSE decreased by 3.9%, 7.5%, and 14.9%, respectively. This represents a 4.5% improvement. It can be seen that ST-DAN performs better in long-term predictions.

[0236] Furthermore, considering that ship trajectories at sea are sparser than vehicle trajectories in road networks, this invention hypothesizes whether different flow rates (high and low) of grid nodes would affect model training and prediction. Therefore, flow prediction for high and low flow grid nodes was considered separately. Grid nodes with actual inflow / outflow values ​​less than or equal to 30 were designated as low flow nodes, and others as high flow nodes. (It should be noted that the threshold for distinguishing between high and low flow rates was set based on experimental results. Considering the average flow rate is 25, this invention set the threshold to 5, 10, 20, and 30 for experiments, finding that a threshold of 30 yielded the best results.)

[0237] Table 4. Experimental evaluation metrics for flow prediction of low-flow (≤30) grid nodes.

[0238]

[0239] Table 5. Experimental evaluation index results for flow prediction of high-flow (>30) grid nodes.

[0240]

[0241] As shown in Tables 4 and 5, compared to predicting flow for all grid nodes, predicting flow for high-flow or low-flow nodes individually improved the performance to varying degrees. For low-flow grid prediction, ST-DAN outperformed other baseline models in all indicators except MAE in short-term, medium-term, and long-term predictions. For high-flow grid prediction, the model of this invention only outperformed other baseline models in long-term prediction. The reason for this is that the sparse ship trajectories at sea result in a relatively small number of ships passing through each grid, meaning that the flow for most grids tends to be less than 30. This invention conducted four sampling surveys of the inflow and outflow of all grid nodes within 6 hours and calculated the average value. The statistical results of the average flow are as follows: Figure 8 and Figure 9As shown in the bar chart, the grid nodes with inflow and outflow rates of 0-10 are the most numerous, followed by those with rates of 10-20, 20-30, 30-40, and 40-50. Grid nodes with rates exceeding 60 account for only a small portion. Furthermore, the pie chart (with outflow rates on the inside and inflow rates on the outside) shows that low-rate (≤30) grid nodes account for nearly half. Therefore, the prediction of low-rate grid nodes is more effective than that of high-rate grid nodes. This may be because there are more low-rate nodes, and their rate changes are relatively stable, making them easier for the model to capture and predict.

[0242] This invention also provides a visual representation of the model's traffic prediction performance and the model itself, such as... Figure 10 and Figure 11 As shown in the figure, the prediction curves of multiple models, including ST-DAN and Baseline, are compared with the actual curves of each model to evaluate their prediction accuracy. It is observed that the prediction curve of the ST-DAN model is relatively close to the actual value.

[0243] Furthermore, taking inflow as an example, the prediction errors of ST-DAN and three types of baseline models are visualized, such as... Figures 12a-12d As shown, the shaded area represents the error. Among them, Figure 12a It is ST-DAN and RNN, Seq2Seq, Transformer, and Figure 12b This is a comparison chart of the errors of DCRNN, GWNET, and MTGNN. It can be clearly seen that ST-DAN has the smallest error area, indicating high prediction accuracy. Figure 12c It is ST-DAN and Figure 12d The graph shows the error comparison of STGCN, STSGCN, and ASTGCN. The error area of ​​STSGCN is relatively large compared to other models, so that only its own error can be seen in the third error graph. Similarly, the fourth graph shows that the error of ST-DAN is better than that of other models.

[0244] 5. Ablation test

[0245] In addition to comparative experiments, this invention also conducted ablation experiments on ST-DAN, designing the following three variants to verify the effectiveness of the spatiotemporal diffusion attention network:

[0246] ST-AN: Removes the diffusion mechanism of the original model and does not consider random walks of graph signals in the extraction of spatial features.

[0247] ST-DSAN: Removes the temporal attention mechanism from the original model.

[0248] ST-DTAN: Spatial probabilistic sparse attention mechanism that removes the original model.

[0249] This invention combines experimental results from three variants with the original model. For example... Figures 13a-13d As shown, ST-DAN with diffusion and spatiotemporal attention mechanisms significantly outperforms the ablation version. MAE, RMSE, and MPAE results show that ST-DAN's metrics are more than twice that of the three variant models, indicating a significant improvement in model accuracy. R2 and EVAR, two independent correlation metrics, show that ST-DAN is nearly three times higher than the three variants, indicating improved model interpretability. All of the above verifies that the diffusion mechanism, temporal attention mechanism, and spatially probabilistic sparse attention mechanism are indispensable components of ST-DAN. In conclusion, these ablation experiments validate the effectiveness, interpretability, and predictive accuracy of the ST-DAN design.

[0250] Secondly, such as Figure 14 and Figure 15 As shown, this application provides a traffic prediction method based on a spatiotemporal diffusion attention network, employing the aforementioned traffic prediction system based on a spatiotemporal diffusion attention network, applied to marine transportation. The method includes:

[0251] Step 401 of the maritime road network construction: Divide the sea surface into multiple grids, map the ship nodes in the same area to the same grid, and output a set of ship trajectories represented by grid node trajectories;

[0252] Step 402 in constructing the spatiotemporal graph: Construct a weighted adjacency matrix of a directed graph based on each group of ship trajectories, extract atomic files from the adjacency matrix, and construct the spatiotemporal graph based on the atomic files;

[0253] Step 403 of constructing the spatiotemporal diffusion attention network: Construct the spatiotemporal diffusion attention network, which includes an input module, a spatiotemporal convolution module, and an output module. The input module is used to process atomic files into dataset features in a standard data format; the spatiotemporal convolution module is used to capture the spatiotemporal features of the dataset features; and the output module is used to generate traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.

[0254] In this embodiment of the invention, the above-mentioned spatiotemporal diffusion attention network construction step 403 includes:

[0255] Input steps: By loading dynamic files, extract traffic data and time series, generate time series samples through a sliding window based on the inflow and outflow of entities, perform data format conversion preprocessing, and output the directed weighted adjacency matrix and traffic data of the maritime road network.

[0256] Spatiotemporal convolution steps: Receive a directed weighted adjacency matrix and traffic data, and use convolution to capture spatiotemporal features. The spatiotemporal convolution module includes two spatiotemporal convolution blocks, each containing two temporal convolutional layers and one spatial convolutional layer; the temporal convolutional layers are used to capture temporal features, and the spatial convolutional layer is used to capture spatial features.

[0257] Output steps: After normalizing the features output by the spatiotemporal convolution module, the features are integrated and summarized to generate traffic flow prediction values ​​for future time steps. The output module includes two temporal convolutional layers and a fully convolutional layer. The temporal convolutional layer is used to further extract temporal correlations, and the fully convolutional layer is used to integrate multi-level features.

[0258] Thirdly, embodiments of this application provide a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the traffic prediction method based on a spatiotemporal diffusion attention network.

[0259] Fourthly, embodiments of this application provide an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the traffic prediction method based on a spatiotemporal diffusion attention network as described above.

[0260] In addition, combined Figure 1 The traffic prediction method based on spatiotemporal diffusion attention networks described in this application can be implemented by electronic devices, such as computer devices. Figure 16 This is a schematic diagram of the hardware structure of a computer device according to an embodiment of this application.

[0261] In some embodiments, the computer device may further include a communication interface 83 and a bus 80. For example, Figure 16 As shown, the processor 81, memory 82, and communication interface 83 are connected through bus 80 and complete communication with each other.

[0262] Specifically, the processor 81 may include a central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more integrated circuits that can be configured to implement the embodiments of this application.

[0263] The memory 82 can be used to store or cache various data files that need to be processed and / or communicated, as well as possible computer program instructions executed by the processor 81.

[0264] The processor 81 reads and executes computer program instructions stored in the memory 82 to implement any of the traffic prediction methods based on spatiotemporal diffusion attention networks in the above embodiments.

[0265] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0266] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.

Claims

1. A traffic prediction system based on a spatiotemporal diffusion attention network, characterized in that, The system, applied to marine transportation, includes: Maritime road network construction module: used to divide the sea surface into multiple grids, map ship nodes in the same area to the same grid, and output a set of ship trajectories represented by grid node trajectories; The spatiotemporal graph construction module is used to construct a weighted adjacency matrix of a directed graph based on each group of ship trajectories, extract atomic files from the adjacency matrix, and construct a spatiotemporal graph based on the atomic files. Spatiotemporal Diffusion Attention Network Construction Module: Used to construct a spatiotemporal diffusion attention network, which includes an input module, a spatiotemporal convolution module, and an output module. The input module is used to process the atomic file into dataset features in a standard data format; the spatiotemporal convolution module is used to capture the spatiotemporal features of the dataset features; and the output module is used to generate traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.

2. The traffic prediction system based on spatiotemporal diffusion attention network according to claim 1, characterized in that, The maritime road network construction module includes: Sea surface grid division and trajectory mapping module: Divides the selected sea surface area into multiple equal sea surface grids, maps ship trajectory data to multiple sea surface grids, ships in the same area are mapped to the same grid, and obtains the ship navigation trajectory represented by grid node trajectory, and converts the grid node trajectory data into a preset coordinate format; The trajectory data processing module processes the obtained grid node trajectory data into a trajectory sequence with the grid entity number as the primary key in chronological order. It calculates the range of all grids and uses the latitude and longitude of the center point of all grids as the latitude and longitude of the grid entity to obtain a set of ship trajectories under the grid, with each time point corresponding to one ship trajectory.

3. The traffic prediction system based on spatiotemporal diffusion attention network according to claim 1, characterized in that, The spatiotemporal graph construction module includes: The adjacency matrix construction module collects navigation data from the Automatic Identification System (AIS) and constructs a weighted adjacency matrix of a directed graph based on the ship trajectories. The weights of the adjacency matrix represent the number of connections between entity nodes, and the adjacency matrix reflects the geographical relationships between nodes. Atomic file extraction module: Extracts atomic files from network entities in the adjacency matrix. The atomic files include: a geographic file, a relational file, and a dynamic file. The geographic file records the entity node number, type, and corresponding spatial coordinates. The relational file records the associations between all nodes in a many-to-many format based on the entity nodes. The dynamic file records the time-series data of each entity node. Spatiotemporal graph construction module: Based on the trajectory points of ships at sea, for each time step The relative positions of the ships are used to construct a spatiotemporal graph, which includes: the set of vertices of the ships and the set of edges in the spatiotemporal graph.

4. The traffic prediction system based on spatiotemporal diffusion attention network according to claim 1, characterized in that, The spatiotemporal diffusion attention network construction module includes: Input module: By loading the dynamic file, it extracts traffic data and time series, generates time series samples through a sliding window based on the inflow and outflow of the entity, performs data format conversion preprocessing, and outputs the directed weighted adjacency matrix and traffic data of the maritime road network; Spatiotemporal convolution module: Receives the directed weighted adjacency matrix and traffic data, and uses convolution to capture spatiotemporal features. The spatiotemporal convolution module includes two spatiotemporal convolution blocks, each of which contains two temporal convolutional layers and one spatial convolutional layer. The temporal convolutional layers are used to capture temporal features, and the spatial convolutional layer is used to capture spatial features. Output module: After receiving the features output by the spatiotemporal convolution module and normalizing them, the module integrates and summarizes the features to generate traffic flow prediction values ​​for future time steps. The output module includes two temporal convolutional layers and a fully convolutional layer. The temporal convolutional layers are used to further extract temporal correlations, and the fully convolutional layer is used to integrate multi-level features.

5. The traffic prediction system based on spatiotemporal diffusion attention network according to claim 4, characterized in that, The temporal convolutional layer in the spatiotemporal convolution module includes a temporal attention layer, a two-dimensional causal convolution, and a gated linear unit. The input data passes through the temporal attention layer to learn the correlation between different time points in the data. After processing by the temporal attention layer, it is processed by alignment operation and two-dimensional causal convolution. In the two-dimensional causal convolution, local dependencies in the time series are captured. The fused result and the remaining part of the two-dimensional causal convolution pass through the gated linear unit. The feature fusion result is weighted and added to the aligned original input to obtain the gated output features. Then, combined with the activation function, nonlinearity is introduced into the gated output features.

6. The traffic prediction system based on spatiotemporal diffusion attention network according to claim 4, characterized in that, The spatial convolutional layer in the spatiotemporal convolution module captures spatial features, including the following steps: A diffusion mechanism is introduced to establish connections between multi-hop neighbor nodes to capture the graph structure. The diffusion process is represented as a weighted combination of infinite random walks of graph signals, which converges to a stationary distribution space after multiple time steps. A probabilistic sparse self-attention mechanism is introduced. By using KL divergence comparison to determine the difference between the current vector and the target distribution, vectors that contribute little to the final result are identified and simplified or discarded. Graph convolution is introduced to graph-structured data to extract highly meaningful patterns and features in the spatial domain.

7. A traffic prediction method based on a spatiotemporal diffusion attention network, employing the traffic prediction system based on a spatiotemporal diffusion attention network as described in any one of claims 1-6, characterized in that, Applied to marine transportation, the method includes: The steps for constructing a maritime road network are as follows: Divide the sea surface into multiple grids, map ship nodes in the same area to the same grid, and output a set of ship trajectories represented by grid node trajectories; The steps for constructing the spatiotemporal graph are as follows: construct a weighted adjacency matrix of a directed graph based on each group of ship trajectories, extract atomic files from the adjacency matrix, and construct the spatiotemporal graph based on the atomic files; The construction steps of the spatiotemporal diffusion attention network are as follows: A spatiotemporal diffusion attention network is constructed, comprising an input module, a spatiotemporal convolution module, and an output module. The input module processes the atomic file into dataset features in a standard data format. The spatiotemporal convolution module captures the spatiotemporal features of the dataset features. The output module generates traffic flow prediction values ​​for future time steps based on the features extracted by the spatiotemporal convolution module.

8. The traffic prediction method based on spatiotemporal diffusion attention network according to claim 7, characterized in that, The spatiotemporal diffusion attention network construction steps include: Input steps: By loading the dynamic file, extract the flow data and time series, generate time series samples through a sliding window based on the inflow and outflow of the entity, perform data format conversion preprocessing, and output the directed weighted adjacency matrix and flow data of the maritime road network; Spatiotemporal convolution steps: Receive the directed weighted adjacency matrix and traffic data, and use convolution to capture spatiotemporal features. The spatiotemporal convolution module includes two spatiotemporal convolution blocks, each of which contains two temporal convolutional layers and one spatial convolutional layer. The temporal convolutional layers are used to capture temporal features, and the spatial convolutional layer is used to capture spatial features. Output steps: After normalizing the features output by the spatiotemporal convolution module, the features are integrated and summarized to generate traffic flow prediction values ​​for future time steps. The output module includes two temporal convolutional layers and a fully convolutional layer. The temporal convolutional layers are used to further extract temporal correlations, and the fully convolutional layer is used to integrate multi-level features.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the steps of the traffic prediction method based on spatiotemporal diffusion attention network as described in any one of claims 7-8.

10. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the traffic prediction method based on spatiotemporal diffusion attention network as described in any one of claims 7-8.