A ship trajectory prediction method
By utilizing the stemGNN and ASTGCN technologies in the patent and implementing the innovative method, the problem of low accuracy in ship trajectory prediction in existing technologies has been solved, achieving a more efficient trajectory prediction effect.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- Chinese People's Liberation Army Cyberspace Force Information Engineering University
- Filing Date
- 2023-10-11
- Publication Date
- 2026-06-26
AI Technical Summary
Existing ship trajectory prediction methods are insufficient in terms of accuracy and prediction performance, especially for ship trajectories in non-Euclidean space, where it is difficult to effectively capture the dependencies and temporal patterns between trajectory features.
The adjacency matrix between trajectory features obtained by the stemGNN model is used as the prior matrix of the ASTGCN model. Combining the spatiotemporal attention mechanism and graph convolution of the ASTGCN model, graph convolution operation is performed through Chebyshev polynomials to extract the spatiotemporal correlation of large-scale graph data.
It significantly improves the accuracy and stability of ship trajectory prediction, can better adapt to different graph structures, and enhances the model's generalization ability and prediction performance.
Smart Images

Figure CN117195969B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for predicting ship trajectories, belonging to the field of trajectory prediction technology. Background Technology
[0002] Trajectory data refers to a series of data describing the changes of an intelligent agent in space over time. This data can be used to analyze the agent's behavioral patterns, motion characteristics, and interactions with the environment. The essence of trajectory prediction lies in predicting the state of an intelligent agent at a future moment by analyzing known agent characteristics and environmental information. This prediction has significant practical application value in fields such as autonomous driving, robot navigation, and shipping management. Trajectory prediction methods are mainly divided into prediction methods based on shallow learning and deep learning. Shallow learning-based prediction methods originated earlier and are effective in handling simple problems, but their application scope is limited due to the lack of unified evaluation standards and adaptability to complex scenarios. In the past decade or so, with the rapid development of machine learning, especially deep learning technology, trajectory prediction methods based on deep learning have gradually emerged. These methods mainly include recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and graph neural networks (GNNs). They have significant advantages in handling time-series prediction problems, and can capture long-term dependencies and complex patterns in the data.
[0003] When using pure CNN models or models that combine CNN and LSTM for trajectory prediction, the models can effectively extract the spatiotemporal features of the trajectory. However, these models are mainly suitable for processing sequential data in Euclidean space. The trajectory of a moving target has nonlinear and uncertain characteristics, especially for ship trajectories without land road constraints. When using the above models to perform time series prediction of non-Euclidean space data such as ship trajectories, the models cannot fully capture the dependencies between trajectory features or mine the temporal patterns of the trajectory.
[0004] To address the characteristics of trajectory data, graph neural networks (GNNs) are used for learning and training on graph structures, enabling the discovery of latent feature relationships and temporal patterns in non-Euclidean data like trajectories. It assumes that node states are influenced by neighboring nodes, updating each node's state through graph signal propagation and fusion, learning deep features of each node, and thus classifying or predicting nodes or the graph. Graph convolutional networks (GCNNs) are the most widely used GNN learning method, updating a node's state to the average of its own state and the states of its neighboring nodes, achieving graph signal propagation and fusion. However, GCNNs have limitations in capturing long-term dependencies in time series data and neglect spatial and temporal dependencies.
[0005] To address this, an attention-based spatio-temporal graph convolutional network (ASTGCN) model has been proposed. This model applies attention mechanisms in the temporal and spatial dimensions, combining the input data with a spatio-temporal attention matrix as the object for subsequent convolutions in the temporal and spatial dimensions. Simultaneously, the above operations are performed on three channels with hourly, daily, and weekly time granularities. Finally, the three components are weighted and merged to obtain the final prediction result, achieving excellent prediction performance. Graph convolution is used to extract the relationships between feature nodes from graph-based trajectory network structures, while temporal convolution is used to describe the dependencies between adjacent time segments. Prior matrix determination is a method for handling irregular graph shapes; it constrains the graph structure by introducing prior knowledge. Since spatio-temporal data often has irregular shapes, prior matrices are needed to handle this irregularity, enabling the model to better adapt to different graph structures and thus improve its generalization ability. For data such as GPS trajectories of land vehicles, which are subject to geometric constraints of fixed roads, road network information can be used to construct trajectory network structures to propose relationships between nodes. For ship trajectories, there are fewer geometric constraints at sea, making it difficult to determine prior knowledge using geometric constraints, which in turn affects the accuracy of ship trajectory prediction. Summary of the Invention
[0006] The purpose of this invention is to provide a ship trajectory prediction method to solve the problems of low accuracy and poor prediction effect in current ship trajectory prediction.
[0007] To solve the above-mentioned technical problems, this invention provides a method for predicting ship trajectories, which includes the following steps:
[0008] 1) Acquire historical trajectory data of ships and preprocess the historical trajectory data;
[0009] 2) The preprocessed data is processed using the trained stemGNN model to obtain the adjacency matrix between trajectory features;
[0010] 3) Use the correlation matrix obtained in step 2) as the prior matrix of the ASTGCN model, and input the divided historical trajectory and prior matrix into the ASTGCN model for trajectory prediction.
[0011] This invention utilizes a trained stemGNN model to process preprocessed data to obtain the correlations between trajectory features. The resulting adjacency matrix is used as the prior matrix of the ASTGCN model, which is then used to predict ship trajectories. This invention leverages stemGNN's ability to accurately obtain the correlation coefficient matrix between trajectory features in historical ship trajectory data. Using this matrix as the prior matrix of the ASTGCN model significantly improves its prediction accuracy.
[0012] Furthermore, the adjacency matrix in step 2) is obtained as follows:
[0013] The historical trajectory data of ships is input into the latent correlation layer of the stemGNN model to learn the implicit correlation between variables, and this is used as an adjacency matrix. This adjacency matrix is used to characterize the correlation between trajectory data features and the historical state information of features.
[0014] This invention uses the latent correlation layer of stemGNN to determine the hidden correlation between historical trajectory variables, thereby establishing an adjacency matrix. The obtained adjacency matrix is processed in the frequency domain through two layers of stemGNNblock and finally transformed into the spatial domain. This method can extract the correlation relationship of all multidimensional time series without predefined topological structure.
[0015] Furthermore, the historical trajectory data obtained in step 1) is provided by the AIS system on the ship and includes static information and dynamic information. The static information includes the ship name, MMSI number, ship type, and ship size, while the ship dynamic information includes the ship position, ship speed, and heading.
[0016] Furthermore, the preprocessing in step 1) includes removing outliers and null values from the AIS trajectory, extracting a single trajectory based on MMSI and trajectory time span, and compressing each trajectory.
[0017] This invention cleans the acquired AIS data of outliers and null values, avoiding interference from outliers and null values and further improving the accuracy of subsequent predictions; it extracts a single trajectory based on MMSI and trajectory time span, eliminating the influence of duplicate and interfering trajectories; and it reduces the data size and improves the speed of subsequent data processing by compressing each trajectory to eliminate redundant points without changing the original trajectory characteristics.
[0018] Furthermore, the spatiotemporal attention mechanism used in step 3) of the ASTGCN model adds an additive attention mechanism to the temporal and spatial dimensions of the input of each Stblock module of the ASTGCN model. Attention weights are obtained through training to capture the relationship between input data in different times and spaces.
[0019] Furthermore, the spatiotemporal graph convolution used in the ASTGCN model in step 3) is performed by Chebyshev polynomial graph convolution operation and multiple STblocks are stacked to extract a wider range of dynamic spatiotemporal correlations.
[0020] This invention utilizes Chebyshev polynomials for graph convolution computation, which significantly improves the efficiency of processing large-scale graph data without incurring expensive feature decomposition, while maintaining computational accuracy. Attached Figure Description
[0021] Figure 1 This is a flowchart of the ship trajectory prediction method of the present invention;
[0022] Figure 2 This is a flowchart of the preprocessing used in this invention;
[0023] Figure 3 This is a schematic diagram of the ASTGCN model used in this invention;
[0024] Figure 4a This is a comparison chart of the L1 Loss metrics of the present invention and existing methods on the training dataset during the experimental simulation process;
[0025] Figure 4b This is a detailed comparison of the L1 Loss metrics of the present invention and existing methods on the training dataset during the experimental simulation process;
[0026] Figure 4c This is a comparison chart of the MSE index of the present invention and existing methods on the training dataset during the experimental simulation process;
[0027] Figure 4d This is a detailed comparison chart of the MSE metrics of the present invention and existing methods on the training dataset during the experimental simulation process;
[0028] Figure 5a This is a comparison chart of the L1 Loss metrics of the present invention and existing methods on the val dataset during the experimental simulation process;
[0029] Figure 5b This is a detailed comparison chart of the L1 Loss metrics of the present invention and existing methods on the val dataset during the experimental simulation process;
[0030] Figure 5c This is a comparison chart of the MSE index of the present invention and existing methods on the val dataset during the experimental simulation process;
[0031] Figure 5d This is a detailed comparison chart of the MSE metrics of the present invention and existing methods on the val dataset during the experimental simulation process;
[0032] Figure 6a This is a comparison chart of the L1 Loss metrics of the present invention and existing methods on the test dataset during the experimental simulation process;
[0033] Figure 6b This is a detailed comparison chart of the L1 Loss metrics of the present invention and existing methods on the test dataset during the experimental simulation process;
[0034] Figure 6c This is a comparison chart of the MSE index of the present invention and existing methods on the test dataset during the experimental simulation process;
[0035] Figure 6d This is a detailed comparison chart of the MSE index of the present invention and existing methods on the test dataset during the experimental simulation process;
[0036] Figure 7 This is a schematic diagram of the trajectory predicted using the prediction method of this invention during the experimental simulation process;
[0037] Figure 8 This is a heatmap of the stemGNN attention matrix obtained during the experimental simulation.
[0038] Figure 9a This is a heatmap of the temporal dimension attention matrix obtained by using ASTGCN trajectory prediction during the experimental simulation process;
[0039] Figure 9b It is a heatmap of the spatial dimension attention matrix obtained by using ASTGCN trajectory prediction during the experimental simulation. Detailed Implementation
[0040] The specific embodiments of the present invention will be further described below with reference to the accompanying drawings.
[0041] This invention utilizes the stemGNN model to process preprocessed data to obtain the adjacency matrix between trajectory features. This adjacency matrix is then used as the prior matrix of the ASTGCN model, and subsequently, the ASTGCN model is used to predict ship trajectories. This invention leverages stemGNN to accurately obtain the correlation coefficient matrix between trajectory features in historical ship trajectory data. Using this matrix as the prior matrix of the ASTGCN model significantly improves the prediction accuracy of the ASTGCN model. The implementation process of this method is as follows: Figure 1 As shown below, a detailed explanation will follow.
[0042] 1. Collect historical trajectory data of ships.
[0043] Trajectory data possesses spatiotemporal characteristics and is formed by sampling the motion process of one or more moving objects. It typically includes sampling point location information, sampling time information, and velocity. The ship trajectory data used in this invention is AIS (Automatic Identification System) trajectory data, which refers to the location information, timestamps, and other ship attribute data collected from the vessel through the AIS system. This data can be used in fields such as ship monitoring, safety management, and navigation planning.
[0044] This embodiment uses AIS data from January to February 2017, covering a global scale. This data is 10.9GB in size and consists of daily AIS trajectory data in CSV format. Each CSV file contains millions of records, each composed of static and dynamic information: static information includes MMSI number, IMO number, vessel name, type, length and width, and position; dynamic information includes position, time, heading, speed, heading, turning angular velocity, and navigation status. The specific content of the static and dynamic information is shown in Tables 1 and 2.
[0045] Table 1
[0046]
[0047] Table 2
[0048]
[0049]
[0050] Assuming this embodiment acquires trajectory data from T historical location points, X = (x t-T+1 , ..., x t Let x be the value of x. t ∈R n The characteristic value x of the position point at time t t=(vessel) type draught t lon t lat t , sog t cog t rot t ,navstatus t ,time); the prediction result is represented by Y = (y t+1 , ..., y t+p ) indicates that y t+1 ∈R n′ y represents the predicted location information at time t+1. t+1 =(lon) t+1 lat t+1 (time).
[0051] 2. Preprocess the historical trajectory data of ships.
[0052] AIS trajectory data is subject to various noise levels due to limitations imposed by ship equipment and the AIS system itself, coupled with the complexity of ship activity patterns. Therefore, to better extract the spatiotemporal characteristics and implicit semantic information from AIS trajectory data, this invention establishes an AIS data preprocessing framework to improve the performance and quality of AIS data in subsequent trajectory clustering and prediction. Figure 2 As shown, the process first cleans up outliers and null values in the trajectory. Based on the MMSI and trajectory time span, a single trajectory is extracted from the processing results. Finally, the TR-DP algorithm is used to compress each trajectory to eliminate redundant points, thereby reducing the data size without changing the original trajectory characteristics.
[0053] 1) Clean up null values
[0054] The constantly changing latitude and longitude information is key data for understanding the movement trajectory of ships; data with missing latitude and longitude coordinates are directly removed.
[0055] 2) Outlier removal
[0056] The 3-σ method is a commonly used outlier detection and removal method based on statistical principles. It assumes that the data follows a normal distribution and estimates the mean and standard deviation of the population parameters using the mean and standard deviation of the sample data, then calculates the confidence interval for the population parameters. This method uses a threshold to determine whether data is an outlier, typically three standard deviations. If a data value differs from the mean by more than three times the standard deviation, it is considered an outlier. In AIS trajectories, SOG, COG, and ROT data can be used to detect outliers. This paper first calculates the mean and standard deviation of these data, and then, according to the 3-σ principle, identifies all data points exceeding three times the standard deviation as outliers. For these outliers, linear interpolation is used to replace them. This effectively removes outliers while retaining most of the normal data.
[0057] 3) Trajectory Extraction
[0058] Since the same vessel or MMSI may correspond to multiple tracks for different time periods, tracks within the same time period can be extracted based on the time span differences between different activity tracks. The specific process is as follows:
[0059] The data is sorted by timestamp to better determine the start and end times of different trajectory units. For each MMSI number, the time span of each row of data is measured sequentially from front to back according to the timestamp to determine the time span differences between different trajectory units and separate them. If the time span of a row of data exceeds a preset time threshold (usually several hours or days), this row of data is marked as the starting point of a new trajectory unit. The next row of data then becomes the starting point of a new trajectory unit. This ensures that only the latest trajectory information is retained and avoids duplicate trajectories. For example, an initial trajectory marked with MMSI 413803476 may generate new markings such as 413xxxx76-1, 413xxxx76-2, etc., corresponding to trajectory units of the same MMSI at different time periods. These new trajectory units can be used to further analyze the ship's activity trajectory and navigation path.
[0060] 3. Use preprocessed historical trajectory data to determine the correlation coefficient matrix between trajectory features, and use it as the prior matrix of the ASTGCN model.
[0061] Since this invention uses the ASTGCN model for prediction, this model requires the dependencies between multiple variables as priors; in the case of AIS trajectory prediction, this is a correlation coefficient matrix between trajectory features. Therefore, this invention needs to determine the prior matrix of the ASTGCN model before using it for prediction.
[0062] This embodiment uses the stemGNN model to determine the correlation coefficient matrix between trajectory features, and uses the self-attention matrix output by the stemGNN model as the correlation coefficient matrix. The specific process is as follows:
[0063] When dealing with multivariate time series prediction problems such as trajectory prediction, stemGNN first learns the hidden correlations between variables as an adjacency matrix through a latent correlation layer, and then feeds it into a two-layer stemGNN block. It first uses GFT to transform the spatial dimension from the spatial domain to the frequency domain, and then uses DFT to transform the temporal dimension from the time domain to the frequency domain. Next, it uses 1D-CNN and GLU to extract the temporal patterns in the frequency domain, and then uses IDFT to transform the time dimension back from the frequency domain to the time domain. Finally, it uses GCN to extract the spatial-frequency domain dependencies and uses IGFT to transform the spatial dimension back from the frequency domain to the spatial domain. This model is universally applicable to all multivariate time series without predefined topological structures, and the output self-attention matrix can also be used as prior input for other graph-based time series prediction models.
[0064] 4. Use the ASTGCN model for trajectory prediction.
[0065] Attention-Based Spatio-Temporal Graph Convolutional Networks (ASTGCN) are deep learning models designed for spatio-temporal data. They can directly process time-series data such as trajectories and traffic flows through graphs. ASTGCN consists of three key components: a spatio-temporal attention mechanism, spatio-temporal graph convolution, and prior matrix determination. The spatio-temporal attention mechanism focuses on learning the dynamic spatio-temporal dependencies in trajectory data. Specifically, spatial attention is used to simulate complex dynamic associations between different features, while temporal attention is used to capture dynamic temporal associations between different time points. Spatio-temporal graph convolution is a graph-based convolution operation that includes graph convolution and temporal convolution. Graph convolution is used to extract the associations between feature nodes from the graph-based trajectory network structure, while temporal convolution is used to describe the dependencies between adjacent time segments. Prior matrix determination is a method for handling irregular graph shapes; it constrains the graph structure by introducing prior knowledge. Because spatiotemporal data often exhibits irregular shapes, a prior matrix is needed to handle this irregularity, enabling the model to better adapt to different graph structures and thus improve its generalization ability. The overall ASTGCN network framework is as follows: Figure 3As shown, it consists of multiple stacked STblocks. Based on the spatiotemporal attention mechanism, it captures the dependencies between feature nodes by modeling the temporal and spatial features of trajectory data through graph convolutional layers, and combines residual modules to enhance the modeling capability of the model.
[0066] 1) Spatiotemporal attention mechanism
[0067] This invention employs an additive attention mechanism, which first calculates the similarity between the query vector and the key vector. These similarity values are then input into a softmax function to obtain attention weights. Next, these weights are multiplied by the value vector to obtain a weighted sum, which serves as the attention output. For the input trajectory spatiotemporal data, the influence between different locations (i.e., different features) and other features in the spatial dimension is dynamically related. Similarly, the influence of the same location at different time points on the final result is also different.
[0068] The spatiotemporal attention mechanism applies additive attention to the spatial and temporal dimensions of the input spatiotemporal data. It acquires attention weights through training, captures the relationships between input data in different times and spaces, and provides more spatiotemporal information for the model's prediction performance.
[0069] The spatial attention mechanism is as follows:
[0070]
[0071]
[0072] in, It is the input of the r-th ST block, C r-1 T represents the number of channels in the input data of the r-th layer. r-1 V represents the length of the time dimension of the input data in the r-th layer. s b s ∈R N×N , This is the matrix used in training, where σ represents the sigmoid activation function. The attention matrix s is dynamically calculated from the input of this layer, and the elements in s are S... i,j Semantically representing the relevance between nodes i and j, the softmax function is used for weight normalization. When performing graph convolution, the spatial attention matrix S is multiplied by the adjacency matrix W to calculate the dynamic influence between nodes.
[0073] The mechanism of time attention is as follows:
[0074]
[0075]
[0076] in, U1∈R N , The matrix used in training, the time-dimensional adjacency matrix E, is determined by the changing input, and the elements E in E... i,j Semantically, it represents the relevance between nodes i and j. Finally, E is normalized using the softmax function. This paper directly applies the normalized temporal attention matrix to the input to obtain... This allows for the integration of relevant information and dynamic adjustment of the input.
[0077] This invention adds an additive attention mechanism to the temporal and spatial dimensions of the input to each STblock module to calculate the attention weights between different temporal and spatial nodes. Combining the spatiotemporal attention mechanism with graph convolutional networks enhances the modeling capability of the trajectory prediction model, expands the range of spatiotemporal feature representation, and improves the accuracy and robustness of trajectory prediction.
[0078] 2) Spatiotemporal graph convolution
[0079] Spatiotemporal graph convolution includes convolution in both spatial and temporal dimensions. The former captures spatial dependencies from the neighborhood, while the latter mines temporal dependencies in nearby times. Spectrograph theory extends convolution operations from grid-based data to graph-structured data. In this study, the dependencies between trajectory features are treated as a graph structure, and the value of each node is considered a signal on the graph. To fully exploit the topological properties of the network, i.e., the dependencies between features, this invention employs graph convolution based on spectral graph conclusions to directly process signals, utilizing the signal correlations within the network in the spatial dimension.
[0080] In spectral analysis, the structural properties of a graph can be obtained by analyzing its corresponding Laplacian matrix and eigenvalues. The Laplacian matrix of a graph is defined as L = DA, and its normalized form is... A represents the graph adjacency matrix, I N The identity matrix is D, and the degree matrix D is a diagonal matrix composed of the degree values of the nodes. ii =∑ j A ij Eigenvalue decomposition of the Laplacian matrix:
[0081] L=UΛU T
[0082] where Λ=diag([λ0,...,λ N-1 ])∈R N×NIt is a diagonal matrix of eigenvalues, where U represents the orthogonal matrix formed by the eigenvectors. Compared to traditional CNNs, which are only applicable to regular data in Euclidean space, the subsequent graph convolution can better capture the interactions and information transmission between nodes in the graph structure corresponding to non-Euclidean space data. This is achieved by using a diagonalized linear operator in the Fourier domain, replacing the traditional convolution. The specific convolution formula is as follows:
[0083] g θ * G x = g θ (L)x=g θ (UΛU T )x=Ug θ (Λ)U T x
[0084] Where ⊙ represents the Hamada product: c ij =a ij *b ij , * G This represents the graph convolution operation. In traditional signal processing methods, convolution operations are typically implemented by performing a dot product in the Fourier domain. This means that g can be transformed using the Fourier transform. θ The results are then Fourier transformed to the frequency domain, multiplied in the frequency domain, and finally converted back to the time domain using an inverse Fourier transform. However, when the graph is very large, eigenvalue decomposition of the Laplacian matrix becomes extremely time-consuming and computationally intensive. Chebyshev polynomials are a mathematical tool that can be used to approximate complex functions, thereby simplifying the computation process. By using Chebyshev polynomials, this paper significantly improves the efficiency of processing large-scale graph data while maintaining computational accuracy without performing expensive eigenvalue decomposition.
[0085]
[0086] Where the parameter θ∈R K A vector representing the coefficients of a polynomial. λ max The largest eigenvalue of the Laplace matrix is T, and the Chebyshev polynomial is recursively defined as T. k (x)=2xT k-1 (x)-T k-2 (x), T0(x)=1, T1(x)=x. To dynamically adjust the correlation between points, each term of the Chebyshev polynomial is multiplied by the spatial attention matrix s′ using the Hamad product. Therefore, the above graph convolution can be transformed into:
[0087]
[0088] In the spatial dimension, graph convolution operations capture neighbor information for each node, and then a standard convolutional layer is superimposed in the temporal dimension for computation, updating node information by merging information from neighboring time segments.
[0089]
[0090] Where * represents the standard convolution operation, φ is the convolution kernel parameter in the time dimension, and ReLU is the activation function.
[0091] Therefore, the spatiotemporal convolution module can effectively capture the spatiotemporal dependencies of trajectory data. The STblock is the core component of the spatiotemporal convolution module, consisting of a spatiotemporal attention module, a spatiotemporal convolution module, and a residual module. Stacking multiple STblocks allows for the extraction of a wider range of dynamic spatiotemporal correlations. Finally, a fully connected layer is introduced to map the spatiotemporal features extracted by the STblock module onto the dimensional space of the prediction target, thereby achieving effective prediction of the trajectory data.
[0092] Experimental verification
[0093] To better illustrate the effects of the present invention, a simulation experiment is now conducted on the prediction method of the present invention.
[0094] 1) Determine the evaluation indicators
[0095] Mean Square Error (MSE) is a widely used metric for measuring the predictive performance of a regression model. It is the average of the sum of squares of the differences between predicted and true values. It is sensitive to outliers and easily affected by them. The formula is as follows:
[0096]
[0097] L1 Loss is another metric commonly used to evaluate the accuracy of regression models. It is calculated by summing the absolute differences between the predicted and actual values for each sample and then taking the average. Compared to MSE, L1 Loss is more robust to outliers. The formula is as follows:
[0098]
[0099] Where n represents the number of trajectory points, y i This represents the true value of the i-th sample. This represents the predicted value of the i-th sample.
[0100] In trajectory prediction models, MSE and L1 Loss are used to measure the predictive power of the model. MSE is suitable for smoother prediction results, while L1 Loss is suitable for prediction results that focus more on outliers or are more robust.
[0101] 2) Experimental Design
[0102] The preprocessed trajectory data is first grouped according to mmsi, and then the trajectory segment corresponding to each mmsi is used to generate the input required by the model using a sliding window method. The features in the training data include vessels. type draught t lont,lat t , sog t cog t rot t ,navstatus t time, predicting future trajectory points, lon t lat t The dataset used in this experiment spans from January 1, 2017 to January 5, 2017. The training / val / test partition is 7 / 2 / 1, and the training sequence step size (window) / prediction sequence step size (horizon) is 15 / 5. The Optimizer is set to Adam, and gradient clipping and learning rate decay are used to gradually bring the model closer to the optimal solution during training.
[0103] In addition to testing the ASTGCN model of this invention, this experiment also compares it with STGCN, CNN_LSTM_CBAM, TCN, and stemGNN models. The STGCN structure is similar to the ASTGCN structure, using Gated CNNs to extract spatial and temporal features and performing graph convolution operations in both dimensions. The CNN-LSTM-CBAM model integrates convolutional neural networks and long short-term memory networks, and introduces an attention mechanism in the convolutional module for feature fusion and selection, giving the model strong feature extraction and model recognition capabilities. TCN uses pure convolution, employing causal convolution, dilated convolution, and residual network design to capture temporal patterns, enhance the memory of long-term dependencies, and can compute features at multiple locations in parallel, resulting in fast training speed. StemGNN extracts temporal patterns by transforming the data from the spatiotemporal domain to the spectral domain and performing corresponding convolution operations, and combines a self-attention mechanism to capture the dependency information between features, demonstrating adaptive capabilities for temporal prediction tasks.
[0104] 3) Analysis of experimental results
[0105] The prediction accuracy of the 5 models on the train / val / test trajectory data is as follows: Figure 4a , Figure 4c , Figure 5a , Figure 5c , Figure 6a and Figure 6c As shown, Figure 4b , Figure 4d , Figure 5b , Figure 5d , Figure 6b and Figure 6d To compare the results of the four remaining models after removing those with poor performance, a more detailed comparison can be shown. Looking at the training and testing results of each model, CNN_LSTM_CBAM performed poorly, with a high error after stabilization and the slowest convergence. The other four models performed similarly, but ASTGCN had the smallest error and fastest convergence on the training dataset, followed by stemGNN and STGCN, while TCN performed worse than STGCN. On the val and test datasets, ASTGCN performed best, with errors very close to those on the training dataset and stable results, followed by stemGNN and stgcn, while TCN showed significant fluctuations and increased error. CNN_LSTM_CBAM still performed poorly. The predicted ship trajectories using the prediction model of this invention in this experiment are as follows: Figure 7 As shown.
[0106] The trajectories of moving targets typically exhibit nonlinear characteristics, and the movement paths of ships at sea, unlike land traffic, do not have fixed road constraints, making the establishment of ship trajectory models more challenging. From the prediction results of CNN_LSTM_CBAM and TCN, it can be seen that the introduction of standard convolutions helps with trajectory prediction to some extent; however, these models only consider local information at each location of the input data, ignoring temporal dimension information, and therefore cannot effectively utilize the temporal features in ship trajectories. In contrast, graph-based trajectory models such as stemGNN, STGCN, and ASTGCN are more suitable for ship trajectories with complex nonlinear relationships. They can capture the dependencies between features and temporal pattern information, making the trajectory prediction results more reliable and stable. The trajectory prediction accuracy of each model is shown in Table 3.
[0107] Table 3
[0108]
[0109]
[0110] Furthermore, ASTGCN's trajectory prediction also possesses a degree of interpretability. During training, after the stemGNN model is trained, the attention matrix in the latent correlation layer is displayed as a heatmap, such as... Figure 8 As shown, the dependencies between feature nodes can be observed. During the training of the ASTGCN network, the attention matrix of stemGNN is used as prior input. After training, the attention matrices in both the spatiotemporal dimensions are visualized, as shown below. Figure 9a and Figure 9bAs shown: the temporal matrix reveals the interactions between different time steps; while the spatial matrix shows that the motion characteristics sog and rot have a significant impact on other variables (such as lon, lat, time, etc.), while the interactions between other variables are weak. Here, sog represents the ship's velocity, and rot represents the ship's turning angular velocity. These two motion characteristics interact with other variables as follows: Figure 3-7 The dependency relationship also conforms to the principles of motion.
[0111] Therefore, the experimental results further demonstrate that the prediction method using the ASTGCN model in this invention performs best in both prediction accuracy and stability. This indicates that the trajectory prediction model proposed in this invention has good performance in capturing the spatiotemporal features of ship trajectories and can effectively predict the future motion trajectory of ships. Furthermore, the spatiotemporal attention mechanism and spatiotemporal graph convolution of the network structure also possess a certain degree of interpretability.
Claims
1. A method for predicting ship trajectories, characterized in that, The method includes the following steps: 1) Acquire historical trajectory data of ships and preprocess the historical trajectory data; 2) The preprocessed data is processed using the trained stemGNN model. The ship historical trajectory data is input into the latent correlation layer of the stemGNN model to learn the implicit correlation between variables and is used as an adjacency matrix. This adjacency matrix is used to characterize the correlation between trajectory data features and the historical state information of features. 3) Use the adjacency matrix from step 2) as the prior matrix of the ASTGCN model. Input the divided historical trajectories and the prior matrix into the ASTGCN model for trajectory prediction. The spatiotemporal attention mechanism adopted by the ASTGCN model adds an additive attention mechanism to the time and space dimensions of the input of each Stblock module of the ASTGCN model. The attention weights are obtained through training to capture the relationship between the input data in different times and spaces.
2. The ship trajectory prediction method according to claim 1, characterized in that, The historical trajectory data obtained in step 1) is provided by the ship's AIS system and includes static and dynamic information. The static information includes the ship's name, MMSI number, ship type, and ship size, while the ship's dynamic information includes the ship's position, speed, and heading.
3. The ship trajectory prediction method according to claim 2, characterized in that, The preprocessing in step 1) includes removing outliers and null values from the AIS trajectory, extracting a single trajectory based on MMSI and trajectory time span, and compressing each trajectory.
4. The ship trajectory prediction method according to claim 3, characterized in that, Extracting a single trajectory based on MMSI and trajectory time span includes: The data is sorted by timestamp to determine the start and end times of different trajectory units. For each MMSI number, the time span of each row of data is measured sequentially from front to back according to the timestamp to determine the time span difference between different trajectory units and separate them. If the time span of a row of data exceeds a preset time threshold, this row of data is marked as the starting point of a new trajectory unit, and a new trajectory unit begins from the next row of data.
5. The ship trajectory prediction method according to claim 1, characterized in that, The ASTGCN model in step 3) uses spatiotemporal graph convolution based on spectral graph conclusions to directly process the signal.
6. The ship trajectory prediction method according to claim 1, characterized in that, The ASTGCN model in step 3) uses spatiotemporal graph convolution, which is performed by Chebyshev polynomial graph convolution operation and multiple STblocks are stacked to extract a wider range of dynamic spatiotemporal correlations.