A method for predicting a three-dimensional sound velocity field based on an attention mechanism recurrent neural network
By employing a three-dimensional sound velocity field prediction method based on an attention-based recurrent neural network, and training a DA-RNN with spatial and temporal attention layers, the problem of insufficient real-time performance and accuracy in ocean sound velocity field prediction in existing technologies is solved, achieving real-time, high-accuracy forecasts while reducing computational requirements.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HAINAN RES INST OF ZHEJIANG UNIV
- Filing Date
- 2023-01-09
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies struggle to achieve real-time performance and accuracy in predicting the three-dimensional sound velocity field of the ocean, and their computational requirements are high, failing to meet the demands of the complexity and rapid changes in the marine environment.
A three-dimensional sound velocity field prediction method based on attention mechanism recurrent neural network (DA-RNN) is adopted. By constructing spatial and temporal attention layers, DA-RNN is trained using training set to predict the three-dimensional sound velocity field.
It achieves real-time, high-accuracy, large-scale three-dimensional sound velocity field prediction, reduces computational requirements, is suitable for personal computers and mobile devices, and overcomes the shortcomings of existing technologies in terms of real-time performance and accuracy.
Smart Images

Figure CN116010817B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of sound field prediction, and in particular relates to a three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network. Background Technology
[0002] The ocean contains abundant resources, and its development has brought enormous benefits to humankind. Therefore, long-term, timely, and reliable observation of the marine environment is receiving increasing attention. However, unlike air, electromagnetic waves face various limitations in their propagation within the ocean. Consequently, sound waves have become the most important means of ocean observation, widely used in communication, positioning, and observation. Many factors influence sound propagation, with sound speed being a crucial one. Sound speed in the marine environment is affected by sea temperature, salinity, and depth. Due to the complex and variable marine environment, temperature and salinity exhibit nonlinear variations both spatially and temporally, making the prediction of a three-dimensional sound speed field extremely challenging.
[0003] Predicting the sound velocity field requires both real-time performance and high accuracy. Common numerical simulation methods are divided into forecast-driven and post-forecast-driven methods. Post-forecast-driven methods require assimilation of environmental field data collected by underwater equipment in real time, offering high accuracy but failing to meet real-time requirements. Forecast-driven methods do not require real-time environmental field data, but forecast accuracy decreases as the prediction time increases, failing to meet accuracy requirements. Furthermore, numerical simulation methods subdivide the ocean area into small grids and calculate values for all grids using variational methods and the Navier-Stokes equations, thus requiring supercomputers or workstations, posing high computational demands and consuming significant time for computation. Summary of the Invention
[0004] The purpose of this invention is to address the shortcomings of existing technologies by providing a three-dimensional sound velocity field prediction method based on an attention-based recurrent neural network. This invention enables real-time, high-accuracy, and wide-range three-dimensional sound velocity field prediction.
[0005] The objective of this invention is achieved through the following technical solution: a three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network, comprising the following steps:
[0006] (1) Take the k consecutive predicted three-dimensional sound velocity fields and the subsequent three-dimensional sound velocity fields as inputs, and form multiple sets of data pairs with the h subsequent output three-dimensional sound velocity fields to construct a training set;
[0007] (2) Construct a DA-RNN and train the constructed DA-RNN based on the training set to obtain the final DA-RNN;
[0008] (3) Input the predicted three-dimensional sound velocity field into the final DA-RNN to obtain the predicted post-predicted three-dimensional sound velocity field and evaluate the accuracy of the DA-RNN prediction.
[0009] Optionally, step (2) includes the following sub-steps:
[0010] (2.1) Construct a DA-RNN, which includes a spatial attention layer and a temporal attention layer, and the spatial attention layer and the temporal attention layer are connected;
[0011] (2.2) Train the DA-RNN constructed in step (2.1) using the training set obtained in step (1), and update the parameters of the DA-RNN to obtain the final DA-RNN.
[0012] Optionally, step (2.1) includes the following sub-steps:
[0013] (2.1.1) The predicted three-dimensional sound velocity field is input into the three-dimensional convolutional layer to obtain spatial attention features, so as to complete the construction of the spatial attention layer;
[0014] (2.1.2) Input the spatial attention features into the temporal attention layer to obtain the predicted post-report three-dimensional sound velocity field, so as to complete the construction of the temporal attention layer.
[0015] Optionally, step (2.1.1) includes the following sub-steps:
[0016] (2.1.1.1) Define the three-dimensional sound velocity field as a three-dimensional array;
[0017] (2.1.1.2) The predicted three-dimensional sound velocity field is input into the spatial attention layer, and the state p of the LSTM unit at time t-1 is used as the basis for the calculation. t-1 and output h t-1 And the predicted three-dimensional sound velocity field c at time t t Calculate the spatial attention matrix at time t;
[0018] (2.1.1.3) Normalize the spatial attention matrix using the softmax function and calculate the attention weight matrix at time t;
[0019] (2.1.1.4) Obtain the weighted predicted three-dimensional sound velocity field based on the attention weight matrix and the predicted three-dimensional sound velocity field;
[0020] (2.1.1.5) Obtain k temporally continuous spatial attention features based on the weighted predicted three-dimensional sound velocity field to complete the construction of the spatial attention layer.
[0021] Optionally, step (2.1.2) includes the following sub-steps:
[0022] (2.1.2.1) Input the spatial attention features and the corresponding post-report 3D sound velocity field into the time attention layer, and according to the state s of the LSTM unit at time t-1. t-1 and output d t-1 And calculate the temporal attention matrix for the predicted three-dimensional sound velocity field attention features at all times;
[0023] (2.1.2.2) Normalize the temporal attention matrix using the softmax function and calculate the temporal attention weighting matrix;
[0024] (2.1.2.3) Obtain spatiotemporal attention features by weighting the spatial attention features according to the temporal attention weighting matrix;
[0025] (2.1.2.4) The post-reported three-dimensional sound velocity field is spliced with the spatiotemporal attention features to obtain the weighted predicted three-dimensional sound velocity field;
[0026] (2.1.2.5) Based on the weighted predicted three-dimensional sound velocity field and the spatiotemporal attention features, the predicted post-predicted three-dimensional sound velocity field is obtained to complete the construction of the temporal attention layer.
[0027] Optionally, step (2.2) includes the following sub-steps:
[0028] (2.2.1) Input the predicted three-dimensional sound velocity field from the training set into the DA-RNN, and output the predicted value of the three-dimensional sound velocity field.
[0029] (2.2.2) The error between the predicted value of the three-dimensional sound velocity field and the predicted value of the three-dimensional sound velocity field in the training set is calculated using mean square error.
[0030] (2.2.3) Determine whether the error in step (2.2.2) is less than or equal to the set error threshold. If the error is less than or equal to the set error threshold, stop training to obtain the final DA-RNN; otherwise, adjust the values of the DA-RNN parameters by gradient descent and update the DA-RNN parameters by backpropagation algorithm, and return to step (2.2.1).
[0031] Optionally, the accuracy of the DA-RNN prediction is determined by the root mean square error, which is expressed as:
[0032]
[0033] Wherein, RMSE represents the error between the predicted post-accurate 3D sound velocity field and the actual measured post-accurate 3D sound velocity field, c *(i,j,m) represents the sound velocity value of the post-reported 3D sound velocity field at position (i,j,m) predicted by DA-RNN, c′(i,j,k) represents the sound velocity value of the actual measured post-reported 3D sound velocity field at position (i,j,m), X, Y, and Z are the number of samples of the post-reported 3D sound velocity field on the x-axis, y-axis, and z-axis, respectively, and X*Y*Z is the total number of samples of the post-reported 3D sound velocity field.
[0034] The beneficial effects of this invention are as follows: Firstly, it improves the accuracy of predicting three-dimensional sound velocity fields, overcoming the limitation of decreased prediction accuracy with increasing prediction time compared to forecast-driven numerical simulation methods. Secondly, it improves the real-time performance of predicting three-dimensional sound velocity fields, eliminating the need for deploying underwater equipment to sample environmental field data in real time, thus avoiding the time overhead of transmitting environmental field data back to shore via satellite. Thirdly, it enhances the ease of use of predicting three-dimensional sound velocity fields, requiring less computational power. Compared to numerical simulation methods that require workstations or supercomputers, the trained DA-RNN can be deployed on personal computers and mobile devices. Attached Figure Description
[0035] Figure 1 This is a flowchart of the predicted three-dimensional sound velocity field of this invention;
[0036] Figure 2 This is a schematic diagram of the training set used in this invention to train a neural network;
[0037] Figure 3 This is a schematic diagram of the neural network training process of the present invention;
[0038] Figure 4 This is a schematic diagram of the overall neural network model of the present invention;
[0039] Figure 5 This is a schematic diagram of the LSTM unit of the present invention;
[0040] Figure 6 This is a schematic diagram of the spatial attention layer of the present invention;
[0041] Figure 7 This is a schematic diagram of the time attention layer of the present invention;
[0042] Figure 8 This is an example of the present invention predicting the three-dimensional sound velocity field 120 hours later. Detailed Implementation
[0043] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0044] This invention uses a dual-stage attention-based recurrent neural network (DA-RNN) to learn the relationship between the existing post-report sound velocity field and the predicted sound velocity field. After training, a converged DA-RNN is obtained. The existing predicted sound velocity field is input into the trained DA-RNN to predict the post-report sound velocity field, thereby achieving real-time and high-accuracy three-dimensional sound velocity field prediction.
[0045] See Figure 1 The three-dimensional sound velocity field prediction method based on attention mechanism recurrent neural network of the present invention includes the following steps:
[0046] (1) As Figure 2 As shown, k consecutive predicted 3D sound velocity fields and subsequent 3D sound velocity fields are taken as inputs and then combined with h subsequent output 3D sound velocity fields to form a data pair, forming a total of n data pairs. The total of n data pairs are used as the training set to train DA-RNN.
[0047] In this embodiment, the k predicted three-dimensional sound velocity fields and the subsequent three-dimensional sound velocity fields, which are used as input, are represented as follows:
[0048]
[0049] Among them, C input For the attention-based recurrent neural network (DA-RNN), C is the set of predicted 3D sound velocity fields from time t+1 to time t+k, and C′ is the set of subsequent 3D sound velocity fields from time t+1 to time t+k; k is the input time sequence length; c t+1 c represents the predicted three-dimensional sound velocity field at time t+1; t+k c represents the predicted three-dimensional sound velocity field at time t+k; t ′ +1 c represents the post-reported three-dimensional sound velocity field at time t+1; t ′ +k This represents the post-reported three-dimensional sound velocity field at time t+k.
[0050] The subsequent h output three-dimensional sound velocity fields are represented as follows:
[0051] C output ={c t ′+k+1 ,…, t ′ +k+h}
[0052] Among them, C output The output of the attention-based recurrent neural network (DA-RNN) is the set of post-reported three-dimensional sound velocity fields from time t+k+1 to time t+k+h; h is the output time sequence length; c t ′ +k+1 The post-report three-dimensional sound velocity field at time t+k+1; c t ′ +k+h The post-report three-dimensional sound velocity field at time t+k+h.
[0053] The n pairs of data in the training set are represented as follows:
[0054]
[0055] Where Trainset is the training set, C input C is the input to the training set. output The output of the training set, where t represents time and c... t+1 c represents the predicted three-dimensional sound velocity field at time t+1. t ′ +1 Let c represent the post-reported three-dimensional sound velocity field at time t+1. t ′ +k+1 Let represent the post-report three-dimensional sound velocity field at time t+k+1, and n represent the number of data pairs in the training set.
[0056] In this embodiment, the completed training set is used to train DA-RNN.
[0057] (2) Construct a DA-RNN and train the constructed DA-RNN based on the training set to obtain the optimal DA-RNN.
[0058] (2.1) Construct a DA-RNN, which includes a spatial attention layer (SAL) and a temporal attention layer (TAL), and the spatial attention layer and the temporal attention layer are connected.
[0059] In this embodiment, an attention-based recurrent neural network model, namely DA-RNN, is constructed for predicting the post-report 3D sound velocity field. Figure 4As shown, DA-RNN mainly consists of a Spatial Attention Layer (SAL) and a Temporal Attention Layer (TAL). The input predicted 3D sound velocity field passes through the Spatial Attention Layer, and the input subsequent predicted 3D sound velocity field passes through the Temporal Attention Layer, ultimately predicting the future subsequent predicted 3D sound velocity field.
[0060] The spatial attention layer is based on Long Short-Term Memory (LSTM) units and is used to calculate the spatial attention weighting matrix for each part of the three-dimensional sound velocity field.
[0061] (2.1.1) The predicted three-dimensional sound velocity field is input into a three-dimensional convolutional layer to obtain spatial attention features. The spatial attention layer is constructed as follows: Figure 6 As shown.
[0062] (2.1.1.1) Define the three-dimensional sound velocity field as a three-dimensional array, denoted as C:
[0063]
[0064] Where X, Y, and Z are the number of sampling points of the three-dimensional sound velocity field on the x-axis, y-axis, and z-axis, respectively, and c(i,j,m) represents the sound velocity value of the three-dimensional sound velocity field at position (i,j,m), where i, j, and m represent the position coordinates of the three-dimensional sound velocity field on the x-axis, y-axis, and z-axis, respectively.
[0065] (2.1.1.2) The predicted three-dimensional sound velocity field is input into the spatial attention layer, and calculated through the following process: Figure 5 As shown:
[0066]
[0067] in, g For the sigmoid function, σ c Let f be the hyperbolic tangent function, ⊙ be the elemental-wise product, and [·;·] denote the concatenation of two matrices. t For the forget gate function, i t For memory gate functions, o t For the output gate function, c t The predicted three-dimensional sound velocity field at time t. t This represents the state of the LSTM cell at time t, used to record the information passed from all LSTM cells before time t to subsequent LSTM cells. Let h be the activation vector of the LSTM unit at time t. t Let w be the post-reported three-dimensional sound velocity field feature of the LSTM unit output at time t. f wi w o w c They are h t-1 In the parameters of the forget gate, memory gate, output gate, and activation vector, b f b i b o b c These are the forget gate, memory gate, output gate, and the bias of the activation vector, respectively.
[0068] It should be understood that both the sigmoid function and the hyperbolic tangent function are activation functions, which can remember or forget input data while maintaining their differentiability; these will not be elaborated upon further here. Elemental-wise multiplication is used to multiply corresponding elements of two arrays.
[0069] The attention mechanism uses the state p of the LSTM unit at time t-1. t-1 and output h t-1 And the predicted three-dimensional sound velocity field c at time t t The spatial attention matrix at time t is calculated using the following expression:
[0070]
[0071] in, w represents the linear parameters of the spatial attention matrix. e p represents the state of the LSTM cell at time t-1. t-1 and output h t-1 The linear parameter, u e c represents the predicted three-dimensional sound velocity field at time t. t The linear parameters.
[0072] Furthermore, the attention matrix e at time t has the same dimension as the predicted three-dimensional sound velocity field c, that is:
[0073]
[0074] (2.1.1.3) The spatial attention matrix is normalized using the softmax function, and the attention weight matrix at time t is calculated as follows:
[0075]
[0076] Here, exp(·) represents the exponential function.
[0077] (2.1.1.4) Perform element-wise product of the attention weight matrix at time t and the predicted 3D sound velocity field at time t to obtain the weighted predicted 3D sound velocity field at time t:
[0078]
[0079] (2.1.1.5) After obtaining the weighted predicted three-dimensional sound velocity field at time t The formula for predicting the three-dimensional sound velocity field at time t in the computational process of replacing the spatial attention layer t The attention features of the predicted sound velocity field at time t are obtained.
[0080] In this embodiment, k temporally consecutive predicted three-dimensional sound velocity fields are input into the spatial attention layer, and k temporally consecutive spatial attention features are finally obtained, represented as:
[0081]
[0082] in, Let t represent k predicted 3D sound velocity field features, where t is time and k is the input time series length. The spatial attention features at time t+1 are given.
[0083] It should be understood that, Let t+k be the spatial attention feature at time t, which is the kth spatial attention feature that is sequentially continuous starting from time t.
[0084] (2.1.2) Input the spatial attention features into the temporal attention layer to obtain the predicted post-report three-dimensional sound velocity field, so as to complete the construction of the temporal attention layer.
[0085] (2.1.2.1) As Figure 7 As shown, the spatial attention features at k time points are:
[0086]
[0087] And the corresponding post-reported three-dimensional sound velocity field:
[0088] C′={c t ′ +1 ,…,c t ′ +k}
[0089] The input temporal attention layer also uses an attention mechanism for weighting, assigning higher weights to moments with significant changes. The temporal attention layer is also based on LSTM units, and the specific operations are as follows:
[0090]
[0091] Where, σ g For the sigmoid function, σ c Let f be the hyperbolic tangent function, ⊙ be the elemental-wise product, and [·;·] denote the concatenation of two matrices.t For the forget gate function, i t For memory gate functions, o t For the output gate function, c t ' is the post-reported three-dimensional sound velocity field at time t, s t This represents the state of the LSTM cell at time t, used to record the information passed from all LSTM cells before time t to subsequent LSTM cells. Let d be the activation vector of the LSTM unit at time t. t Let w be the post-reported three-dimensional sound velocity field feature of the LSTM unit output at time t. f ′、w i ′、w′ o w′ c They are d t-1 In the parameters of the forget gate, memory gate, output gate, and activation vector, b′ f b i ′、b′ o b′ c These are the forget gate, memory gate, output gate, and the bias of the activation vector, respectively.
[0092] Then, the state s of the LSTM unit at time t-1 is used. t-1 and output d t-1 And the attention features of the predicted three-dimensional sound velocity field at all times. Calculate the time attention matrix l t :
[0093]
[0094] Where i represents all input times, l t (u represents the predicted three-dimensional sound velocity field attention feature at time u (1≤u≤k)) The impact of the time attention layer on prediction at time t. w represents the linear parameters of the time attention matrix. d The state s of the LSTM cell at time t-1 is represented by the s. t-1 and output d t-1 The linear parameter, u d Attention features of the predicted three-dimensional sound velocity field at time u The linear parameters.
[0095] (2.1.2.2) Similarly, the temporal attention matrix is normalized using the softmax function, and the temporal attention weighting matrix β is calculated. t :
[0096]
[0097] Where exp(·) represents the exponential function, βt (u represents the normalized l) t (u.)
[0098] (2.1.2.3) Spatial attention features of the predicted three-dimensional sound velocity field at each time step based on the time attention weighting matrix. We perform weighted summation to obtain the spatiotemporal attention feature q. t :
[0099]
[0100] (2.1.2.4) The LSTM unit at time t will transmit the post-time 3D sound velocity field c at time t. t Spatiotemporal attention features q at time t t splicing:
[0101]
[0102] in, c′ represents the post-reported three-dimensional sound velocity field at time t. t With spatiotemporal attention features q t The linear parameters, The two parameters represent the bias parameters, and [·; ·] represents the concatenation operation, resulting in the weighted predicted three-dimensional sound velocity field at time t. Afterwards, the three-dimensional sound velocity field c′ is reported after time t in the relevant formulas of the spatial attention layer calculation process. t The attention features of the predicted sound velocity field at time t are obtained.
[0103] (2.1.2.5) Based on the weighted predicted three-dimensional sound velocity field and the spatiotemporal attention features, the predicted post-predicted three-dimensional sound velocity field is obtained to complete the construction of the temporal attention layer.
[0104] Specifically, the output of the LSTM unit at time k (the last LSTM unit) predicts the post-reported three-dimensional sound velocity field features d. k The spatiotemporal attention features q at time k k The predicted post-report three-dimensional sound velocity field is obtained through linear equations:
[0105]
[0106] in, This represents the predicted value of the post-reported 3D sound velocity field at time k+s, where k represents the number of predicted 3D sound velocity fields input to the DA-RNN, and s represents the target prediction time. y The post-reported three-dimensional sound velocity field feature d at time k k spatiotemporal attention features q k The linear parameter, b wThis represents the bias parameters of both. Denotes the linear parameter, b v This represents the bias parameter.
[0107] (2.2) Train the DA-RNN constructed in step (2.1) using the training set obtained in step (1), and update the parameters of the DA-RNN to obtain the final DA-RNN.
[0108] Specifically, the training set consists of multiple data pairs, and for each data pair:
[0109] (2.2.1) As Figure 3 As shown, the predicted 3D sound velocity field from the training set is input into the DA-RNN, and the output is the predicted 3D sound velocity field value, represented as:
[0110] C predict =f(C input )
[0111] Where f· represents DA-RNN, C predict This indicates the predicted value of the three-dimensional sound velocity field, C. input This represents the predicted three-dimensional sound velocity field and the subsequent three-dimensional sound velocity field.
[0112] (2.2.2) The predicted value C of the three-dimensional sound velocity field is calculated using the mean square root (MSE). predict Compared with the post-reported three-dimensional sound velocity field C in the training set output The error is expressed as follows.
[0113] loss=(C output -C predict ) 2
[0114] (2.2.3) Determine whether the error in step (2.2.2) is less than or equal to the set error threshold. If the error is less than or equal to the set error threshold, stop training to obtain the final DA-RNN; otherwise, adjust the values of the DA-RNN parameters by gradient descent and update the DA-RNN parameters by back propagation (BP) and return to step (2.2.1).
[0115] In this embodiment, the parameters of the DA-RNN include w in the LSTM unit of the spatial attention layer. f w i w o w c and b f ,b i ,b o ,b cAttention mechanism w e u e w′ in the LSTM unit of the time-attention layer f w′ i w′ o w′ c and b′ f b′ i b′ o b′ c Attention mechanism w d u d ,as well as w y b w , b v .
[0116] The values of all parameters are adjusted by gradient descent until the error loss is reduced to less than or equal to the set error threshold. Then, the training of DA-RNN is terminated and the values of all parameters are fixed.
[0117] (3) After the DA-RNN is trained with fixed parameters, it is not necessary to input the post-reported three-dimensional sound velocity field. Instead, the predicted three-dimensional sound velocity field is input into the final DA-RNN to obtain the predicted post-reported three-dimensional sound velocity field and evaluate the accuracy of the DA-RNN prediction.
[0118] Specifically, k consecutive predicted three-dimensional sound velocity fields that do not belong to the training set are input into the final DA-RNN obtained in step (2) to obtain the predicted subsequent three-dimensional sound velocity fields.
[0119] The k temporally consecutive predicted 3D sound velocity fields that are not part of the training set are represented as follows:
[0120] C test ={c n′+1 ,…,c n′+k |n′>n+k}
[0121] The predicted post-report three-dimensional sound velocity field is represented as follows:
[0122] C predict =f′(C test )
[0123] Where f′(·) represents the final DA-RNN after training.
[0124] The prediction accuracy of the DA-RNN is evaluated by comparing the actual and predicted post-accurate 3D sound velocity fields. Specifically, the root mean square error (RMSE) is used to evaluate the accuracy of the DA-RNN's prediction of the post-accurate 3D sound velocity field. The accuracy of the DA-RNN prediction is judged using the root mean square error, which is expressed as:
[0125]
[0126] Wherein, RMSE represents the error between the predicted post-accurate 3D sound velocity field and the actual measured post-accurate 3D sound velocity field, c * (i,j,m) represents the sound velocity value (in m / s) of the post-reported 3D sound velocity field at position (i,j,m) predicted by DA-RNN, c′(i,j,k) represents the sound velocity value (in m / s) of the actual measured post-reported 3D sound velocity field at position (i,j,m), X, Y, and Z are the number of samples of the post-reported 3D sound velocity field on the x-axis, y-axis, and z-axis, respectively, and X*Y*Z is the total number of samples of the post-reported 3D sound velocity field.
[0127] It is easy to understand that the smaller the root mean square error, the greater the accuracy of the DA-RNN prediction; the larger the root mean square error, the smaller the accuracy of the DA-RNN prediction.
[0128] It should be understood that the root mean square error (RMSE) and the mean square error (MSE) have the same characteristics in judging prediction error. MSE is convenient for calculating the derivative in neural network iteration, while RMSE is numerically the square root of MSE, which can more intuitively represent the magnitude of the error.
[0129] The following describes in detail the three-dimensional sound velocity field prediction method using a three-dimensional convolutional recurrent neural network according to embodiments of the present invention, and the purpose and effects of the present invention will become more apparent.
[0130] Example
[0131] To verify the effectiveness of the DA-RNN method for predicting the three-dimensional sound velocity field, simulation analysis was conducted. The sound velocity field of a sea area in the South China Sea with a length of 80 km, a width of 80 km, and a depth of 375 m was selected. The number of sampling points on the x and y axes was 6, with a sampling resolution of 13 km; the number of sampling points on the z axis was 15, with a sampling resolution of 25 m. Then:
[0132]
[0133] If the 3D convolution kernel used has a length of 2 on the x-axis and y-axis, and a length of 5 on the z-axis, then:
[0134]
[0135] Given an input sequence length k of 6 hours, a prediction time h of 2 hours, and a training set size n of 168, then:
[0136]
[0137] The DA-RNN was trained using the training set.
[0138] like Figure 8 As shown, DA-RNN is input using 120 predicted 3D sound velocity fields that are not part of the training set:
[0139] C input ={c t+1 ,…,c t+6 |t=169,…,288}
[0140] Predicting the three-dimensional sound velocity field 120 hours later:
[0141]
[0142] The accuracy of the prediction is evaluated using RMSE. The gray dashed line represents the change of RMSE between the predicted and subsequent three-dimensional sound velocity fields over time, while the black solid line represents the change of RMSE between the three-dimensional sound velocity fields predicted using DA-RNN and the subsequent three-dimensional sound velocity fields over time. It can be seen that the three-dimensional sound velocity field predicted using DA-RNN has a significant improvement in accuracy, which demonstrates the effectiveness of this invention in predicting the three-dimensional sound velocity field.
[0143] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A three-dimensional sound velocity field prediction method based on an attention-mechanism recurrent neural network, characterized in that, Includes the following steps: (1) Take the k predicted three-dimensional sound velocity fields and the k subsequent three-dimensional sound velocity fields that are sequentially continuous in time as an input data group, and take the h subsequent three-dimensional sound velocity fields that are sequentially continuous with the input data group as the corresponding output label group. The input data group and the output label group constitute a data pair; collect multiple data pairs to construct a training set. (2) Construct a DA-RNN and train the constructed DA-RNN based on the training set to obtain the final DA-RNN; (3) Input the predicted three-dimensional sound velocity field into the final DA-RNN to obtain the predicted post-predicted three-dimensional sound velocity field and evaluate the accuracy of the DA-RNN prediction.
2. The three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network according to claim 1, characterized in that, Step (2) includes the following sub-steps: (2.1) Construct a DA-RNN, which includes a spatial attention layer and a temporal attention layer, and the spatial attention layer and the temporal attention layer are connected; (2.2) Train the DA-RNN constructed in step (2.1) using the training set obtained in step (1), and update the parameters of the DA-RNN to obtain the final DA-RNN.
3. The three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network according to claim 2, characterized in that, Step (2.1) includes the following sub-steps: (2.1.1) The predicted three-dimensional sound velocity field is input into the three-dimensional convolutional layer to obtain spatial attention features, so as to complete the construction of the spatial attention layer; (2.1.2) Input the spatial attention features into the temporal attention layer to obtain the predicted post-report three-dimensional sound velocity field, so as to complete the construction of the temporal attention layer.
4. The three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network according to claim 3, characterized in that, Step (2.1.1) includes the following sub-steps: (2.1.1.1) Define the three-dimensional sound velocity field as a three-dimensional array; (2.1.1.2) The predicted three-dimensional sound velocity field is input into the spatial attention layer, and the state of the LSTM unit at time t-1 is used as the basis for the calculation. and output and the predicted three-dimensional sound velocity field at time t calculate Spatial attention matrix at any given time; (2.1.1.3) Normalize the spatial attention matrix using the softmax function and calculate... Attention weight matrix at each time step; (2.1.1.4) Obtain the weighted predicted three-dimensional sound velocity field based on the attention weight matrix and the predicted three-dimensional sound velocity field; (2.1.1.5) Obtain k temporally continuous spatial attention features based on the weighted predicted three-dimensional sound velocity field to complete the construction of the spatial attention layer.
5. The three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network according to claim 3, characterized in that, Step (2.1.2) includes the following sub-steps: (2.1.2.1) Input the spatial attention features and the corresponding post-report 3D sound velocity field into the time attention layer, and according to the state of the LSTM unit at time t-1. and output And calculate the temporal attention matrix for the predicted three-dimensional sound velocity field attention features at all times; (2.1.2.2) Normalize the temporal attention matrix using the softmax function and calculate the temporal attention weighting matrix; (2.1.2.3) Obtain spatiotemporal attention features by weighting the spatial attention features according to the temporal attention weighting matrix; (2.1.2.4) The post-reported three-dimensional sound velocity field is spliced with the spatiotemporal attention features to obtain the weighted predicted three-dimensional sound velocity field; (2.1.2.5) Based on the weighted predicted three-dimensional sound velocity field and the spatiotemporal attention features, the predicted post-prediction three-dimensional sound velocity field is obtained to complete the construction of the temporal attention layer.
6. The three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network according to claim 2, characterized in that, Step (2.2) includes the following sub-steps: (2.2.1) Input the predicted three-dimensional sound velocity field from the training set into the DA-RNN, and output the predicted value of the three-dimensional sound velocity field; (2.2.2) The error between the predicted value of the three-dimensional sound velocity field and the predicted value of the three-dimensional sound velocity field in the training set is calculated using mean square error; (2.2.3) Determine whether the error in step (2.2.2) is less than or equal to the set error threshold. If the error is less than or equal to the set error threshold, stop training to obtain the final DA-RNN. Otherwise, the parameters of the DA-RNN are adjusted by gradient descent, and then the parameters of the DA-RNN are updated by backpropagation, returning to step (2.2.1).
7. The three-dimensional sound velocity field prediction method based on an attention mechanism recurrent neural network according to claim 1, characterized in that, The accuracy of the DA-RNN prediction is determined by the root mean square error, and its expression is as follows: Wherein, RMSE represents the error between the predicted post-accurate 3D sound velocity field and the actual measured post-accurate 3D sound velocity field. This indicates that the post-hoc three-dimensional sound velocity field predicted by DA-RNN is in The speed of sound at a location This represents the actual measured three-dimensional sound velocity field in... The sound velocity values at the location, where X, Y, and Z represent the number of samples of the three-dimensional sound velocity field along the x, y, and z axes, respectively. This represents the total number of samples for the subsequent three-dimensional sound velocity field.