A Traffic Flow Prediction Method Based on Spatiotemporal Graph Convolutional Networks
By combining a spatiotemporal graph convolutional network with temporal multi-head self-attention and an adaptive diffusion graph convolutional network, the problem of insufficient mining of spatiotemporal correlation features in traffic flow prediction is solved, achieving more accurate prediction and real-time traffic monitoring, and improving the efficiency of urban road management.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SOUTHWEST JIAOTONG UNIV
- Filing Date
- 2023-11-17
- Publication Date
- 2026-06-30
AI Technical Summary
Existing traffic flow prediction methods fail to fully exploit the deep correlation features of spatiotemporal data, resulting in low prediction accuracy.
A spatiotemporal graph convolutional network is used, which combines temporal multi-head self-attention network and adaptive diffusion graph convolutional network to predict traffic flow.
It improves the accuracy of traffic flow forecasting, enables real-time monitoring and prediction of future traffic conditions, assists in road scheduling, improves traffic efficiency, and reduces traffic accidents.
Smart Images

Figure CN117576907B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, specifically to a method for traffic flow prediction using spatiotemporal graph convolutional networks. Background Technology
[0002] With the continuous development of data acquisition technology, spatiotemporal data from various fields is constantly accumulating. Utilizing this data to mine its inherent spatiotemporal characteristics is crucial for applications such as intelligent transportation and urban planning. Traffic flow is spatiotemporal data with complex nonlinear relationships, containing features from both temporal and spatial dimensions. Modeling traffic flow requires full consideration of its temporal, spatial, and spatiotemporal correlations.
[0003] Traditional traffic flow prediction methods treat data as independent time series, relying on classic time series forecasting techniques. These methods fail to consider the nonlinear spatiotemporal characteristics of traffic flow, cannot accurately simulate its spatiotemporal correlations, and struggle to uncover deeper relationships. Currently, deep learning methods based on neural networks are widely used, achieving significant results in image recognition, sequence prediction, and text translation. For traffic flow data with complex spatiotemporal relationships, more and more researchers are choosing to use deep learning methods for modeling and analysis, uncovering its inherent spatiotemporal correlations, overcoming the shortcomings of traditional prediction methods, and exhibiting superior predictive performance. Current research on traffic flow data prediction largely relies on prior knowledge, ignoring the dynamic spatiotemporal characteristics of the data and making it difficult to learn its global spatiotemporal correlation information, thus leading to low accuracy and significant errors in prediction results compared to actual data. Summary of the Invention
[0004] To address the aforementioned shortcomings in existing technologies, this invention provides a traffic flow prediction method based on spatiotemporal graph convolutional networks. This method solves the problems of inaccurate extraction of spatiotemporal data feature information and difficulty in uncovering deeper spatiotemporal dynamic correlations, which in turn leads to low accuracy in prediction results.
[0005] To achieve the above-mentioned objectives, the technical solution adopted by this invention is as follows:
[0006] A traffic flow prediction method using spatiotemporal graph convolutional networks is provided, which includes the following steps:
[0007] S1. Preprocess the spatiotemporal traffic data to obtain initial traffic data and its labels;
[0008] S2. Encode the initial traffic data in a spatiotemporal manner to obtain inherent spatiotemporal information;
[0009] S3. Input the inherent spatiotemporal information into the encoder and train it to obtain intermediate sequence data;
[0010] S4. Perform spatiotemporal location encoding on the labels to obtain multi-granularity feature representation sequences;
[0011] S5. Input the multi-granularity feature representation sequence and intermediate sequence data into the decoder and train it to obtain the trained encoder and the trained decoder.
[0012] S6. Input the traffic spatiotemporal data to be predicted into the cascaded trained encoder and decoder, and perform a linear transformation to obtain the prediction result, thus completing the prediction of traffic flow.
[0013] Furthermore, the specific method of preprocessing in step S1 is as follows: the traffic spatiotemporal data are extracted according to three different time granularities to obtain the corresponding data samples and their feature tensors; the feature tensors of the three data samples are concatenated to obtain the initial traffic data.
[0014] Furthermore, the specific method for spatiotemporal location encoding in step S2 is as follows: construct a time embedding matrix and a spatial location embedding matrix; superimpose the time embedding matrix, the spatial location embedding matrix and the initial traffic data to obtain the inherent spatiotemporal information of the initial traffic data.
[0015] Furthermore, the encoder includes three encoding modules connected in series. Each encoding module includes a temporal multi-head self-attention network unit and an adaptive diffusion graph convolutional network unit, and a residual connection and normalization layer is inserted after each unit.
[0016] The adaptive diffusion graph convolutional network unit consists of parallel diffusion graph convolutional networks and adaptive graph convolutional networks;
[0017] The decoder consists of three decoding modules connected in series. Each decoding module includes a mask temporal multi-head self-attention mechanism unit, an Encoder-Decoder interaction unit, and an adaptive diffusion graph convolution unit. A residual connection and normalization layer is inserted after each unit.
[0018] The Encoder-Decoder interaction unit is a temporal multi-head self-attention network unit;
[0019] The temporal multi-head self-attention network unit and the masked temporal multi-head self-attention mechanism unit adopt the TCN temporal convolutional network.
[0020] Furthermore, step S3 further includes:
[0021] S3-1. Input the inherent spatiotemporal information into the temporal multi-head self-attention network module, and according to the formula:
[0022] Q l =W q sl
[0023] K l =W K s l
[0024] V l =W v s l
[0025]
[0026]
[0027]
[0028] S l+1 =φ*Z l
[0029] Obtain the output S of the temporal multi-head self-attention network module l+1 ; where z l Q represents an intermediate variable in the l-th hidden layer of a temporal multi-head self-attention network module. l K represents the query vector matrix of the l-th hidden layer in the temporal multi-head self-attention network module. l V represents the key vector matrix of the l-th hidden layer in a temporal multi-head self-attention network module. l W represents the value vector matrix of the l-th hidden layer in the temporal multi-head self-attention network module. q W K and W v These represent the weights of the query vector matrix Q, key vector matrix K, and value vector matrix V in the temporal multi-head self-attention network module, respectively. l This represents the input of the l-th hidden layer in the temporal multi-head self-attention network module, where T represents the number of time slices, and d Model W represents the vector dimension of a temporal multi-head self-attention network module. i,j Let Z represent the weight coefficient matrix of the temporal multi-head self-attention network module, Softmax(·) represent the activation function, φ represent the convolution kernel parameters of the temporal multi-head self-attention network module, and Z represent the weight coefficient matrix of the temporal multi-head self-attention network module. l W represents the dynamic temporal correlation implied at different time steps. i Q W represents the weight of the query vector matrix Q at the i-th time step. i K W represents the weights of the key vector matrix K at the i-th time step. i V The weights of the value vector matrix V at the i-th time step are represented by H, where H represents the total number of time steps, and W represents the weights of the value vector matrix V at the i-th time step. o Represents the projection matrix; when l is 0, sl It is inherent spatiotemporal information;
[0030] S3-2, According to the formula:
[0031]
[0032] Obtain the output of the first residual connection and normalization layer in the current encoding module. Among them, X (L') This represents the input of the current encoding module, i.e., the inherent spatiotemporal information, TCMA(X). (L') ) represents the output of the temporal multi-head self-attention network module, i.e., S l+1 LayerNorm(·) represents the normalization function;
[0033] S3-3. Input the output of the first residual connection and normalization layer in the current encoding module to the adaptive diffusion graph convolutional network module to obtain the output of the adaptive diffusion graph convolutional network module.
[0034] S3-4. According to the formula:
[0035]
[0036] Obtain the output of the second residual connection and normalization layer in the current encoding module. in, This represents the output of the adaptive diffusion graph convolutional network module;
[0037] S3-5. Input the output of the second residual connection and normalization layer in the current encoding module to the next encoding module, and use the output of the second residual connection and normalization layer in the third encoding module as the output of the encoder, i.e., the intermediate sequence data.
[0038] Furthermore, the specific methods for steps S3-4 are as follows:
[0039] S3-4-1. Input the output of the first residual connection and normalization layer in the current encoding module into the diffusion map convolutional network, and according to the formula:
[0040]
[0041] Obtain the output X of the diffusion graph convolutional network D ; Among them, A, A T Let K represent the pre-adjacency matrix and its transpose of the nodes in the road network graph, and D represent the feature diffusion steps. O D I These represent the out-degree and in-degree matrices of the node, respectively, and X represents the output of the first residual connection and normalization layer in the current encoding module. φ k,O φk,I These represent the convolution parameters corresponding to the forward and backward feature diffusion at the k-th step, respectively.
[0042] S3-4-2. Input the output of the first residual connection and normalization layer in the current encoding module into the adaptive graph convolutional network, and according to the formula:
[0043]
[0044]
[0045] Obtain the output X of the adaptive graph convolutional network A ;in, Let E1 and E2 represent the adaptive adjacency matrix and its transpose, respectively. Let E1 and E2 represent the embedding matrices of the original node and the target node, respectively. T Let E2 be the transpose of the embedding matrix E2 of the target node, and ReLU(·) denote the activation function.
[0046] S3-4-3. Concatenate the outputs of the diffusion graph convolutional network and the adaptive graph convolutional network to obtain the output of the adaptive diffusion graph convolutional network module.
[0047] Furthermore, step S5 further includes:
[0048] S5-1. Input the multi-granularity feature representation sequence into the masked temporal multi-head self-attention mechanism module, and use the same method as steps S3-1 to S3-2 to obtain the output of the masked temporal multi-head self-attention mechanism module.
[0049] S5-2, According to the formula:
[0050]
[0051] Obtain the output of the first residual connection and normalization layer in the current decoding module. in, This represents a sequence with multi-granularity feature representation. This represents the output of the mask timing multi-head self-attention mechanism module;
[0052] S5-3. Input the output of the first residual connection and normalization layer in the current decoding module and the intermediate sequence data into the Encoder-Decoder interaction module, and use the same method as steps S3-1 to S3-2 to obtain the output of the Encoder-Decoder interaction module.
[0053] S5-4. According to the formula:
[0054]
[0055] Obtain the output of the second residual connection and normalization layer in the current decoding module. in This represents the output of the Encoder-Decoder interactive module;
[0056] S5-5. Apply the same method as steps S3-4 and S3-5 to the output of the second residual connection and normalization layer in the current decoding module to obtain the output of the third residual connection and normalization layer in the current decoding module.
[0057] S5-6. Input the output of the third residual connection and normalization layer in the current decoding module to the next decoding module, and use the output of the third residual connection and normalization layer in the third decoding module as the output of the decoder to obtain the initial prediction result;
[0058] S5-7. Based on the initial prediction results, the parameters of the encoder and decoder are adjusted using the backpropagation algorithm to obtain the trained encoder and decoder.
[0059] The beneficial effects of this invention are as follows:
[0060] 1. This prediction method uses temporal convolutional networks to simulate the temporal correlation of traffic flow and combines an attention mechanism to enhance the model's ability to characterize the dynamic temporal dependence of traffic flow at different time granularities.
[0061] 2. This prediction method constructs a diffusion graph convolutional network based on an adaptive adjacency matrix in the spatial domain, and uses a data-driven approach to capture the local spatial association features of nodes and the global spatial information of the road network;
[0062] 3. This prediction method can monitor complex traffic conditions in real time and predict traffic conditions in the future, assisting relevant management departments in road scheduling and traffic management, thereby improving the efficiency of urban roads and reducing the occurrence of traffic accidents. Attached Figure Description
[0063] Figure 1 This is a detailed flowchart of the present invention;
[0064] Figure 2 This is a structural diagram of the encoder and decoder in this invention;
[0065] Figure 3 This is a structural diagram of the temporal multi-head self-attention network unit in this invention. Detailed Implementation
[0066] The specific embodiments of the present invention are described below to enable those skilled in the art to understand the present invention. However, it should be understood that the present invention is not limited to the scope of the specific embodiments. For those skilled in the art, various changes are obvious as long as they are within the spirit and scope of the present invention as defined and determined by the appended claims. All inventions utilizing the concept of the present invention are protected.
[0067] like Figure 1 As shown, a traffic flow prediction method using spatiotemporal graph convolutional networks includes the following steps:
[0068] S1. Preprocess the spatiotemporal traffic data to obtain initial traffic data and its labels;
[0069] S2. Encode the initial traffic data in a spatiotemporal manner to obtain inherent spatiotemporal information;
[0070] S3. Input the inherent spatiotemporal information into the encoder and train it to obtain intermediate sequence data;
[0071] S4. Perform spatiotemporal location encoding on the labels to obtain multi-granularity feature representation sequences;
[0072] S5. Input the multi-granularity feature representation sequence and intermediate sequence data into the decoder and train it to obtain the trained encoder and the trained decoder.
[0073] S6. Input the traffic spatiotemporal data to be predicted into the cascaded trained encoder and decoder, and perform a linear transformation to obtain the prediction result, thus completing the prediction of traffic flow.
[0074] The specific method of preprocessing in step S1 is as follows: the traffic spatiotemporal data are extracted according to three different time granularities to obtain the corresponding data samples and their feature tensors; the feature tensors of the three data samples are concatenated to obtain the initial traffic data.
[0075] The specific method for spatiotemporal location encoding in step S2 is as follows: construct a time embedding matrix and a spatial location embedding matrix; superimpose the time embedding matrix, the spatial location embedding matrix and the initial traffic data to obtain the inherent spatiotemporal information of the initial traffic data.
[0076] The encoder consists of three encoding modules connected in series. Each encoding module includes a temporal multi-head self-attention network unit and an adaptive diffusion graph convolutional network unit, and a residual connection and normalization layer is inserted after each unit.
[0077] The adaptive diffusion graph convolutional network unit consists of parallel diffusion graph convolutional networks and adaptive graph convolutional networks;
[0078] The decoder consists of three decoding modules connected in series. Each decoding module includes a mask temporal multi-head self-attention mechanism unit, an Encoder-Decoder interaction unit, and an adaptive diffusion graph convolution unit. A residual connection and normalization layer is inserted after each unit.
[0079] The Encoder-Decoder interaction unit is a temporal multi-head self-attention network unit;
[0080] The temporal multi-head self-attention network unit and the masked temporal multi-head self-attention mechanism unit adopt the TCN temporal convolutional network.
[0081] Step S3 further includes:
[0082] S3-1. Input the inherent spatiotemporal information into the temporal multi-head self-attention network module, and according to the formula:
[0083] Q l =W q s l
[0084] K l =W K s l
[0085] V l =W v s l
[0086]
[0087]
[0088]
[0089] S l+1 =φ*Z l
[0090] Obtain the output S of the temporal multi-head self-attention network module l+1 ; where z l Q represents an intermediate variable in the l-th hidden layer of a temporal multi-head self-attention network module. l K represents the query vector matrix of the l-th hidden layer in the temporal multi-head self-attention network module. l V represents the key vector matrix of the l-th hidden layer in a temporal multi-head self-attention network module. l W represents the value vector matrix of the l-th hidden layer in the temporal multi-head self-attention network module. q W K and W vThese represent the weights of the query vector matrix Q, key vector matrix K, and value vector matrix V in the temporal multi-head self-attention network module, respectively. l This represents the input of the l-th hidden layer in the temporal multi-head self-attention network module, where T represents the number of time slices, and d Model W represents the vector dimension of a temporal multi-head self-attention network module. i,j Let Z represent the weight coefficient matrix of the temporal multi-head self-attention network module, Softmax(·) represent the activation function, φ represent the convolution kernel parameters of the temporal multi-head self-attention network module, and Z represent the weight coefficient matrix of the temporal multi-head self-attention network module. l W represents the dynamic temporal correlation implied at different time steps. i Q W represents the weight of the query vector matrix Q at the i-th time step. i K W represents the weights of the key vector matrix K at the i-th time step. i V The weights of the value vector matrix V at the i-th time step are represented by H, where H represents the total number of time steps, and W represents the weights of the value vector matrix V at the i-th time step. o Represents the projection matrix; when l is 0, s l It is inherent spatiotemporal information;
[0091] S3-2, According to the formula:
[0092]
[0093] Obtain the output of the first residual connection and normalization layer in the current encoding module. Among them, X (L') This represents the input of the current encoding module, i.e., the inherent spatiotemporal information, TCMA(X). (L') ) represents the output of the temporal multi-head self-attention network module, i.e., S l+1 LayerNorm(·) represents the normalization function;
[0094] S3-3. Input the output of the first residual connection and normalization layer in the current encoding module to the adaptive diffusion graph convolutional network module to obtain the output of the adaptive diffusion graph convolutional network module.
[0095] S3-4. According to the formula:
[0096]
[0097] Obtain the output of the second residual connection and normalization layer in the current encoding module. in, This represents the output of the adaptive diffusion graph convolutional network module;
[0098] S3-5. Input the output of the second residual connection and normalization layer in the current encoding module to the next encoding module, and use the output of the second residual connection and normalization layer in the third encoding module as the output of the encoder, i.e., the intermediate sequence data.
[0099] The specific method for step S3-4 is as follows:
[0100] S3-4-1. Input the output of the first residual connection and normalization layer in the current encoding module into the diffusion map convolutional network, and according to the formula:
[0101]
[0102] Obtain the output X of the diffusion graph convolutional network D ; Among them, A, A T Let K represent the pre-adjacency matrix and its transpose of the nodes in the road network graph, and D represent the feature diffusion steps. O D I These represent the out-degree and in-degree matrices of the node, respectively, and X represents the output of the first residual connection and normalization layer in the current encoding module. φ k,O φ k,I These represent the convolution parameters corresponding to the forward and backward feature diffusion at the k-th step, respectively.
[0103] S3-4-2. Input the output of the first residual connection and normalization layer in the current encoding module into the adaptive graph convolutional network, and according to the formula:
[0104]
[0105]
[0106] Obtain the output X of the adaptive graph convolutional network A ;in, Let E1 and E2 represent the adaptive adjacency matrix and its transpose, respectively. Let E1 and E2 represent the embedding matrices of the original node and the target node, respectively. T Let E2 be the transpose of the embedding matrix E2 of the target node, and ReLU(·) denote the activation function.
[0107] S3-4-3. Concatenate the outputs of the diffusion graph convolutional network and the adaptive graph convolutional network to obtain the output of the adaptive diffusion graph convolutional network module.
[0108] Step S5 further includes:
[0109] S5-1. Input the multi-granularity feature representation sequence into the masked temporal multi-head self-attention mechanism module, and use the same method as steps S3-1 to S3-2 to obtain the output of the masked temporal multi-head self-attention mechanism module.
[0110] S5-2, According to the formula:
[0111]
[0112] Obtain the output of the first residual connection and normalization layer in the current decoding module. in, This represents a sequence with multi-granularity feature representation. This represents the output of the mask timing multi-head self-attention mechanism module;
[0113] S5-3. Input the output of the first residual connection and normalization layer in the current decoding module and the intermediate sequence data into the Encoder-Decoder interaction module, and use the same method as steps S3-1 to S3-2 to obtain the output of the Encoder-Decoder interaction module.
[0114] S5-4. According to the formula:
[0115]
[0116] Obtain the output of the second residual connection and normalization layer in the current decoding module. in This represents the output of the Encoder-Decoder interactive module;
[0117] S5-5. Apply the same method as steps S3-4 and S3-5 to the output of the second residual connection and normalization layer in the current decoding module to obtain the output of the third residual connection and normalization layer in the current decoding module.
[0118] S5-6. Input the output of the third residual connection and normalization layer in the current decoding module to the next decoding module, and use the output of the third residual connection and normalization layer in the third decoding module as the output of the decoder to obtain the initial prediction result;
[0119] S5-7. Based on the initial prediction results, the parameters of the encoder and decoder are adjusted using the backpropagation algorithm to obtain the trained encoder and decoder.
[0120] In one embodiment of the present invention, a temporal multi-head self-attention network unit utilizes a TCN temporal convolutional network to extract temporal dependencies in traffic flow data, and combines an attention mechanism to optimize feature allocation, thereby capturing the implicit dynamic temporal correlations at different time steps. An adaptive diffusion graph convolutional network unit utilizes a diffusion graph convolutional network to simulate local spatial correlations in traffic flow data, and establishes implicit spatial dependencies for road network nodes through an adaptive adjacency matrix, thereby mining global spatial correlations between road network nodes.
[0121] During training or prediction, the initial traffic data is: The time embedding matrix is The spatial location embedding matrix is Inherent spatiotemporal information is Intermediate variables in the l-th hidden layer of the temporal multi-head self-attention network module The input of the l-th hidden layer in the temporal multi-head self-attention network module Query vector matrix key vector matrix Value vector matrix The output S of the temporal multi-head self-attention network module l+1 ∈R T×C The convolution kernel parameter φ of the temporal multi-head self-attention network module is 3. The output of the first residual connection and the normalization layer in the current encoding module is... The output X of the diffusion graph convolutional network D ∈R N×Q The input X∈R of the diffusion graph convolutional network N ×P The out-degree matrix D corresponding to the node O ∈R N×N The in-degree matrix D corresponding to the node I ∈R N×N The nodes of the road network graph are predefined in the adjacency matrix A∈R N×N The convolution parameter φ corresponding to the k-th forward feature diffusion step k,O ∈R P×Q The convolution parameter φ corresponding to the k-th step reverse feature diffusion k,I ∈R P×Q Adaptive adjacency matrix The embedding matrix E1∈R of the original nodes N×C The embedding matrix of the target node is E2∈R N×C Multi-granularity feature representation sequence is The first residual connection in the current decoding module is connected to the output of the normalization layer. The second residual connection in the current decoding module is connected to the output of the normalization layer. Where N represents the number of sensors in the road network, S represents the feature channels of each node in the road network in each time slice, w and d represent the number of weeks and days of the historical time segment, respectively, and T w T d and T h These represent the time window size, R as a constant, T as the number of time slices, and d as the number of time slices. Model C represents the vector dimension with a value of 64, C represents the hidden layer dimension with a value of 128, P and Q represent the vector dimensions, and T represents the vector dimension. f This indicates the length of the predicted future time window.
[0122] The model used in this invention is executed in parallel during training. To prevent data leakage, future data needs to be masked when predicting the target sequence. Therefore, the weight coefficient matrix of the temporal multi-head self-attention network unit is processed as follows:
[0123]
[0124] in, This is the weight coefficient matrix of the processed temporal multi-head self-attention network unit. This is the original weight coefficient matrix of the temporal multi-head self-attention network unit, and the target sequence is the label information of the traffic data.
[0125] In summary, this invention utilizes temporal convolutional networks to simulate the temporal correlation of traffic flow and combines an attention mechanism to enhance the model's ability to characterize the dynamic temporal dependence of traffic flow at different temporal granularities. A diffusion graph convolutional network based on an adaptive adjacency matrix is constructed in the spatial domain, using a data-driven approach to capture the local spatial correlation features of nodes and the global spatial information of the road network. This allows for real-time monitoring of complex traffic conditions and prediction of future traffic situations, assisting relevant management departments in road scheduling and traffic management, thereby improving urban road efficiency and reducing traffic accidents.
Claims
1. A traffic flow prediction method of a spatio-temporal graph convolution network, characterized in that: Includes the following steps: S1. Preprocess the spatiotemporal traffic data to obtain initial traffic data and its labels; S2. Encode the initial traffic data in a spatiotemporal manner to obtain inherent spatiotemporal information; S3. Input the inherent spatiotemporal information into the encoder and train it to obtain intermediate sequence data; S4. Perform spatiotemporal location encoding on the labels to obtain multi-granularity feature representation sequences; S5. Input the multi-granularity feature representation sequence and intermediate sequence data into the decoder and train it to obtain the trained encoder and the trained decoder. S6. Input the traffic spatiotemporal data to be predicted into the cascaded trained encoder and decoder, and perform a linear transformation to obtain the prediction result, thus completing the prediction of traffic flow. The encoder includes three encoding modules connected in series. Each encoding module includes a temporal multi-head self-attention network unit and an adaptive diffusion graph convolutional network unit, and a residual connection and normalization layer is inserted after each unit. The adaptive diffusion graph convolutional network unit includes a parallel diffusion graph convolutional network and an adaptive graph convolutional network; The decoder includes three decoding modules connected in series. Each decoding module includes a mask temporal multi-head self-attention mechanism unit, an Encoder-Decoder interaction unit, and an adaptive diffusion graph convolution unit. A residual connection and normalization layer is inserted after each unit. The Encoder-Decoder interaction unit is a temporal multi-head self-attention network unit; The temporal multi-head self-attention network unit and the masked temporal multi-head self-attention mechanism unit adopt the TCN temporal convolutional network; Step S3 further includes: S3-1. Input the inherent spatiotemporal information into the temporal multi-head self-attention network module, and according to the formula: Obtain the output of the temporal multi-head self-attention network module ;in, This represents the first time-series multi-head self-attention network module. The intermediate variables of the hidden layer, This represents the first time-series multi-head self-attention network module. The query vector matrix of the hidden layer, This represents the first time-series multi-head self-attention network module. The key vector matrix of the hidden layer, This represents the first time-series multi-head self-attention network module. The value vector matrix of the hidden layer, , and These represent the query vector matrices in the temporal multi-head self-attention network module. Key vector matrix Sum value vector matrix The weight, This represents the first time-series multi-head self-attention network module. The input of the hidden layer, Indicates the number of time slices. This represents the vector dimension of a temporal multi-head self-attention network module. This represents the weight coefficient matrix of a temporal multi-head self-attention network module. This represents the activation function. This represents the convolutional kernel parameters of a temporal multi-head self-attention network module. This represents the dynamic temporal correlation implied at different time steps. Indicates the first Query vector matrix at each time step The weight, Indicates the first Key vector matrix at each time step The weight, Indicates the first Value vector matrix at each time step The weight, Indicates the total number of time steps. Represents the projection matrix; when When it is 0, It is inherent spatiotemporal information; S3-2, According to the formula: Obtain the output of the first residual connection and normalization layer in the current encoding module. ;in, This represents the input to the current encoding module, i.e., the inherent spatiotemporal information. This represents the output of the temporal multi-head self-attention network module, i.e. , Represents the normalization function; S3-3. Input the output of the first residual connection and normalization layer in the current encoding module to the adaptive diffusion graph convolutional network module to obtain the output of the adaptive diffusion graph convolutional network module. S3-4. According to the formula: Obtain the output of the second residual connection and normalization layer in the current encoding module. ;in, This represents the output of the adaptive diffusion graph convolutional network module; S3-5. Input the output of the second residual connection and normalization layer in the current encoding module to the next encoding module, and use the output of the second residual connection and normalization layer in the third encoding module as the output of the encoder, i.e., the intermediate sequence data.
2. The traffic flow prediction method using spatiotemporal graph convolutional networks according to claim 1, characterized in that: The specific method of preprocessing in step S1 is as follows: traffic spatiotemporal data are extracted according to three different time granularities to obtain corresponding data samples and their feature tensors; the feature tensors of the three data samples are concatenated to obtain the initial traffic data.
3. The traffic flow prediction method using spatiotemporal graph convolutional networks according to claim 1, characterized in that: The specific method for spatiotemporal location encoding in step S2 is as follows: construct a time embedding matrix and a spatial location embedding matrix; superimpose the time embedding matrix, the spatial location embedding matrix and the initial traffic data to obtain the inherent spatiotemporal information of the initial traffic data.
4. The traffic flow prediction method using spatiotemporal graph convolutional networks according to claim 1, characterized in that: The specific method for step S3-4 is as follows: S3-4-1. Input the output of the first residual connection and normalization layer in the current encoding module into the diffusion map convolutional network, and according to the formula: Obtain the output of the diffusion graph convolutional network ;in, , Represents the pre-adjacency matrix and its transpose of the nodes in the road network graph. Indicates the number of feature diffusion steps. , These represent the out-degree matrix and in-degree matrix of the node, respectively. This represents the output of the first residual connection and normalization layer in the current encoding module. , , They represent the first The convolution parameters corresponding to forward and backward feature diffusion; S3-4-2. Input the output of the first residual connection and normalization layer in the current encoding module into the adaptive graph convolutional network, and according to the formula: Obtain the output of the adaptive graph convolutional network ;in, , Let them represent the adaptive adjacency matrix and its transpose, respectively. , Let these represent the embedding matrices of the original node and the target node, respectively. Embedding matrix representing the target node The transpose of the matrix, Indicates the activation function; S3-4-3. Concatenate the outputs of the diffusion graph convolutional network and the adaptive graph convolutional network to obtain the output of the adaptive diffusion graph convolutional network module.
5. The traffic flow prediction method using spatiotemporal graph convolutional networks according to claim 1, characterized in that: Step S5 further includes: S5-1. Input the multi-granularity feature representation sequence into the masked temporal multi-head self-attention mechanism module, and use the same method as steps S3-1 to S3-2 to obtain the output of the masked temporal multi-head self-attention mechanism module. S5-2, According to the formula: Obtain the output of the first residual connection and normalization layer in the current decoding module. ;in, This represents a sequence with multi-granularity feature representation. This represents the output of the mask timing multi-head self-attention mechanism module; S5-3. Input the output of the first residual connection and normalization layer in the current decoding module and the intermediate sequence data into the Encoder-Decoder interaction module, and use the same method as steps S3-1 to S3-2 to obtain the output of the Encoder-Decoder interaction module. S5-4. According to the formula: Obtain the output of the second residual connection and normalization layer in the current decoding module. ;in This represents the output of the Encoder-Decoder interactive module; S5-5. Apply the same method as steps S3-4 and S3-5 to the output of the second residual connection and normalization layer in the current decoding module to obtain the output of the third residual connection and normalization layer in the current decoding module. S5-6. Input the output of the third residual connection and normalization layer in the current decoding module to the next decoding module, and use the output of the third residual connection and normalization layer in the third decoding module as the output of the decoder to obtain the initial prediction result; S5-7. Based on the initial prediction results, the parameters of the encoder and decoder are adjusted using the backpropagation algorithm to obtain the trained encoder and decoder.