A Traffic Flow Prediction Method Based on Multi-Scale Dual Hypergraph Fusion
By constructing a multi-scale dual hypergraph fusion method, multi-scale spatial features and temporal dependencies of traffic flow are extracted and fused, which solves the shortcomings of existing models in capturing high-order spatial features and achieves traffic flow prediction with higher accuracy and generalization ability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- YANGTZE DELTA REGION INST (QUZHOU) UNIV OF ELECTRONIC SCI & TECH OF CHINA
- Filing Date
- 2025-06-09
- Publication Date
- 2026-06-30
AI Technical Summary
Existing traffic flow prediction models struggle to effectively capture multi-scale, high-order spatial features, particularly in handling micro-level individual travel intentions, meso-level community commuting patterns, and macro-level regional traffic transmission, resulting in limited prediction accuracy and generalization ability.
A multi-scale dual hypergraph fusion method is constructed, which includes dual hypergraphs representing micro-level individual travel intentions, meso-level community commuting, and macro-level regional traffic transmission. Multi-scale spatial features are extracted by combining hypergraph convolution and dynamic graph convolution. Global spatial dependencies are modeled through gating and spatial multi-head attention mechanisms. Long-term and short-term temporal dependencies are captured by combining temporal multi-head attention. Finally, multi-step traffic flow prediction is achieved through residual connections.
It significantly improves the accuracy and generalization ability of traffic flow prediction, and can more accurately reflect the multi-scale high-order spatial characteristics and complex spatiotemporal dynamic interactions of urban roads, providing a more powerful prediction tool for intelligent transportation systems.
Smart Images

Figure CN120673587B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of intelligent transportation, specifically relating to a traffic flow prediction method based on multi-scale dual hypergraph fusion. Background Technology
[0002] With the accelerating pace of global urbanization, problems such as road congestion and frequent traffic accidents have become increasingly prominent, posing significant challenges to modern urban transportation systems. Against this backdrop, traffic flow prediction, as a core issue of Intelligent Transportation Systems (ITS), offers crucial decision-making support for signal timing optimization and traffic allocation strategy formulation.
[0003] Traffic flow prediction methods have evolved from traditional statistical methods to modern deep learning techniques. Early research relied primarily on traditional statistical models, such as regression analysis and time series models, to predict the dynamic changes in traffic flow. However, these methods had significant limitations in capturing the spatiotemporal dependencies and nonlinear relationships of traffic flow. With the rise of machine learning, researchers introduced algorithms such as support vector machines and random forests, which could handle the nonlinear characteristics of traffic flow to some extent, but were still insufficient in modeling spatial dependencies. Subsequently, deep learning techniques propelled the development of traffic flow prediction. Deep learning models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) performed well in capturing temporal dependencies, but still had limitations in handling the spatial dependencies of traffic flow. The introduction of graph convolutional networks (GCNs) provided a breakthrough solution to this problem. By learning from graph structures, researchers were able to effectively capture the spatial dependencies between regions in traffic networks. Building on this foundation, several models more suitable for high-precision prediction in complex traffic scenarios have been proposed in recent years. For example, DHSTNet employs a dynamic spatiotemporal graph model, combining spatiotemporal dependencies with a dynamic graph structure to more comprehensively model the dynamic changes in traffic flow. STGHTN proposes a spatiotemporally gated hybrid transformer network. This network first uses temporally gated convolution and spatially gated graph convolution to extract temporal and spatial local features of traffic flow, respectively. Then, it captures global spatiotemporal dependencies through transformer modules, thereby significantly improving the accuracy of traffic flow prediction.
[0004] In summary, while significant progress has been made in traffic flow prediction research, existing spatiotemporal prediction frameworks primarily rely on feature learning based on nodes and edges of predefined low-order graph structures or on hyperedge feature learning from single-scale hypergraphs. Although these frameworks can capture some spatiotemporal dependencies, they struggle to effectively characterize the multi-scale, high-order spatial features prevalent in road networks, limiting their performance in real-world traffic flow prediction. In reality, the dynamic changes in traffic flow are influenced by a combination of multi-scale factors, including micro-level travel intentions, meso-level community commuting patterns, and macro-level regional flow transmission. At the micro-level of travel, existing studies generally employ homogeneous scenario assumptions, neglecting the coexistence of various travel demands such as school drop-off, medical treatment, and freight transport, as well as the significant differences in route selection and timing among different groups. At the meso-level of the community, traditional modeling methods fail to capture the dynamic functional units constructed by spatiotemporal behavioral patterns such as high-frequency commuting corridors and periodic pick-up / drop-off flows. These high-order spatial interaction clusters formed based on commuting demands essentially define the functional boundaries of the community. Furthermore, at the macroscopic transmission level, the static adjacency matrix is difficult to reflect the dynamic transmission effect of cross-regional traffic flow, and the problem of interaction between areas without direct connection through detour behavior still exists. Summary of the Invention
[0005] To address the shortcomings of existing technologies, this invention provides a traffic flow prediction method with multi-scale high-order spatial perception, which adaptively integrates high-order spatial features at three scales: micro, meso, and macro, significantly improving the accuracy and generalization ability of traffic flow prediction.
[0006] The objective of this invention is achieved through the following technical solution:
[0007] A traffic flow prediction method based on multi-scale dual hypergraph fusion, the prediction method comprising:
[0008] S1. Based on the actual traffic network topology, a dual hypergraph of travel intention, a dual hypergraph of functional community, and a dual hypergraph of regional transmission are constructed from three scales: micro-level individual travel intention, meso-level community commuting, and macro-level regional traffic transmission, so as to realize the construction of a dual hypergraph of urban roads at multiple scales.
[0009] S2. Based on the three-scale dual hypergraphs constructed in step S1 and the historical traffic flow feature matrix, firstly, multi-scale spatial features are extracted step by step using hypergraph convolution and dynamic graph convolution, and multi-scale fusion of high-order local spatial information is achieved through a gating mechanism. Then, global spatial dependencies are modeled by combining spatial multi-head attention mechanism, and an adaptive spatial fusion gate is introduced to dynamically adjust the weights of high-order local spatial embedding and global spatial features, finally generating a multi-scale spatial encoded feature sequence.
[0010] S3. Input the multi-scale spatial coding feature sequence obtained in step S2 into a dynamic temporally dilated causal convolutional network to extract short-term fluctuation patterns of traffic flow. At the same time, a temporal multi-head attention mechanism is used to capture the long-term trend features of traffic flow data. Then, the short-term dynamics and long-term temporal dependencies of historical traffic flow data are adaptively integrated through an adaptive temporal fusion gate to form a spatiotemporal feature representation that combines multi-scale spatial information and long-term and short-term temporal dependencies.
[0011] S4. Based on the spatiotemporal feature representation obtained in step S3, the spatiotemporal feature fusion is enhanced by residual connection and then input into the convolutional prediction layer to finally output the multi-step traffic flow prediction result.
[0012] Furthermore, the actual traffic network topology in step S1 is specifically as follows:
[0013] The actual traffic network topology is modeled as a weighted directed graph G = (V, ε, A), where the node set V = {v1, v2, ..., v...} N} represents N road segments, i.e., |V| = N; the set of edges ε = {e1, e2, ..., e E The adjacency matrix A ∈ R describes the connection relationships of E roads, i.e., |ε| = E. N×N The spatial distance weights between coded road segments are used; furthermore, the historical traffic flow feature matrix over T time steps is X∈R. T×N×d , where d represents the traffic feature dimension of each road node.
[0014] Specifically, constructing the travel intention dual hypergraph in step S1 includes:
[0015] The hyperedges are formed by all nodes starting from the same road node and all nodes ending at the same road node; the dual hypergraph of travel intention is represented as Gt = (V t , ε t H t Its adjacency matrix H t The definition is as follows:
[0016]
[0017] in, For the adjacency matrix H t The element in the i-th row and j-th column, the road node Super Edge Travel Intent Dual Hypergraph Adjacency Matrix H t It can be divided into two parts: in This represents the hyperedge connection relationship built based on the starting point. This represents the hyperedge connection relationship built based on the destination; together, they constitute the complete topological structure of the dual hypergraph of travel intentions.
[0018] Specifically, the construction of the functional community dual hypergraph in step S1 includes:
[0019] First, the K road nodes with the highest destination overlap are selected to form the destination hyperedge. Then, the K road nodes with the highest origin overlap are selected to form the origin hyperedge. During the node selection process, the total spatial distance between nodes within the hyperedge is minimized. This constructs the dual hyperedge of the functional community dual hypergraph. To quantify the functional similarity between nodes, a new functional association matrix A is reconstructed based on the original weighted adjacency matrix A. c The calculation is as follows:
[0020]
[0021] in Let N be the element in the i-th row and j-th column of the functional association matrix, where N is the number of nodes. The value is related to the node v i and v j The functional similarity is negatively correlated; based on this, the K-nearest neighbor algorithm is used to analyze A... c In this process, the C most similar neighbors of each node i are selected to form a functional community. Finally, the dual hypergraph G is constructed. c =(V c ,ε c H c ), adjacency matrix H c The definition is as follows:
[0022]
[0023] in, For the adjacency matrix H c The element in the i-th row and j-th column, the road node Super Edge Functional community dual hypergraph adjacency matrix H c It can be divided into two parts, namely This represents all hyperedge connections constructed based on the overlap of the starting nodes, while This represents the hyperedge connection relationship constructed based on the overlap of arriving nodes, and together they constitute the complete topological structure of the functional community dual hypergraph.
[0024] Specifically, the construction of the region-transmitted dual hypergraph in step S1 includes:
[0025] First, the top K most influential neighbor nodes are selected using the Top-K sampling method to construct the node influence adjacency matrix A. r The specific calculations are as follows:
[0026] idx = Top-K(-A) i,: (K)
[0027] idy=Top-K(-A :,j K),
[0028] A r =A[idx,idy],
[0029] Where idx and idy ∈ R N×K Let A represent selecting the K most influential neighbor nodes based on the starting point and the K most influential neighbor nodes based on the ending point, respectively. Then, the set of the Top-K influential neighbors for each node is used as a hyperedge, and the node influence matrix A is used as the hyperedge. r Constructing a regional transmission dual hypergraph G r =(V r , ε r H r Its adjacency matrix H r The definition is as follows:
[0030]
[0031] in, For the adjacency matrix H r The element in the i-th row and j-th column, the road node Super Edge The adjacency matrix H of the region-transmitting dual hypergraph r It can be divided into two parts: in, and The hyperedge relationships corresponding to the starting and ending points respectively together constitute the complete topological structure of the regional transmission dual hypergraph.
[0032] Furthermore, in step S2, hypergraph convolution and dynamic graph convolution are used to progressively extract multi-scale spatial features, and a gating mechanism is used to achieve multi-scale fusion of high-order local spatial information, as detailed below:
[0033] First, construct the multi-scale hyperedge feature matrix X. h The original traffic flow features X∈R T×N×d The process of integrating into the constructed multi-scale dual hypergraph can be formally expressed as follows:
[0034] X h =[(W1x)[ind src ,:];(W2χ)([ind dst ,:]]∈R E×2d′ ,
[0035] Here, [·; ·] represents the concatenation operation, where W1, W2 ∈ R.T×d×d′ These are learnable parameters used to perform a linear transformation on the original traffic flow features X, where d′ is the hyperedge feature dimension. After applying these parameters to X, the result is W1X, W2X∈R. N×d′ Therefore, (W1X)[ind src ,:]: is the hyperedge feature slice tensor constructed based on the hyperedge relationship at the starting point; (W2X)[ind dst , :] is the hyperedge feature slice tensor constructed based on the endpoint hyperedge relationship; next, the interaction relationships between hypernodes are further aggregated through convolution operations:
[0036] X h′ =Conv 1×1 (X h )∈R E×d″ ,
[0037] Among them, Conv 1×1 (·) indicates that a 1×1 convolution kernel is used for convolution operation to update the features of the supernode, where d″ is the feature dimension of the supernode; to enhance the adaptability of the model, the diagonal adaptive learning parameters of the dual hypergraph are defined as follows:
[0038] W adp =diag(L adp )∈R N×N ,
[0039] Among them, L adp ∈R N These are the weight vectors used for adaptive learning of hyperedges; based on this, the adaptive adjacency matrix of the dual hypergraph is defined as follows:
[0040]
[0041] Among them, D hv It is a hyperdiagonal matrix, D he H is the diagonal matrix of the degree of the supernodes, and H is the adjacency matrix of the dual hypergraph.
[0042] Subsequently, the adaptive adjacency matrix W h The information is input into a hypergraph convolutional network to aggregate the higher-order information of the hyperedges. The calculation process is as follows:
[0043]
[0044] It should be noted that Θ = [Θ0,...,Θ] N-1 ]∈R N×d" These are the parameters to be learned; further, the Reshape() function is used to map the output of the hypergraph convolutional network into a sparse matrix, calculated as follows:
[0045]
[0046] Based on this, construct the reverse correlation matrix. A dynamic graph convolutional network is used to aggregate high-order local information about traffic flow, as shown below:
[0047]
[0048] Among them, A f =A / rowsum(A) is the row-normalized adjacency matrix. It is a column-normalized adjacency matrix. It is an adaptive adjacency matrix, and E1, E2 ∈ R N ×c It is a learnable parameter, Θ' f =[Θ′ 0,f ,....,Θ′ N-1,f ]∈R N×d×1 Θ′ b =[Θ′ 0,b ,...,Θ′ N-1,b ]∈R N ×d×1 and Θ′ adp =[Θ′ 0,adp ,...,Θ′ N-1,adp ]∈R N×d×1 These are learnable parameters;
[0049] Then, multiple Dynamic Graph Convolutional Networks (DGCNs) are used to enable the model to comprehensively learn the high-order local spatial information of the multi-scale dual hypergraph. The specific calculation is as follows:
[0050] Z t =DGCN t (X),Z c =DGCN c (X), Z r =DGCN r (X),
[0051] Among them, Z t Z c and Z r These represent the higher-order spatial embeddings of the travel intention dual hypergraph, the community commuting dual hypergraph, and the regional transmission dual hypergraph, respectively.
[0052] Finally, a gating mechanism is introduced to avoid overfitting during the aggregation process of high-order spatial embeddings. The specific process is as follows:
[0053] Z cr =Sigmoid(Z) c )☉Tamh(Z r ),
[0054] gh =Sigmoid(Z) cr +Z t ),
[0055] Z h =g h ☉Z cr +(1-g h )☉Z t ,
[0056] Where ⊙ represents element-wise multiplication and adaptive weight g h It is obtained through the Sigmoid activation function, with Z ranging from 0 to 1. cr Z represents the higher-order spatial embedding of the community commuting dual hypergraph and the region transmission dual hypergraph. h This represents the final generated high-order local space embedding;
[0057] Furthermore, in step S2, the global spatial dependency relationship is modeled using a spatial multi-head attention mechanism, as detailed below:
[0058] First, input features X∈R T×N×d The query matrix Q is generated through three independent linear transformation layers. S Key matrix K S Sum matrix V S The specific calculations are as follows:
[0059]
[0060] in, and These are used to generate the query matrix Q. S Key matrix K S Sum matrix V S The linear layer can learn the parameter matrix;
[0061] Based on this, the attention score between nodes in the spatial dimension is calculated by introducing a scaling factor. To stabilize gradient calculations and prevent the softmax function from saturating due to excessively large dot product results, the attention score matrix is calculated as follows:
[0062]
[0063] in, K represents S The transpose of the expression; due to the limited expressive power of a single attention mechanism, a multi-head attention mechanism is further introduced to capture the feature interactions of different subspaces. Specifically, the outputs of multiple attention heads are concatenated and then linearly transformed to obtain the final output:
[0064]
[0065] Where m represents the number of attention heads, W S It outputs the projection matrix, and Concat() is the concatenation operation;
[0066] Finally, a feature interaction mechanism is designed to enhance the expressive power of local features. This mechanism stabilizes the training process through residual connections and layer normalization, while introducing nonlinear transformations to enhance feature representation, thereby obtaining global spatial features. The calculation process is as follows:
[0067]
[0068] Here, ReLU is the activation function, and Layernorm is the layer normalization function.
[0069] Furthermore, in step S2, an adaptive spatial fusion gate is introduced to dynamically adjust the weights of high-order local spatial embedding and global spatial features, as follows:
[0070] An Entangle fusion mechanism is introduced to adaptively balance high-order local spatial embeddings and global spatial dependencies. This mechanism dynamically adjusts the contribution of the two types of features through gating weights, thereby obtaining the final output features that simultaneously contain local and global spatial information.
[0071]
[0072] Wherein, the adaptive weight g s It is obtained through the Sigmoid activation function and is between 0 and 1;
[0073] Specifically, step S3 includes the following:
[0074] (3.1) Short-term fluctuation patterns of traffic flow are extracted through dynamic time-dilated causal convolution, specifically as follows:
[0075] Given the multi-scale spatially encoded feature sequence obtained in step S2 and filter F = {f1, f2, ..., f k}∈R K The dilated causal convolution at time step t is calculated as follows:
[0076]
[0077] Where u is the expansion factor, which controls the jump distance and can control the receptive field of the model in the time dimension, making it easier to learn the short-term time dependence of traffic flow data.
[0078] Simultaneously, a gating mechanism is employed to control the output of the information flow. This utilizes gated activation units to extract short-term features from the output of the dilated causal convolutional layer in parallel, reducing the impact of external noise. Specifically:
[0079] Z T =Sigmoid(Φ1*X) S +a)☉Tanh(Φ2*X S +b)
[0080] Here, Φ1 and Φ2 are time-independent one-dimensional dilated causal convolution operations, a and b are learnable offset parameters, Sigmoid() is the activation function that determines the proportion of information passed to the next layer, Tanh() is the activation function that performs the nonlinear transformation, and Z... T ∈R N×T×d′ It is a feature representation of enhanced time perception;
[0081] (3.2) The method of using a time-based multi-head attention mechanism to capture the long-term trend features of traffic flow data specifically includes:
[0082] First, the query matrix Q is mapped through three independent linear transformation layers. T Key matrix K T Sum matrix V T The specific definition is as follows:
[0083]
[0084] in, and These are used to generate the query matrix Q. T Key matrix K T Sum matrix V T The learnable parameter matrix;
[0085] Based on this, the attention score between nodes in the time dimension is calculated by introducing a scaling factor. To stabilize gradient calculations and prevent the softmax function from saturating due to excessively large dot product results, the attention score matrix is calculated as follows:
[0086]
[0087] in, K represents S The transpose of the expression. Considering the limited expressive power of a single attention mechanism, a multi-head attention mechanism is further introduced to capture the feature interactions of different subspaces. Specifically, the outputs of multiple attention heads are concatenated and then linearly transformed to obtain the final output:
[0088]
[0089] Where m represents the number of attention heads, W T It is a linear transformation matrix used to map the output of temporal multi-head attention to the desired output dimension; Concat() is the concatenation operation.
[0090] Finally, a feature interaction mechanism is designed to enhance the expressive power of local features and address the problem of the multi-head attention mechanism's insensitivity to short-term temporal changes, thereby obtaining long-term temporal dependent features. The calculation process is as follows:
[0091]
[0092] Relu() is the activation function, and Layernorm() is the normalization function.
[0093] (3.3) The method of integrating the short-term dynamics and long-term temporal dependencies of historical traffic flow data through an adaptive time fusion gate is as follows:
[0094] An Entangle fusion mechanism is introduced to dynamically adjust the short-term dynamics and long-term temporal dependence contributions of historical traffic flow data through gating weights, thereby obtaining a spatiotemporal feature representation that combines multi-scale spatial information and long- and short-term temporal dependence characteristics.
[0095]
[0096] Wherein, the adaptive weight g t It is obtained through the activation function Sigmoid, with values between 0 and 1;
[0097] Furthermore, step S4 specifically includes the following:
[0098] First, the multi-scale spatiotemporal feature representations of the outputs from each layer are fused through a skip connection mechanism. The calculation process is as follows:
[0099]
[0100] Among them, W a W b Here, M represents the number of network layers, and Layernorm() is the layer normalization function. Finally, a 1×1 convolutional layer maps the fused features to the prediction result.
[0101]
[0102] in, This is a traffic flow prediction sequence for the next P time steps.
[0103] Compared with the prior art, the present invention has the following beneficial effects:
[0104] Existing spatiotemporal prediction frameworks primarily rely on feature learning based on nodes and edges of predefined low-order graph structures or on hyperedge feature learning from single-scale hypergraphs. While these frameworks can capture certain spatiotemporal dependencies, they struggle to effectively represent the multi-scale, high-order spatial features prevalent in road networks, limiting their performance in practical traffic flow prediction. To address these issues, this invention proposes a traffic flow prediction method with multi-scale, high-order spatial perception, offering the following core advantages: First, by constructing a dual hypergraph representation framework encompassing microscopic (individual travel intentions), mesoscopic (community commuting), and macroscopic (regional traffic transmission) dimensions, accurate modeling of multi-scale, high-order spatial features of urban roads is achieved. Second, a spatiotemporal perception method based on multi-scale hypergraph feature fusion is designed, employing an adaptive fusion mechanism to effectively integrate spatial features at different scales, significantly enhancing the model's ability to capture complex spatiotemporal dynamic interaction features. Finally, the model achieves high-precision traffic flow prediction by dynamically fusing multi-scale, high-order spatial features with complex temporal dependencies. This innovative method overcomes the limitations of traditional single-scale modeling, providing a more powerful prediction tool for intelligent transportation systems. Attached Figure Description
[0105] To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention will be described in detail below with reference to the accompanying drawings.
[0106] Figure 1 This is a diagram of the traffic flow prediction model based on multi-scale hypergraph fusion of the present invention.
[0107] Figure 2 This is a schematic diagram of the construction of the multi-scale dual hypergraph of the present invention;
[0108] Figure 3 This is a schematic diagram of the high-order spatial perception of the present invention;
[0109] Figure 4 This is a schematic diagram of the spatial multi-head attention method of the present invention;
[0110] Figure 5 This is a schematic diagram of the adaptive spatial fusion gate of the present invention;
[0111] Figure 6 This is a schematic diagram illustrating the time-sensing enhancement of the present invention;
[0112] Figure 7 This is a schematic diagram of the time-based multi-head attention mechanism of the present invention;
[0113] Figure 8 This is a schematic diagram of the adaptive time fusion gate of the present invention;
[0114] Figure 9 This is a schematic diagram of the model output of the present invention. Detailed Implementation
[0115] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of the embodiments. The components of the embodiments of this application described and shown in the accompanying drawings can generally be arranged and designed in various different configurations. Therefore, the detailed description of the embodiments of this application provided below with reference to the accompanying drawings is not intended to limit the scope of protection of the claimed application, but merely represents selected embodiments of this application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application. The present invention will be further described below with reference to the accompanying drawings.
[0116] This invention proposes a traffic flow prediction method based on multi-scale dual hypergraph fusion to address the problem that existing traffic flow prediction models rely on predefined low-order graphs or single-scale hypergraphs, making it difficult to capture high-order spatial dependencies across multiple scales in the traffic network. First, the model constructs a dual hypergraph of urban roads at three scales: micro-level individual travel intentions, meso-level community commuting, and macro-level regional traffic transmission. Based on this, a traffic flow prediction method combining spatial and temporal perception is designed. The spatial perception module, composed of high-order spatial perception and spatial multi-head attention, is used to extract and fuse high-order local spatial features and global spatial dependencies from the multi-scale dual hypergraph. The temporal perception module, composed of temporal perception enhancement and temporal multi-head attention, is used to capture short-term fluctuations and long-term trends in traffic flow. Finally, by stacking the spatial and temporal perception modules and introducing a residual connection mechanism, multi-step spatiotemporal feature fusion and traffic flow prediction are achieved. This model adaptively fuses high-order spatial features at the micro, meso, and macro scales, significantly improving the accuracy and generalization ability of traffic flow prediction.
[0117] This embodiment selects four real-world traffic flow datasets (PEMS03, PEMS04, PEMS07, and PEMS08) collected by the California Department of Transportation through its Highway Performance Measurement System (PEMS) as input sources for actual traffic operation data. These datasets are used to construct and verify the construction process of the multi-scale dual hypergraph and the spatiotemporal feature fusion mechanism in the model of this invention. The basic statistical information of these four datasets is shown in Table 1.
[0118] Table 1
[0119]
[0120] Based on obtaining and processing the required dataset, the overall process of model building is as follows: Figure 1As shown, the specific steps include:
[0121] like Figure 1 As shown, the specific steps include:
[0122] Step 1: Model the actual traffic network topology as a weighted directed graph G = (V, ε, A), where the node set V = {v1, v2, ..., v...} N} represents N road segments, i.e., |V| = N; the set of edges ε = {e1, e2, ..., e E The adjacency matrix A ∈ R describes the connection relationships of E roads, i.e., |ε| = E. N×N Spatial distance weights between road segments are encoded; traffic feature matrix X∈R over T time steps. T×N×d , where d represents the traffic characteristic dimension of each road node (such as speed, flow, etc.);
[0123] Step 2: Constructing a Micro-scale Dual Hypergraph of Travel Intents: Existing traffic flow prediction studies generally employ homogeneous scenario assumptions, such as assuming that all travelers during the morning rush hour are concentrated on commuting to the central business district. This assumption forcibly compresses people's travel intentions into a single commuting purpose. However, in real traffic systems, multiple types of travel demands exist at the same time, such as school drop-off, medical treatment, and freight transport, and different groups have significant differences in route selection and time arrangements. Therefore, existing models struggle to accurately characterize the spatiotemporal distribution characteristics of traffic flow caused by the diversity of individual travel intentions. To better understand the impact of diverse individual travel intentions on traffic flow at the micro-level, this paper proposes a dual hypergraph of travel intentions. Specifically, as follows... Figure 2 As shown in (a), all nodes starting from the same road node and all nodes ending at the same road node form hyperedges. The dual hypergraph of travel intentions is represented as G. t =(V t ,ε t H t Its adjacency matrix H t The definition is as follows:
[0124]
[0125] in, For the adjacency matrix H t The element in the i-th row and j-th column, the road node Super Edge Travel Intent Dual Hypergraph Adjacency Matrix H t It can be divided into two parts: in This represents the hyperedge connection relationship built based on the starting point. This represents the hyperedge connections built based on the destination. Together, they constitute the complete topological structure of the travel intention dual hypergraph.
[0126] Step 3: Constructing a Functional Community Dual Hypergraph at the Mesoscale: Existing traffic flow prediction studies generally neglect the high-order spatial correlation characteristics at the mesoscale functional community level when modeling urban spatial relationships. However, in real-world urban systems, communities do not strictly adhere to administrative divisions but are dynamic functional units constructed from spatiotemporal behavioral patterns such as high-frequency commuting corridors and periodic pick-up and drop-off flows. For example, regular commuting corridors form between educational districts and surrounding residential areas. These high-order spatial interaction clusters based on commuting needs essentially define the functional boundaries of the community. To learn the impact of the spatial correlation characteristics of such dynamic functional units on traffic flow, this paper proposes a functional community dual hypergraph. Specifically, as follows... Figure 2 As shown in (b), firstly, the K road nodes with the highest destination overlap are selected to form the destination hyperedge, and then the K road nodes with the highest origin overlap are selected to form the origin hyperedge. During the node selection process, the total spatial distance between nodes within the hyperedge is minimized, thus constructing the dual hyperedge of the functional community hypergraph. To quantify the functional similarity between nodes, a new functional association matrix A is reconstructed based on the original weighted adjacency matrix A. c The calculation is as follows:
[0127]
[0128] in Let N be the element in the i-th row and j-th column of the functional association matrix, where N is the number of nodes. The value is related to the node v i and v j The functional similarity is negatively correlated. Based on this, the K-nearest neighbor algorithm is used to analyze A... c In this process, the C most similar neighbors of each node i are selected to form a functional community. Finally, the dual hypergraph G is constructed. c =(V c , ε c H c ), adjacency matrix H c The definition is as follows:
[0129]
[0130] in, For the adjacency matrix H c The element in the i-th row and j-th column, the road node Super Edge Functional community dual hypergraph adjacency matrix H c It can be divided into two parts, namely here This represents all hyperedge connections constructed based on the overlap of the starting nodes, while This represents the hyperedge connection relationship constructed based on the overlap of arriving nodes, and together they constitute the complete topological structure of the functional community dual hypergraph.
[0131] Step 4: Constructing a Macro-Scale Regional Transmission Dual Hypergraph: Traditional urban spatial modeling methods often overlook the macro-level impact mechanisms of cross-regional traffic flow transmission. For example, in urban traffic, even if there is no direct road connection between two regions, when a traffic accident occurs in one region, drivers may choose to detour, indirectly affecting the traffic flow in the other region. This dynamic macro-level transmission effect is difficult to capture using static adjacency matrices. To address this challenge, this paper proposes constructing a regional transmission dual hypergraph to learn the macro-level impact relationships between regions. Specifically, as follows... Figure 2 As shown in (c), the top K most influential neighbor nodes are first selected using the Top-K sampling method to construct the node influence adjacency matrix A. r The specific calculations are as follows:
[0132]
[0133] Where idx and idy ∈ R N×K Let represent selecting the K most influential neighbor nodes based on the starting point and the K most influential neighbor nodes based on the ending point, respectively. Then, the set of the Top-K influential neighbors for each node is used as a hyperedge, and a region propagation dual hypergraph G is constructed based on the node influence matrix Ar. r =(V r , ε r H r Its adjacency matrix H r The definition is as follows:
[0134]
[0135] in, For the adjacency matrix H r The element in the i-th row and j-th column, the road node Super Edge The adjacency matrix H of the region-transmitting dual hypergraph r It can be divided into two parts: in, and The hyperedge relationships corresponding to the starting and ending points respectively together constitute the complete topological structure of the regional transmission dual hypergraph.
[0136] Step 5: Gradually extract multi-scale high-order local spatial features through hypergraph convolution and dynamic graph convolution: such as Figure 3The diagram shown illustrates a high-order spatial perception process. First, a multi-scale hyperedge feature matrix X is constructed. h The original traffic flow features X∈R T×N×d The process of integrating into the constructed multi-scale dual hypergraph can be formally expressed as follows:
[0137]
[0138] Where [·; ·] is the concatenation operation, W1, W2∈R T×d×d′ These are learnable parameters used to perform a linear transformation on the original traffic flow features X, where d′ is the dimension of the hyperedge features. After applying these parameters to X, the result is W1X, W2X∈R. N×d′ Therefore, (W1X)[ind sre , :] is the hyperedge feature slice tensor constructed based on the hyperedge relationship at the starting point; (W2X)[ind dst , :] is the hyperedge feature slice tensor constructed based on the endpoint hyperedge relationship; next, the interaction relationships between hypernodes are further aggregated through convolution operations:
[0139] X h′ =Conv 1×1 (X h )∈R E×d″ (7)
[0140] Among them, Conv 1×1 (·) indicates that a 1×1 convolution kernel is used for convolution to update the features of the supernode, where d″ is the feature dimension of the supernode; to enhance the adaptability of the model, the diagonal adaptive learning parameters of the dual hypergraph are defined as follows:
[0141] W adp =diag(L adp )∈R N×N (8)
[0142] Among them, L adp ∈R N These are the weight vectors used for adaptive learning of hyperedges; based on this, the adaptive adjacency matrix of the dual hypergraph is defined as follows:
[0143]
[0144] Among them, D hv It is a hyperdiagonal matrix, D he H is the diagonal matrix of the degree of the supernodes, and H is the adjacency matrix of the dual hypergraph.
[0145] Subsequently, the adaptive adjacency matrix W h The information is input into a hypergraph convolutional network to aggregate the higher-order information of the hyperedges. The calculation process is as follows:
[0146]
[0147] It should be noted that Θ = [Θ0,...,Θ] N-1 ]∈R N×d″ These are the parameters to be learned; further, the Reshape() function is used to map the output of the hypergraph convolutional network into a sparse matrix, calculated as follows:
[0148]
[0149] Based on this, construct the reverse correlation matrix. A dynamic graph convolutional network is used to aggregate high-order local information about traffic flow, as shown below:
[0150]
[0151] Among them, A f =A / rowsum(A) is the row-normalized adjacency matrix. It is a column-normalized adjacency matrix. It is an adaptive adjacency matrix, and E1, E2 ∈ R N ×c It is a learnable parameter, Θ′ f =[Θ′ 0,f ,...,Θ′ N-1,f ]∈R N×d×1 Θ′ b =[Θ′ 0,b ,...,Θ′ N-1,b ]∈R N ×d×1 and Θ′ adp =[Θ′ 0,adp ,..,Θ′ N-1,adp ]∈R N×d×1 These are learnable parameters;
[0152] Finally, multiple Dynamic Graph Convolutional Networks (DGCNs) are employed to enable the model to comprehensively learn the high-order local spatial information of the multi-scale dual hypergraph. The specific calculations are as follows:
[0153] Z t =DGCN t (X),Z c =DGCN c (X),Z r =DGCN r (X) (13)
[0154] Among them, Z t Z c and Z rThese represent the higher-order spatial embeddings of the travel intention dual hypergraph, the community commuting dual hypergraph, and the regional transmission dual hypergraph, respectively.
[0155] Step 6: Achieve multi-scale fusion of high-order local spatial information through gating mechanisms: such as... Figure 3 As shown in the diagram of high-order spatial perception, a gating mechanism is introduced to avoid overfitting issues that may occur during the aggregation process when high-order spatial embedding is used. The specific process is as follows:
[0156]
[0157] Where ⊙ represents element-wise multiplication and adaptive weight g h It is obtained through the Sigmoid activation function, with Z ranging from 0 to 1. h This represents the final generated high-order local space embedding;
[0158] Step 7: Model global spatial dependencies using a multi-head attention mechanism: such as Figure 4 The diagram shown illustrates spatial multi-head attention. First, the input features X∈R are... T×N×d The query matrix Q is generated through three independent linear transformation layers. S Key matrix K S Sum matrix V S The specific calculations are as follows:
[0159]
[0160] in, and It is used to generate the query matrix Q S Key matrix K S Sum matrix V S The learnable parameter matrix of the linear layer;
[0161] Based on this, the attention score between nodes in the spatial dimension is calculated by introducing a scaling factor. To stabilize gradient calculations and prevent the softmax function from saturating due to excessively large dot product results, the attention score matrix is calculated as follows:
[0162]
[0163] in, K represents S The transpose of the expression. Considering the limited expressive power of a single attention mechanism, a multi-head attention mechanism is further introduced to capture the feature interactions of different subspaces. Specifically, the outputs of multiple attention heads are concatenated and then linearly transformed to obtain the final output:
[0164]
[0165] Where m represents the number of attention heads, W S It outputs the projection matrix, and Concat() is the concatenation operation;
[0166] Finally, a feature interaction mechanism is designed to enhance the expressive power of local features. This mechanism stabilizes the training process through residual connections and layer normalization, while introducing nonlinear transformations to enhance feature representation, thereby obtaining global spatial features. The calculation process is as follows:
[0167]
[0168] Where ReLU is the activation function and Layernorm is the layer normalization function;
[0169] Step 8: Dynamically adjust the weights of high-order local spatial embedding and global spatial features using an adaptive spatial fusion gate: such as... Figure 5 The adaptive spatial fusion gate shown introduces an entangling fusion mechanism to adaptively balance high-order local spatial embeddings and global spatial dependencies. This mechanism dynamically adjusts the contributions of the two features through gating weights, thereby obtaining the final output features that simultaneously contain local and global spatial information.
[0170]
[0171] Wherein, the adaptive weight g s It is obtained through the Sigmoid activation function, with values between 0 and 1, and outputs feature X. s ∈R N ×T×d It not only preserves high-order local spatial details but also incorporates global spatial dependencies, providing richer spatial feature representations for subsequent traffic flow prediction.
[0172] Step 9: Extract short-term traffic flow fluctuation patterns using dynamic temporally dilated causal convolution: such as Figure 6 The illustration shown illustrates time-aware enhancement, given the multi-scale spatially encoded feature sequence obtained in step 8. and filter The dilated causal convolution at time step t is calculated as follows:
[0173]
[0174] Where u is the expansion factor, which controls the jump distance and can control the receptive field of the model in the time dimension, making it easier to learn the short-term time dependence of traffic flow data.
[0175] Simultaneously, a gating mechanism is employed to control the output of the information flow. This utilizes gated activation units to extract short-term features from the output of the dilated causal convolutional layer in parallel, reducing the impact of external noise. Specifically:
[0176] Z T =Sigmoid(Φ1*X) S +α)☉Tanh(Φ2*X S +b) (21)
[0177] Here, Φ1 and Φ2 are time-independent one-dimensional dilated causal convolution operations, a and b are learnable offset parameters, Sigmoid() is the activation function that determines the proportion of information passed to the next layer, Tanh() is the activation function that performs the nonlinear transformation, and Z... T ∈R N×T×d ′ is a feature representation of time-aware enhancement.
[0178] Step 10: Capture long-term trend features of traffic flow data using a temporal multi-head attention mechanism: such as Figure 7 The diagram illustrating multi-head attention in time first maps the data to a query matrix Q through three independent linear transformation layers. T Key matrix K T Sum matrix V T The specific definition is as follows:
[0179]
[0180] in, and These are used to generate the query matrix Q. T Key matrix K T Sum matrix V T The learnable parameter matrix;
[0181] Based on this, the attention score between nodes in the time dimension is calculated by introducing a scaling factor. To stabilize gradient calculations and prevent the softmax function from saturating due to excessively large dot product results, the attention score matrix is calculated as follows:
[0182]
[0183] in, K represents T The transpose of the expression. Considering the limited expressive power of a single attention mechanism, a multi-head attention mechanism is further introduced to capture the feature interactions of different subspaces. Specifically, the outputs of multiple attention heads are concatenated and then linearly transformed to obtain the final output:
[0184]
[0185] Where m represents the number of attention heads, W T It is a linear transformation matrix used to map the output of temporal multi-head attention to the desired output dimension; Concat() is the concatenation operation.
[0186] Finally, a feature interaction mechanism is designed to enhance the expressive power of local features and address the problem of the multi-head attention mechanism's insensitivity to short-term temporal changes, thereby obtaining long-term temporal dependent features. The calculation process is as follows:
[0187]
[0188] Relu() is the activation function, and Layernorm() is the normalization function.
[0189] Step 11: Integrate the short-term dynamics and long-term temporal dependencies of historical traffic flow data through an adaptive temporal fusion gate: such as... Figure 8 The adaptive temporal fusion gate shown introduces an Entangle fusion mechanism to dynamically adjust the short-term dynamics and long-term temporal dependence contributions of historical traffic flow data through gating weights, thereby obtaining a spatiotemporal feature representation that combines multi-scale spatial information and long- and short-term temporal dependence characteristics.
[0190]
[0191] Wherein, the adaptive weight g t It is obtained through the activation function Sigmoid, with a value between 0 and 1.
[0192] The steps in S4 specifically include the following:
[0193] like Figure 9 The schematic diagram of the model output shown uses a skip connection mechanism to fuse the multi-scale spatiotemporal feature representations of the outputs from each layer. The calculation process is as follows:
[0194]
[0195] Among them, W a W b Here, M represents the number of spatiotemporally aware stacked layers, and Layernorm() is the layer normalization function. Finally, a 1×1 convolutional layer maps the fused features to the prediction result.
[0196]
[0197] in, This is a traffic flow prediction sequence for the next P time steps.
[0198] This invention designs a spatiotemporal perception method based on multi-scale dual hypergraph fusion. First, a multi-scale urban road dual hypergraph construction method is proposed, accurately representing the high-order spatial features of urban roads from three scales: micro-level individual travel intentions, meso-level community commuting, and macro-level regional traffic transmission. Specifically, the travel intention dual hypergraph G... t By analyzing individual travel intentions, spatial relationships at the micro-scale are captured; functional community dual hypergraph G c Based on community partitioning, a mesoscopic-scale functional synergistic relationship is constructed; a region-transmitted dual hypergraph G is used to represent this relationship. r By constructing a model based on the transmission patterns of road traffic flow, the model depicts large-scale spatial interaction characteristics at a macroscopic scale. Based on this, a gating mechanism is used to integrate high-order spatial features from the microscopic, mesoscopic, and macroscopic scales, as shown in formula (14). This effectively enhances the traffic flow prediction model's ability to model complex spatiotemporal dependencies, thereby achieving accurate capture of dynamic changes in traffic flow. For example, when predicting traffic flow in a commercial area during the morning rush hour, the microscopic scale G... t It can capture the commuting behavior of individuals in the surrounding residential area, at the mesoscale G c It can reflect the collaborative relationship between the business district and adjacent office areas, while the macro-scale G r This can characterize the traffic transmission effect of the city's main roads on the area. By integrating the characteristics of the three, the model can more accurately predict the traffic flow trend of the business district.
[0199] To fully verify the effectiveness and adaptability of the model under different traffic scenarios, this invention selects 14 representative traffic flow prediction methods as comparison baseline methods as shown in Table 2, and conducts comparative experiments on four datasets: PEMS03, PEMS04, PEMS07 and PEMS08.
[0200] Table 2
[0201] Model Abbreviation Year of publication introduce FC-LSTM 2015 Fully connected LSTM architecture, focusing on time series modeling DCRNN 2017 Bidirectional random walk and encoder-decoder framework STGCN 2018 Fully convolutional graph structure for efficient spatiotemporal modeling Graph WaveNet 2019 Adaptive Dependency Matrix and Dilated Convolution STSGCN 2020 Synchronous spatiotemporal modeling and handling of time-series heterogeneity OGCRNN 2020 Data-driven graph structure optimization STGODE 2021 Tensor differential equations and semantic adjacency matrices STFGNN 2021 Spatiotemporal graph fusion and gated convolution GMSDR 2022 Explicit Modeling of Multi-Step Dependencies FEDformer 2022 Frequency domain enhancement and seasonal trend decomposition HSTGCNT 2023 Hierarchical Transformer and Hypergraph Convolution Fusion AGFCRN 2024 Adaptive Hypergraph Fusion and Attention Mechanism DS-STGCN 2024 Dynamic multi-scale spatiotemporal modeling CDAGF 2024 Contrastive learning and residual enhancement decomposition
[0202] To ensure the reliability and fairness of the comparison results, the experiment followed the publicly available optimal parameter configurations of each comparison model, and the original data was divided into training, validation, and test sets in a 6:2:2 ratio. Model performance was evaluated using three mainstream metrics: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE), where smaller values for each metric indicate higher model prediction accuracy.
[0203] Table 3 presents the comparative experimental results of each model on the PEMS03, PEMS04, PEMS07, and PEMS08 datasets for predicting traffic flow 60 minutes in the future. The proposed model utilizes a hypergraph convolution mechanism to integrate multi-scale high-order information such as individual travel intentions, community commuting patterns, and regional traffic transmission into the modeling process, accurately characterizing the spatial correlation features in complex urban traffic. Experimental results show that the proposed model achieves state-of-the-art prediction performance on all datasets. Taking the PEMS03 dataset as an example, the proposed model achieves a MAE of 14.12, an RMSE of 22.52, and a MAPE of 12.95 on this dataset, representing improvements of 11.9% (MAE), 14.7% (RMSE), and 10.7% (MAPE) respectively compared to the suboptimal baseline model. This trend is also validated on other datasets, further demonstrating the superiority, adaptability, and robustness of the proposed method under different traffic scenarios. Meanwhile, since the hypergraph convolution structure in the model can effectively capture high-order dependencies in complex traffic networks, the proposed model has a more outstanding prediction performance on large-scale datasets (such as PEMS03 and PEMS07) than on small-scale datasets (such as PEMS04 and PEMS08), further demonstrating its advantages in processing complex, large-scale traffic data.
[0204] Table 3
[0205]
[0206] The above description is merely a preferred embodiment of the present invention. It should be understood that the present invention is not limited to the forms disclosed herein and should not be construed as excluding other embodiments. It can be used in various other combinations, modifications, and improvements, and can be altered within the scope of the concept described herein through the above teachings or related technologies or knowledge. Modifications and variations made by those skilled in the art that do not depart from the spirit and scope of the present invention should be within the protection scope of the appended claims.
Claims
1. A traffic flow prediction method based on multi-scale dual hypergraph fusion, characterized in that, The prediction method includes: S1. Based on the actual traffic network topology, a dual hypergraph of travel intention, a dual hypergraph of functional community, and a dual hypergraph of regional transmission are constructed from three scales: micro-level individual travel intention, meso-level community commuting, and macro-level regional traffic transmission, so as to realize the construction of a dual hypergraph of urban roads at multiple scales. S2. Based on the three-scale dual hypergraphs constructed in step S1 and the historical traffic flow feature matrix, firstly, multi-scale spatial features are extracted step by step using hypergraph convolution and dynamic graph convolution, and multi-scale fusion of high-order local spatial information is achieved through a gating mechanism. Then, global spatial dependencies are modeled by combining spatial multi-head attention mechanism, and an adaptive spatial fusion gate is introduced to dynamically adjust the weights of high-order local spatial embedding and global spatial features, finally generating a multi-scale spatial encoded feature sequence. S3. Input the multi-scale spatial coding feature sequence obtained in step S2 into a dynamic temporally dilated causal convolutional network to extract short-term fluctuation patterns of traffic flow. At the same time, a temporal multi-head attention mechanism is used to capture the long-term trend features of traffic flow data. Then, the short-term dynamics and long-term temporal dependencies of historical traffic flow data are adaptively integrated through an adaptive temporal fusion gate to form a spatiotemporal feature representation that combines multi-scale spatial information and long-term and short-term temporal dependencies. S4. Based on the spatiotemporal feature representation obtained in step S3, the spatiotemporal feature fusion is enhanced by residual connection and then input into the convolutional prediction layer to finally output the multi-step traffic flow prediction result.
2. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, The actual traffic network topology in step S1 is as follows: The actual traffic network topology is modeled as a weighted directed graph G = (V, ε, A), where the node set V = {v1, v2, ..., v...} N } represents N road segments, i.e., |V| = N; the set of edges ε{e1, e2, ..., e E The adjacency matrix A ∈ R describes the connection relationships of E roads, i.e., |ε| = E. N×N The spatial distance weights between coded road segments are used; furthermore, the historical traffic flow feature matrix over T time steps is X∈R. T×N×d , where d represents the traffic feature dimension of each road node.
3. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, The step S1 of constructing the travel intention dual hypergraph includes: The hyperedges are formed by all nodes starting from the same road node and all nodes ending at the same road node; the dual hypergraph of travel intention is represented as G. t =(V t , ε t H t Its adjacency matrix H t The definition is as follows: in, For the adjacency matrix H t The element in the i-th row and j-th column, the road node Super Edge The adjacency matrix H of the dual hypergraph of travel intention t It can be divided into two parts: in This represents the hyperedge connection relationship built based on the starting point. This represents the hyperedge connection relationship built based on the destination; together, they constitute the complete topological structure of the dual hypergraph of travel intentions.
4. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, The construction of the functional community dual hypergraph in step S1 includes: First, the K road nodes with the highest destination overlap are selected to form the destination hyperedge. Then, the K road nodes with the highest origin overlap are selected to form the origin hyperedge. During the node selection process, the total spatial distance between nodes inside the hyperedge is constrained to be the shortest. This is used to construct the dual hyperedge of the functional community dual hypergraph. To quantify the functional similarity between nodes, a new functional association matrix A is reconstructed based on the original weighted adjacency matrix A. c The calculation is as follows: in Let N be the element in the i-th row and j-th column of the functional association matrix, where N is the number of nodes. The value is related to the node v i and v j The functional similarity is negatively correlated; based on this, the K-nearest neighbor algorithm is used to analyze A... c In this process, the C most similar neighbors of each node i are selected to form a functional community. Finally, the dual hypergraph G is constructed. c =(V c , ε c H c ), adjacency matrix H c The definition is as follows: in, For the adjacency matrix H c The element in the i-th row and j-th column, the road node Super Edge Functional community dual hypergraph adjacency matrix H c It can be divided into two parts, namely This represents all hyperedge connections constructed based on the overlap of the starting nodes, while This represents the hyperedge connection relationship constructed based on the overlap of arriving nodes, and together they constitute the complete topological structure of the functional community dual hypergraph.
5. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, The construction of the region-transmitted dual hypergraph in step S1 includes: First, the top K most influential neighbor nodes are selected using the Top-K sampling method to construct the node influence adjacency matrix A. r The specific calculations are as follows: idx=Top-K(-A i,: ,K), ids=Top-K(-A :,j ,TO), A r =A[idx,idy], Where idx and idy ∈ R N×K Let A represent selecting the K most influential neighbor nodes based on the starting point and the K most influential neighbor nodes based on the ending point, respectively. Then, the set of the Top-K influential neighbors for each node is used as a hyperedge, and the node influence matrix A is used as the hyperedge. r Constructing a regional transmission dual hypergraph G r =(V r , ε r H r Its adjacency matrix H r The definition is as follows: in, For the adjacency matrix H r The element in the i-th row and j-th column, the road node Super Edge The adjacency matrix H of the region-transmitting dual hypergraph r It can be divided into two parts: in, and The hyperedge relationships corresponding to the starting and ending points respectively together constitute the complete topological structure of the regional transmission dual hypergraph.
6. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, In step S2, hypergraph convolution and dynamic graph convolution are used to gradually extract multi-scale spatial features, and a gating mechanism is used to achieve multi-scale fusion of high-order local spatial information, as detailed below: First, construct the multi-scale hyperedge feature matrix X. h The original traffic flow features X∈R T×N×d The process of integrating into the constructed multi-scale dual hypergraph can be formally expressed as follows: Where [·; ·] is the concatenation operation, W1, W2∈R T×d×d′ These are learnable parameters used to perform a linear transformation on the original traffic flow features X, where d′ is the dimension of the hyperedge features. After applying these parameters to X, the result is W1X, W2X∈R. N×d′ Therefore, (W1X)[ind src , :] is the hyperedge feature slice tensor constructed based on the hyperedge relationship at the starting point; (W2X)[ind dst , :] is the hyperedge feature slice tensor constructed based on the endpoint hyperedge relationship; next, the interaction relationships between hypernodes are further aggregated through convolution operations: X h ′=Conv 1×1 (X h )∈R E×d″ , Among them, Conv 1×1 (·) indicates that a 1×1 convolution kernel is used for convolution operation to update the features of the supernode, where d″ is the feature dimension of the supernode; to enhance the adaptability of the model, the diagonal adaptive learning parameters of the dual hypergraph are defined as follows: W adp =diag(L adp )∈R N×N , Among them, L adp ∈R N These are the weight vectors used for adaptive learning of hyperedges; based on this, the adaptive adjacency matrix of the dual hypergraph is defined as follows: Among them, D hv It is a hyper-diagonal matrix, D he H is the diagonal matrix of the degree of the supernodes, and H is the adjacency matrix of the dual hypergraph. Subsequently, the adaptive adjacency matrix W h The information is input into a hypergraph convolutional network to aggregate the higher-order information of the hyperedges. The calculation process is as follows: It should be noted that Θ = [Θ0, ..., Θ] N-1 ]∈R N×d" These are the parameters to be learned; further, the Reshape() function is used to map the output of the hypergraph convolutional network into a sparse matrix, calculated as follows: Based on this, construct the reverse correlation matrix. A dynamic graph convolutional network is used to aggregate high-order local information about traffic flow, as shown below: Among them, A f =A / rowsum(A) is the row-normalized adjacency matrix. It is a column-normalized adjacency matrix. It is an adaptive adjacency matrix, and E1, E2 ∈ R N×c It is a learnable parameter, Θ′ f =[Θ′ 0,f ,...,Θ′ N-1,f ]∈R N×d×1 Θ′ b =[Θ′ 0,b ,...,Θ′ N-1,b ]∈R N×d×1 and Θ′ adp =[Θ′ 0,adp ,...,Θ′ N-1,adp ]∈R N×d×1 These are learnable parameters; Then, multiple Dynamic Graph Convolutional Networks (DGCNs) are used to enable the model to comprehensively learn the high-order local spatial information of the multi-scale dual hypergraph. The specific calculation is as follows: Z t =DGCN t (X),Z c =DGCN c (X),Z r =DGCN,(X), Among them, Z t Z c and Z r These represent the higher-order spatial embeddings of the travel intention dual hypergraph, the community commuting dual hypergraph, and the regional transmission dual hypergraph, respectively. Finally, a gating mechanism is introduced to avoid overfitting during the aggregation process of high-order spatial embeddings. The specific process is as follows: Z cr =Sigmoid(Z c )☉Tanh(Z r ), g h =Sigmoid(Z cr +Z t ), Z h =g h ☉Z cr +(1-g h )☉Z t , Where ⊙ represents element-wise multiplication and adaptive weight g h It is obtained through the Sigmoid activation function, with Z ranging from 0 to 1. cr Z represents the higher-order spatial embedding of the community commuting dual hypergraph and the region transmission dual hypergraph. h This represents the final generated high-order local space embedding.
7. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, In step S2, the global spatial dependency relationship is modeled using a spatial multi-head attention mechanism, as detailed below: First, input features X∈R T×N×d The query matrix Q is generated through three independent linear transformation layers. s Key matrix K S Sum matrix V S The specific calculations are as follows: in, and These are used to generate the query matrix Q. S Key matrix K S Sum matrix V S The linear layer can learn the parameter matrix; Based on this, the attention score between nodes in the spatial dimension is calculated by introducing a scaling factor. To stabilize gradient calculations and prevent the softmax function from saturating due to excessively large dot product results, the attention score matrix is calculated as follows: in, K represents S The transpose of the expression; due to the limited expressive power of a single attention mechanism, a multi-head attention mechanism is further introduced to capture the feature interactions of different subspaces. Specifically, the outputs of multiple attention heads are concatenated and then linearly transformed to obtain the final output: Where m represents the number of attention heads, W s It outputs the projection matrix, and Concat() is the concatenation operation; Finally, a feature interaction mechanism is designed to enhance the expressive power of local features. This mechanism stabilizes the training process through residual connections and layer normalization, while introducing nonlinear transformations to enhance feature representation, thereby obtaining global spatial features. The calculation process is as follows: Here, ReLU is the activation function, and Layernorm is the layer normalization function.
8. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, In step S2, an adaptive spatial fusion gate is introduced to dynamically adjust the weights of high-order local spatial embedding and global spatial features, as detailed below: An Entangle fusion mechanism is introduced to adaptively balance high-order local spatial embeddings and global spatial dependencies. This mechanism dynamically adjusts the contribution of the two types of features through gating weights, thereby obtaining the final output features that simultaneously contain local and global spatial information. Wherein, the adaptive weight g s It is obtained through the Sigmoid activation function, and the value is between 0 and 1.
9. The traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, Step S3 includes the following: (3.1) Short-term fluctuation patterns of traffic flow are extracted through dynamic time-dilated causal convolution, specifically: Given the multi-scale spatially encoded feature sequence obtained in step S2 and filter The dilated causal convolution at time step t is calculated as follows: Where u is the expansion factor, which controls the jump distance and can control the receptive field of the model in the time dimension, making it easier to learn the short-term time dependence of traffic flow data. Simultaneously, a gating mechanism is employed to control the output of the information flow. This utilizes gated activation units to extract short-term features from the output of the dilated causal convolutional layer in parallel, reducing the impact of external noise. Specifically: Z T =Sigmoid(Φ1*X S +a)☉Tanh(Φ2*X S +b) Here, φ1 and φ2 are time-independent one-dimensional dilated causal convolution operations, a and b are learnable offset parameters, sigmoid() is the activation function used to determine the proportion of information passed to the next layer, and tanh() is the activation function that performs the nonlinear transformation. Z T ∈R N×T×d′ It is a feature representation of enhanced time perception; (3.2) The method of using a time-based multi-head attention mechanism to capture the long-term trend features of traffic flow data specifically includes: First, the query matrix Q is mapped through three independent linear transformation layers. T Key matrix K T Sum matrix V T The specific definition is as follows: in, and These are used to generate the query matrix Q. T Key matrix K T Sum matrix V T The learnable parameter matrix; Based on this, the attention score between nodes in the time dimension is calculated by introducing a scaling factor. To stabilize gradient calculations and prevent the softmax function from saturating due to excessively large dot product results, the attention score matrix is calculated as follows: in, K represents S The transpose of the expression is given. Considering the limited expressive power of a single attention mechanism, a multi-head attention mechanism is further introduced to capture the feature interactions of different subspaces. Specifically, the outputs of multiple attention heads are concatenated and then linearly transformed to obtain the final output. Where m represents the number of attention heads, W T It is a linear transformation matrix used to map the output of temporal multi-head attention to the desired output dimension; Concat() is the concatenation operation. Finally, a feature interaction mechanism is designed to enhance the expressive power of local features and address the problem of the multi-head attention mechanism's insensitivity to short-term temporal changes, thereby obtaining long-term temporal dependent features. The calculation process is as follows: Relu() is the activation function, and Layernorm() is the normalization function. (3.3) The method of integrating the short-term dynamics and long-term temporal dependencies of historical traffic flow data through an adaptive time fusion gate is as follows: An Entangle fusion mechanism is introduced to dynamically adjust the short-term dynamics and long-term temporal dependence contributions of historical traffic flow data through gating weights, thereby obtaining a spatiotemporal feature representation that combines multi-scale spatial information and long- and short-term temporal dependence characteristics. Wherein, the adaptive weight g t It is obtained through the activation function Sigmoid, with a value between 0 and 1.
10. A traffic flow prediction method based on multi-scale dual hypergraph fusion according to claim 1, characterized in that, Step S4 specifically includes the following: First, the multi-scale spatiotemporal feature representations of the outputs from each layer are fused through a skip connection mechanism. The calculation process is as follows: Among them, W a W b Here, M represents the number of network layers, and Layernorm() is the layer normalization function. Finally, a 1×1 convolutional layer maps the fused features to the prediction result. in, Traffic flow prediction sequence for the next P time steps.