An algorithm for highway event detection based on data posture analysis

By using an algorithm based on data situation analysis, multi-source traffic data is integrated for graph modeling and fuzzy label training. Graph neural networks are then used for prediction and abnormal subgraph detection. This solves the problem of insufficient multi-source data integration in highway event detection, enabling accurate identification and risk assessment of traffic emergencies and improving detection accuracy and adaptability.

CN121122007BActive Publication Date: 2026-06-19YUNNAN XUANHUI EXPRESSWAY CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
YUNNAN XUANHUI EXPRESSWAY CO LTD
Filing Date
2025-08-05
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing highway incident detection technologies suffer from insufficient fusion of multi-source heterogeneous data and a lack of spatiotemporal dynamic modeling capabilities, resulting in limited detection accuracy and coverage, making it difficult to achieve accurate perception and rapid identification of traffic emergencies.

Method used

An algorithm based on data situation analysis is adopted to construct a dynamic graph structure through multi-source traffic data graph modeling, fuzzy label-guided training, graph neural network prediction, and abnormal subgraph detection. This enables intelligent perception and risk assessment of traffic conditions, including techniques such as unified graph structure modeling of multi-source data, fuzzy label-guided training, graph neural network prediction, and abnormal subgraph scoring.

Benefits of technology

It significantly improves the accuracy and coverage of traffic incident detection, supports the identification of structured abnormal areas, enables accurate prediction and risk assessment of traffic conditions, reduces false alarm rate, and has good scalability and engineering applicability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121122007B_ABST
    Figure CN121122007B_ABST
Patent Text Reader

Abstract

This invention discloses a highway event detection algorithm based on data situation analysis, belonging to the field of intelligent transportation system technology. It constructs a dynamic graph structure by fusing multi-source data such as ETC gantries, toll stations, traffic detectors, and weather data. It then uses graph convolution and LSTM to extract spatiotemporal features, achieving fuzzy distribution prediction of traffic conditions. Based on the prediction results, it calculates the node score mutation rate, combines the probability of severe congestion to screen abnormal nodes, and extracts connected and structurally compact abnormal subgraphs as potential event regions. A multi-factor event scoring function is designed, fusing mutation intensity, structural density, and state offset direction to quantify risk levels. An adaptive boundary learner is introduced to dynamically determine alarms, avoiding the limitations of fixed threshold scenarios. The algorithm outputs structured event information, improving detection accuracy and alarm flexibility, and is suitable for intelligent perception and early warning of highway traffic events.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of intelligent transportation systems technology, specifically relating to an algorithm for highway event detection based on data situation analysis. Background Technology

[0002] With the rapid expansion of the expressway network and the continuous growth of motor vehicle ownership, the road traffic safety situation is becoming increasingly severe. Traffic congestion, traffic accidents, severe weather, and other emergencies not only seriously threaten driving safety but also significantly reduce road traffic efficiency, posing enormous challenges to expressway operation and management.

[0003] Against this backdrop, how to construct an efficient and accurate automatic traffic incident detection system has become a core issue that urgently needs to be addressed in the field of intelligent transportation. Existing traffic incident detection technologies mainly fall into two categories and their limitations:

[0004] The first category is rule-based detection methods based on fixed thresholds. These methods determine whether an anomaly has occurred on a road segment by pre-setting thresholds for traffic flow parameters such as speed and volume. However, this method is poorly adaptable to different types of traffic events, easily affected by data noise, resulting in a high false alarm rate and failing to meet practical application needs.

[0005] The second category is classification and prediction methods based on traditional machine learning. These methods use algorithms such as support vector machines, decision trees, and random forests to model and analyze historical traffic data. While they improve detection performance to some extent, these methods primarily rely on static features for classification and lack effective modeling of the spatiotemporal evolution of traffic conditions, failing to accurately capture the complex dynamic relationships within the road network.

[0006] Furthermore, existing technologies generally suffer from a key deficiency: they fail to fully utilize the abundant multi-source heterogeneous data resources within highway systems. Data from ETC gantry systems, toll station entry and exit information, traffic detector data, and meteorological information are used independently, lacking an effective fusion and analysis mechanism, which limits detection accuracy and coverage.

[0007] Therefore, there is an urgent need to propose a highway event detection method that can deeply integrate multi-source traffic data, has spatiotemporal dynamic modeling capabilities, and supports the identification of structured abnormal areas, so as to achieve accurate perception, rapid identification, and intelligent early warning of traffic emergencies. Summary of the Invention

[0008] To address the aforementioned issues, this invention proposes a highway event detection algorithm based on data situation analysis. This algorithm integrates key technologies such as multi-source traffic data graph modeling, fuzzy label-guided training, graph neural network prediction, score distribution mutation detection, abnormal subgraph scoring, and boundary discrimination. The overall process covers the entire process of data modeling, prediction, detection, and decision-making, enabling intelligent perception of highway traffic operation status, accurate identification of emergencies, and dynamic assessment of risk levels.

[0009] The technical solution adopted in this invention is as follows:

[0010] A highway event detection algorithm based on data situational analysis includes the following steps:

[0011] Step 1: Unified graph structure modeling of multi-source traffic data: Collect multi-source heterogeneous traffic data on highways, and map the multi-source heterogeneous traffic data into a unified dynamic graph structure through graph structure definition and edge relationship construction mechanism;

[0012] Step 2: Fuzzy Label-Guided Training Mechanism: Based on the dynamic graph structure, three fuzzy categories are divided into smooth, moderately congested, and severely congested. Fuzzy labels are generated using the triangular fuzzy membership function. The fuzzy labels are used as supervision signals to construct a multi-task joint loss function and complete end-to-end training.

[0013] Step 3: Graph Neural Network Fuzzy Distribution Prediction: Construct a spatiotemporal joint modeling prediction framework. Based on the dynamic graph structure and multi-task joint loss function, use graph neural network (GNN) and recurrent neural network (LSTM) to collaboratively extract the spatial dependence and temporal evolution features of highway traffic state, and accurately predict the fuzzy distribution of traffic state at each node in the future.

[0014] Step 4: Anomaly Subgraph Detection Based on Distribution Abrupt Changes: An anomaly detection mechanism driven by predicting distribution abrupt changes is proposed. Based on the fuzzy distribution of traffic states at each node and the dynamic adjacency matrix in the dynamic graph structure, the set of anomaly candidate regions is identified by analyzing the time change rate of the fuzzy distribution of traffic states.

[0015] Step 5 Event Scoring and Alarm Determination: Based on the set of abnormal candidate regions, the risk level of each sub-graph is quantified by defining a multi-factor event scoring function and an adaptive alarm determination mechanism, and the alarm is dynamically determined to be triggered.

[0016] Furthermore, in step 1, the multi-source heterogeneous traffic data includes: collecting ETC gantry traffic data, toll station entry and exit information, traffic detector data, weather information, and holiday markers.

[0017] In step 1, the graph structure is defined as follows: at any time t, the traffic network is modeled as a dynamic graph G. t ;

[0018] Animated Graph G t Represented as: G t =(V,E) t ,X t );

[0019] V represents the set of spatial nodes in the transportation network; V = {v1, v2, ..., v...} N Each node represents a traffic location unit with data observations;

[0020] E t This represents the relationship between nodes at time t;

[0021] X t Represents the node feature matrix; X t ∈R N×d Each row x i (t) represents node v i The multidimensional eigenvector at time t;

[0022] The feature vector x of each node i (t) is formed by fusing multi-source heterogeneous traffic data, and is represented as:

[0023] x i (t)=[f i ETC (t),f i Toll (t),q i (t),u i (t),w i (t),h i (t)];

[0024] In the formula, x i f(t) represents the eigenvector of a node at time t; i ETC (t) represents the frequency of vehicles passing through the ETC gantry node per unit time; f i Toll (t) represents the difference in the number of vehicles entering and exiting the toll station; q i (t) represents the flow rate per unit time; u i (t) represents the average velocity; w i (t) represents the weather category code, 0 = sunny, 1 = rain, 2 = snow, 3 = fog; h i (t) indicates whether it is a holiday, 0 = holiday, 1 = other; i is the node number.

[0025] Furthermore, in step 1, the edge relationship construction mechanism includes: basic connection relationships, dynamic similarity enhancement, and fusion strategy to construct a dynamic adjacency matrix;

[0026] Basic connectivity: Define the initial adjacency matrix A based on the real road network structure. phys , represented as:

[0027]

[0028] In the formula, a ij (t) represents the value of node v in the graph at time t. i With node v j Does a connection exist between them, i.e., does an edge exist? i and j are the index numbers of the nodes, representing the i-th and j-th traffic observation points in the graph; v i v j These represent the i-th and j-th spatial nodes in the graph, respectively. Each node corresponds to a geographic location unit on the highway that has the ability to collect data. Represents spatial node v i With v j Does a direct road connection exist in the physical road network, i.e., can one travel directly via a highway without detours? t is a time variable, representing the current point in time or time step.

[0029] Dynamic similarity enhancement: Introducing historical sequence similarity calculation based on traffic features;

[0030] Space node v i ,v j The historical sequence within the window length T is as follows:

[0031] s i =[f i (t-T+1),...,f i (t)];

[0032] In the formula, s i Represents spatial node v i A historical time series along a certain feature dimension is a vector of length T; f i () represents the nodal observation function, indicating the observation of spatial node v. i The value obtained from observation at a certain moment; T is the length of the time window, representing the number of historical time steps used for modeling; t-T+1 represents the starting time point of the historical sequence;

[0033] The similarity score between node pairs is calculated using cosine similarity.

[0034]

[0035] In the formula, sim(i,j) represents the spatial node v i With spatial node v j Similarity score between them; s i For spatial node vi Historical characteristic sequence; s j For spatial node v j Historical characteristic sequence; s i ·s j For vector s i With s j dot product; |s i | is the vector s i The modulus; |s j | is the vector s j The model;

[0036] The fusion strategy constructs a dynamic adjacency matrix A t Update edge weights by combining physical connectivity and similarity:

[0037]

[0038] In the formula, A t (i,j) represents the spatial node v in the dynamically adjusted adjacency matrix at time t. i With v j The edge weights between them; θ high The connection enhancement threshold is a preset lower limit of similarity; θ low The connection pruning threshold is a preset upper limit of similarity; K% is a dynamic edge ratio control parameter, indicating that the top K% and bottom K% of similarity in the entire set of node pairs are selected as special processing objects; A phys (i,j) are elements of the original physical adjacency matrix, representing empty nodes v. i With v j Does a direct road connection exist in a real highway network?

[0039] The dynamic graph structure is as follows: The final generated time evolution graph structure sequence is as follows:

[0040] {G 1 G 2 ,...G t ,...}, where G t =(V,A t ,X t );

[0041] In the formula, G t The traffic map constructed at time point t represents a dynamically evolving graph structure; A t This represents a dynamic adjacency matrix that integrates historical traffic similarity with the original road network structure.

[0042] Furthermore, in step 2, when dividing into three fuzzy categories, all membership degrees satisfy the normalization condition:

[0043]

[0044] Define spatial node v i The fuzzy label at time t is:

[0045]

[0046] In the formula, S i (t) represents the spatial node v in the graph at time t. i The fuzzy scoring vector is a three-dimensional probability distribution vector used to describe the degree of membership of the current traffic state of the node under the three fuzzy categories; the index number of node i indicates the i-th traffic observation node in the graph; t is a time variable, representing the current time point or time step; Represents spatial node v i The membership degree of a business in the open state at time t; Represents spatial node v i The membership degree at time t when the congestion is moderate; Represents spatial node v i The membership degree of a state in severe congestion at time t; (1), (2), (3) represent three predefined fuzzy traffic state categories: smooth flow, moderate congestion, and severe congestion.

[0047] Furthermore, in step 2, fuzzy labels are generated from traffic parameters including speed, density, and flow rate using triangular fuzzy membership functions;

[0048] The fuzzy membership function for triangles is as follows:

[0049] Severe congestion μ (3) (u):

[0050]

[0051] Moderate congestion μ (2) (u):

[0052]

[0053] Unobstructed μ (1) (u):

[0054]

[0055] In the formula, u represents the average speed of a node at the current moment, which is the core input variable for generating fuzzy labels; μ (k) (u) represents the membership degree of the road segment to the kth traffic state when the average speed is u; u1, u2, and u3 are three empirical speed thresholds that satisfy u1 < u2 < u3 to divide the transition interval between different traffic states.

[0056] Using fuzzy labels as supervision signals, a multi-task joint loss function is constructed:

[0057] L total =L pred +λ1L fuzzy +λ2L structure ;

[0058] In the formula, L total L represents the total loss function; pred This represents the mean square error in traditional continuous value prediction, such as speed and flow rate; L fuzzy L represents the loss estimated by the fuzzy distribution. structure This indicates that the smoothness of the neighborhood prediction results is maintained in the graph structure; λ1 and λ2 represent the weighting coefficients of different components in the loss function.

[0059] Further, step 3 includes the following steps:

[0060] Step 3.1 Graph convolution to extract spatial dependencies:

[0061] Graph G at each time t t =(V,A t ,X t Spatial feature extraction is performed using Chebyshev graph convolution ChebNet. The Chebyshev graph convolution calculation formula is as follows:

[0062]

[0063] In the formula, H (l+1),t This represents the output feature matrix of the (l+1)th layer of the graph convolutional network at time t, with dimension . H (l),t This represents the input feature matrix of the l-th layer of the graph convolutional network at time t, with dimension . N represents the total number of spatial nodes in the graph, i.e., the number of location units with observation capabilities in the highway; d l The dimension of the feature in layer l varies with the layer; σ() is the activation function; To sum over k from 0 to K, we have θ, which represents the filtering expansion of the graph signal using a K-order Chebyshev polynomial; k The learnable graph convolution kernel parameters, equivalent to the weight coefficients of a filter, are automatically optimized through training and used to control the contribution of different orders of graph Laplacian operators; T k () represents the k-th order Chebyshev polynomial; L t Here, K is the normalized graph Laplace matrix; K is the polynomial order.

[0064] Graph convolution is performed separately at each time step, and the output sequence is:

[0065]

[0066] In the formula, H is a spatially encoded output set for time-series processing, containing results for T time steps from t=1 to t=T; (L),t The final output feature matrix after L layers of graph convolution at time t; T is the length of the sliding time window; For spatial node v i The spatial encoding vector at time t is H t The i-th row belongs to R d ;

[0067] Step 3.2 Constructing the time series input:

[0068] The spatial encoding results of the historical T time steps are stacked to form the time series input for each node; the formula for calculating the spatial encoding sequence is as follows:

[0069]

[0070] In the formula, z i Represents spatial node v i The time series input is a two-dimensional matrix with dimensions T×d; i Represents the i-th spatial node in the graph; t-T+1 represents the starting time point of the historical sequence; R T×d Indicate z i It is a real matrix with T rows and d columns;

[0071] Step 3.3 LSTM extracts time-dependent features:

[0072] z i Inputting the data into a Long Short-Term Memory (LSTM) network captures the long-term and short-term evolution patterns of traffic states; the formula for calculating the comprehensive state evolution trend vector is as follows:

[0073]

[0074] In the formula, Represents spatial node v i The time-dependent feature vector obtained after LSTM processing belongs to R. d ; i is the node index, representing the i-th traffic observation node in the graph; LSTM() is a Long Short-Term Memory network used to capture long-term and short-term time-dependent features in the input sequence;

[0075] Step 3.4: Fuse and output the fuzzy prediction distribution:

[0076] The comprehensive state evolution trend vector is input into the fully connected layer, and combined with the softmax activation function, the fuzzy state prediction for the future time t+1 is output; the softmax activation function is calculated as follows:

[0077]

[0078] In the formula, Represents spatial node v i The predicted fuzzy score vector at future time t+1; softmax() is the Softmax function used to convert a real vector into a probability distribution; W o ∈R 3×d b is the prediction layer weight matrix; o ∈R 3 The bias term is a learnable 3D vector used to adjust the output baseline value for each state, thereby improving the model's expressive power. The linear transformation result of the fully connected layer is output as a 3-dimensional real vector.

[0079] Furthermore, step 4 includes the following steps:

[0080] Step 4.1 Calculate the node score mutation rate:

[0081] Define spatial node v i The score mutation rate at time t quantifies the degree of drastic change in the traffic state distribution; the formula for calculating the score mutation rate is as follows:

[0082]

[0083] In the formula, ΔS i (t) represents the spatial node v i The rate of change in the score at time t; This indicates that at time t, the model is effective for spatial nodes v. i Predicted membership degree for traffic condition k; k=1 for smooth traffic, k=2 for moderate congestion, and k=3 for severe congestion;

[0084] Step 4.2 Filter the set of candidate abnormal nodes:

[0085] By setting dual screening criteria, a seed set of suspected abnormal nodes is constructed, represented as follows:

[0086]

[0087] In the formula, V Δ For the set of candidate abnormal nodes; v i Let represent the i-th spatial node in the graph, where each node corresponds to a geographic location unit on the highway with data collection capabilities; ΔS i (t) represents the spatial node v i The score mutation rate at time t; θ1 represents the score mutation rate threshold, used to screen for significant transition nodes; For spatial node vi The membership degree of the severe congestion state is predicted at the next time step t+1; θ2 represents the lower bound of the severe congestion score component, used to ensure that the risk state is made explicit.

[0088] Step 4.3 Extracting the anomaly subgraph:

[0089] In graph structure G t =(V,A t On V Δ Based on the nodes in the graph, extract the connected subgraph that satisfies the following two conditions:

[0090] Connectivity requirement: Subgraph G sub =(V sub E sub All nodes in a node must be topologically reachable from each other;

[0091] Structural density constraint: Define the edge density ρ of the subgraph, which satisfies the following equation:

[0092]

[0093] In the formula, ρ is the edge density of the subgraph, representing the proportion of the actual number of edges in the subgraph to the maximum possible number of edges; E sub V is the set of edges in the subgraph, that is, the set of connections between all nodes in the subgraph; sub θ represents the set of nodes in the subgraph, that is, the set of all nodes that constitute the candidate anomaly region; θ3 represents the structural compactness threshold.

[0094] Step 4.4 Output the set of anomaly candidate regions: Collect all subgraphs that meet the conditions as follows:

[0095] G sub ={G1,G2,...,G m};

[0096] In the formula, G sub G is a set of anomalous candidate regions. j Let be the j-th anomaly candidate subgraph, representing a potential traffic anomaly impact area; m is the total number of anomaly candidate subgraphs.

[0097] Furthermore, in step 5, the multi-factor event scoring function is defined as follows:

[0098] E(G sub )=α·Avg(ΔS i )+β·ClusteringCoeff(G sub )+θ·RatingShift(G sub );

[0099]

[0100] In the formula, E(G) sub The event is scored to comprehensively assess whether an alarm should be triggered in the current abnormal subgraph; a higher score indicates a higher risk. sub Avg(ΔS) represents the set of anomalous candidate regions; α is the mutation intensity weight; Avg(ΔS) i ) represents the average score mutation rate of all nodes within the subgraph, reflecting the severity of overall state transitions; β represents the structural density weight; ClusteringCoeff(G sub RatingShift(G) represents the average local clustering coefficient of the subgraph, measuring the compactness and cohesion of its topology; θ represents the weight of the state shift direction; RatingShift(G) represents the average local clustering coefficient of the subgraph. sub The average trend of the membership degree of severely congested spatial nodes within the subgraph is used to determine whether congestion is worsening; ΔS i For spatial node v i The score mutation rate; V sub T represents the set of nodes in the subgraph, that is, the set of all nodes that constitute the candidate anomaly region; i For spatial node v i The number of triangles formed around the center, i.e., the number of triples where any two of its neighbors are also connected; d i The degree of a spatial node is the number of edges directly connected to it. For spatial node v i The local clustering coefficient; For spatial node v i The membership degree of severe congestion prediction at time t; For spatial node v i Predict the membership degree of the severe congestion state at the next time step t+1.

[0101] Furthermore, in step 5, the judgment rule of the adaptive alarm discrimination mechanism is as follows:

[0102]

[0103] In the formula, Alert(G) sub The result of the alarm trigger is a binary output; B(G) represents the alarm trigger result. sub ) represents the alarm boundary function; others refer to all functions that do not satisfy E(G) sub )>B(G sub (The situation is as follows.)

[0104] The present invention provides a highway event detection algorithm based on data situation analysis, which has the following significant advantages compared with the prior art:

[0105] 1. Improved event detection accuracy: This invention integrates multi-source heterogeneous data such as ETC gantries, toll stations, traffic detectors, weather and holidays to construct a dynamic graph structure. It fully explores the complex correlation between traffic conditions in spatial topology and temporal evolution, overcomes the detection blind spots caused by single data in traditional methods, and significantly improves the accuracy of identifying abnormal events such as traffic accidents and sudden congestion.

[0106] 2. Achieve refined modeling of traffic conditions: By introducing a fuzzy label-guided training mechanism, traffic conditions are transformed from traditional hard classification into a three-dimensional membership probability distribution of "smooth flow - moderate congestion - severe congestion", which supports the continuous expression of critical states and gradual processes, and enhances the model's ability to perceive and its robustness in the evolution of traffic conditions.

[0107] 3. Enhance spatiotemporal joint modeling capabilities: Construct a spatiotemporal prediction framework based on graph convolutional network (GCN) and LSTM. GCN extracts the spatial dependencies between nodes, while LSTM captures the dynamic evolution trend of time series. The two work together to achieve accurate prediction of traffic conditions, providing a high-quality pre-analysis foundation for anomaly detection.

[0108] 4. Supports structured anomaly region identification: No longer limited to point-like anomaly judgment, it filters candidate anomaly nodes by calculating the node score mutation rate and combining it with the probability of severe congestion. Furthermore, it extracts connected and structurally dense anomaly subgraphs based on graph structure, and identifies potential event regions with physical continuity and spatial clustering, thereby improving the spatial interpretability and practical application value of the detection results.

[0109] 5. Implement multi-factor fusion-based risk quantification assessment: Design a multi-factor event scoring function that integrates "mutation intensity," "structural density," and "state shift direction" to comprehensively assess the risk level of abnormal subgraphs. The scoring not only focuses on the severity of changes but also considers topological cohesion and congestion trends, avoiding misjudging recovery processes as sudden events and effectively reducing false alarm rates.

[0110] 6. Flexible and Adaptive Alarm Strategy: An adaptive boundary learner based on supervised learning is introduced, and the alarm threshold is dynamically adjusted according to the contextual features such as the road segment, time period, weather, and traffic flow of the sub-graph. This overcomes the problem of poor adaptability of traditional fixed thresholds in different scenarios, and realizes an intelligent alarm strategy of "relaxed during peak hours, sensitive during rainy days, and high alert for remote road segments", thereby improving the practicality and stability of the system.

[0111] 7. Output structured event information for decision support: When alarm conditions are met, the system outputs structured event information including timestamp, subgraph ID, comprehensive score, maximum mutation rate, status level, and alarm level. It supports visualization, manual review, and emergency dispatch linkage, providing timely, accurate, and operable decision-making basis for highway operation and management.

[0112] 8. Excellent scalability and engineering applicability: The algorithm framework is modularly designed, supports online streaming processing and rolling prediction, can be deployed on provincial traffic monitoring platforms, and is suitable for real-time event detection scenarios in large-scale road networks. It has good scalability and engineering implementation potential.

[0113] In summary, this invention, through a complete technical chain of "multi-source fusion modeling—fuzzy state learning—spatiotemporal prediction—structured detection—intelligent alarm," achieves high-precision, early detection, interpretable, and adaptive intelligent detection of highway traffic incidents, significantly improving traffic safety and management efficiency, and has significant application value and promising prospects for promotion. Attached Figure Description

[0114] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this invention. For those skilled in the art, other drawings can be obtained based on these drawings.

[0115] Figure 1 This is a flowchart of the highway event detection algorithm based on data situation analysis according to the present invention;

[0116] Figure 2 This is a schematic diagram of highway ETC gantry data collection according to the present invention;

[0117] Figure 3 This is a schematic diagram of the distribution evolution of the node fuzzy state prediction in this invention;

[0118] Figure 4 This is a schematic diagram of the thermodynamics of node scoring mutations in this invention. Detailed Implementation

[0119] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative effort are within the scope of protection of the present invention.

[0120] Existing highway event detection methods rely on fixed thresholds or static feature modeling and lack the ability to fuse multi-source heterogeneous data and characterize the spatiotemporal evolution of traffic conditions, resulting in low anomaly identification accuracy and poor adaptability, making it difficult to achieve accurate perception and intelligent early warning of emergencies. This embodiment provides a highway event detection algorithm based on data situation analysis. This algorithm integrates key technologies such as multi-source traffic data graph modeling, fuzzy label-guided training, graph neural network prediction, score distribution mutation detection, anomaly subgraph scoring, and boundary discrimination. The overall process covers the entire process of data modeling, prediction, detection, and decision-making, enabling intelligent perception of highway traffic operation status, accurate identification of emergencies, and dynamic assessment of risk levels.

[0121] Specifically, such as Figure 1 As shown, this highway event detection algorithm based on data situational analysis includes the following steps:

[0122] Step 1: Unified graph structure modeling of multi-source traffic data:

[0123] This step unifies the modeling of scattered and heterogeneous multi-source traffic data in highways, transforming it into a spatiotemporal dynamic graph structure. This provides standardized input for subsequent graph neural network (GNN) processing, enabling structured representation and evolutionary analysis of traffic network states.

[0124] Step 1.1 Input data source:

[0125] like Figure 2 As shown in the table below, multi-source heterogeneous traffic data collected on highways were analyzed.

[0126] Data types Specific content ETC gantry data Vehicle frequency passing through each node per unit time Toll station data Difference in the number of vehicles entering and exiting each station Traffic detector data <![CDATA[Link flow q i (t) and average speed v i (t)]]> Weather information Weather category codes (e.g., 0 = sunny, 1 = rain, 2 = snow, 3 = fog) Time information Is it a holiday (0 = holiday, 1 = non-holiday)?

[0127] Step 1.2 Graph Structure Definition:

[0128] At any time t, the traffic network is modeled as a dynamic graph G. t ;

[0129] Animated Graph G t Represented as: G t =(V,E) t ,X t );

[0130] V represents the set of spatial nodes in the transportation network; V = {v1, v2, ..., v...} N Each node represents a traffic location unit with data observations, such as: ETC gantry point, detector location, etc.

[0131] E t This represents the relationship between nodes at time t;

[0132] Xt Represents the node feature matrix; X t ∈R N×d Each row x i (t) represents node v i The multidimensional eigenvector at time t;

[0133] The feature vector x of each node i (t) is formed by fusing multi-source heterogeneous traffic data, and is represented as:

[0134] x i (t)=[f i ETC (t),f i Toll (t),q i (t),u i (t),w i (t),h i (t)];

[0135] In the formula, x i f(t) represents the eigenvector of a node at time t; i ETC (t) represents the frequency of vehicles passing through the ETC gantry node per unit time; f i Toll (t) represents the difference in the number of vehicles entering and exiting the toll station; q i (t) represents the flow rate per unit time; u i (t) represents the average velocity; w i (t) represents the weather category code, 0 = sunny, 1 = rain, 2 = snow, 3 = fog; h i (t) indicates whether it is a holiday, 0 = holiday, 1 = other; i is the node number.

[0136] Step 1.3 Edge Relationship Construction Mechanism:

[0137] The edge relationship construction mechanism includes: basic connection relationship, dynamic similarity enhancement, and fusion strategy to construct dynamic adjacency matrix.

[0138] Basic connectivity, i.e., physical adjacency: The initial adjacency matrix A is defined based on the actual road network structure. phys , represented as:

[0139]

[0140] In the formula, a ij (t) represents the value of node v in the graph at time t. i With node v j Does a connection exist between them, i.e., does an edge exist? i and j are the index numbers of the nodes, representing the i-th and j-th traffic observation points in the graph; vi v j These represent the i-th and j-th spatial nodes in the graph, respectively. Each node corresponds to a geographic location unit on the highway that has the ability to collect data. Represents spatial node v i With v j Does a direct road connection exist in the physical road network, i.e., can one travel directly via a highway without detours? t is a time variable, representing the current point in time or time step.

[0141] Assume: v3 represents the ETC gantry at K100+500 on a certain highway, and v5 represents the detector at K105+000 downstream of it. The two are continuous main roads without branching or breakpoints; then: a 35 (t) = 1, indicating connectivity; if v2 lies on another parallel but unconnected branch, then; a 25 (t) = 0 indicates that the connection is not established.

[0142] Dynamic similarity enhancement, i.e., traffic sequence similarity: Introducing historical sequence similarity calculation based on traffic features to reflect dynamic functional associations;

[0143] Historical sequence similarity is calculated as follows:

[0144] Space node v i ,v j The historical sequence within the window length T is as follows:

[0145] s i =[f i (t-T+1),...,f i (t)];

[0146] In the formula, s i Represents spatial node v i A historical time series along a certain feature dimension is a vector of length T; f i () represents the nodal observation function, indicating the observation of spatial node v. i The value obtained from observation at a certain moment; T is the length of the time window, representing the number of historical time steps used for modeling; t-T+1 represents the starting time point of the historical sequence.

[0147] Assumptions: Spatial node v3 is an ETC gantry; the observed variable is the number of vehicles passing through every 5 minutes, i.e., f3(t) represents the traffic frequency in the t-th time period; the time window t = 4, i.e., looking back at the 4 most recent time periods;

[0148] At the current time t = 10, then: s3 = [f3(10-4+1),f3(8),f3(9),f3(10)];

[0149] If the actual data is: [15,18,22,35] (vehicles / 5 minutes), then: s3=[15,18,22,35];

[0150] This sequence can be used to calculate the cosine similarity with other nodes to determine whether their traffic state evolution is consistent.

[0151] The similarity score between node pairs is calculated using cosine similarity.

[0152]

[0153] In the formula, sim(i,j) represents the spatial node v i With spatial node v j Similarity score between them; s i For spatial node v i Historical characteristic sequence; s j For spatial node v j Historical characteristic sequence; s i ·s j For vector s i With s j dot product; |s i | is the vector s i The modulus; |s j | is the vector s j The model.

[0154] Assume: s3 = [15, 18, 22, 35] is the flow sequence of spatial node v3; s5 = [16, 20, 24, 33] is the flow sequence of spatial node v5;

[0155] but:

[0156] Therefore, the traffic state evolution trends of nodes v3 and v5 are highly similar, and they may be located on the same road segment or have a strong propagation relationship.

[0157] The fusion strategy constructs a dynamic adjacency matrix A t :

[0158] Combining physical connectivity and similarity, the edge weights are updated as follows:

[0159]

[0160] In the formula, A t (i,j) represents the spatial node v in the dynamically adjusted adjacency matrix at time t. i With v j The edge weights between them; θ high The connection enhancement threshold is a preset lower limit of similarity; θ lowThe connection pruning threshold is a preset upper limit of similarity; K% is a dynamic edge ratio control parameter, indicating that the top K% and bottom K% of similarity in the entire set of node pairs are selected as special processing objects; A phys (i,j) are elements of the original physical adjacency matrix, representing empty nodes v. i With v j Does a direct road connection exist within a real highway network?

[0161] Assumption: A certain highway has 100 observation nodes, totaling... Spatial node pairs: Let K% = 5%, then the first 5% × 4950 ≈ 248 pairs of the most similar and least similar node pairs are processed first. If the sim(i,j) = 0.93 > θ of a certain pair of spatial nodes (i,j) high =0.9, even if they are not on the same main line, A is still forced to connect. t =1. If sim(i,j) = 0.2 < θ low =0.3, even if they are adjacent road segments, the connection A is disconnected. t =0. If sim(i,j) = 0.6, which is in the middle range, then whether to connect depends on A. phys (i,j).

[0162] Based on the above graph structure definition and edge relationship construction mechanism, multi-source heterogeneous traffic data are uniformly mapped into a dynamic graph structure;

[0163] The dynamic graph structure is: the final generated time evolution graph structure sequence, represented as:

[0164] {G 1 G 2 ,...G t ,...}, where G t =(V,A t ,X t );

[0165] In the formula, G t The traffic map constructed at time point t represents a dynamically evolving graph structure; A t This represents a dynamic adjacency matrix that integrates historical traffic similarity with the original road network structure.

[0166] This dynamic graph structure integrates multi-dimensional information such as traffic flow, environment, and time. At the same time, the adjacency relationship is adjusted according to the traffic state and can be adapted to the input format of deep learning models such as graph neural networks.

[0167] This step integrates the physical road network structure with historical traffic similarities to construct a multi-source traffic map structure with spatiotemporal dynamic characteristics, providing a unified and computable data foundation for subsequent intelligent analysis.

[0168] Step 2: Fuzzy Label-Guided Training Mechanism

[0169] This step transforms traditional "deterministic classification" such as: smooth / congested / blocked into "probabilistic distribution learning of traffic states". By introducing fuzzy membership labels, the model can perceive the gradual process of traffic state changes more precisely, improving its ability to model critical states and complex evolution scenarios.

[0170] Traditional method: Each node is assigned a discrete label at time t, such as "congestion"; Figure 3 As shown, in this step, each node is assigned a three-dimensional fuzzy score vector at time t, representing its membership degree in the three states. The specific steps are as follows:

[0171] Step 2.1 Fuzzy Label Construction:

[0172] Traffic conditions are divided into three fuzzy categories:

[0173] Membership degree of unobstructed state;

[0174] Membership degree of moderate congestion;

[0175] Membership degree of severe congestion status;

[0176] All membership degrees satisfy the normalization condition:

[0177]

[0178] Semantic space node v i The fuzzy label at time t is:

[0179]

[0180] In the formula, S i (t) represents the spatial node v in the graph at time t. i The fuzzy scoring vector is a three-dimensional probability distribution vector used to describe the degree of membership of the current traffic state of the node under the three fuzzy categories; the index number of node i indicates the i-th traffic observation node in the graph; t is a time variable, representing the current time point or time step; Represents spatial node v i The membership degree of a business in the open state at time t; Represents spatial node v i The membership degree at time t when the congestion is moderate; Represents spatial node v iThe membership degree of a state in severe congestion at time t; (1), (2), (3) represent three predefined fuzzy traffic state categories: smooth flow, moderate congestion, and severe congestion.

[0181] Assumption: The fuzzy score of a spatial node v5 at time t is:

[0182] S5(t) = [0.1, 0.3, 0.6];

[0183] Therefore: 10% of the traffic is considered smooth; 30% is considered moderately congested; and 60% is considered severely congested.

[0184] Overall assessment: This node is currently in a state of severe congestion, but still retains some characteristics of moderate congestion, and may be in the process of congestion worsening.

[0185] Fuzzy scoring vectors reflect the uncertainty and transition of traffic conditions; for example, a road segment may simultaneously exhibit characteristics of "partially smooth" and "mildly congested".

[0186] Step 2.2 Membership Function Design:

[0187] Taking average speed as an example, the triangular fuzzy membership function is used to automatically generate fuzzy labels based on the measured speed u:

[0188] Severe congestion μ (3) (u):

[0189]

[0190] Moderate congestion μ (2) (u):

[0191]

[0192] Unobstructed μ (1) (u):

[0193]

[0194] In the formula, u represents the average speed of a node at the current moment, which is the core input variable for generating fuzzy labels; μ (k) (u) represents the degree of membership of the road segment to the kth traffic state when the average speed is u; u1, u2, and u3 are three empirical speed thresholds that satisfy u1 < u2 < u3 to divide the transition zone between different traffic states.

[0195] Assume: u1 = 20, u2 = 40, u3 = 60, and the average speed of a certain road section is u = 35 km / h, then:

[0196]

[0197]

[0198] μ (1) (35) = 0;

[0199] The resulting fuzzy label is S = [0, 0.75, 0.25], indicating that the main congestion is moderate, with some mild to severe congestion characteristics.

[0200] Step 2.3 Joint Training Mechanism:

[0201] The fuzzy label is used as the model supervision target and introduced into the neural network training to form a joint learning process with the continuous variable prediction task. In order for the model to learn continuous traffic parameter prediction and fuzzy state estimation simultaneously, this step introduces a joint training loss function:

[0202] L total =L pred +λ1L fuzzy +λ2L structure ;

[0203] In the formula, L total L represents the total loss function; pred This represents the mean square error in traditional continuous value prediction, such as speed and flow rate; L fuzzy L represents the loss estimated by the fuzzy distribution. structure This indicates that the smoothness of the neighborhood prediction results is maintained in the graph structure; λ1 and λ2 represent the weighting coefficients of different components in the loss function.

[0204] Among them, the loss estimation using the fuzzy distribution employs the KL divergence prediction distribution S. i (t) and label distribution S i Differences in (t):

[0205]

[0206] In the formula, This represents summing over all N nodes in the graph; This represents the summation over the three traffic conditions; The membership degree predicted by the model represents the membership degree of node v in the model output. i The predicted probability of being in state k at time t; The core term of KL divergence measures the "information content" deviation of the true distribution relative to the predicted distribution.

[0207] Assumption: The true label and model prediction of a certain node are as follows:

[0208]

[0209]

[0210] Then its KL loss contribution is:

[0211]

[0212] If the result is large, it indicates that the model has a significant prediction bias in terms of smooth traffic and severe congestion, and adjustments are needed.

[0213] This step constructs a three-dimensional fuzzy label based on triangular membership functions and combines KL divergence loss with graph structure regularization to achieve soft classification modeling of traffic states, significantly improving the model's ability to recognize complex and gradually changing traffic states and its training stability.

[0214] Step 3: Graph Neural Network Fuzzy Distribution Prediction:

[0215] This step constructs a spatiotemporal joint modeling prediction framework, which uses graph neural networks (GNN) and recurrent neural networks (LSTM) to collaboratively extract the spatial dependence and temporal evolution features of highway traffic states, thereby achieving accurate prediction of the fuzzy distribution of traffic states at each node in the future.

[0216] The prediction framework adopts a three-stage structure of "spatial encoding + temporal encoding + fusion output", and the specific implementation steps are as follows:

[0217] Step 3.1 Graph convolution to extract spatial dependencies:

[0218] Graph G at each time t t =(V,A t ,X t Spatial feature extraction is performed using Chebyshev graph convolution ChebNet. The Chebyshev graph convolution calculation formula is as follows:

[0219]

[0220] In the formula, H (l+1),t This represents the output feature matrix of the (l+1)th layer of the graph convolutional network at time t, with dimension .

[0221] H (l),t This represents the input feature matrix of the l-th layer of the graph convolutional network at time t, with dimension . N represents the total number of spatial nodes in the graph, i.e., the number of location units with observation capabilities in the highway; d l The dimension of the feature in layer l varies with the layer; σ() is the activation function; To sum over k from 0 to K, we have θ, which represents the filtering expansion of the graph signal using a K-order Chebyshev polynomial; k The learnable graph convolution kernel parameters, equivalent to the weight coefficients of a filter, are automatically optimized through training and used to control the contribution of different orders of graph Laplacian operators; Tk () represents the k-th order Chebyshev polynomial; L t is the normalized graph Laplace matrix; K is the polynomial order.

[0222] Graph convolution is performed separately at each time step, and the output sequence is:

[0223]

[0224] In the formula, H is a spatially encoded output set for time-series processing, containing results for T time steps from t=1 to t=T; (L),t The final output feature matrix after L layers of graph convolution at time t; T is the length of the sliding time window; For spatial node v i The spatial encoding vector at time t is H t The i-th row belongs to R d .

[0225] Assumptions: N = 100 ETC gantry nodes; T = 6, the most recent 6 time steps; d = 64, each node outputs 64-dimensional features; then: H t ∈R 100×64 , sequence {H 1 H 2 ,...,H 6 This forms the input basis for subsequent LSTM.

[0226] Step 3.2 Constructing the time series input:

[0227] The spatial encoding results of the historical T time steps are stacked to form the time series input for each node; the formula for calculating the spatial encoding sequence is as follows:

[0228]

[0229] In the formula, z i Represents spatial node v i The time series input is a two-dimensional matrix with dimensions T×d; i Represents the i-th spatial node in the graph; t-T+1 represents the starting time point of the historical sequence; R T×d Indicate z i It is a real matrix with T rows and d columns.

[0230] Assumptions: t = 100, the current time; T = 5, the last 5 time steps; d = 64, each node outputs 64-dimensional features; then: This 5×64 matrix will be used as input to the LSTM, allowing the model to learn the state change trend of the node in the last 5 steps.

[0231] Step 3.3 LSTM extracts time-dependent features:

[0232] z i Inputting the data into a Long Short-Term Memory (LSTM) network captures the long-term and short-term evolution patterns of traffic states; the formula for calculating the comprehensive state evolution trend vector is as follows:

[0233]

[0234] In the formula, Represents spatial node v i The time-dependent feature vector obtained after LSTM processing belongs to R. d ; i is the node index, representing the i-th traffic observation node in the graph; LSTM() is a Long Short-Term Memory network used to capture long-term and short-term time-dependent features in the input sequence.

[0235] Assumption: z i ∈R 6×64 : Space node v i The spatial encoding sequence of the past 6 time steps, totaling 30 minutes; the LSTM hidden layer dimension is 64; then: input z i It is a 6×64 matrix; output A 64-dimensional vector; this vector encodes trend information such as "whether the road segment is becoming congested" and "whether the congestion is accelerating".

[0236] Step 3.4: Fuse and output the fuzzy prediction distribution:

[0237] The comprehensive state evolution trend vector is input into the fully connected layer, and combined with the softmax activation function, the fuzzy state prediction for the future time t+1 is output; the softmax activation function is calculated as follows:

[0238]

[0239] In the formula, Represents spatial node v i The predicted fuzzy score vector at future time t+1; softmax() is the Softmax function used to convert a real vector into a probability distribution; W o ∈R 3×d b is the prediction layer weight matrix; o ∈R 3 The bias term is a learnable 3D vector used to adjust the output baseline value for each state, thereby improving the model's expressive power. The linear transformation result of the fully connected layer is output as a 3-dimensional real vector.

[0240] Assumption: The output of a certain node is

[0241] This means there is a 10% chance of being in a smooth traffic flow, a 20% chance of being in a moderately congested traffic flow, and a 70% chance of being in a severely congested traffic flow.

[0242] Overall assessment: This node is highly likely to experience a severe congestion event in the future, and can be used as input for anomaly detection.

[0243] The input data for this step is the dynamic graph structure from step 1: G t =(V,A t ,X t In this step, the graph neural network model relies on the fuzzy labels provided in step 2 as supervision during training. By minimizing the joint loss function, it achieves accurate modeling and parameter optimization of the spatiotemporal evolution of traffic states. This step constructs a hybrid "graph convolution-LSTM" model. First, at each time step, GCN is used to extract the spatial dependency features of traffic states. Then, LSTM is used to model the temporal evolution trend, ultimately outputting the three-dimensional fuzzy state prediction distribution for future times. The three-dimensional fuzzy state prediction distribution is shown below. Figure 3 As shown, accurate spatiotemporal prediction of highway traffic conditions has been achieved.

[0244] Step 4: Anomaly subgraph detection based on distribution mutations:

[0245] This step breaks through the limitations of traditional "point-based alarms" or "fixed threshold rules" and proposes an anomaly detection mechanism driven by predictive distribution mutations. By analyzing the temporal change rate of the fuzzy distribution of traffic conditions, it identifies local areas with significant state transitions and compact spatial connectivity, thereby accurately locating the spatiotemporal range of potential traffic events.

[0246] Input sources for this step: (1) Output from step 3: each spatial node v i fuzzy state prediction distribution at continuous time points (2) Graph structure from step 1: dynamic adjacency matrix A t This is used to define the connection relationships between nodes.

[0247] Specifically, step 4.1 calculates the node score mutation rate:

[0248] Define spatial node v i The score mutation rate at time t quantifies the degree of drastic change in the distribution of traffic conditions, and measures the total variation of the predicted distribution from t to t+1; the larger the value, the more drastic the change in condition, such as from smooth and fast traffic to severe congestion; this process does not depend on the absolute state, but focuses on the rate of change, and is sensitive to sudden events.

[0249] The formula for calculating the score mutation rate is as follows:

[0250]

[0251] In the formula, ΔS i (t) represents the spatial node v i The rate of change in the score at time t; This indicates that at time t, the model is effective for spatial nodes v. i The predicted membership degree for traffic conditions of type k; k=1 for smooth traffic, k=2 for moderate congestion, and k=3 for severe congestion.

[0252] Assumption: The prediction result for a certain spatial node v5 is as follows:

[0253] state time t time t+1 absolute value of difference <![CDATA[Unobstructed μ (1) > 0.8 0.1 0.7 <![CDATA[Medium congestion μ (2) > 0.2 0.3 0.1 <![CDATA[Severe congestion μ (3) > 0.0 0.6 0.6 <![CDATA[Total ΔS5(t)]]> - - 1.4

[0254] Therefore, if the node deteriorates drastically between t and t+1, with a score mutation rate as high as 1.4, it is highly likely to correspond to a traffic accident or congestion event.

[0255] Step 4.2 Filter the set of candidate abnormal nodes:

[0256] To eliminate noise and isolated fluctuations, a dual screening condition is set to construct a seed set of suspected anomalous nodes, represented as follows:

[0257]

[0258] In the formula, V Δ Let S be the set of candidate anomalous nodes; i (t) represents the spatial node v i The score mutation rate at time t; θ1 represents the score mutation rate threshold, used to screen for significant transition nodes; For spatial node v i The membership degree of the severe congestion state is predicted at the next time step t+1; θ2 represents the lower bound of the severe congestion score component, used to ensure that the risk state is made explicit.

[0259] Assumptions: θ1 = 0.5, θ2 = 0.6;

[0260]

[0261] Step 4.3 Extracting the anomaly subgraph:

[0262] In graph structure G t =(V,A t On V Δ Based on the nodes in the graph, extract the connected subgraph that satisfies the following two conditions:

[0263] (1) Connectivity requirement: Subgraph G sub =(V sub E subAll nodes in the region must be topologically reachable; that is, there must be a path between any two nodes, thus ensuring that the abnormal region is a physically continuous spatial block, rather than a scattered jump point.

[0264] (2) Structural density constraint: Define the edge density ρ of the subgraph, which satisfies the following equation:

[0265]

[0266] In the formula, ρ is the edge density of the subgraph, representing the proportion of the actual number of edges in the subgraph to the maximum possible number of edges; E sub V is the set of edges in the subgraph, that is, the set of connections between all nodes in the subgraph; sub θ represents the set of nodes in the subgraph, that is, the set of all nodes that constitute the candidate anomaly region; θ3 represents the structural compactness threshold to avoid sparse noise from clustering.

[0267] Assumption: |V sub | = 4 nodes; |E sub | = 4 edges;

[0268] Then the edge density is:

[0269] If we set θ3 = 0.5, then ρ = 0.67 > 0.5, which satisfies the density constraint, so we retain this subgraph.

[0270] Step 4.4 Output the set of anomaly candidate regions: Collect all subgraphs that meet the conditions as follows:

[0271] G sub ={G1,G2,...,G m};

[0272] In the formula, G sub G is a set of anomalous candidate regions. j Let be the j-th anomaly candidate subgraph, representing a potential traffic anomaly impact area; m is the total number of anomaly candidate subgraphs.

[0273] Hypothesis: On a certain section of highway, it is detected that: G1 consists of 5 nodes in the K100-K105 segment, with a high mutation rate and high density → a traffic accident may occur; G2 consists of 3 nodes in the K150-K152 segment, which also meets the conditions → local congestion may occur due to construction.

[0274] Then: G sub ={G1,G2},m=2; The system will score and make alarm decisions for these two areas in subsequent steps.

[0275] This step filters candidate anomalous nodes by calculating node-level score mutation rates and combining them with the probability of severe congestion. Then, it extracts connected and densely structured anomalous subgraphs based on the graph structure. The anomalous subgraphs are shown below. Figure 4 As shown, this enables a leap from "point mutation" to "structured anomalous region," accurately identifying the impact range of potential traffic incidents.

[0276] Step 5: Event Scoring and Alarm Determination

[0277] After identifying the set of abnormal candidate regions G sub ={G1,G2,...,G m Building upon this foundation, this step further quantifies the risk level of each sub-graph and dynamically determines whether to trigger an alarm, achieving a leap from anomaly detection to intelligent decision-making; and solving the problem of poor adaptability of traditional fixed threshold alarms in different road sections, time periods, and traffic conditions.

[0278] The input for this step is the output of step 4: the set of anomaly candidate regions G. sub Each set of candidate anomaly regions G sub Includes: the set of nodes V in the subgraph sub The set of edges E in the subgraph sub Predicted distribution of each node and and the score mutation rate ΔS i (t).

[0279] Follow these steps to complete the process:

[0280] Step 4.1 Define the multi-factor event scoring function:

[0281] For each set of anomaly candidate regions G sub Calculate a comprehensive event score E(G) sub This integrates risk characteristics from three dimensions; it is represented as:

[0282] E(G sub )=α·Avg(ΔS i )+β·ClusteringCoeff(G sub )+θ·RatingShift(G sub );

[0283] In the formula, E(G) sub The score is used to comprehensively assess whether the current abnormal subgraph should trigger an alarm; a higher value indicates a higher risk. α is the mutation intensity weight. Avg(ΔS) i ) represents the average score mutation rate of all nodes within the subgraph, reflecting the severity of overall state transitions; β represents the structural density weight; ClusteringCoeff(G subRatingShift(G) represents the average local clustering coefficient of the subgraph, measuring the compactness and cohesion of its topology; θ represents the weight of the state shift direction; RatingShift(G) represents the average local clustering coefficient of the subgraph. sub The average trend of the membership degree of severely congested spatial nodes within the subgraph is used to determine whether congestion is worsening.

[0284] Among them, the mutation strength term Avg(ΔS) i The formula for calculating ) is:

[0285]

[0286] Avg(ΔS i It measures the average intensity of state change of nodes within a subgraph; the larger the value, the more drastic the overall change, reflecting the suddenness of the event.

[0287] ClusteringCoeff(G) is a structural compactness term. sub The formula for calculating ) is:

[0288]

[0289] In the formula, T i For spatial node v i The number of triangles formed around the center, i.e., the number of triples where any two of its neighbors are also connected; d i The degree of a spatial node is the number of edges directly connected to it. For spatial node v i The local clustering coefficient.

[0290] State offset direction item RatingShift(G) sub The formula for calculating ) is:

[0291]

[0292] In the formula, For spatial node v i The membership degree of severe congestion prediction at time t; For spatial node v i Predict the membership degree of the severe congestion state at the next time step t+1.

[0293] RatingShift(G sub It measures the average growth trend of the "severe congestion membership" of nodes within the subgraph; a positive value indicates that congestion is worsening, while a negative value indicates that congestion is easing; it is used to determine the direction of situation evolution and avoid false alarms.

[0294] The weight coefficients α, β, and θ are all non-negative hyperparameters. They can be set empirically, such as α = 0.5, β = 0.3, and θ = 0.2, or they can be adaptively learned through data-driven methods, such as grid search, gradient optimization, and reinforcement learning.

[0295] Assumption: A subgraph G1 contains 4 nodes, and the calculation yields:

[0296] Avg(ΔS i A value of 0.8 indicates drastic changes; ClusteringCoeff(G) sub RatingShift(G) = 0.7, indicating a compact structure; sub A value of +0.5 indicates continued deterioration; the weights are set as follows: α = 0.5, β = 0.3, θ = 0.2.

[0297] Then: E(G1)=0.5×0.8+0.3×0.7+0.2×0.5=0.71, which means: the overall score is high and an alarm is very likely to be triggered.

[0298] Step 4.2 Adaptive Alarm Judgment Mechanism:

[0299] To overcome the problem of unstable performance of fixed threshold alarms under different road networks, time periods, and traffic volumes, an adaptive boundary learner is introduced.

[0300] The judgment rule is:

[0301]

[0302] In the formula, Alert(G) sub The result of the alarm trigger is a binary output; B(G) represents the alarm trigger result. sub ) represents the alarm boundary function; others refer to all functions that do not satisfy E(G) sub )>B(G sub (The situation is as follows.)

[0303] Step 4.3 Output:

[0304] When Alert(G) sub When ) = 1, the system outputs structured event information:

[0305] <Timestamp, Sub-image ID, Overall Score E, Maximum ΔS, Status Level, Alarm Level>; The timestamp is the time when the event occurred or was detected; the sub-image ID is a unique identifier for the abnormal area; the overall score E is E(G sub ), used for sorting and priority judgment; maximum ΔS is the maximum score mutation rate of nodes in the subgraph, reflecting the point of most drastic change; status level: such as semantic labels such as "moderate congestion spread" and "severe congestion formation"; alarm level: such as level 1 (emergency), level 2 (attention), and level 3 (observation).

[0306] This step involves designing a multi-factor event scoring function that integrates mutation intensity, structural compactness, and state offset direction, and combining it with an adaptive alarm boundary learning mechanism based on subgraph features. This enables accurate quantitative assessment and intelligent alarm decision-making for traffic emergencies, significantly improving the accuracy and scenario adaptability of alarms.

[0307] In summary, this highway event detection algorithm based on data situation analysis integrates key technologies such as multi-source traffic data graph modeling, fuzzy label-guided training, graph neural network prediction, score distribution mutation detection, abnormal subgraph scoring and boundary discrimination. The overall process covers the entire process of data modeling, prediction, detection and decision-making, and is used to realize intelligent perception of highway traffic operation status, accurate identification of emergencies and dynamic assessment of risk levels.

[0308] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely illustrative of the principles of the invention. Various changes and modifications can be made to the invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the present invention as claimed. The scope of protection of this invention is defined by the appended claims and their equivalents.

Claims

1. A highway event detection algorithm based on data situation analysis, characterized in that, Includes the following steps: Step 1: Unified graph structure modeling of multi-source traffic data: Collect multi-source heterogeneous traffic data on highways, and map the multi-source heterogeneous traffic data into a unified dynamic graph structure through graph structure definition and edge relationship construction mechanism; Step 2: Fuzzy Label-Guided Training Mechanism: Based on the dynamic graph structure, three fuzzy categories are divided into smooth, moderately congested, and severely congested. Fuzzy labels are generated using the triangular fuzzy membership function. The fuzzy labels are used as supervision signals to construct a multi-task joint loss function and complete end-to-end training. Step 3: Graph Neural Network Fuzzy Distribution Prediction: Construct a spatiotemporal joint modeling prediction framework. Based on the dynamic graph structure and multi-task joint loss function, use graph neural network (GNN) and recurrent neural network (LSTM) to collaboratively extract the spatial dependence and temporal evolution features of highway traffic state, and accurately predict the fuzzy distribution of traffic state at each node in the future. Step 4: Anomaly Subgraph Detection Based on Distribution Abrupt Changes: An anomaly detection mechanism driven by predicting distribution abrupt changes is proposed. Based on the fuzzy distribution of traffic states at each node and the dynamic adjacency matrix in the dynamic graph structure, the set of anomaly candidate regions is identified by analyzing the time change rate of the fuzzy distribution of traffic states. Step 5 Event Scoring and Alarm Judgment: Based on the set of abnormal candidate regions, the risk level of each sub-graph is quantified by defining a multi-factor event scoring function and an adaptive alarm judgment mechanism, and the alarm is dynamically determined to be triggered. In step 5, the multi-factor event scoring function is defined as follows: ; ; ; ; In the formula, The event is scored to comprehensively assess whether an alarm should be triggered in the current abnormal subgraph; the higher the score, the higher the risk. This is a set of candidate regions for anomalies. Weights for mutation intensity; The average score mutation rate of all nodes within the subgraph reflects the severity of the overall state transition. Weights for structural compactness; is the average local clustering coefficient of the subgraph, which measures the compactness and cohesion of its topological structure; Weights for the state offset direction; The average trend of the membership degree of severely congested spatial nodes within the subgraph is used to determine whether congestion is worsening. For spatial nodes The score mutation rate; This represents the set of nodes in the subgraph, that is, the set of all nodes that constitute the candidate region of the anomaly; For spatial nodes The number of triangles formed around the center, that is, the number of triples in which any two neighbors are also connected to each other; The degree of a spatial node is the number of edges directly connected to it. For spatial nodes The local clustering coefficient; For spatial nodes At any moment The membership degree of severe congestion prediction; For spatial nodes In the next moment Predicting membership degree for severe congestion conditions.

2. The highway event detection algorithm based on data situation analysis according to claim 1, characterized in that: In step 1, the multi-source heterogeneous traffic data includes: collecting ETC gantry passage data, toll station entry and exit information, traffic detector data, weather information, and holiday markers; In step 1, the graph structure is defined as: at any time The transportation network is modeled as a dynamic graph. ; Animated GIF Represented as: ; Represents the set of spatial nodes in a transportation network; Each node represents a traffic location unit with data observations; Indicates the time interval between nodes The relationship between them; Represents the node feature matrix; each of the rows Represents a node At that moment The multidimensional feature vector; Feature vector of each node It is formed by the fusion of multi-source heterogeneous traffic data, and is represented as: ; In the formula, This represents the eigenvector of a node at time t; This indicates the frequency of vehicles passing through the ETC gantry node per unit time. This indicates the difference in the number of vehicles entering and exiting the toll station; Indicates flow rate per unit time; Indicates average speed; This indicates the weather category code: 0 = sunny, 1 = rain, 2 = snow, 3 = fog; Indicates whether it is a holiday or not: 0 = holiday, 1 = other; Number the nodes.

3. The highway event detection algorithm based on data situation analysis according to claim 2, characterized in that: In step 1, the edge relationship construction mechanism includes: basic connection relationship, dynamic similarity enhancement, and fusion strategy to construct a dynamic adjacency matrix; Basic connectivity: Define the initial adjacency matrix based on the real road network structure. , is represented as: ; In the formula, Indicates at time Nodes in the diagram With nodes Does a connection exist between them, i.e., does an edge exist? , Let be the index number of the node, indicating the in the graph. The and the first One traffic observation point; , They represent the first in the figure respectively. The and the first Each of the spatial nodes corresponds to a geographic location unit on the highway with data collection capabilities. Representing spatial nodes and Does a direct road connection exist in the physical road network, i.e., is it possible to travel directly via a highway without having to detour? This is a time variable, representing the current point in time or time step; Dynamic similarity enhancement: Introducing historical sequence similarity calculation based on traffic features; spatial nodes , In window length The internal historical sequence is as follows: ; In the formula, Representing spatial nodes A historical time series with a certain feature dimension is a sequence of length... ; Let be the nodal observation function, representing the observation of spatial nodes. The value obtained from observations at a certain moment; The time window length represents the number of historical time steps used for modeling. Indicates the starting point of the historical sequence; The similarity score between node pairs is calculated using cosine similarity. ; In the formula, Representing spatial nodes With spatial nodes Similarity score between them; For spatial nodes Historical characteristic sequence; For spatial nodes Historical characteristic sequence; For vectors and The dot product; For vectors The model; For vectors The model; Fusion strategy to construct dynamic adjacency matrix Update edge weights by combining physical connectivity and similarity: ; In the formula, Indicates at time Spatial nodes in the dynamically adjusted adjacency matrix and The edge weights between them; The connection enhancement threshold is a preset lower limit of similarity. The connection pruning threshold is a preset upper limit of similarity; This is a dynamic edge ratio control parameter, indicating that the edge with the highest similarity is selected from the entire set of node pairs. and the lowest front As a special case; The elements of the original physical adjacency matrix represent empty nodes. and Does a direct road connection exist in a real highway network? The dynamic graph structure is as follows: The final generated time evolution graph structure sequence is as follows: ,in ; In the formula, Indicates at a point in time The constructed traffic map is a dynamically evolving graph structure; This represents a dynamic adjacency matrix that integrates historical traffic similarity with the original road network structure.

4. The highway event detection algorithm based on data situation analysis according to claim 1, characterized in that: In step 2, when dividing into three fuzzy categories, all membership degrees satisfy the normalization condition: , ; Define spatial nodes At any moment The fuzzy label is: ; In the formula, Indicates at time Spatial nodes in the diagram The fuzzy rating vector is a three-dimensional probability distribution vector used to describe the degree of membership of the current traffic state of the node under the three fuzzy categories; The index number of the node indicates the node number in the graph. One traffic observation node; This is a time variable, representing the current point in time or time step; Representing spatial nodes At any moment Membership degree in a smooth state; Representing spatial nodes At any moment Membership degree of moderate congestion; Representing spatial nodes At any moment Membership degree of being in a state of severe congestion; , , It represents three predefined fuzzy traffic state categories: smooth flow, moderate congestion, and severe congestion.

5. The highway event detection algorithm based on data situation analysis according to claim 4, characterized in that: In step 2, fuzzy labels are generated from traffic parameters including speed, density, and flow rate using triangular fuzzy membership functions. The fuzzy membership function for triangles is as follows: Severe congestion : ; Moderate congestion : ; Smooth : ; In the formula, The average speed of a node at the current moment is the core input variable for generating fuzzy labels; This indicates that when the average speed is At that time, this section of road belonged to the first Membership degree of traffic class; , , Three empirical speed thresholds are required to satisfy... Used to delineate transition zones for different traffic conditions; Using fuzzy labels as supervision signals, a multi-task joint loss function is constructed: ; In the formula, Represents the total loss function; This represents the mean square error in traditional continuous value prediction, such as speed and flow rate. This represents the loss estimated by the fuzzy distribution; This indicates that the smoothness of the neighborhood prediction results in the graph structure is maintained; , This represents the weighting coefficients of different components in the loss function.

6. The highway event detection algorithm based on data situation analysis according to claim 3, characterized in that: Step 3 includes the following steps: Step 3.1 Graph convolution to extract spatial dependencies: For each moment The picture Spatial feature extraction is performed using Chebyshev graph convolution ChebNet. The Chebyshev graph convolution calculation formula is as follows: ; In the formula, Indicates at time Graph convolutional network The output feature matrix of the layer has a dimension of ; Indicates at time Graph convolutional network The input feature matrix of the layer has a dimension of ; This represents the total number of spatial nodes in the graph, i.e., the number of location units with observation capabilities in the highway. No. The dimension of the layer features, which varies with the layer; For activation functions; To From 0 to Summation indicates the use of The Chebyshev polynomial is used to filter and expand the graph signal; These are learnable graph convolution kernel parameters, equivalent to the weight coefficients of a filter. They are automatically optimized through training and used to control the contribution of graph Laplacian operators of different orders. For the first Chebyshev polynomial; The normalized graph Laplace matrix; The order of the polynomial; Graph convolution is performed separately at each time step, and the output sequence is: ; In the formula, A set of spatially encoded outputs for time-series processing, containing data from... arrive common The result of each time step; For at any time go through The final output feature matrix after layer graph convolution; The length of the sliding time window; For spatial nodes At any moment The spatial encoding vector is The Okay, belongs to ; Step 3.2 Constructing the time series input: History The spatial encoding results of each time step are stacked to form the time series input of each node; the formula for calculating the spatial encoding sequence is as follows: ; In the formula, Representing spatial nodes The time series input is a two-dimensional matrix with dimension 1. ; Indicates the first in the figure One spatial node; Indicates the starting point of the historical sequence; express It is a real matrix, with OK List; Step 3.3 LSTM extracts time-dependent features: Will Inputting the data into a Long Short-Term Memory (LSTM) network captures the long-term and short-term evolution patterns of traffic states; the formula for calculating the comprehensive state evolution trend vector is as follows: ; In the formula, Representing spatial nodes The time-dependent feature vector obtained after LSTM processing belongs to ; Let be the node index, representing the in the graph. One traffic observation node; It is a Long Short-Term Memory network used to capture long-term and short-term time-dependent features in the input sequence; Step 3.4: Fuse and output the fuzzy prediction distribution: The comprehensive state evolution trend vector is input into the fully connected layer, and combined with the softmax activation function, it outputs the future time step. Fuzzy state prediction; the softmax activation function calculation formula is as follows: ; In the formula, Representing spatial nodes In the future The predicted fuzzy score vector; The Softmax function is used to convert a real number vector into a probability distribution. This is the prediction layer weight matrix; The bias term is a learnable 3D vector used to adjust the output baseline value for each state, thereby improving the model's expressive power. The linear transformation result of the fully connected layer is output as a 3-dimensional real vector.

7. The highway event detection algorithm based on data situation analysis according to claim 3, characterized in that: Step 4 includes the following steps: Step 4.1 Calculate the node score mutation rate: Define spatial nodes At any moment The score mutation rate quantifies the degree of drastic change in the distribution of traffic conditions; the formula for calculating the score mutation rate is as follows: ; In the formula, Representing spatial nodes At any moment The score mutation rate; Indicates at time The model is for spatial nodes In the first Predicted membership degree of traffic state type; To ensure smooth flow, Moderate congestion. Severe congestion; Step 4.2 Filter the set of candidate abnormal nodes: By setting dual screening criteria, a seed set of suspected abnormal nodes is constructed, represented as follows: ; In the formula, For the set of candidate abnormal nodes; Indicates the first in the figure Each of the spatial nodes corresponds to a geographic location unit on the highway with data collection capabilities. For spatial nodes At any moment The score mutation rate; This represents the threshold for the scoring mutation rate, used to screen for significant transition nodes; For spatial nodes In the next moment The membership degree of the predicted severe congestion state; This represents the lower bound of the severe congestion score component, used to ensure that the risk status is made explicit; Step 4.3 Extracting the anomaly subgraph: In graph structure Above, with Based on the nodes in the graph, extract the connected subgraph that satisfies the following two conditions: Connectivity requirement: subgraph All nodes in the topology must be mutually reachable; Structural density constraints: Define the edge density of a subgraph edge density Satisfy the following formula: ; In the formula, The edge density of the subgraph represents the proportion of the actual number of edges in the subgraph to the maximum possible number of edges. It is the set of edges in the subgraph, that is, the set of connections between all nodes in the subgraph; This represents the set of nodes in the subgraph, that is, the set of all nodes that constitute the candidate region of the anomaly; Indicates the structural compactness threshold; Step 4.4 Output the set of anomaly candidate regions: Collect all subgraphs that meet the conditions as follows: ; In the formula, This is a set of candidate regions for anomalies. For the first Each anomaly candidate submap represents a potential area affected by traffic anomalies; This represents the total number of abnormal candidate subgraphs.

8. The highway event detection algorithm based on data situation analysis according to claim 1, characterized in that: In step 5, the judgment rule of the adaptive alarm discrimination mechanism is as follows: ; In the formula, The alarm trigger result is a binary output; For alarm boundary functions; others refer to all functions that do not meet the criteria. The situation.