Deformed intersection recognition and feature extraction method based on integrated YOLOv5 and angle texture features

By combining quantitative and qualitative definition criteria for irregular intersections and training with the YOLOv5 backbone network, and incorporating angular texture features, automatic identification and feature extraction of irregular intersections were achieved. This solves the problems of inconsistent identification criteria and low efficiency in existing technologies, and improves traffic safety and road planning efficiency.

CN118196497BActive Publication Date: 2026-06-23TONGJI UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TONGJI UNIV
Filing Date
2024-03-20
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing technologies lack unified definition criteria for identifying irregular intersections, resulting in low accuracy and efficiency. Reliance on manual identification is also inefficient, and insufficient samples make it difficult to capture the common patterns of the overall road network, leading to low traffic safety and efficiency.

Method used

We adopted a definition criterion combining quantitative and qualitative methods for irregular intersections, and achieved automatic identification through YOLOv5 backbone network training and weighted voting. We also extracted the geometric design features of irregular intersections by combining angle and texture features, and constructed a large sample dataset.

Benefits of technology

It enables automatic and accurate identification and positioning of irregular intersections, improves the safety and efficiency of the road system, and provides comprehensive large-sample data support.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118196497B_ABST
    Figure CN118196497B_ABST
Patent Text Reader

Abstract

The present application relates to a kind of based on integrated YOLOv5 and angle texture feature's deformity intersection identification and feature extraction method, comprising the following steps: constructing deformity intersection definition criterion;Typical urban vector road network data is collected and is constructed as grid slice dataset, according to definition criterion, data set is labeled;Multiple YOLOv5 main network is trained using labeled data set, and integrated using weighted voting method obtains vector deformity intersection identification model;The vector road network to be detected is input into vector deformity intersection identification model, the number of deformity intersection, crop figure and crop frame information are obtained by identification, and the central longitude and latitude coordinates of each deformity intersection are calculated;Using angle texture feature algorithm, the number of road intersection in deformity intersection crop figure, included angle and deflection angle are identified, and deformity intersection feature distribution data set is constructed.Compared with prior art, the present application realizes the fast identification and positioning of deformity intersection in large-scale vector road network, and can automatically extract geometric features in deformity intersection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image detection technology, and in particular to a method for identifying and extracting features from deformed intersections based on integrated YOLOv5 and angular texture features. Background Technology

[0002] Early road planning failed to account for such rapid urbanization, resulting in existing roads exhibiting irregular alignments and insufficient road clearance. Combined with topographical and other factors, numerous irregular intersections of various types have formed within the road network. These irregular intersections, with their non-standard geometric designs, numerous conflict points, and poor visibility, reduce traffic safety and are high-risk areas for traffic accidents. Due to the unique road conditions and high driver load at irregular intersections, they are prone to traffic congestion and reduced road efficiency. Therefore, strengthening the management and control of irregular intersections is essential for improving traffic safety, alleviating traffic congestion, and promoting sustainable urban development.

[0003] Before addressing irregular intersections, the primary task is to determine their location and category. However, existing research on irregular intersection identification suffers from the following problems: The definition criteria lack adaptability; irregular intersections exhibit diverse forms, and relying solely on design thresholds or template libraries cannot cover all possible scenarios. Furthermore, there is no unified definition standard for irregular intersections in the global traffic engineering field. Identification accuracy and efficiency are low; irregular intersection identification primarily relies on manual assessment, leading to subjective results due to individual experience. Moreover, manual identification is clearly inefficient for large-scale inspection projects. Finally, the research sample is insufficient; existing studies mainly focus on hazard analysis and improvement of single or a few irregular intersections. Relying on limited samples easily limits understanding of local exceptions and hinders the capture of correlations and common patterns among irregular intersections, thus restricting the formulation and implementation of overall road network improvement strategies. Summary of the Invention

[0004] The purpose of this invention is to provide a method for identifying and extracting features from irregular intersections based on integrated YOLOv5 and angular texture features. It employs a qualitative and quantitative definition criterion for irregular intersections, providing a more comprehensive definition. Multiple YOLOv5 backbone networks are trained using an irregular intersection dataset, and a weighted voting method is used to integrate multiple models, achieving automatic and accurate identification and location of irregular intersections, and efficiently identifying safety hazards in the road network. The angular texture feature method is used to extract geometric design features such as intersection angles and deflection angles of irregular intersections in batches, constructing a large-sample dataset to provide a foundation for intersection planning, design, and improvement, thereby enhancing the overall safety and efficiency of the road system.

[0005] The objective of this invention can be achieved through the following technical solutions:

[0006] A method for identifying and extracting features from irregular intersections based on integrated YOLOv5 and angular texture features is proposed. This method automatically extracts irregular intersections from a large-scale urban vector road network and generates a dataset containing intersection coordinates and geometric design information for analysis. Specifically, the method includes the following steps:

[0007] Step 1) Construct a definition criterion for irregular intersections that combines quantitative and qualitative methods;

[0008] Step 2) Collect typical urban vector road network data and construct a raster tile dataset, and label the dataset according to the definition criteria in Step 1);

[0009] Step 3) Train multiple YOLOv5 backbone networks using the labeled dataset from Step 2), and integrate the trained backbone networks using a weighted voting method to obtain a vector deformed intersection recognition model.

[0010] Step 4) Input the vector road network to be detected into the vector irregular intersection recognition model to identify the number of irregular intersections, the clipping map and clipping box information, and calculate the latitude and longitude coordinates of the center of each irregular intersection;

[0011] Step 5) Use the angle texture feature algorithm to identify the number of road intersections, the included angle, and the deflection angle in the irregular intersection clipping image, and construct an irregular intersection feature distribution dataset.

[0012] Step 1) includes the following steps:

[0013] Step 11) Based on the characteristics of the vector intersection and referring to the intersection design specifications, calibrate 4 quantitative indicators and 1 qualitative indicator. The quantitative indicators include the intersection angle, deflection angle, number of road intersections, and misalignment amount. The qualitative indicator is the poor horizontal alignment.

[0014] Step 12) Construct a definition criterion for irregular intersections based on quantitative and qualitative indicators. An intersection is considered an irregular intersection if it meets any of the following conditions:

[0015] a) The included angle of the intersection is less than 75° or greater than 115°, where the included angle is the angle enclosed by the two adjacent road edges;

[0016] b) The deviation angle is greater than 5°, where the deviation angle is the angle at which the straight-going vehicle flow line deviates in the crossroads;

[0017] c) The misalignment is greater than 3m and less than 50m;

[0018] d) There is an access point within the physical area of ​​the intersection;

[0019] e) There are five or more road intersections;

[0020] f) Matches with the template library formed by summarizing other special irregular intersections.

[0021] Step 2) includes the following steps:

[0022] Step 21) Select several cities with typical road network layouts as data sources, such as Shanghai and Beijing, collect their vector road network data and preprocess them. The roads include highways and urban roads.

[0023] Step 22) Divide the urban road network data collected in Step 21 into units: Using ArcGIS's Fishnet tool, the road network is made into several square grid tiles, each with a side length of 2.3km. The grids are exported sequentially as Tag Image File Format (TIFF) with latitude and longitude coordinates, in order of their IDs, to obtain a raster image dataset of the road network tiles.

[0024] Step 23) Based on the definition criteria for irregular intersections constructed in Step 1), use data annotation software such as Labelme and LabelImg to manually annotate various target intersections in the raster image dataset of road network slices. According to the number of different types of intersections, the dataset is divided into training set, validation set and test set. The types of intersection annotations include irregular intersections (DI), normal intersections (NI), grade-separated intersections (IC) and roundabouts (RA).

[0025] The preprocessing in step 21) involves deleting internal roads of buildings, unclassified highways, and other roads of a lower grade than the preset grade (i.e., too low grade), retaining only the road sections that connect to the main line, and symbolizing each type of road with different colors to distinguish each road and utilize its implicit features.

[0026] Step 3) includes the following steps:

[0027] Step 31) Construct multiple YOLOv5 backbone networks. The YOLOv5 model includes an input network, a backbone network, a neck network, and a detection network. YOLOv5 has five main backbone networks: n, s, m, l, and x. The main difference lies in the depth and width components.

[0028] Step 32) Input the labeled road network raster image into the input terminal of the YOLOv5 model and perform size adjustment and data augmentation operations; specifically, scale, crop, fill and other methods to make its size meet the requirement of 640×640; use the Mosaic method for data augmentation processing. The principle is to take 4 images, randomly crop, scale, flip, change saturation and other operations on them, and then use a matrix to extract fixed areas of the 4 images and combine them to form a new image to enrich the learning samples.

[0029] Step 33) Input the image processed in step 32) into the backbone network for feature extraction, wherein the backbone network includes a Focus module, a CBL (Convolution-Batch Normalization-LeakyReLU) module, a Cross Stage Partial Network (CSPNet) module, and a Spatial Pyramid Pooling (SPP) module.

[0030] Step 34) While the backbone network is extracting features, the neck network uses a Feature Pyramid Network (FPN) and a Path Aggregation Network (PAN) to extract and fuse intersection feature information, thereby enhancing the ability to detect various intersection targets.

[0031] Step 35) Input the feature-extracted road network image into the detection end to generate various intersection prediction boxes and confidence scores, wherein the detection end consists of convolutional layers, pooling layers and fully connected layers;

[0032] Step 36) Use the training set in the dataset constructed in Step 2) to train at least three YOLOv5 models with different backbone networks, and select the model parameters with the highest average accuracy in multiple rounds of training as the individual learners to be integrated.

[0033] Step 37) Use weighted voting to integrate the individual learners to be integrated. In an intersection prediction task, the prediction label of each AI expert is regarded as a vote, and the weight is the product of the average accuracy of the AI ​​expert and the confidence of the prediction of the target. The corresponding weight is assigned to the vote, and the label with the highest vote value is used as the final output result of the integrated model prediction, thereby constructing an integrated vector deformed intersection recognition model.

[0034] Step 33) includes the following steps:

[0035] Step 331) The road network image processed in step 32) is reduced in size by the Focus module. The size of the input road network image is 640×640×3. The Focus operation divides the input feature map into four sub-maps and stitches them together according to the channel direction. The size becomes 320×320×12. Each sub-map is convolved and then output, realizing the transformation from spatial dimension to channel dimension.

[0036] Step 332) Using the four CBL modules in the backbone network, perform convolution, batch regularization, and Leaky ReLU activation operations on the image processed in Step 331; adjust the number of convolution kernels in the CBL to determine the receptive field of the model;

[0037] Step 333) In conjunction with the cross-stage local network module in the backbone network, the input feature map is divided into two parts. The first part undergoes feature extraction through convolutional layers to obtain deeper feature information; the second part retains the original feature information and is concatenated with the feature map after convolution of the first part to preserve more low-level and high-level contextual feature information. The number of residual blocks in CSP1 is adjusted to determine the network depth of the model.

[0038] Step 334) After the last convolutional layer of the backbone network, feature maps of different scales are processed using the Spatial Pyramid Pooling (SPP) module to obtain feature vectors of fixed length. SPP applies multiple pooling layers with different window sizes to the input feature maps: 1×1, 5×5, 9×9, and 13×13. After max pooling, feature maps of any size are transformed into outputs of 1+25+81+169 dimensions.

[0039] Step 34) includes the following steps:

[0040] Step 341) Use a feature pyramid network to process target intersections at different scales and locations in the same image. By upsampling and downsampling operations, feature maps at different levels are fused together to generate a multi-scale feature pyramid. Semantic and geometric information are then fused through lateral connections.

[0041] Step 342) To address the issue of poor information flow in FPN, a path aggregation network is used to obtain features from different levels of the backbone network. Low-level feature maps are then horizontally connected with high-level feature maps to retain more intersection details. Furthermore, an upsampling operation is used to increase the spatial resolution of shallow feature maps to the same level as deep feature maps, thereby achieving feature map fusion.

[0042] Bounding box regression loss and classification loss are used as the loss functions of the YOLOv5 model, where the bounding box regression loss adopts the GIoU loss function, and its expression is shown in equation (1):

[0043]

[0044] Where I is the intersection area of ​​the ground truth bounding box and the predicted bounding box, U is the union area of ​​the ground truth bounding box and the predicted bounding box, and C represents the area of ​​the smallest rectangular region that can enclose the ground truth bounding box and the predicted bounding box.

[0045] The classification loss uses the binary cross-entropy loss function. Therefore, the complete loss function is expressed as:

[0046]

[0047] Wherein, the first term represents the bounding box regression loss, the second and third terms represent the confidence prediction loss, the fourth term represents the category prediction loss, GIoU Loss is the GIoU loss, S×S represents the number of grids into which the image is divided, B represents the number of bounding boxes predicted per grid cell, and C... i It is the confidence level of the prediction, p i (c) is the probability that the model predicts the i-th bounding box contains category c.

[0048] Threshold filtering is performed on the intersection prediction results: a certain threshold is set for the category probability, and only intersection targets with category probabilities higher than a certain threshold are retained, while intersection detection targets that are uncertain in the model are removed.

[0049] For the problem of multiple overlapping candidate boxes for the same intersection target, the idea of ​​non-maximum suppression is adopted to retain the bounding box with the highest class probability and delete other bounding boxes that highly overlap with it.

[0050] In step 36), after configuring the GPU and other hardware / software training environment, hyperparameters such as decay type, optimizer, and learning rate are set for the model according to the dataset size. Mean Accuracy (mAP) is selected as the evaluation metric for the training model's recognition performance, using the following formula:

[0051]

[0052] Where, the average accuracy (AP) is the area enclosed by the PR curve and the coordinates; n is 4, representing the four classification tasks: irregular intersections, normal intersections, roundabouts, and grade-separated intersections; accuracy rate Find the total percentage

[0053] In step 37), after training in step 36), multiple AI experts with intersection recognition and classification capabilities have been obtained. These experts are then integrated using a weighted voting method to improve the model's recognition accuracy. Weighted voting is a learner combination strategy in ensemble learning. In an intersection prediction task, this method treats each AI expert's prediction label as a vote and assigns a corresponding weight to each vote. In this invention, the weight is the product of the AI ​​expert's mAP and the confidence level in predicting the target. Ultimately, the label with the highest vote value becomes the final output of the ensemble model's prediction.

[0054] For intersection x, the prediction result H(x) output by the ensemble model is shown in the following equation:

[0055]

[0056] In the formula, h i For the type of the i-th AI expert, mAP i Let mAP be the value of the i-th expert identification validation set; expert h i A category will be predicted from the intersection category set {c0=NI, c1=DI, c2=RA, c3=IC}, and h will be... i The predicted output at intersection x is represented as a 4-dimensional vector (h) i 0 (x),h i 1 (x),h i 2 (x),h i 3 (x)), if h i Predict intersection x as category c j Then h i j (x) takes the value 1, otherwise it is 0; C i j (x) is h i Predict intersection x as c j The confidence level.

[0057] Based on this integration strategy, an integrated YOLOv5 vector malformed intersection recognition model based on weighted voting is constructed.

[0058] Step 4) includes the following steps:

[0059] Step 41) Create several square grid slices from the vector road network within the target area to be detected. Sequentially export the grids as labeled image files with latitude and longitude coordinates, using the ID of each grid as the order.

[0060] Step 42) Input the road network grid slice to be detected into the vector irregular intersection recognition model, obtain the recognition and classification results of different types of intersections, count the number of each type of intersection, filter out the irregular intersections and export their clipping images, as well as the center coordinates, length and width of the clipping frame;

[0061] Step 43) Using the geographic information tags of the road network raster image and the cropping box information generated in step 42), calculate the Long and Lat coordinates of the center of the irregular intersection:

[0062]

[0063] Where (Long0,Lat0) are the latitude and longitude coordinates of the top left corner of the raster image; (x,y) are the center coordinates of the cropping box, which are located in a relative coordinate system with the origin at the top left corner of the image and the values ​​of the horizontal and vertical axes being [0,1]; w and h are the width and height of the image in pixels, respectively; r represents the image resolution, that is, the latitude and longitude value corresponding to each unit pixel.

[0064] Step 5) includes the following steps:

[0065] Step 51) Introduce the Angular Texture Signature (ATS) algorithm to extract features from the cropping images of each deformed intersection:

[0066] Step 511) Convert the cropped image into a binary image, and use a fan-shaped mask with a step size of 1° and an included angle of 1° as the center to rotate and scan the average gray value under each mask.

[0067] Step 512) Using the angle as the horizontal axis and the average gray value as the vertical axis, obtain the angle texture map. Crop the gray value valleys in the angle texture map corresponding to each road in the map. The number of valleys is the number of intersections of that road.

[0068] Step 513) Calculate the intersection angle based on the difference between the horizontal coordinates of two adjacent troughs in the angle texture map, where a three-limbed and a four-limbed deformed intersection correspond to three and four angles respectively;

[0069] Step 514) Calculate the intersection deflection angle based on the difference in the horizontal coordinates of two adjacent troughs in the angle texture map. For example, a limb deformity intersection has two deflection angles.

[0070] Step 52) After extracting the geometric features of each irregular intersection in batches according to Step 51), together with the latitude and longitude coordinate fields generated in Step 43), they constitute the irregular intersection feature distribution dataset, which is used for in-depth research such as spatial distribution features and geometric feature distribution analysis.

[0071] Compared with the prior art, the present invention has the following beneficial effects:

[0072] 1. This invention proposes a definition criterion that combines quantitative and qualitative methods for irregular intersections. Compared with traditional definition criterions that are based solely on quantitative or qualitative methods, the criteria of this invention cover a more comprehensive range of irregular intersection types and are more applicable to a variety of irregular intersection special cases.

[0073] 2. This invention proposes a method for identifying irregular intersections based on integrated YOLOv5, which can quickly and automatically extract irregular intersections from large-scale vector road networks. Directly filtering irregular intersections in GIS-based computable road networks is difficult to handle complex situations such as poor connectivity of vector road networks and the convergence of multiple vector streamlines at intersections; however, the YOLOv5 algorithm using convolutional neural networks can learn deeper semantic features of irregular intersections, solving the problem of difficult calculation of topological relationships of vector intersections from the perspective of visual images, while also achieving accurate intersection location.

[0074] 3. This invention employs an angular texture feature algorithm to extract the geometric features of irregular intersections in batches. Combined with latitude and longitude coordinate fields, this forms a large-sample dataset of irregular intersection feature distributions. Research on irregular intersections relying solely on a limited sample is easily limited to local special cases, while a large sample facilitates in-depth analysis of common features, providing more comprehensive theoretical support for management. The generated dataset can be used for road network planning quality evaluation, and can also be linked to accident-prone intersections to analyze the correlation between irregular geometric design and accidents. Attached Figure Description

[0075] Figure 1 This is a flowchart of the method of the present invention;

[0076] Figure 2 This is a set of vector-based malformed intersection illustrations in one embodiment;

[0077] Figure 3 This is a diagram of the YOLOv5 network framework in one embodiment;

[0078] Figure 4 The loss curve and mAP curve for YOLOv5 training in one embodiment;

[0079] Figure 5 Comparison of recognition results of a vector deformed intersection recognition model integrating YOLOv5 in one embodiment with actual image images;

[0080] Figure 6 This is a schematic diagram illustrating the calculation of latitude and longitude coordinates of an irregular intersection in one embodiment.

[0081] Figure 7 This is a schematic diagram of an angle texture feature algorithm in one embodiment;

[0082] Figure 8 This is a geometric feature distribution map of irregular intersections in four main urban areas of Suzhou in one embodiment. Detailed Implementation

[0083] The present invention will now be described in detail with reference to the accompanying drawings and specific embodiments. These embodiments are based on the technical solution of the present invention and provide detailed implementation methods and specific operating procedures. However, the scope of protection of the present invention is not limited to the following embodiments.

[0084] This embodiment provides a method for identifying and extracting features from irregular intersections based on integrated YOLOv5 and angular texture features. It employs a combination of quantitative and qualitative methods, proposing and defining five indicators for irregular intersections: intersection angle, deflection angle, number of road intersections, misalignment, and poor alignment. Vector road data from typical Chinese cities is collected, the data is sliced ​​and converted into geographically informative raster images, and irregular intersections are labeled according to the definition criteria, forming a training and validation set for the identification model. After adjusting appropriate parameters, YOLOv5 is tested under the same configuration environment. The v5s, v5m, and v5l backbone networks were trained, and mean average precision (mAP) was used as the performance evaluation metric. The training result with the highest mAP was selected as the individual learner to be integrated. A weighted voting method was used to integrate the trained v5s, v5m, and v5l backbone networks to construct a non-standard intersection recognition model. The recognition model was used to detect the vector road network in the study area, and could output information such as non-standard intersection screenshots, prediction boxes, and numbers. Combined with raster images with geographic information, the latitude and longitude coordinates of the intersections were matched. An angular texture feature method was applied to scan the non-standard intersection screenshots to calculate the intersection angle, deflection angle, and number of road intersections, and finally form a non-standard intersection sample dataset containing location and geometric design parameters.

[0085] Specifically, such as Figure 1 As shown, it includes the following steps:

[0086] Step 1) Construct a definition criterion for irregular intersections that combines quantitative and qualitative methods.

[0087] Step 11) Based on the characteristics of the vector intersection and referring to the intersection design specifications, calibrate 4 quantitative indicators and 1 qualitative indicator. The quantitative indicators include the intersection angle, deflection angle, number of road intersections, and misalignment amount. The qualitative indicator is the poor horizontal alignment.

[0088] Step 12) Construct a definition criterion for irregular intersections based on quantitative and qualitative indicators. An intersection is considered an irregular intersection if it meets any of the following conditions:

[0089] a) The included angle of the intersection is less than 75° or greater than 115°, where the included angle is the angle enclosed by the two adjacent road edges;

[0090] b) The deviation angle is greater than 5°, where the deviation angle is the angle at which the straight-going vehicle flow line deviates in the crossroads;

[0091] c) The misalignment is greater than 3m and less than 50m;

[0092] d) There is an access point within the physical area of ​​the intersection;

[0093] e) There are five or more road intersections;

[0094] f) A template library formed by summarizing other pre-set special irregular intersections (such as...) Figure 2 (As shown) Matches.

[0095] Step 2) Collect typical urban vector road network data and construct a raster tile dataset, and label the dataset according to the definition criteria in Step 1).

[0096] Step 21) Select five cities with typical road network layouts, namely Suzhou, Shanghai, Guangzhou, Beijing and Xi'an, as data sources. For example, collect their vector road network data (roads include highways and urban roads). Delete internal roads of buildings, unclassified highways and other roads with a level lower than the preset level (i.e., the level is too low). Only retain the road sections that connect to the main line. Symbolize each type of road with different colors to distinguish each road and utilize its implicit features.

[0097] Step 22) Divide the urban road network data collected in Step 21 into units.

[0098] Using ArcGIS's Fishnet tool, the road network was divided into several square grid tiles, each with a side length of 2.3 km. The grid tiles were then exported sequentially as Tag Image File Format (TIFF) files with latitude and longitude coordinates, resulting in a raster image dataset of the road network tiles.

[0099] Step 23) Based on the definition criteria for deformed intersections established in Step 1), use data annotation software such as Labelme and LabelImg to manually annotate various target intersections in the raster image dataset of road network slices. According to the number of different types of intersections, the dataset is divided into training set, validation set and test set. The types of intersection annotations include deformed intersections, normal intersections, grade-separated intersections, and roundabouts, corresponding to the categories "NI", "DI", "IC", and "RA". The labels with category and location information are saved in a text file.

[0100] Taking Suzhou as an example, this study identifies irregular intersections in urban areas. Gusu District, Xiangcheng District, Wuzhong District, and Wujiang District constitute the main urban area of ​​Suzhou. A total of 441 raster images were used as the test set, without data annotation. The labeled training set and validation set consisted of 1385 and 52 images, respectively. The training set included road networks from five cities, including Suzhou and Shanghai, while the validation set only included Suzhou's road network. A total of 6671 normal intersections, 3691 irregular intersections, 316 roundabouts, and 687 grade-separated intersections were labeled in the training and validation sets.

[0101] Step 3) Train multiple YOLOv5 backbone networks using the labeled dataset from Step 2), and integrate the trained backbone networks using a weighted voting method to obtain a vector deformed intersection recognition model.

[0102] Step 31) Construct multiple YOLOv5 backbone networks. In this embodiment, YOLO v5s, v5m, and v5l backbone networks are used, such as... Figure 3 As shown, the YOLOv5 model includes an input network, a backbone network, a neck network, and a prediction network.

[0103] Step 32) Input the labeled road network raster image into the input terminal of the YOLOv5 model and perform size adjustment and data augmentation operations; specifically, scale, crop, fill and other methods to make its size meet the requirement of 640×640; use the Mosaic method for data augmentation processing. The principle is to take 4 images, randomly crop, scale, flip, change saturation and other operations on them, and then use a matrix to extract fixed areas of the 4 images and combine them to form a new image to enrich the learning samples.

[0104] Step 33) Input the image processed in step 32) into the backbone network for feature extraction. The backbone network includes a Focus module, a CBL (Convolution-Batch Normalization-LeakyReLU) module, a Cross Stage Partial Network (CSPNet) module, and a Spatial Pyramid Pooling (SPP) module.

[0105] Step 331) The road network image processed in step 32) is reduced in size by the Focus module. The size of the input road network image is 640×640×3. The Focus operation divides the input feature map into four sub-maps and stitches them together according to the channel direction. The size becomes 320×320×12. Each sub-map is convolved and then output, realizing the transformation from spatial dimension to channel dimension.

[0106] Step 332) Utilize the four CBL modules in the backbone network to perform convolution, batch regularization, and Leaky ReLU activation operations on the image processed in step 331). Adjust the number of CBL convolution kernels to determine the receptive field of the model. In this embodiment, the number of CBL convolution kernels in the YOLO v5s, v5m, and v5l backbone networks is set as shown in Table 1:

[0107] Table 1

[0108] Width (number of convolution kernels) YOLO v5s YOLO v5m YOLO v5l P1 32 48 64 P2 64 96 128 P3 128 192 256 P4 256 384 512 P5 512 768 1024

[0109] Step 333) In conjunction with the Cross-Stage Local Network (CSP) module in the backbone network, the input feature map is divided into two parts. The first part undergoes feature extraction through convolutional layers to obtain deeper feature information; the second part retains the original feature information and is concatenated with the feature map after convolution of the first part to preserve more low-level and high-level contextual feature information. The number of residual blocks in CSP1 is adjusted to determine the network depth of the model. In this embodiment, the number of CSP components in the YOLO v5s, v5m, and v5l backbone networks is set as shown in Table 2:

[0110] Table 2

[0111] Depth (number of components) YOLO v5s YOLO v5m YOLO v5l <![CDATA[1 st CSP1]]> CSP1_1 CSP1_2 CSP1_3 <![CDATA[2 nd CSP1]]> CSP1_3 CSP1_6 CSP1_9 <![CDATA[3 rd CSP1]]> CSP1_3 CSP1_6 CSP1_9 <![CDATA[1 st CSP2]]> CSP2_1 CSP2_2 CSP2_3 <![CDATA[2 nd CSP2]]> CSP2_1 CSP2_2 CSP2_3 <![CDATA[3 rd CSP2]]> CSP2_1 CSP2_2 CSP2_3 <![CDATA[4 th CSP2]]> CSP2_1 CSP2_2 CSP2_3 <![CDATA[5 th CSP2]]> CSP2_1 CSP2_2 CSP2_3

[0112] Step 334) After the last convolutional layer of the backbone network, feature maps of different scales are processed using the Spatial Pyramid Pooling (SPP) module to obtain fixed-length feature vectors. The SPP applies multiple pooling layers with different window sizes to the input feature maps: 1×1, 5×5, 9×9, and 13×13. After max pooling, feature maps of any size are transformed into outputs of 1+25+81+169 dimensions.

[0113] Step 34) While the backbone network is extracting features, the neck network uses a Feature Pyramid Network (FPN) and a Path Aggregation Network (PAN) to further extract and fuse intersection feature information, thereby enhancing the ability to detect various intersection targets.

[0114] Step 341) Use a feature pyramid network to process target intersections at different scales and locations in the same image. By upsampling and downsampling operations, feature maps at different levels are fused together to generate a multi-scale feature pyramid. Semantic and geometric information are then fused through lateral connections.

[0115] Step 342) To address the issue of poor information flow in FPN, a path aggregation network is used to obtain features from different levels of the backbone network. Low-level feature maps are then horizontally connected with high-level feature maps to retain more intersection details. Furthermore, an upsampling operation is used to increase the spatial resolution of shallow feature maps to the same level as deep feature maps, thereby achieving feature map fusion.

[0116] Step 35) Input the feature-extracted road network image into the detection end to generate various intersection prediction boxes and confidence scores, wherein the detection end consists of convolutional layers, pooling layers and fully connected layers.

[0117] Boundary regression loss and classification loss are used as loss functions for the YOLOv5 model to optimize the model's convergence speed and detection performance. The boundary regression loss adopts the GIoU loss function, whose expression is shown in equation (1):

[0118]

[0119] Where I is the intersection area of ​​the ground truth bounding box and the predicted bounding box, U is the union area of ​​the ground truth bounding box and the predicted bounding box, and C represents the area of ​​the smallest rectangular region that can enclose the ground truth bounding box and the predicted bounding box.

[0120] The classification loss uses the binary cross-entropy loss function. Therefore, the complete loss function is expressed as:

[0121]

[0122] Wherein, the first term represents the bounding box regression loss, the second and third terms represent the confidence prediction loss, the fourth term represents the category prediction loss, GIoU Loss is the GIoU loss, S×S represents the number of grids into which the image is divided, B represents the number of bounding boxes predicted per grid cell, and C... i It is the confidence level of the prediction, p i (c) is the probability that the model predicts the i-th bounding box contains category c.

[0123] Threshold filtering is performed on the intersection prediction results: a certain threshold is set for the category probability, and only intersection targets with category probabilities higher than a certain threshold are retained, while intersection detection targets that are uncertain in the model are removed.

[0124] For the problem of multiple overlapping candidate boxes for the same intersection target, the idea of ​​non-maximum suppression is adopted to retain the bounding box with the highest class probability and delete other bounding boxes that highly overlap with it.

[0125] Step 36) Use the training set in the dataset constructed in Step 2) to train YOLOv5 models with different backbone networks, and select the model parameters with the highest average accuracy in multiple rounds of training as the individual learners to be integrated.

[0126] After configuring the training environment (GPU and other hardware), set hyperparameters such as decay type, optimizer, and learning rate for the model according to the dataset size. As a preferred implementation, the training environment used is: Linux Ubuntu 20.04, NVIDIA GeForce RTX 4090 (24GB, GPU), Intel(R) Xeon(R) Platinum 8375C @ 2.90GHz (CPU), 80GB RAM, Python 3.8, PyTorch 1.11.0, and CUDA 11.3.

[0127] Based on the characteristics of the dataset and the model, the training parameters are set as shown in Table 3:

[0128] Table 3

[0129] parameter Value parameter Value Input dimensions 640×640 Number of iterations 300 Batch size 12 Attenuation type cos Initial learning rate <![CDATA[1×10 -2 ]]> Minimum learning rate <![CDATA[1×10 -4 ]]> Optimizer Stochastic gradient descent momentum 0.9

[0130] The mean accuracy (mAP) is chosen as the evaluation metric for the recognition performance of the trained model, as shown in the following formula:

[0131]

[0132] Where, the average accuracy (AP) is the area enclosed by the PR curve and the coordinates; n is 4, representing the four classification tasks: irregular intersections, normal intersections, roundabouts, and grade-separated intersections; accuracy rate Find the total percentage

[0133] The model training results in this embodiment are as follows: Figure 4 As shown, the YOLOv5s, v5m, and v5l models converged and stopped training at the 182nd, 212th, and 180th iterations, respectively. Their loss functions decreased rapidly in the early stages of training and then stabilized, indicating good learning performance. The mAP curves show that v5s, v5m, and v5l achieved maximum mAP of 0.697, 0.680, and 0.666 at the 127th, 152nd, and 80th iterations, respectively. Therefore, the backbone network parameters from these three training iterations were selected as the individual learners to be integrated, i.e., the AI ​​expert for identifying abnormal intersections.

[0134] Step 37) Use a weighted voting method to integrate the v5s, v5m and v5l backbone networks to be integrated, and construct an integrated vector malformed intersection recognition model.

[0135] Following training in step 36), multiple AI experts with intersection recognition and classification capabilities have been obtained. These experts are then integrated using a weighted voting method to improve the model's recognition accuracy. Weighted voting is a learner combination strategy in ensemble learning. In an intersection prediction task, this method treats each AI expert's prediction as a vote and assigns a corresponding weight to each vote. In this invention, the weight is the product of the AI ​​expert's mAP and the confidence level in predicting the target. Ultimately, the label with the highest vote value becomes the final output of the ensemble model.

[0136] For intersection x, the prediction result H(x) output by the ensemble model is shown in the following equation:

[0137]

[0138] In the formula, h i For the type of the i-th AI expert, mAP i Let mAP be the value of the i-th expert identification validation set; expert h i A category will be predicted from the intersection category set {c0=NI, c1=DI, c2=RA, c3=IC}, and h will be... i The predicted output at intersection x is represented as a 4-dimensional vector (h) i 0 (x),h i 1 (x),h i 2 (x),h i 3 (x)), if h i Predict intersection x as category c j Then h i j(x) takes the value 1, otherwise it is 0; C i j (x) is h i Predict intersection x as c j The confidence level.

[0139] Based on this integration strategy, an integrated YOLOv5 vector abnormal intersection recognition model based on weighted voting was constructed. Using the integrated YOLO v5 model to identify intersections in 52 road network images in the validation set, the model's mAP was calculated to be 0.843. Compared to the YOLO v5s with the highest recognition accuracy, the mAP was improved by 20.95%, indicating a significant improvement in the recognition accuracy of the integrated model.

[0140] Step 4) Input the vector road network to be detected into the vector irregular intersection recognition model to identify the number of irregular intersections, the clipping diagram and clipping box information, and calculate the latitude and longitude coordinates of the center of each irregular intersection.

[0141] Step 41) Create several square grid slices from the vector road network within the target area to be detected. Sequentially export the grids as labeled image files with latitude and longitude coordinates, using the ID of each grid as the order.

[0142] In this embodiment, Gusu, Xiangcheng, Wuzhong and Wujiang districts of Suzhou are used as the template range for this detection. The vector road network within the target range is exported as TIFF raster slices according to the method in step 22).

[0143] Step 42) Input the road network grid slice to be detected into the vector irregular intersection recognition model, obtain the recognition and classification results of different types of intersections, count the number of each type of intersection, filter out the irregular intersections and export their clipping images, as well as the center coordinates, length and width of the clipping frame.

[0144] The identification and classification results of the four types of intersections obtained in this embodiment are shown in Table 4:

[0145] Table 4

[0146] Intersection type normal intersection irregular intersection Interchange a roundabout total quantity 5420 1799 92 16 7327

[0147] The results show that there are a total of 7327 intersections in the four administrative districts of Suzhou City, of which 1799 are irregular intersections, accounting for 24.55%. The recognition results of the vector irregular intersection recognition model integrated with YOLOv5 are compared with the actual image maps as follows: Figure 5 As shown, the model has accurately learned the detailed features of various intersections and can accurately classify and locate them.

[0148] Step 43) Using the geographic information tags of the road network raster image and the cropping box information generated in step 42), according to... Figure 6Based on the principle shown, calculate the Long and Lat coordinates of the center of the irregular intersection:

[0149]

[0150] Where (Long0,Lat0) are the latitude and longitude coordinates of the top left corner of the raster image; (x,y) are the center coordinates of the cropping box, which are located in a relative coordinate system with the origin at the top left corner of the image and the values ​​of the horizontal and vertical axes being [0,1]; w and h are the width and height of the image in pixels, respectively; r represents the image resolution, that is, the latitude and longitude value corresponding to each unit pixel.

[0151] Step 5) Use the angle texture feature algorithm to identify the number of road intersections, the included angle, and the deflection angle in the irregular intersection clipping image, and construct an irregular intersection feature distribution dataset.

[0152] Step 51) Introduce the Angular Texture Signature (ATS) algorithm, such as... Figure 7 As shown, features are extracted from the cutout diagrams of each irregular intersection.

[0153] Step 511) Convert the cropped image into a binary image, and use a fan-shaped mask with a step size of 1° and an included angle of 1° as the center to rotate and scan the average gray value under each mask.

[0154] Step 512) Using the angle as the horizontal axis and the average gray value as the vertical axis, obtain the angle texture map. Crop the gray value valleys in the angle texture map corresponding to each road in the map. The number of valleys is the number of intersections of that road.

[0155] Step 513) Calculate the intersection angle based on the difference between the horizontal coordinates of two adjacent troughs in the angle texture map, where a three-limbed and a four-limbed deformed intersection correspond to three and four angles respectively;

[0156] Step 514) Calculate the intersection deflection angle based on the difference in the horizontal coordinates of two adjacent troughs in the angle texture map. For example, a limb deformity intersection has two deflection angles.

[0157] Step 52) After extracting the geometric features of each irregular intersection in batches according to Step 51), together with the latitude and longitude coordinate fields generated in Step 43), they constitute the irregular intersection feature distribution dataset, which is used for in-depth research such as spatial distribution features and geometric feature distribution analysis.

[0158] This embodiment utilizes the ATS method to extract geometric features of four irregular intersections in the main urban area of ​​Suzhou in batches, yielding the geometric feature distribution results, which can be visualized as follows: Figure 8As shown. The geometric feature fields of each intersection are combined with the latitude and longitude coordinate fields generated in step 43) to form a dataset of irregular intersection feature distribution. Further research using this dataset may include, but is not limited to: examining the mathematical distribution types of included angles, deviation angles, and the number of road intersections; determining the relationship between each distribution type parameter and the quality of road network planning; providing guidance for overall road network optimization; analyzing the spatial distribution characteristics of irregular intersections within the detection area using latitude and longitude coordinates; and comprehensively analyzing the correlation between irregular intersections and traffic accidents using latitude and longitude coordinates and geometric features.

[0159] The preferred embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make numerous modifications and variations based on the concept of the present invention without creative effort. Therefore, all technical solutions that can be obtained by those skilled in the art based on the concept of the present invention through logical analysis, reasoning, or limited experimentation on the basis of existing technology should be within the scope of protection defined by the claims.

Claims

1. A method for identifying and extracting features from deformed intersections based on integrated YOLOv5 and angular texture features, characterized in that, Includes the following steps: Step 1) Construct a definition criterion for irregular intersections based on 4 quantitative indicators and 1 qualitative indicator. The quantitative indicators include the intersection angle, deviation angle, number of road intersections, and misalignment of the intersection. The qualitative indicator is the poor horizontal alignment. Step 2) Collect typical urban vector road network data and construct a raster tile dataset, and label the dataset according to the definition criteria in Step 1); Step 3) Use the training set in the dataset constructed in Step 2) to train YOLOv5 models with different backbone networks, and select the model parameters with the highest average accuracy in multiple rounds of training as the individual learners to be integrated. The weighted voting method is used to integrate the individual learners to be integrated. In an intersection prediction task, the prediction label of each AI expert is regarded as a vote, and the weight is the product of the average accuracy of the AI ​​expert and the confidence of the prediction of the target. The corresponding weight is assigned to the vote, and the label with the highest vote value is used as the final output result of the integrated model prediction, thereby constructing an integrated vector deformed intersection recognition model. Step 4) Input the vector road network to be detected into the vector irregular intersection recognition model to identify the number of irregular intersections, the clipping map and clipping box information, and calculate the latitude and longitude coordinates of the center of each irregular intersection; Step 5) Use the angle texture feature algorithm to identify the number of road intersections, the included angle, and the deflection angle in the cropped image of the irregular intersection, and construct a feature distribution dataset for the irregular intersection. The angle texture feature algorithm specifically includes: Step 511) Convert the cropped image into a binary image, and using the intersection of the vector intersections as the center, rotate and scan the average gray value under each mask with a preset step size and preset angle. Step 512) Using the angle as the horizontal axis and the average gray value as the vertical axis, obtain the angle texture map. Crop the gray value valleys in the angle texture map corresponding to each road in the map. The number of valleys is the number of intersections of that road. Step 513) Calculate the intersection angle based on the difference between the horizontal coordinates of two adjacent troughs in the angle texture map, where a three-limbed and a four-limbed deformed intersection correspond to three and four angles respectively; Step 514) Calculate the intersection deflection angle based on the difference in the horizontal coordinates of two adjacent troughs in the angle texture map. A limb deformed intersection has two deflection angles.

2. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 1, characterized in that, Step 1) includes the following steps: Step 11) Based on the characteristics of vector intersections and referring to intersection design specifications, calibrate four quantitative indicators and one qualitative indicator; Step 12) An intersection is considered an irregular intersection if it meets any of the following conditions: a) The included angle of the intersection is less than 75° or greater than 115°, where the included angle is the angle enclosed by the two adjacent road edges; b) The deviation angle is greater than 5°, where the deviation angle is the angle at which the straight-moving vehicle flow line deviates in the crossroads; c) The misalignment is greater than 3m and less than 50m; d) An access point exists within the physical area of ​​the intersection; e) There are five or more road intersections; f) Matches with the template library formed by summarizing other special irregular intersections.

3. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 1, characterized in that, Step 2) includes the following steps: Step 21) Select several cities with typical road network layouts as data sources, collect their vector road network data and preprocess them, wherein the roads include highways and urban roads; Step 22) Divide the urban road network data collected in Step 21 into units: make the road network into several square grid slices, and export the grids in sequence as labeled image files with latitude and longitude coordinates according to the ID of each grid, so as to obtain the raster image dataset of the road network slices. Step 23) Based on the definition criteria for irregular intersections constructed in Step 1), use data annotation software to manually annotate various target intersections in the raster image dataset of road network slices, and divide the dataset into training set, validation set and test set according to the number of different types of intersections. The types of intersections annotated include irregular intersections, normal intersections, grade-separated intersections and roundabouts.

4. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 3, characterized in that, The preprocessing in step 21) involves deleting internal roads of buildings, highways of unclassified roads, and other roads of lower class than the preset class, retaining only the road sections that connect to the main line, and symbolizing each type of road with different colors to distinguish each road and utilize its implicit features.

5. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 1, characterized in that, Step 3) also includes the following steps: Step 31) Construct multiple YOLOv5 backbone networks. The YOLOv5 model includes an input network, a backbone network, a neck network, and a detection network. Step 32) Input the labeled road network raster image into the input terminal of the YOLOv5 model, and perform size adjustment and data augmentation operations; Step 33) Input the image processed in step 32) into the backbone network for feature extraction, wherein the backbone network includes a Focus module, a CBL module, a cross-stage local network module, and a spatial pyramid pooling module. Step 34) While the backbone network is extracting features, the neck network uses a feature pyramid network and a path aggregation network to extract and fuse intersection feature information, thereby enhancing the ability to detect various intersection targets. Step 35) Input the feature-extracted road network image into the detection end to generate various intersection prediction boxes and confidence scores, wherein the detection end consists of convolutional layers, pooling layers and fully connected layers.

6. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 5, characterized in that, Step 33) includes the following steps: Step 331) The road network image processed in step 32) is divided into four sub-maps by the Focus module. After being stitched together according to the channel direction, each sub-map is convolved and then output, realizing the transformation from spatial dimension to channel dimension. Step 332) Using the four CBL modules in the backbone network, perform convolution, batch regularization, and Leaky ReLU activation operations on the image processed in step 331); Step 333) In conjunction with the cross-stage local network module in the backbone network, the input feature map is divided into two parts. The first part is processed by convolutional layers for feature extraction to obtain deeper feature information. The second part retains the original feature information and is concatenated with the feature map after convolution of the first part to retain more low-level and high-level contextual feature information. Step 334) After the last convolutional layer of the backbone network, feature maps of different scales are processed based on the spatial pyramid pooling module to obtain feature vectors of fixed length.

7. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 5, characterized in that, Step 34) includes the following steps: Step 341) Use a feature pyramid network to process target intersections at different scales and locations in the same image. By upsampling and downsampling operations, feature maps at different levels are fused together to generate a multi-scale feature pyramid. Semantic and geometric information are then fused through lateral connections. Step 342) Use the path aggregation network to obtain features from different levels in the backbone network, connect the low-level feature maps with the high-level feature maps laterally to retain more intersection details, and use upsampling operation to increase the spatial resolution of the shallow feature maps to the same level as the deep feature maps, thereby achieving feature map fusion.

8. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 5, characterized in that, The YOLOv5 model uses bounding box regression loss and classification loss as its loss functions, where the bounding box regression loss employs... GIoU The loss function, for classification loss, uses the binary cross-entropy loss function, resulting in the complete loss function expression as follows: The first term represents the bounding box regression loss, the second and third terms represent the confidence prediction loss, and the fourth term represents the category prediction loss. GIoU Loss for GIoU loss, S × S This indicates the number of grids into which the image is divided. B This indicates the number of bounding boxes predicted for each grid cell. C i It is the confidence level of the prediction. p i ( c ) is the model prediction of the first i Each bounding box contains categories c The probability of.

9. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 1, characterized in that, Step 4) includes the following steps: Step 41) Create several square grid slices from the vector road network within the target area to be detected. Sequentially export the grids as labeled image files with latitude and longitude coordinates, using the ID of each grid as the order. Step 42) Input the road network grid slice to be detected into the vector irregular intersection recognition model, obtain the recognition and classification results of different types of intersections, count the number of each type of intersection, filter out the irregular intersections and export their clipping images, as well as the center coordinates, length and width of the clipping frame; Step 43) Calculate the latitude and longitude coordinates of the center of the irregular intersection using the geographic information tags of the road network raster image and the cropping box information generated in step 42). Long and Lat : in,( Long 0, Lat 0) represents the latitude and longitude coordinates of the top left corner of the raster image; x , y ) represents the center coordinates of the cropping box. These coordinates are located in a relative coordinate system, with the origin at the top left corner of the image and the values ​​of the horizontal and vertical axes being [0,1]. w and h These are the width and height of the image in pixels, respectively. r This indicates the image resolution, which is the latitude and longitude value corresponding to each unit pixel.

10. The method for identifying and extracting features of deformed intersections based on integrated YOLOv5 and angular texture features according to claim 9, characterized in that, Step 5) includes the following steps: Step 51) Introduce the angle texture feature algorithm to extract the features of each deformed intersection clipping image; Step 52) After extracting the geometric features of each irregular intersection in batches according to Step 51), the features are combined with the latitude and longitude coordinate fields generated in Step 43) to form the irregular intersection feature distribution dataset.