A traffic light frequency flashing tracking method and device and an unmanned vehicle

By using target detection algorithms and Kalman filter matching algorithms, the problem of autonomous vehicles being unable to recognize flashing traffic lights was solved, enabling accurate tracking and recognition of traffic lights and improving safety.

CN114998868BActive Publication Date: 2026-06-26SHENZHEN UNITY-DRIVE INNOVATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN UNITY-DRIVE INNOVATION TECH CO LTD
Filing Date
2022-06-28
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Self-driving cars cannot accurately recognize flashing traffic lights, leading to reduced safety.

Method used

The target detection algorithm is used to obtain the traffic light detection boxes in the current frame image. The category of the detection box is obtained and the first detection box of the preset category is selected. The Kalman filter is combined to obtain the prediction boxes of the previous N frames of images, and the matching algorithm is used to confirm that the detection box is being tracked.

Benefits of technology

It enables accurate identification of traffic lights even when they are flashing, thus improving the safety of autonomous vehicles.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114998868B_ABST
    Figure CN114998868B_ABST
Patent Text Reader

Abstract

The present application relates to the technical field of artificial intelligence, in particular to a traffic light frequency flash tracking method and device and an unmanned vehicle, mainly according to a target detection algorithm, a detection frame of a traffic light in a current frame image is acquired, then a category of the detection frame is acquired, and a first detection frame with a preset category is acquired based on the category, then a prediction frame of a tracked traffic light in the previous N frame images mapped to the current frame image is acquired, and finally the first detection frame and the prediction frame are matched, if the first detection frame and the prediction frame are matched successfully, the state of the first detection frame is determined as tracked. Thus, the traffic light can be accurately identified when the traffic light flashes, and the safety of the unmanned vehicle is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence technology, and in particular to a method, device, and driverless car for tracking flashing traffic lights. Background Technology

[0002] With the continuous development of artificial intelligence technology, self-driving cars are being widely used. However, tracking traffic lights on the road during the movement of self-driving cars is a major challenge. As the self-driving car moves, the duty cycle of the camera in the self-driving car is not consistent with the duty cycle of the traffic light flashing, which makes it impossible for the self-driving car to accurately identify the current status of the traffic light, thus reducing the safety of the self-driving car. Summary of the Invention

[0003] The present invention provides a method, device and autonomous vehicle for tracking flashing traffic lights, aiming to solve the technical problems of autonomous vehicles being unable to accurately identify flashing traffic lights and having poor safety performance in the prior art.

[0004] To solve the above-mentioned technical problems, one technical solution adopted in the embodiments of the present invention is: to provide a method for tracking the flashing of traffic lights, applied to autonomous vehicles, the method comprising:

[0005] Based on the target detection algorithm, obtain the detection bounding box of the traffic light in the current frame image;

[0006] Obtain the category of the detection box, and obtain a first detection box with a preset category based on the category;

[0007] Obtain the predicted bounding boxes of the tracked traffic lights in the previous N frames and map them to the current frame image;

[0008] The first detection box is matched with the predicted box. If the first detection box and the predicted box match successfully, the state of the first detection box is determined to be tracked.

[0009] Optionally, matching the first detection box with the predicted box, and determining the state of the first detection box as being tracked if the first detection box and the predicted box match successfully, includes:

[0010] Obtain the confidence level of the first detection box;

[0011] Based on the confidence level of the first detection box, the first detection box is divided into high-scoring boxes and low-scoring boxes;

[0012] The high-resolution bounding box is matched with the predicted bounding box to obtain a first matching result;

[0013] The predicted bounding boxes that do not match the high-scoring bounding boxes are matched with the low-scoring bounding boxes to obtain a second matching result;

[0014] The first matching result and the second matching result are integrated to obtain the tracking result of the flashing traffic light.

[0015] Optionally, matching the high-resolution bounding box with the predicted bounding box to obtain a first matching result includes:

[0016] The first and second position information of the high-resolution bounding box and the prediction bounding box in the current frame image are obtained respectively;

[0017] The overlap degree of the first location information and the second location information is calculated to obtain high-resolution bounding boxes and predicted bounding boxes with an overlap degree greater than a first preset threshold, and high-resolution bounding boxes and predicted bounding boxes with an overlap degree equal to a second preset threshold are obtained.

[0018] High-resolution bounding boxes and predicted bounding boxes with an overlap greater than the first preset threshold are matched. If the high-resolution bounding box and the predicted bounding box are successfully matched, the state of the high-resolution bounding box is confirmed as being tracked.

[0019] Calculate the Euclidean distance between the high-resolution bounding box and the predicted bounding box whose overlap is equal to the second preset threshold, and confirm that the high-resolution bounding box is tracked when the Euclidean distance is greater than the preset distance.

[0020] Optionally, matching the predicted bounding box that does not match the high-scoring bounding box with the low-scoring bounding box to obtain a second matching result includes:

[0021] The third and fourth position information of the low-resolution bounding box and the predicted bounding box that does not match the high-resolution bounding box in the current frame image are obtained respectively.

[0022] The overlap degree is calculated on the third location information and the fourth location information, and low-scoring boxes with an overlap degree greater than the third preset threshold and predicted boxes that do not match the high-scoring boxes are obtained, as well as low-scoring boxes with an overlap degree equal to the second preset threshold and predicted boxes that do not match the high-scoring boxes.

[0023] High-scoring bounding boxes with an overlap greater than the third preset threshold and predicted bounding boxes that do not match the high-scoring bounding boxes are matched. If a low-scoring bounding box successfully matches a predicted bounding box that does not match the high-scoring bounding box, the low-scoring bounding box is confirmed to be tracked.

[0024] Calculate the Euclidean distance between the low-scoring bounding box whose overlap is equal to the second preset threshold and the predicted bounding box that does not match the high-scoring bounding box, and confirm that the low-scoring bounding box is tracked when the Euclidean distance is greater than the preset distance.

[0025] Optionally, after determining that the state of the first detection box is being tracked, the method further includes:

[0026] Obtain the category of the first tracked detection box and determine the duration for which the first tracked detection box remains within the category;

[0027] If the duration is less than a preset time threshold, the traffic light corresponding to the first detection frame being tracked is determined to be a flashing traffic light.

[0028] If the duration is greater than or equal to the preset time threshold, then the traffic light corresponding to the first detection frame being tracked is determined to be a flashing traffic light.

[0029] Optionally, after determining that the state of the first detection box is being tracked, the method further includes:

[0030] Obtain the category of the first detected bounding box being tracked in the previous frame image, and update the category of the first detected bounding box being tracked to the category in the previous frame image;

[0031] The first detected bounding box that is tracked is used as the predicted bounding box for the traffic lights in the next frame of the current frame image, and the traffic lights in the next frame image are tracked based on the predicted bounding box.

[0032] Optionally, after obtaining the detection bounding box of the traffic light in the current frame image, the method further includes:

[0033] Obtain all detection boxes in the current frame image;

[0034] The overlap of the detection boxes is calculated to determine whether there are overlapping detection boxes in the current frame image;

[0035] When there are overlapping detection boxes in the current frame image, the detection boxes with higher confidence are retained based on the confidence of the detection boxes.

[0036] To solve the above-mentioned technical problems, another technical solution adopted in the embodiments of the present invention is: providing a traffic light flashing tracking device for use in autonomous vehicles, the device comprising:

[0037] The first acquisition module is used to acquire the detection box of the traffic light in the current frame image according to the target detection algorithm;

[0038] The second acquisition module is used to acquire the category of the detection box, and acquire a first detection box with a preset category based on the category;

[0039] The third acquisition module is used to acquire the predicted bounding boxes mapped from the tracked traffic lights in the previous N frames to the current frame image.

[0040] The matching module is used to match the first detection box with the predicted box. If the first detection box and the predicted box are successfully matched, the state of the first detection box is determined to be tracked.

[0041] To solve the above-mentioned technical problems, another technical solution adopted in the embodiments of the present invention is: providing an unmanned vehicle, comprising:

[0042] At least one processor; and,

[0043] A memory communicatively connected to the at least one processor; wherein,

[0044] The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method as described above.

[0045] To solve the above-mentioned technical problems, another technical solution adopted in the embodiments of the present invention is to provide a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the processor performs the method described above.

[0046] Unlike related technologies, this invention provides a method, apparatus, and autonomous vehicle for tracking flashing traffic lights. It primarily uses a target detection algorithm to obtain detection boxes for traffic lights in the current frame image, then determines the category of the detection boxes, and obtains a first detection box with a preset category based on the category. Next, it obtains predicted boxes mapping the tracked traffic lights from the previous N frames to the current frame image. Finally, it matches the first detection box with the predicted box. If the first detection box and the predicted box match successfully, the state of the first detection box is determined to be tracked. This enables accurate identification of traffic lights when they are flashing, improving the safety of autonomous vehicles. Attached Figure Description

[0047] One or more embodiments are illustrated by way of example with reference to the accompanying drawings. These illustrations do not constitute a limitation on the embodiments. Elements having the same reference numerals in the drawings are denoted as similar elements. Unless otherwise stated, the figures in the drawings are not to be limited by scale.

[0048] Figure 1 This is a flowchart of a method for tracking traffic light flashing provided in an embodiment of the present invention;

[0049] Figure 2 This is a flowchart of a method for tracking traffic light flashing according to another embodiment of the present invention;

[0050] Figures 3a-3b This is a schematic diagram of continuous frame images acquired during traffic light tracking, provided in an embodiment of the present invention.

[0051] Figure 4 This is a structural block diagram of a traffic light flashing tracking device provided in an embodiment of the present invention;

[0052] Figure 5 This is a structural diagram of an autonomous vehicle provided in an embodiment of the present invention. Detailed Implementation

[0053] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.

[0054] It should be noted that, unless otherwise specified, the various features in the embodiments of the present invention can be combined with each other, and all are within the protection scope of the present invention. Furthermore, although functional modules are divided in the device schematic diagram and a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in a different module division or in a different order than that shown in the device schematic diagram or the flowchart.

[0055] Unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. The terminology used in this specification is for the purpose of describing particular embodiments only and is not intended to limit the invention. The term "and / or" as used in this specification includes any and all combinations of one or more of the associated listed items.

[0056] Please see Figure 1 , Figure 1 This is a flowchart illustrating a method for tracking the flashing of traffic lights according to an embodiment of the present invention. This method is applied to autonomous vehicles, such as... Figure 1 As shown, the method includes:

[0057] S11. Based on the target detection algorithm, obtain the detection box of the traffic light in the current frame image.

[0058] Specifically, the autonomous vehicle includes a camera. During its movement, the camera captures real-time images of the current road segment, thus obtaining a current frame image of the current road segment, where traffic lights are present. This current frame image is then input into a target detection algorithm to obtain detection boxes for the traffic lights within the current frame image.

[0059] The object detection algorithms can be divided into one-stage and two-stage algorithms. The two-stage algorithms include the R-CNN (Region-CNN) series, which is a technique for object detection based on convolutional neural networks (CNNs), linear regression, and other algorithms. It mainly processes the input image through a network to generate detection boxes, which include location and category information. The one-stage algorithms include the YOLO (You Only Look Once) series, which is an object detection system based on a single neural network. It mainly processes the input image through a candidate box generation network to generate candidate boxes, and then processes the image through a classification network to classify the content of the candidate boxes. Preferably, this embodiment uses a YOLOv5 network to process the current frame image to obtain the detection boxes for traffic lights in the current frame image.

[0060] In some embodiments, TensorRT can also be constructed within the object detection algorithm. TensorRT refers to an inference framework that can run on various GPU (graphics processing unit) hardware platforms. By converting the model trained using the framework into TensorRT format and then running the model using the TensorRT inference engine, the speed at which the model runs on the GPU is improved. Therefore, by constructing TensorRT, the detection speed of the object detection algorithm on the frame image can be improved, thereby achieving the goal of real-time detection of the frame image without missing any frames.

[0061] In some embodiments, after obtaining the detection bounding boxes of the traffic lights in the current frame image, all detection bounding boxes in the current frame image can be obtained. Then, the overlap of these detection bounding boxes is calculated to determine if there are any overlapping detection bounding boxes in the current frame image. When overlapping detection bounding boxes exist, the detection bounding box with the higher confidence level is retained based on its confidence score. This avoids the possibility of two detection bounding boxes for the same traffic light due to detection errors, thereby improving the accuracy of traffic light detection.

[0062] S12. Obtain the category of the detection box, and obtain a first detection box with a preset category based on the category.

[0063] Specifically, the detection box includes the position, category, and confidence level of the corresponding traffic light. The position refers to the coordinates, height, and width of the traffic light in the frame image; the confidence level refers to the importance of the detection box in the frame image; and the category refers to the color of the traffic light. After obtaining the detection box, based on its category, it is determined whether the category of the detection box is a preset category. If the category of the detection box is a preset category, then the detection box is confirmed as the first detection box. Optionally, the preset category can be black. When the traffic light is flashing, the traffic light will alternate between bright and dark. At this time, the category of the traffic light can be red, green, yellow, or black. That is, when the category of the traffic light is black, it is confirmed that the traffic light is in a flashing state.

[0064] In some embodiments, after obtaining the detection bounding boxes of traffic lights in the current frame image, the detection bounding boxes need to be transformed to obtain the specific position, category, and score of the detection bounding box corresponding to the traffic light in the frame image. Specifically, the specific position (x1, y1, x2, y2) of the detection bounding box in the frame image is obtained based on the coordinates, height, and width of the detection bounding box, and the score of the detection bounding box is obtained based on the confidence level. The score can range from 0 to 1. The specific position, category, and score of the detection bounding box can be stored in a preset data structure, such as constructing an strack (stack) to store the specific position, category, and score of the detection bounding box. The stack refers to a specific register, a sequentially arranged data structure, where data items can only be inserted and deleted at one end, strictly following the "last-in, first-out" principle for storage and retrieval, and is used to temporarily store data and addresses.

[0065] S13. Obtain the predicted bounding boxes of the tracked traffic lights in the previous N frames and map them to the current frame.

[0066] The predicted bounding box can be obtained using a Kalman filter. Specifically, the detection bounding box of the tracked traffic light in the previous frame is obtained and input into the Kalman filter. Then, the estimated position of the tracked traffic light in the current frame is obtained, thus yielding the predicted bounding box in the current frame. The predicted bounding box includes position, category, and confidence level. The position is the estimated position, and the category and confidence level are derived from the category and confidence level of the detection bounding box corresponding to the tracked traffic light. The Kalman filter is an algorithm that uses the state equations of a linear system to optimally estimate the system state using system input and output observation data.

[0067] The state of the traffic light corresponding to the detection box in the current frame can be determined by the state of the detection box. After the detection boxes in the current frame image are matched by the prediction box, the state of the detection boxes in the current frame image can be obtained. The state of the detection box includes the tracked state, the added state, the lost state, and the removed state. The added state refers to a newly appearing detection box in the current frame image; the tracked state refers to the traffic light appearing in at least two frames; the lost state refers to the prediction box not being matched during the matching in the current frame image; and the removed state refers to the detection box in the lost state not being matched within a preset number of frames. That is, during the matching process between the detection box and the prediction box, if the detection box and the prediction box match successfully, the state of the detection box is the tracked state; if the detection box and the prediction box do not match successfully, the state of the detection box is the added state, and the state of the prediction box is the lost state; if the prediction box in the lost state has not been matched within a preset number of frames, the state of the prediction box is the removed state. Preferably, the preset frame count can be 30 frames. If the predicted bounding box of the lost state has not been matched for more than 30 frames, the predicted bounding box of the lost state is updated to the predicted bounding box of the removed state.

[0068] S14. Match the first detection box with the prediction box. If the first detection box and the prediction box match successfully, determine that the state of the first detection box is being tracked.

[0069] For details, please refer to Figure 2 , Figure 2 This is a flowchart of a method for tracking traffic light flashing according to another embodiment of the present invention, as shown below. Figure 2 As shown, it includes:

[0070] S141. Obtain the confidence level of the first detection box.

[0071] S142. Based on the confidence level of the first detection box, divide the first detection box into high-scoring boxes and low-scoring boxes.

[0072] Based on the score of the first detection box, a detection threshold is set. Then, the score of the first detection boxes in the stack is compared with the detection threshold. Finally, the first detection boxes with scores greater than or equal to the detection threshold are determined as high-scoring boxes, and the first detection boxes with scores less than the detection threshold are determined as low-scoring boxes. The detection threshold is an optimal value obtained after multiple experiments. By setting the detection threshold, priority is given to objects similar to the tracked traffic light in the target detection of the current frame image, thereby improving the accuracy of matching.

[0073] S143. Match the high-resolution bounding box with the predicted bounding box to obtain a first matching result.

[0074] Specifically, after obtaining the predicted bounding box based on the previous frame or the previous N frames, trajectory matching is performed between the high-resolution bounding box and the predicted bounding box. Trajectory matching refers to obtaining the matching value between the first detected bounding box and the predicted bounding box corresponding to the tracked traffic light in the previous N frames based on Intersection over Union (IOU). When the matching value is greater than a first preset threshold, the high-resolution bounding box in the current frame is then matched with the predicted bounding box using a Hungarian matching algorithm to obtain a first matching result. The first matching result indicates that the high-resolution bounding box and the predicted bounding box are successfully matched. The Hungarian matching algorithm is a combinatorial optimization algorithm that solves the task allocation problem in polynomial time. The first preset threshold is an optimal value obtained from multiple experiments. By using the first preset threshold, the accuracy of tracking the high-resolution bounding box is improved while avoiding detection errors caused by excessively high values.

[0075] In some embodiments, since the camera moves along with the autonomous vehicle, traffic light tracking is achieved by using a dynamic camera to track a static traffic light. Each camera movement during traffic light tracking causes a significant pixel displacement of the traffic light in the image. Therefore, when matching the overlap between the high-resolution bounding box and the predicted bounding box, if the matching value between the high-resolution bounding box and the predicted bounding box is a second preset threshold, the Euclidean distance between the high-resolution bounding box and the predicted bounding box can be calculated and compared with a preset distance. When the Euclidean distance is less than the preset distance, the high-resolution bounding box is determined to be in a tracked state. Furthermore, the overlap between the high-resolution bounding box and the predicted bounding box is modified to a third preset threshold to prevent the high-resolution bounding box in a tracked state from being filtered out due to numerical settings, thereby improving the accuracy of traffic light tracking. The third preset threshold is less than the first preset threshold, and the second and third preset thresholds are optimal values ​​obtained through multiple experiments. The Euclidean distance is calculated as the distance between the high-resolution bounding box and the predicted bounding box in the current frame image, and its unit is pixels. The preset distance is set based on the current speed of the autonomous vehicle. Preferably, the second preset threshold is 0, the third preset threshold is 0.5, and when the speed of the autonomous vehicle is 30-40 kilometers per hour, the preset distance is 50 pixels.

[0076] Specifically, the system obtains the first position information of the high-resolution bounding box in the current frame image and the second position information of the predicted bounding box in the consecutive frame images. Then, based on the first and second position information, it calculates the overlap between the high-resolution bounding box and the predicted bounding box to obtain a matching value. If the matching value is less than a first preset threshold, it obtains the high-resolution bounding box and the predicted bounding box with a matching value of the second preset threshold, calculates the Euclidean distance between the high-resolution bounding box and the predicted bounding box, and confirms that the high-resolution bounding box is the tracked trajectory when the Euclidean distance is less than the preset distance. If the matching value is greater than the first preset threshold, it performs a Hungarian matching algorithm on the high-resolution bounding box and the predicted bounding box. If the high-resolution bounding box and the predicted bounding box match successfully, it is determined that the predicted bounding box is successfully tracked, and the traffic light corresponding to the high-resolution bounding box is a flashing traffic light.

[0077] S144. Match the predicted box that does not match the high-scoring box with the low-scoring box to obtain a second matching result.

[0078] In the prediction box, the prediction boxes that do not have a matching relationship with the high-resolution box are obtained, and the third and fourth position information of the low-resolution box and the prediction boxes that do not match the high-resolution box in the current frame image are obtained. The overlap degree between the low-resolution box and the prediction box that does not match the high-resolution box is calculated according to the third and fourth position information. The low-resolution box and the prediction box that does not match the high-resolution box with an overlap degree greater than a third preset threshold are obtained, as well as the low-resolution box and the prediction box that does not match the high-resolution box with an overlap degree equal to the second preset threshold are obtained. The low-resolution box and the prediction box that does not match the high-resolution box with an overlap degree greater than the third preset threshold are subjected to Hungarian matching. If the low-resolution box and the prediction box that does not match the high-resolution box are successfully matched, the low-resolution box is determined to be in a tracked state, and the traffic light corresponding to the prediction box is a flashing traffic light. Next, the Euclidean distance between the low-scoring bounding box with an overlap of equal to the second preset threshold and the predicted bounding box that does not match the high-scoring bounding box is calculated. When the Euclidean distance is less than the preset distance, the low-scoring bounding box is confirmed to be in a tracked state, and the traffic light corresponding to the predicted bounding box is a flashing traffic light.

[0079] S145. Integrate the first matching result and the second matching result to obtain the tracking result of the flashing traffic light.

[0080] After the first detection box and the prediction box are matched, the first detection box that is successfully matched with the prediction box is obtained. The first detection box is a flashing traffic light. When the match is successful, it is confirmed that the flashing traffic light corresponding to the first detection box is successfully tracked.

[0081] In some embodiments, after the strobe traffic light is successfully tracked, the autonomous vehicle cannot accurately determine the color of the traffic light because the category of the traffic light in the current frame is black. At this time, the category of the first detection box being tracked in the previous frame is obtained, and the category of the first detection box is updated to the category in the previous frame. This avoids the inability to accurately identify the color of the traffic light due to the first detection box being in a preset category, thereby improving the safety of strobe traffic light tracking.

[0082] In some embodiments, after determining that the state of the first detection box is being tracked, the category of the tracked first detection box can be obtained, and the duration of the tracked first detection box within the category can be determined. If the duration is less than a preset time threshold, the traffic light corresponding to the tracked first detection box is determined to be a flashing traffic light. If the duration is greater than or equal to the preset time threshold, the traffic light corresponding to the tracked first detection box is determined to be a blinking traffic light. The tracking of the flashing traffic light includes tracking both flashing and blinking traffic lights. The flashing refers to the on / off state of the traffic light, and the blinking state refers to the color change of the traffic light. Optionally, the preset time threshold can be 0.3 seconds.

[0083] In some embodiments, please refer to Figures 3a-3b , Figures 3a-3b This is a schematic diagram of continuous frame images acquired during traffic light tracking, as provided in an embodiment of the present invention. Figure 3a As shown, in the first frame image, the object detection algorithm can obtain four detection boxes in the first frame image. These four detection boxes are then input into the Kalman filter to obtain four predicted boxes in the second frame image. Next, the object detection algorithm is used to obtain detection boxes related to traffic lights in the second frame image, and the category of the detection boxes is determined, such as... Figure 3bAs shown, a first detection box of a preset category exists within the detection box. Then, based on the confidence level of the first detection box, it is classified as a high-scoring box or a low-scoring box, where a confidence level higher than 0.7 is considered a high-scoring box, and a confidence level lower than 0.7 is considered a low-scoring box. When the first detection box is a high-scoring box, the overlap between the high-scoring box and the predicted box is calculated. If the overlap is greater than or equal to 0.8, a Hungarian matching is performed between the high-scoring box and the predicted box, and if the matching is successful, the high-scoring box is considered successfully tracked. If the overlap is 0, the Euclidean distance between the high-scoring box and the predicted box is calculated. When the speed of the autonomous vehicle is between 30-40 km / h, if the Euclidean distance is less than 50 pixels, the high-scoring box is considered successfully tracked. If the first detection box is the low-resolution box, and the predicted box fails to match the high-resolution box during the high-resolution box matching process, then the overlap between the predicted box that did not match the high-resolution box and the low-resolution box is calculated. Low-resolution boxes with an overlap greater than or equal to 0.5 are then subjected to Hungarian matching. If the match is successful, the low-resolution box is confirmed to be successfully tracked. If the overlap is 0, the Euclidean distance between the low-resolution box and the predicted box is calculated. When the Euclidean distance is less than 50 pixels, the low-resolution box is considered successfully tracked. After the above matching algorithm, if the detection box matches the predicted box, the stroboscopic traffic light is confirmed to be successfully tracked, and the category of the first detection box is updated from a preset category to the category of the detection box in the previous frame.

[0084] This invention provides a method, apparatus, and autonomous vehicle for tracking flashing traffic lights. The method primarily uses a target detection algorithm to obtain detection boxes for traffic lights in the current frame image. Then, it obtains the category of the detection boxes and, based on the category, obtains a first detection box belonging to a preset category. Next, it obtains predicted boxes mapping the tracked traffic lights from the previous N frames to the current frame image. Finally, it matches the first detection box with the predicted box. If the first detection box and the predicted box match successfully, the state of the first detection box is determined to be tracked. This enables accurate identification of traffic lights when they are flashing, improving the safety of autonomous vehicles.

[0085] Please see Figure 4 , Figure 4 This is a structural block diagram of a traffic light flashing tracking device provided in an embodiment of the present invention, as shown below. Figure 4 As shown, the traffic light flashing tracking device 400 includes a first acquisition module 41, a second acquisition module 42, a third acquisition module 43, and a matching module 44.

[0086] The first acquisition module 41 is used to acquire the detection box of the traffic light in the current frame image according to the target detection algorithm;

[0087] The second acquisition module 42 is used to acquire the category of the detection box, and acquire a first detection box with a preset category based on the category;

[0088] The third acquisition module 43 is used to acquire the prediction box mapped from the tracked traffic lights in the previous N frames to the current frame image;

[0089] The matching module 44 is used to match the first detection box with the prediction box. If the first detection box and the prediction box are successfully matched, the state of the first detection box is determined to be tracked.

[0090] It should be noted that the traffic light flashing tracking device described above can execute the traffic light flashing tracking method provided in the embodiments of the present invention, and has the corresponding functional modules and beneficial effects of the method. Technical details not described in detail in the embodiments of the traffic light flashing tracking device can be found in the traffic light flashing tracking method provided in the embodiments of the present invention.

[0091] Please see Figure 5 This invention provides an autonomous vehicle 30, which includes at least one processor 31. Figure 5 Taking a processor 31 as an example; the at least one processor 31 is communicatively connected to a memory 32, Figure 5 Taking the example of a connection between China and Israel via a bus.

[0092] The memory 32 stores instructions that can be executed by the at least one processor 31, which are executed by the at least one processor 31 to enable the at least one processor 31 to perform the above-described traffic light flashing tracking method.

[0093] The memory 32, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the program instructions / modules corresponding to the traffic light flashing tracking method in this embodiment of the invention. The processor 31 executes various functional applications and data processing of the autonomous vehicle 30 by running the non-volatile software programs, instructions, and modules stored in the memory 32, thereby implementing the traffic light flashing tracking method in the above-described method embodiment.

[0094] The memory 32 may include a program storage area and a data storage area, wherein the program storage area may store the operating system and application programs required for at least one function. Furthermore, the memory 32 may include high-speed random access memory and may also include non-volatile memory. For example, it may include at least one disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, the memory 32 may optionally include memory remotely located relative to the processor 31.

[0095] The one or more modules are stored in the memory 32. When executed by the one or more processors 31, they perform the traffic light flashing tracking method in any of the above method embodiments, for example, performing the above-described method. Figure 1 , Figure 2 The methods and steps in the text.

[0096] The aforementioned driverless vehicle can execute the method provided in the embodiments of the present invention and has corresponding functional modules for executing the method. Technical details not described in detail in this embodiment can be found in the method provided in the embodiments of the present invention.

[0097] This invention also provides a non-volatile computer-readable storage medium storing computer-executable instructions that are executed by one or more processors, for example, to perform the operations described above. Figure 1 and Figure 2 The method and steps to achieve Figure 4 The functions of each module in the program.

[0098] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.

[0099] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented using software and a general-purpose hardware platform, or of course, using hardware. Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), or random access memory (RAM), etc.

[0100] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; under the concept of the present invention, the technical features of the above embodiments or different embodiments can also be combined, the steps can be implemented in any order, and there are many other variations of different aspects of the present invention as described above, which are not provided in detail for the sake of brevity; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for tracking the flashing of traffic lights, characterized in that, Applied to autonomous vehicles, the method includes: Based on the target detection algorithm, obtain the detection bounding box of the traffic light in the current frame image; Obtain all detection boxes in the current frame image, calculate the overlap of the detection boxes, and if there are overlapping detection boxes, retain the detection box with the highest confidence based on the confidence of the detection boxes; Obtain the category of the retained detection box, and obtain a first detection box with a preset category based on the category, wherein the preset category is black, and black represents the traffic light being in the intermediate state of alternating bright and dark flashing; Obtain the predicted bounding boxes of the tracked traffic lights in the previous N frames and map them to the current frame image; The first detection box is matched with the predicted box. If the first detection box and the predicted box match successfully, the state of the first detection box is determined to be tracked. Based on the duration of the tracked first detection box in the black category, determine whether the traffic light is in a strobe or flashing state; Obtain the category of the first detected bounding box being tracked in the previous frame image, and update the category of the first detected bounding box being tracked to the category in the previous frame image; The step of determining whether the traffic light is flashing or stroking based on the duration of the tracked first detection box in the black category includes: Obtain the category of the first tracked bounding box and determine the duration for which the first tracked bounding box remains in the black category; If the duration is less than a preset time threshold, the traffic light corresponding to the first detection frame being tracked is determined to be a flashing traffic light. If the duration is greater than or equal to the preset time threshold, then the traffic light corresponding to the first detection frame being tracked is determined to be a flashing traffic light.

2. The method according to claim 1, characterized in that, The step of matching the first detection box with the predicted box, and determining the state of the first detection box as being tracked if the first detection box and the predicted box match successfully, includes: Obtain the confidence level of the first detection box; Based on the confidence level of the first detection box, the first detection box is divided into high-scoring boxes and low-scoring boxes; The high-resolution bounding box is matched with the predicted bounding box to obtain a first matching result; The predicted bounding boxes that do not match the high-scoring bounding boxes are matched with the low-scoring bounding boxes to obtain a second matching result; The first matching result and the second matching result are integrated to obtain the tracking result of the flashing traffic light.

3. The method according to claim 2, characterized in that, The step of matching the high-resolution bounding box with the predicted bounding box to obtain a first matching result includes: The first and second position information of the high-resolution bounding box and the prediction bounding box in the current frame image are obtained respectively; The overlap degree of the first location information and the second location information is calculated to obtain high-resolution bounding boxes and predicted bounding boxes with an overlap degree greater than a first preset threshold, and high-resolution bounding boxes and predicted bounding boxes with an overlap degree equal to a second preset threshold are obtained. High-resolution bounding boxes and predicted bounding boxes with an overlap greater than the first preset threshold are matched. If the high-resolution bounding box and the predicted bounding box are successfully matched, the state of the high-resolution bounding box is confirmed as being tracked. Calculate the Euclidean distance between the high-resolution bounding box and the predicted bounding box whose overlap is equal to the second preset threshold, and confirm that the high-resolution bounding box is tracked when the Euclidean distance is less than the preset distance.

4. The method according to claim 3, characterized in that, The step of matching the predicted bounding box that does not match the high-scoring bounding box with the low-scoring bounding box to obtain a second matching result includes: The third and fourth position information of the low-resolution bounding box and the predicted bounding box that does not match the high-resolution bounding box in the current frame image are obtained respectively. The overlap degree is calculated on the third location information and the fourth location information, and low-scoring boxes with an overlap degree greater than the third preset threshold and predicted boxes that do not match the high-scoring boxes are obtained, as well as low-scoring boxes with an overlap degree equal to the second preset threshold and predicted boxes that do not match the high-scoring boxes. High-scoring bounding boxes with an overlap greater than the third preset threshold and predicted bounding boxes that do not match the high-scoring bounding boxes are matched. If a low-scoring bounding box successfully matches a predicted bounding box that does not match the high-scoring bounding box, the low-scoring bounding box is confirmed to be tracked. Calculate the Euclidean distance between the low-scoring bounding box whose overlap is equal to the second preset threshold and the predicted bounding box that does not match the high-scoring bounding box, and confirm that the low-scoring bounding box is tracked when the Euclidean distance is less than the preset distance.

5. The method according to claim 1, characterized in that, After determining that the state of the first detection box is being tracked, the method further includes: The first detected bounding box that is tracked is used as the predicted bounding box for the traffic lights in the next frame of the current frame image, and the traffic lights in the next frame image are tracked based on the predicted bounding box.

6. The method according to claim 1, characterized in that, After obtaining the detection bounding box of the traffic light in the current frame image, the method further includes: Obtain all detection boxes in the current frame image; The overlap of the detection boxes is calculated to determine whether there are overlapping detection boxes in the current frame image; When there are overlapping detection boxes in the current frame image, the detection boxes with higher confidence are retained based on the confidence of the detection boxes.

7. A tracking device for traffic light flashing, characterized in that, For use in autonomous vehicles, the apparatus for performing a traffic light flashing tracking method as described in any one of claims 1-6 comprises: The first acquisition module is used to acquire the detection box of the traffic light in the current frame image according to the target detection algorithm; The second acquisition module is used to acquire the category of the detection box, and acquire a first detection box with a preset category based on the category; The third acquisition module is used to acquire the predicted bounding boxes mapped from the tracked traffic lights in the previous N frames to the current frame image. The matching module is used to match the first detection box with the predicted box. If the first detection box and the predicted box are successfully matched, the state of the first detection box is determined to be tracked.

8. An unmanned vehicle, characterized in that, The driverless vehicles include: At least one processor; and, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the method according to any one of claims 1-6.

9. A non-transitory computer-readable storage medium, characterized in that, The non-transitory computer-readable storage medium stores computer-executable instructions for causing a computer to perform the method as described in any one of claims 1-6.