Traffic light occlusion state recognition method, device and electronic equipment
By combining target detection models and high-precision map data with multi-level constraint judgment in autonomous vehicles, the problem of inaccurate traffic light recognition caused by camera obstruction is solved, achieving accurate traffic light obstruction state recognition and ensuring autonomous driving safety.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHIDAO NETWORK TECH (BEIJING) CO LTD
- Filing Date
- 2023-04-03
- Publication Date
- 2026-06-16
AI Technical Summary
In existing technologies, cameras on autonomous vehicles are easily obstructed by large trucks, leading to inaccurate traffic light recognition and potentially causing red light violations, thus posing a risk of traffic accidents.
By acquiring the road image of the current frame and performing target detection using a preset target detection model, and combining high-precision map data and the occlusion status recognition results of the previous frame, different occlusion status recognition strategies are adopted to determine whether the traffic lights in the current frame are occluded, including multi-level constraint condition judgment.
It achieves accurate and robust judgment of traffic light occlusion status, assisting the planning and control system of autonomous vehicles in making correct decisions and reducing the possibility of misjudgment.
Smart Images

Figure CN116386004B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of autonomous driving technology, and in particular to a method, apparatus and electronic device for recognizing the occlusion state of traffic lights. Background Technology
[0002] Traffic light recognition is an essential capability for autonomous driving. Because LiDAR cannot recognize colors and therefore cannot perform traffic light recognition, and V2N (Vehicle-to-Network) signals have limited coverage at all intersections and are often delayed, the most reliable traffic light recognition capability for autonomous vehicles comes from visual perception.
[0003] However, in certain scenarios, the presence of large trucks, buses, and other vehicles may obstruct the cameras of autonomous vehicles, causing them to mistakenly believe there are no traffic lights at the intersection and proceed forward, potentially running a red light and, in severe cases, causing a traffic accident. For example... Figure 1 As shown, a comparative diagram of a traffic light at an intersection before and after it is blocked is provided.
[0004] The key to solving the above problems lies in the ability of the visual perception system to determine in a timely and accurate manner whether the traffic lights in the camera are obstructed. This allows the planning and control system to decelerate or stop the autonomous vehicle based on its current position, thus avoiding traffic accidents. However, the existing technology is not accurate enough in judging the obstruction of traffic lights, and the possibility of misjudgment is relatively high. Summary of the Invention
[0005] This application provides a method, apparatus, and electronic device for identifying the occlusion status of traffic lights, so as to improve the accuracy of traffic light occlusion identification and assist the planning and control system of autonomous vehicles in making correct decisions.
[0006] The embodiments of this application adopt the following technical solutions:
[0007] In a first aspect, embodiments of this application provide a method for recognizing the occlusion state of a traffic light, wherein the method includes:
[0008] Acquire the road image of the current frame, and use a preset target detection model to perform target detection on the road image of the current frame to obtain the target detection result of the current frame;
[0009] Based on the road image of the current frame, obtain the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame;
[0010] Based on the occlusion status recognition result of the traffic lights in the previous frame, the target detection result of the current frame and the high-precision map data are used to determine the occlusion status recognition strategy of the traffic lights in the current frame.
[0011] The occlusion status of the traffic lights in the current frame is identified using the traffic light occlusion status identification strategy of the current frame, and the occlusion status identification result of the traffic lights in the current frame is obtained.
[0012] Optionally, the strategy for determining the occlusion status of traffic lights in the current frame based on the occlusion status recognition result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data, includes:
[0013] If the occlusion status identification result of the traffic light in the previous frame is an unoccluded state, then the first occlusion status identification strategy of the traffic light in the current frame is determined based on the target detection result of the current frame and the high-precision map data.
[0014] If the occlusion status identification result of the traffic light in the previous frame is that it is occluded, then the second occlusion status identification strategy of the traffic light in the current frame is determined based on the target detection result of the current frame and the high-precision map data.
[0015] Optionally, the occlusion state recognition strategy for the traffic lights in the current frame is a first occlusion state recognition strategy, and the step of recognizing the occlusion state of the traffic lights in the current frame using the occlusion state recognition strategy for the current frame includes:
[0016] Determine whether to trigger the preset constraint condition for occluded targets based on the target detection results of the current frame;
[0017] If the preset constraint condition for an occludeable target is triggered, the constraint condition of the preset region of interest is determined based on the target detection result of the current frame and the preset region of interest of the road image.
[0018] Determine whether to trigger the constraint condition for the number of traffic lights based on the target detection result of the current frame and the high-precision map data;
[0019] If the constraints of the preset region of interest and the number of traffic lights are triggered, it is determined whether the constraints of the relative position are triggered based on the target detection results of the current frame, the preset region of interest of the road image, and the high-precision map data.
[0020] If the constraint condition of the relative position is triggered, the occlusion state of the traffic light in the current frame is determined to be an occluded state.
[0021] Otherwise, the occlusion state of the traffic light in the current frame is determined to be unoccluded.
[0022] Optionally, determining whether to trigger a preset constraint condition for an occludeable target based on the target detection result of the current frame includes:
[0023] Based on the target detection results of the current frame, determine whether a preset occludeable target is detected in the road image of the current frame;
[0024] If so, then the constraint condition for triggering the preset occludeable target is determined;
[0025] Otherwise, it is determined that the preset constraint condition for the occludeable target has not been triggered.
[0026] Optionally, the target detection result includes a preset detection box for occluded targets. The step of determining whether to trigger the preset region of interest constraint condition based on the target detection result of the current frame and the preset region of interest of the road image when the preset occluded target constraint condition is triggered includes:
[0027] Determine the height percentage of the detection box of the preset occludeable target in the road image of the current frame;
[0028] Determine whether there is an intersection between the preset detection box of the occluded target and the preset region of interest of the road image;
[0029] If the height percentage is greater than a preset height percentage threshold, and the detection box of the preset occludeable target intersects with the preset region of interest of the road image, then the detection box of the preset occludeable target is taken as the region of interest detection box, and the constraint condition that triggers the preset region of interest is determined.
[0030] Otherwise, it is determined that the constraint condition of the preset region of interest has not been triggered.
[0031] Optionally, the constraint condition for determining whether to trigger the number of traffic lights based on the target detection result of the current frame and the high-precision map data includes:
[0032] The number of traffic lights detected in the road image of the current frame is determined based on the target detection results of the current frame, and the actual number of traffic lights at the intersection corresponding to the road image of the current frame is determined based on the high-precision map data.
[0033] If the number of traffic lights detected in the road image of the current frame is less than the actual number of traffic lights, then the constraint condition for the number of traffic lights is determined to be triggered.
[0034] Otherwise, the constraint condition of the number of traffic lights not being triggered is determined.
[0035] Optionally, the step of determining whether to trigger the relative position constraint condition when the constraints of the preset region of interest and the number of traffic lights are triggered includes:
[0036] The location of the detection box of interest is determined based on the target detection result of the current frame and the preset region of interest of the road image;
[0037] The absolute position of the traffic light at the intersection corresponding to the road image in the current frame is determined based on the high-precision map data, and the absolute position of the traffic light is projected onto the road image in the current frame to obtain the projected position of the traffic light.
[0038] Determine whether there is an intersection between the region of the detection box of interest and the projection position of the traffic light;
[0039] If so, then determine the constraint condition that triggers the relative position;
[0040] Otherwise, it is determined that the constraint condition for the relative position has not been triggered.
[0041] Optionally, the occlusion state recognition strategy for the traffic lights in the current frame is a second occlusion state recognition strategy, and the step of recognizing the occlusion state of the traffic lights in the current frame using the occlusion state recognition strategy for the current frame includes:
[0042] Determine whether to trigger the constraint condition for the number of traffic lights based on the target detection result of the current frame and the high-precision map data;
[0043] If the constraint on the number of traffic lights is not triggered, the occlusion state of the traffic lights in the current frame is determined to be an unoccluded state.
[0044] Otherwise, the traffic light in the current frame is determined to be occluded.
[0045] Secondly, embodiments of this application also provide a traffic light occlusion status recognition device, wherein the device includes:
[0046] The target detection unit is used to acquire the road image of the current frame and perform target detection on the road image of the current frame using a preset target detection model to obtain the target detection result of the current frame;
[0047] The acquisition unit is used to acquire the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame based on the road image of the current frame.
[0048] The determining unit is used to determine the occlusion status recognition strategy of the traffic lights in the current frame based on the occlusion status recognition result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data.
[0049] The identification unit is used to identify the occlusion status of the traffic lights in the current frame using the traffic light occlusion status identification strategy of the current frame, and obtain the traffic light occlusion status identification result of the current frame.
[0050] Thirdly, embodiments of this application also provide an electronic device, including:
[0051] Processor; and
[0052] A memory configured to store computer-executable instructions, which, when executed, cause the processor to perform any of the methods described above.
[0053] Fourthly, embodiments of this application also provide a computer-readable storage medium that stores one or more programs, which, when executed by an electronic device including multiple applications, cause the electronic device to perform any of the methods described above.
[0054] The at least one technical solution adopted in this application embodiment can achieve the following beneficial effects: The traffic light occlusion state recognition method of this application embodiment first acquires the road image of the current frame, and performs target detection on the road image of the current frame using a preset target detection model to obtain the target detection result of the current frame; then, based on the road image of the current frame, it acquires the corresponding high-precision map data and the traffic light occlusion state recognition result of the previous frame; then, based on the traffic light occlusion state recognition result of the previous frame, it determines the traffic light occlusion state recognition strategy of the current frame using the target detection result and high-precision map data of the current frame; finally, it uses the traffic light occlusion state recognition strategy of the current frame to identify the occlusion state of the traffic light in the current frame, and obtains the traffic light occlusion state recognition result of the current frame. The traffic light occlusion state recognition method of this application embodiment, based on the traffic light occlusion state recognition result of the previous frame, combines the target detection result and high-precision map data to determine the recognition strategy of the current frame, fully considers the impact of errors in the actual scene on the occlusion state judgment, and achieves accurate and robust judgment of traffic light occlusion state, thereby assisting the planning and control system of autonomous vehicles to make correct decisions. Attached Figure Description
[0055] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:
[0056] Figure 1 This is a comparative illustration of traffic lights at an intersection before and after they were blocked.
[0057] Figure 2This is a flowchart illustrating a method for recognizing the occlusion state of a traffic light according to an embodiment of this application.
[0058] Figure 3 This is a schematic diagram of a traffic light occlusion state recognition process in an embodiment of this application;
[0059] Figure 4 This is a schematic diagram of the structure of a traffic light occlusion state recognition device according to an embodiment of this application;
[0060] Figure 5 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Detailed Implementation
[0061] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0062] The technical solutions provided by the various embodiments of this application are described in detail below with reference to the accompanying drawings.
[0063] This application provides a method for recognizing the occlusion state of traffic lights, such as... Figure 2 The diagram shows a flowchart of a traffic light occlusion state recognition method according to an embodiment of this application. The method includes at least the following steps S210 to S240:
[0064] Step S210: Obtain the road image of the current frame, and use a preset target detection model to perform target detection on the road image of the current frame to obtain the target detection result of the current frame.
[0065] In this embodiment of the application, when identifying the occlusion state of traffic lights, it is necessary to first acquire the road image of the current frame captured by the camera of the autonomous vehicle, and then use a pre-trained target detection model to perform 2D target detection on the road image of the current frame to obtain the 2D target detection result of the current frame.
[0066] The above-mentioned target detection model can be implemented based on existing convolutional neural networks such as YOLO. The target detection model trained in this application embodiment is mainly used to detect traffic lights and specific vehicle targets such as large trucks and buses in the image, and output the detection box position of each target.
[0067] Step S220: Obtain the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame based on the road image of the current frame.
[0068] After obtaining the road image of the current frame, it is also necessary to obtain the corresponding high-precision map data. The "corresponding high-precision map data" can be understood as the local high-precision map data of the intersection area that the autonomous vehicle is about to pass through, determined based on the current position and driving path of the autonomous vehicle. The high-precision map data can specifically include the number and absolute position of traffic lights at each intersection.
[0069] In addition, it is also necessary to obtain the occlusion status recognition result of the traffic light in the previous frame, such as whether the previous frame's recognition result was an occluded state or an unoccluded state. The different occlusion status recognition results of the previous frame will affect the selection of occlusion status constraints for the current frame. This is mainly because the judgment of the traffic light's occlusion status from an occluded state to an unoccluded state, or from an unoccluded state to an occluded state, means that different entry and exit mechanisms have different requirements for the strictness of constraints. Therefore, the occlusion status recognition result of the traffic light in the previous frame can be used as the basis for adopting different recognition strategies.
[0070] Step S230: Based on the occlusion status recognition result of the traffic lights in the previous frame, determine the occlusion status recognition strategy of the traffic lights in the current frame using the target detection result of the current frame and the high-precision map data.
[0071] Based on the traffic light occlusion status recognition results of the previous frame obtained from the aforementioned steps, the target detection results of the current frame and the corresponding high-precision map data are combined to further determine the traffic light occlusion status recognition strategy adopted in the current frame.
[0072] Step S240: Use the traffic light occlusion state recognition strategy of the current frame to identify the occlusion state of the traffic light in the current frame, and obtain the occlusion state recognition result of the traffic light in the current frame.
[0073] After determining the occlusion status recognition strategy for the current frame, the occlusion status of traffic lights can be identified. Since the target detection results can reflect whether specific vehicle targets that may obstruct traffic lights are detected in the road image of the current frame, as well as the traffic light situation, and the high-precision map data can provide the actual traffic light situation in the intersection area that the autonomous vehicle is about to pass through, combining this information can accurately identify whether the traffic lights in the road image of the current frame are obstructed.
[0074] The traffic light occlusion state recognition method of this application embodiment is based on the recognition result of the traffic light occlusion state of the previous frame, combined with the target detection result and high-precision map data to determine the recognition strategy of the current frame. It fully considers the impact of errors in the actual scene on the occlusion state judgment, and realizes accurate and robust judgment of traffic light occlusion state, thereby assisting the planning and control system of autonomous vehicles to make correct decisions.
[0075] In some embodiments of this application, the step of determining the occlusion status identification strategy of the traffic lights in the current frame based on the occlusion status identification result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data, includes: if the occlusion status identification result of the traffic lights in the previous frame is an unoccluded state, then determining a first occlusion status identification strategy of the traffic lights in the current frame based on the target detection result of the current frame and the high-precision map data; if the occlusion status identification result of the traffic lights in the previous frame is an occluded state, then determining a second occlusion status identification strategy of the traffic lights in the current frame based on the target detection result of the current frame and the high-precision map data.
[0076] As mentioned earlier, the occlusion status recognition result of the traffic lights in the previous frame will affect the specific constraints applied in the current frame. To ensure the accuracy of occlusion judgment and reduce the possibility of misjudgment, if the previous frame was unoccluded, the first occlusion status recognition strategy can adopt relatively complex and strict constraints to determine whether the occlusion status of the current frame has changed to an occluded state. Conversely, if the previous frame was occluded, the second occlusion status recognition strategy can adopt relatively simplified but still sufficiently strict constraints to determine whether the occlusion status of the current frame has changed to an unoccluded state. For example, multiple constraints can be set based on the target detection results and high-precision map data of the current frame. Only when all multiple constraints are satisfied is the state considered to be occluded; only when any one of the constraints is not satisfied is the state considered to be unoccluded.
[0077] The reason why the constraints for the transition from "unobstructed state" to "occluded state" are relatively complex and strict is due to the existence of various errors, such as errors in high-precision maps, errors in the intrinsic and extrinsic parameters of camera calibration, and detection errors in object detection models. In practice, sufficiently comprehensive constraints are needed to avoid or reduce the possibility of misjudgment. As for the constraints for the transition from "occluded state" to "unobstructed state," since a relatively comprehensive judgment has already been made in the previous frame, and the key to returning to the "unobstructed state" is whether all traffic lights can be identified from the image captured by the camera, only this one constraint needs to be satisfied. Removing other conditions is not convincing enough, as it may be due to errors causing misjudgment.
[0078] In some embodiments of this application, the occlusion state recognition strategy for traffic lights in the current frame is a first occlusion state recognition strategy. The step of recognizing the occlusion state of traffic lights in the current frame using this strategy includes: determining whether a preset constraint condition for an occludeable target is triggered based on the target detection result of the current frame; if the preset constraint condition for an occludeable target is triggered, determining whether a preset region of interest constraint condition is triggered based on the target detection result of the current frame and a preset region of interest (ROI) of the road image; determining whether a constraint condition for the number of traffic lights is triggered based on the target detection result of the current frame and the high-precision map data; if both the preset ROI constraint condition and the traffic light number constraint condition are triggered, determining whether a relative position constraint condition is triggered based on the target detection result of the current frame, the preset ROI of the road image, and the high-precision map data; if the relative position constraint condition is triggered, determining that the occlusion state of the traffic lights in the current frame is occluded; otherwise, determining that the occlusion state of the traffic lights in the current frame is unoccluded.
[0079] If the traffic light occlusion status in the previous frame was unoccluded, the occlusion status in the current frame can be determined using several predefined constraints:
[0080] 1) Based on the target detection results of the current frame, it can be determined whether vehicle targets that may obstruct traffic lights are identified in the road image of the current frame, i.e., preset occludeable targets, such as large trucks and buses, etc., so as to determine whether the constraint condition of preset occludeable targets is triggered.
[0081] 2) If the constraint condition of the preset occludeable target is triggered, it means that the preset occludeable target has been detected in the road image of the current frame. Then, the predefined region of interest of the road image can be further combined to determine whether there is a target among the currently identified preset occludeable targets that may occlude the traffic lights, that is, whether the constraint condition of the preset region of interest is triggered.
[0082] 3) Since the target detection results also include the detection results of traffic lights, such as the number of traffic lights, it is possible to further combine the number of traffic lights at the corresponding intersection provided in the high-precision map data to determine whether the traffic volume has decreased, that is, whether the constraint condition of the number of traffic lights has been triggered.
[0083] 4) If the constraints of the preset region of interest and the number of traffic lights are triggered, it means that the position of the preset occludeable target detected in the road image of the current frame is in a position that may occlude the traffic lights, and the number of detected traffic lights is less than the actual number of traffic lights. In this case, it is considered that the traffic lights in the current frame are likely to be occluded. In order to further improve the accuracy of the judgment, the relative position of the detection box of the vehicle target that may occlude the traffic lights and the traffic lights can be judged, that is, whether the relative position constraint is triggered.
[0084] 5) When all of the above constraints are triggered, the traffic light in the current frame can be considered to be in an occluded state; otherwise, it can be considered to be in an unoccluded state.
[0085] It should be noted that since the detection of the number of traffic lights involved in the above constraint 3) and the detection of the preset occludeable targets involved in constraint 1)-2) can be directly obtained based on the detection results of the target detection model, there is no strict order relationship between the above constraint 3) and constraint 1)-2), and they can be regarded as independent judgment conditions.
[0086] In some embodiments of this application, determining whether to trigger the constraint condition of a preset occludeable target based on the target detection result of the current frame includes: determining whether a preset occludeable target is detected in the road image of the current frame based on the target detection result of the current frame; if so, determining that the constraint condition of the preset occludeable target is triggered; otherwise, determining that the constraint condition of the preset occludeable target is not triggered.
[0087] The determination of the pre-set occluded target constraint can be directly based on the target detection results of the current frame. If the target detection results of the current frame include the detection results of the pre-set occluded target, it means that there are large trucks or buses in front of or around the autonomous vehicle that may block the traffic lights. At this time, it is considered that the pre-set occluded target constraint has been triggered.
[0088] In some embodiments of this application, the target detection result includes a preset detection box for a clogging target. The step of determining whether to trigger the preset region of interest constraint condition based on the target detection result of the current frame and the preset region of interest of the road image when the preset clogging target constraint condition is triggered includes: determining the height percentage of the preset clogging target detection box in the road image of the current frame; determining whether the preset clogging target detection box intersects with the preset region of interest of the road image; if the height percentage is greater than a preset height percentage threshold, and the preset clogging target detection box intersects with the preset region of interest of the road image, then the preset clogging target detection box is taken as a region of interest detection box, and the constraint condition for triggering the preset region of interest is determined; otherwise, the constraint condition for triggering the preset region of interest is determined not to be triggered.
[0089] To determine the constraints of the preset region of interest, we can first determine the height percentage of the detection box of the preset occludeable target in the entire road image. If the height percentage is greater than the preset height percentage threshold, it means that the height of the detection box of the preset occludeable target may obstruct the traffic light. The preset height percentage threshold is an empirical value and can be flexibly set according to the actual situation. It is not specifically limited here.
[0090] In addition, a Region of Interest (ROI) can be predefined for the road image. The ROI can be understood as the location where traffic lights are likely to be detected, and therefore can be used as the basis for determining whether traffic lights will be occluded. For example, it can be set to the 70% area in the middle of the image, that is, the part remaining after truncating 15% of the width on both sides. Of course, those skilled in the art can flexibly adjust how to define the ROI based on the actual situation, and no specific limitation is made here. Then, it is determined whether the detection box of the preset occludeable target intersects with the ROI in the current road image, that is, whether there is at least a partial overlap. If there is an intersection, the detection box of the preset occludeable target can be used as the ROI detection box, that is, the preset occludeable target falling in the ROI is more likely to occlude the traffic light.
[0091] If the height percentage of the detection box of the preset occludeable target is not greater than the preset height percentage threshold, and there is also a detection box ROI_box in the region of interest, then the preset constraint condition of the region of interest can be considered to be triggered; otherwise, the preset constraint condition of the region of interest can be considered not triggered.
[0092] In some embodiments of this application, the step of determining whether to trigger the constraint condition for the number of traffic lights based on the target detection result of the current frame and the high-precision map data includes: determining the number of traffic lights detected in the road image of the current frame based on the target detection result of the current frame, and determining the actual number of traffic lights at the intersection corresponding to the road image of the current frame based on the high-precision map data; if the number of traffic lights detected in the road image of the current frame is less than the actual number of traffic lights, then the constraint condition for triggering the number of traffic lights is determined; otherwise, the constraint condition for not triggering the number of traffic lights is determined.
[0093] To determine the constraint on the number of traffic lights, we can first determine the number of traffic lights detected in the current frame based on the target detection results. At the same time, we can determine the actual number of traffic lights at the corresponding intersection based on the high-precision map data. If the number of traffic lights detected in the current frame is less than the actual number of traffic lights at the intersection, it means that the missing traffic lights may be due to the obstruction of targets such as large trucks, which may have prevented them from being detected. In this case, we can consider that the constraint on the number of traffic lights has been triggered.
[0094] In some embodiments of this application, the step of determining whether to trigger the relative position constraint condition based on the target detection result of the current frame, the preset region of interest of the road image, and the high-precision map data when the constraints of the preset region of interest and the number of traffic lights are triggered includes: determining the region position of the detection box of interest based on the target detection result of the current frame and the preset region of interest of the road image; determining the absolute position of the traffic lights at the intersection corresponding to the road image of the current frame based on the high-precision map data, and projecting the absolute position of the traffic lights onto the road image of the current frame to obtain the projected position of the traffic lights; determining whether there is an intersection between the region position of the detection box of interest and the projected position of the traffic lights; if so, determining that the relative position constraint condition is triggered; otherwise, determining that the relative position constraint condition is not triggered.
[0095] If multiple constraints in the aforementioned embodiments are triggered—that is, if on the one hand, there is a detection box containing an occluderable target in the region of interest in the road image, and on the other hand, at least one traffic light in the road image is not detected—it can be concluded to a certain extent that the undetected traffic light may be occluded by an occluderable target. To further improve the accuracy of the judgment, embodiments of this application can utilize relative position constraints for further verification.
[0096] On the one hand, combining the aforementioned embodiments, the location of all ROI_boxes can be determined using the target detection results of the current frame and the preset regions of interest (ROIs) in the road image. On the other hand, the projection positions of all traffic lights at the intersection in the road image can be determined based on the absolute positions of the traffic lights in the world coordinate system provided in the high-precision map data and the intrinsic and extrinsic parameter matrices of the camera. Then, the location of all ROI_boxes is compared with the projection positions of all traffic lights in the road image to determine if at least one traffic light's projection falls within the ROI_box area. If so, the occlusion state of the traffic lights in the current frame can be ultimately determined as occluded.
[0097] It should be noted that, since the absolute position information of traffic lights in high-precision map data and the pre-calibrated intrinsic and extrinsic parameter matrices of cameras have varying degrees of error, the intersection of the converted 2D projection position of the traffic lights with the ROI_box cannot be used as a sufficient condition for determining occlusion, but only as a necessary condition. This is also the main reason why this application embodiment adopts multi-level constraint conditions for strict judgment of occlusion state.
[0098] In some embodiments of this application, the traffic light occlusion state identification strategy for the current frame is a second occlusion state identification strategy. The step of identifying the occlusion state of the traffic lights in the current frame using the traffic light occlusion state identification strategy for the current frame includes: determining whether a constraint condition on the number of traffic lights is triggered based on the target detection result of the current frame and the high-precision map data; if the constraint condition on the number of traffic lights is not triggered, determining that the occlusion state of the traffic lights in the current frame is an unoccluded state; otherwise, determining that the occlusion state of the traffic lights in the current frame is an occluded state.
[0099] If the traffic light occlusion status identification result of the previous frame is occlusion status, then when making a judgment on the current frame based on the identification result of the previous frame, the constraint condition of the number of traffic lights can be directly used for judgment. This is because the occlusion status identified in the previous frame was determined after strict constraint condition judgment. Therefore, for the current frame, the occlusion status can only be considered to be lifted if the constraint condition of the number of traffic lights is not triggered, that is, if the number of traffic lights detected in the road image is not less than the actual number of traffic lights in the high-precision map. The lifting of other constraint conditions may result in misjudgment.
[0100] For ease of understanding of the various embodiments of this application, such as Figure 3The diagram illustrates a traffic light occlusion state recognition process according to an embodiment of this application. First, the road image of the current frame is acquired and target detection is performed to obtain the target detection result of the current frame. Then, the corresponding high-precision map data and the traffic light occlusion state recognition result of the previous frame are acquired.
[0101] If the occlusion status of the traffic lights in the previous frame is identified as unoccluded, then based on the target detection results of the current frame, it is determined whether the preset constraint condition for occludeable targets is triggered. If the preset constraint condition for occludeable targets is triggered, then based on the target detection results of the current frame and the preset region of interest (ROI) of the road image, it is determined whether the constraint condition for the number of traffic lights is triggered. If both the preset ROI constraint condition and the constraint condition for the number of traffic lights are triggered, then based on the target detection results of the current frame, the preset ROI of the road image, and the high-precision map data, it is determined whether the constraint condition for relative position is triggered. If the relative position constraint condition is triggered, the occlusion status of the traffic lights in the current frame is determined to be occluded; otherwise, the occlusion status of the traffic lights in the current frame is determined to be unoccluded.
[0102] If the traffic light occlusion status identification result of the previous frame is occluded, then determine whether the traffic light quantity constraint condition is triggered based on the target detection result of the current frame and the high-precision map data; if the traffic light quantity constraint condition is not triggered, determine that the traffic light occlusion status of the current frame is unoccluded; otherwise, determine that the traffic light occlusion status of the current frame is occluded.
[0103] The traffic light occlusion state recognition process of this application is based on the recognition result of the traffic light occlusion state of the previous frame. It combines the target detection result and high-precision map data to adopt different recognition strategies for the current frame. It fully considers the impact of various errors in the actual scene on the occlusion state judgment, and realizes accurate and robust judgment of traffic light occlusion state, thereby assisting the planning and control system of autonomous vehicles to make correct decisions.
[0104] This application embodiment also provides a traffic light occlusion status recognition device 400, such as... Figure 4 The diagram shows a schematic representation of a traffic light occlusion recognition device according to an embodiment of this application. The device 400 includes: a target detection unit 410, an acquisition unit 420, a determination unit 430, and a recognition unit 440, wherein:
[0105] The target detection unit 410 is used to acquire the road image of the current frame and perform target detection on the road image of the current frame using a preset target detection model to obtain the target detection result of the current frame;
[0106] The acquisition unit 420 is used to acquire the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame based on the road image of the current frame.
[0107] The determining unit 430 is used to determine the occlusion status recognition strategy of the traffic lights in the current frame based on the occlusion status recognition result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data.
[0108] The identification unit 440 is used to identify the occlusion state of the traffic lights in the current frame using the traffic light occlusion state identification strategy of the current frame, and obtain the traffic light occlusion state identification result of the current frame.
[0109] In some embodiments of this application, the determining unit 430 is specifically used to: if the occlusion status identification result of the traffic light in the previous frame is an unoccluded state, then determine a first occlusion status identification strategy for the traffic light in the current frame based on the target detection result of the current frame and the high-precision map data; if the occlusion status identification result of the traffic light in the previous frame is an occluded state, then determine a second occlusion status identification strategy for the traffic light in the current frame based on the target detection result of the current frame and the high-precision map data.
[0110] In some embodiments of this application, the occlusion state recognition strategy for the traffic lights in the current frame is a first occlusion state recognition strategy. The recognition unit 440 is specifically used to: determine whether a preset constraint condition for an occludeable target is triggered based on the target detection result of the current frame; if the preset constraint condition for an occludeable target is triggered, determine whether a preset constraint condition for a region of interest is triggered based on the target detection result of the current frame and a preset region of interest of the road image; determine whether a constraint condition for the number of traffic lights is triggered based on the target detection result of the current frame and the high-precision map data; if both the constraint condition for the preset region of interest and the constraint condition for the number of traffic lights are triggered, determine whether a constraint condition for relative position is triggered based on the target detection result of the current frame, the preset region of interest of the road image, and the high-precision map data; if the constraint condition for relative position is triggered, determine that the occlusion state of the traffic lights in the current frame is an occluded state; otherwise, determine that the occlusion state of the traffic lights in the current frame is an unoccluded state.
[0111] In some embodiments of this application, the identification unit 440 is specifically used to: determine whether a preset occludeable target is detected in the road image of the current frame based on the target detection result of the current frame; if so, determine the constraint condition that triggers the preset occludeable target; otherwise, determine that the constraint condition that triggers the preset occludeable target is not triggered.
[0112] In some embodiments of this application, the target detection result includes a preset detection box for an occluded target. The recognition unit is specifically used to: determine the height percentage of the preset occluded target detection box in the road image of the current frame; determine whether the preset occluded target detection box intersects with a preset region of interest (ROI) of the road image; if the height percentage is greater than a preset height percentage threshold, and the preset occluded target detection box intersects with the preset ROI of the road image, then the preset occluded target detection box is taken as an ROI detection box, and the constraint condition that triggers the preset ROI is determined; otherwise, the constraint condition that triggers the preset ROI is determined.
[0113] In some embodiments of this application, the identification unit 440 is specifically used to: determine the number of traffic lights detected in the road image of the current frame based on the target detection result of the current frame, and determine the actual number of traffic lights at the intersection corresponding to the road image of the current frame based on the high-precision map data; if the number of traffic lights detected in the road image of the current frame is less than the actual number of traffic lights, then determine that the constraint condition for triggering the number of traffic lights is triggered; otherwise, determine that the constraint condition for triggering the number of traffic lights is not triggered.
[0114] In some embodiments of this application, the identification unit 440 is specifically used to: determine the region location of the detection box of interest based on the target detection result of the current frame and the preset region of interest of the road image; determine the absolute position of the traffic light at the intersection corresponding to the road image of the current frame based on the high-precision map data, and project the absolute position of the traffic light onto the road image of the current frame to obtain the projected position of the traffic light; determine whether there is an intersection between the region location of the detection box of interest and the projected position of the traffic light; if so, determine that the constraint condition of the relative position is triggered; otherwise, determine that the constraint condition of the relative position is not triggered.
[0115] In some embodiments of this application, the traffic light occlusion state identification strategy of the current frame is a second occlusion state identification strategy. The identification unit 440 is specifically used to: determine whether the traffic light quantity constraint condition is triggered based on the target detection result of the current frame and the high-precision map data; if the traffic light quantity constraint condition is not triggered, determine that the traffic light occlusion state of the current frame is an unoccluded state; otherwise, determine that the traffic light occlusion state of the current frame is an occluded state.
[0116] It is understood that the traffic light occlusion state recognition device described above can realize each step of the traffic light occlusion state recognition method provided in the foregoing embodiments. The relevant explanations of the traffic light occlusion state recognition method are applicable to the traffic light occlusion state recognition device, and will not be repeated here.
[0117] Figure 5 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Please refer to it. Figure 5 At the hardware level, the electronic device includes a processor, and optionally also includes an internal bus, a network interface, and memory. The memory may include main memory, such as high-speed random-access memory (RAM), or non-volatile memory, such as at least one disk drive. Of course, the electronic device may also include other hardware required for other business operations.
[0118] The processor, network interface, and memory can be interconnected via an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, or an EISA (Extended Industry Standard Architecture) bus, etc. This bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 5 The symbol is represented by a single double-headed arrow, but this does not mean that there is only one bus or one type of bus.
[0119] Memory is used to store programs. Specifically, programs may include program code, which includes computer operation instructions. Memory may include main memory and non-volatile memory, and provides instructions and data to the processor.
[0120] The processor reads the corresponding computer program from non-volatile memory into main memory and then runs it, forming a traffic light occlusion recognition device at the logical level. The processor executes the program stored in memory and specifically performs the following operations:
[0121] Acquire the road image of the current frame, and use a preset target detection model to perform target detection on the road image of the current frame to obtain the target detection result of the current frame;
[0122] Based on the road image of the current frame, obtain the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame;
[0123] Based on the occlusion status recognition result of the traffic lights in the previous frame, the target detection result of the current frame and the high-precision map data are used to determine the occlusion status recognition strategy of the traffic lights in the current frame.
[0124] The occlusion status of the traffic lights in the current frame is identified using the traffic light occlusion status identification strategy of the current frame, and the occlusion status identification result of the traffic lights in the current frame is obtained.
[0125] The above is as stated in this application. Figure 2 The method executed by the traffic light occlusion state recognition device disclosed in the illustrated embodiment can be applied to a processor or implemented by a processor. The processor may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method can be completed by integrated logic circuits in the processor's hardware or by instructions in software form. The processor can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc.; it can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in the embodiments of this application can be directly embodied in the execution of a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module can reside in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.
[0126] The electronic device can also perform Figure 2 The method for implementing a traffic light occlusion state recognition device, and realizing the traffic light occlusion state recognition device in... Figure 2 The functions of the embodiments shown are not described in detail here.
[0127] This application also proposes a computer-readable storage medium that stores one or more programs, the programs including instructions that, when executed by an electronic device including multiple applications, enable the electronic device to perform... Figure 2 The method executed by the traffic light occlusion state recognition device in the illustrated embodiment is specifically used to perform:
[0128] Acquire the road image of the current frame, and use a preset target detection model to perform target detection on the road image of the current frame to obtain the target detection result of the current frame;
[0129] Based on the road image of the current frame, obtain the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame;
[0130] Based on the occlusion status recognition result of the traffic lights in the previous frame, the target detection result of the current frame and the high-precision map data are used to determine the occlusion status recognition strategy of the traffic lights in the current frame.
[0131] The occlusion status of the traffic lights in the current frame is identified using the traffic light occlusion status identification strategy of the current frame, and the occlusion status identification result of the traffic lights in the current frame is obtained.
[0132] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0133] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0134] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0135] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0136] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0137] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0138] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0139] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0140] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0141] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.
Claims
1. A method for recognizing the occlusion state of a traffic light, wherein, The method includes: Acquire the road image of the current frame, and use a preset target detection model to perform target detection on the road image of the current frame to obtain the target detection result of the current frame; Based on the road image of the current frame, obtain the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame; Based on the occlusion status recognition result of the traffic lights in the previous frame, the target detection result of the current frame and the high-precision map data are used to determine the occlusion status recognition strategy of the traffic lights in the current frame. The strategy for determining the occlusion status of traffic lights in the current frame based on the occlusion status recognition result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data, includes: If the occlusion status identification result of the traffic light in the previous frame is an unoccluded state, then the first occlusion status identification strategy of the traffic light in the current frame is determined based on the target detection result of the current frame and the high-precision map data; if the occlusion status identification result of the traffic light in the previous frame is an occluded state, then the second occlusion status identification strategy of the traffic light in the current frame is determined based on the target detection result of the current frame and the high-precision map data. The occlusion status of the traffic lights in the current frame is identified using the traffic light occlusion status identification strategy of the current frame, and the occlusion status identification result of the traffic lights in the current frame is obtained. The occlusion state recognition strategy for the traffic lights in the current frame is a first occlusion state recognition strategy. The step of using this strategy to recognize the occlusion state of the traffic lights in the current frame includes: Based on the target detection results of the current frame, determine whether to trigger the preset constraint condition for an occluded target; the preset occluded target is a vehicle that obstructs the traffic light; If the preset constraint condition for an occludeable target is triggered, the constraint condition of the preset region of interest is determined based on the target detection result of the current frame and the preset region of interest of the road image. Determine whether to trigger the constraint condition for the number of traffic lights based on the target detection result of the current frame and the high-precision map data; If the constraints of the preset region of interest and the number of traffic lights are triggered, it is determined whether the constraints of the relative position are triggered based on the target detection results of the current frame, the preset region of interest of the road image, and the high-precision map data. If the constraint condition of the relative position is triggered, the occlusion state of the traffic light in the current frame is determined to be an occluded state. Otherwise, the occlusion state of the traffic light in the current frame is determined to be unoccluded.
2. The method as described in claim 1, wherein, The step of determining whether to trigger the preset constraint condition for an occludeable target based on the target detection result of the current frame includes: Based on the target detection results of the current frame, determine whether a preset occludeable target is detected in the road image of the current frame; If so, then the constraint condition for triggering the preset occludeable target is determined; Otherwise, it is determined that the preset constraint condition for the occludeable target has not been triggered.
3. The method as described in claim 1, wherein, The target detection result includes a detection box for a preset occluded target. When the preset occluded target constraint condition is triggered, the determination of whether to trigger the preset region of interest constraint condition based on the target detection result of the current frame and the preset region of interest of the road image includes: Determine the height percentage of the detection box of the preset occludeable target in the road image of the current frame; Determine whether there is an intersection between the preset detection box of the occluded target and the preset region of interest of the road image; If the height percentage is greater than a preset height percentage threshold, and the detection box of the preset occludeable target intersects with the preset region of interest of the road image, then the detection box of the preset occludeable target is taken as the region of interest detection box, and the constraint condition that triggers the preset region of interest is determined. Otherwise, it is determined that the constraint condition of the preset region of interest has not been triggered.
4. The method as described in claim 1, wherein, The constraint conditions for determining whether to trigger the number of traffic lights based on the target detection results of the current frame and the high-precision map data include: The number of traffic lights detected in the road image of the current frame is determined based on the target detection results of the current frame, and the actual number of traffic lights at the intersection corresponding to the road image of the current frame is determined based on the high-precision map data. If the number of traffic lights detected in the road image of the current frame is less than the actual number of traffic lights, then the constraint condition for the number of traffic lights is determined to be triggered. Otherwise, the constraint condition of the number of traffic lights not being triggered is determined.
5. The method as described in claim 1, wherein, When the constraints of the preset region of interest and the number of traffic lights are triggered, determining whether to trigger the relative position constraint based on the target detection result of the current frame, the preset region of interest of the road image, and the high-precision map data includes: The location of the detection box of interest is determined based on the target detection result of the current frame and the preset region of interest of the road image; The absolute position of the traffic light at the intersection corresponding to the road image in the current frame is determined based on the high-precision map data, and the absolute position of the traffic light is projected onto the road image in the current frame to obtain the projected position of the traffic light. Determine whether there is an intersection between the region of the detection box of interest and the projection position of the traffic light; If so, then determine the constraint condition that triggers the relative position; Otherwise, it is determined that the constraint condition for the relative position has not been triggered.
6. The method of claim 1, wherein, The occlusion state recognition strategy for the traffic lights in the current frame is a second occlusion state recognition strategy. The step of using this strategy to recognize the occlusion state of the traffic lights in the current frame includes: Determine whether to trigger the constraint condition for the number of traffic lights based on the target detection result of the current frame and the high-precision map data; If the constraint on the number of traffic lights is not triggered, the occlusion state of the traffic lights in the current frame is determined to be an unoccluded state. Otherwise, the traffic light in the current frame is determined to be occluded.
7. A traffic light occlusion status recognition device, wherein, The device includes: The target detection unit is used to acquire the road image of the current frame and perform target detection on the road image of the current frame using a preset target detection model to obtain the target detection result of the current frame; The acquisition unit is used to acquire the corresponding high-precision map data and the occlusion status recognition result of the traffic lights in the previous frame based on the road image of the current frame. The determining unit is used to determine the occlusion status recognition strategy of the traffic lights in the current frame based on the occlusion status recognition result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data. The strategy for determining the occlusion status of traffic lights in the current frame based on the occlusion status recognition result of the traffic lights in the previous frame, using the target detection result of the current frame and the high-precision map data, includes: If the occlusion status identification result of the traffic light in the previous frame is an unoccluded state, then the first occlusion status identification strategy of the traffic light in the current frame is determined based on the target detection result of the current frame and the high-precision map data; if the occlusion status identification result of the traffic light in the previous frame is an occluded state, then the second occlusion status identification strategy of the traffic light in the current frame is determined based on the target detection result of the current frame and the high-precision map data. The identification unit is used to identify the occlusion status of the traffic lights in the current frame using the traffic light occlusion status identification strategy of the current frame, and obtain the traffic light occlusion status identification result of the current frame. The traffic light occlusion state recognition strategy for the current frame is a first occlusion state recognition strategy, and the recognition unit is specifically used for: Based on the target detection results of the current frame, determine whether to trigger the preset constraint condition for an occluded target; the preset occluded target is a vehicle that obstructs the traffic light; If the preset constraint condition for an occludeable target is triggered, the constraint condition of the preset region of interest is determined based on the target detection result of the current frame and the preset region of interest of the road image. Determine whether to trigger the constraint condition for the number of traffic lights based on the target detection result of the current frame and the high-precision map data; If the constraints of the preset region of interest and the number of traffic lights are triggered, it is determined whether the constraints of the relative position are triggered based on the target detection results of the current frame, the preset region of interest of the road image, and the high-precision map data. If the constraint condition of the relative position is triggered, the occlusion state of the traffic light in the current frame is determined to be an occluded state. Otherwise, the occlusion state of the traffic light in the current frame is determined to be unoccluded.
8. An electronic device, comprising: processor; as well as A memory configured to store computer-executable instructions, which, when executed, cause the processor to perform the method of any one of claims 1 to 6.