Traffic signal light region detection method, device and equipment
By combining saliency detection and prior features with bright and dark frame analysis, the accuracy problem of traffic light area detection under poor exposure is solved, achieving higher detection accuracy and robustness.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HANGZHOU HIKVISION DIGITAL TECHNOLOGY CO LTD
- Filing Date
- 2021-07-21
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies struggle to accurately detect traffic light areas under poor exposure conditions, especially at night when halos are present, making it difficult to determine their true color and shape.
By using saliency detection, segmentation thresholding, and prior feature information of traffic lights, combined with video image analysis of bright and dark frames, candidate traffic light regions are selected, and the final region is determined based on the number and frequency of their appearance in the video image.
It improves the accuracy of traffic light detection, avoids false detections caused by poor exposure, and enhances the robustness and adaptability of the detection.
Smart Images

Figure CN115690441B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a method, apparatus and equipment for detecting traffic signal light areas. Background Technology
[0002] In road monitoring scenarios, traffic light detection is a crucial application. Currently, traffic light detection is primarily based on color and shape information. This involves first using color thresholding to roughly determine the initial location, then utilizing certain traffic light features to filter these areas, ultimately obtaining the detection region.
[0003] However, in real-world surveillance scenarios, poor exposure often occurs, resulting in traffic light colors in video images not perfectly matching their true colors (e.g., red may appear yellowish, green may appear whitish, etc., especially at night when halos are present, making it almost impossible to determine their true colors). Using only color information makes it difficult to accurately locate traffic light areas in such situations. Furthermore, traditional traffic light area detection schemes are often limited to detecting circular traffic lights, but in real-world scenarios, traffic lights of other shapes are common (e.g., pedestrian traffic lights may be human-shaped, and vehicle traffic lights may have accompanying text, especially at night when halos are present, making their true shapes almost impossible to determine). These traffic lights cannot be detected based on shape features. Summary of the Invention
[0004] In view of this, this application provides a method, apparatus and equipment for detecting traffic signal light areas.
[0005] Specifically, this application is implemented through the following technical solution:
[0006] According to a first aspect of the embodiments of this application, a traffic signal light area detection method is provided, comprising:
[0007] The saliency detection process is performed on the video image frames to be detected to obtain a saliency map;
[0008] The saliency map is segmented according to a preset segmentation threshold to obtain a binarized image;
[0009] Based on the prior feature information of traffic lights, candidate traffic light regions in the binarized image are determined; the prior feature information of traffic lights refers to the feature information of the traffic lights that has been determined before processing the video image to be detected.
[0010] Based on the number of times and / or frequency of each candidate traffic light region appearing in a preset number of video images to be detected, a traffic light region is determined from the candidate traffic light regions.
[0011] According to a second aspect of the embodiments of this application, a processing unit is provided for performing saliency detection processing on a video image frame to be detected to obtain a saliency map;
[0012] The segmentation unit is used to segment the saliency map according to a preset segmentation threshold to obtain a binarized image;
[0013] The first determining unit is used to determine candidate traffic light regions in the binarized image based on the prior feature information of the traffic lights; the prior feature information of the traffic lights is the feature information of the traffic lights that has been determined before processing the video image to be detected.
[0014] The second determining unit is used to determine the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of each candidate traffic light region appearing in a preset number of video images to be detected.
[0015] According to a third aspect of the present application, an electronic device is provided, including a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions that can be executed by the processor, and the processor is configured to execute the machine-executable instructions to implement the above-described traffic signal light area detection method.
[0016] According to a fourth aspect of the present application, a machine-readable storage medium is provided, wherein machine-executable instructions are stored therein, and when the machine-executable instructions are executed by a processor, the above-described traffic light area detection method is implemented.
[0017] The traffic light region detection method of this application embodiment performs saliency detection processing on the video image frame to be detected to obtain a saliency map; segments the saliency map according to a preset segmentation threshold to obtain a binarized image; determines candidate traffic light regions in the binarized image based on the prior feature information of traffic lights; and determines the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected. By using the saliency features of traffic lights to detect traffic light regions, the method avoids the defect in actual monitoring scenarios where the color and shape cannot be accurately detected under poor exposure conditions, thereby improving the accuracy of traffic light detection. Attached Figure Description
[0018] Figure 1 This is a schematic flowchart illustrating a traffic light area detection method as an exemplary embodiment of this application;
[0019] Figure 2An overall system block diagram of a traffic signal light area detection scheme is shown as an exemplary embodiment of this application;
[0020] Figure 3 This is a schematic diagram of the structure of a traffic signal light area detection device shown in an exemplary embodiment of this application;
[0021] Figure 4 This is a schematic diagram of the hardware structure of an electronic device as illustrated in an exemplary embodiment of this application. Detailed Implementation
[0022] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0023] The terminology used in this application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The singular forms “a,” “the,” and “the” used in this application and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise.
[0024] To enable those skilled in the art to better understand the technical solutions provided in the embodiments of this application, and to make the above-mentioned objectives, features and advantages of the embodiments of this application more apparent and understandable, the technical solutions in the embodiments of this application will be further described in detail below with reference to the accompanying drawings.
[0025] Please see Figure 1 This is a flowchart illustrating a traffic signal area detection method provided in an embodiment of this application. Figure 1 As shown, the traffic light area detection method may include the following steps:
[0026] Step S100: Perform saliency detection processing on the video image frames to be detected to obtain a saliency map.
[0027] In this embodiment of the application, the video image frame to be detected may include any video image frame in the video data for which traffic light area detection is required, or a sampled image frame obtained by sampling the video data for which traffic light area detection is required.
[0028] For example, the video data mentioned above could be video data from monitoring front-end equipment at intersections where traffic lights are deployed.
[0029] In this embodiment of the application, considering that traffic lights often differ significantly from their surroundings, in order to accurately detect traffic light areas, saliency detection processing can be performed on the video image frames to be detected to obtain a saliency feature map (which can be simply referred to as a saliency map). For example, for traffic lights, the saliency targets displayed in the saliency map are traffic lights in the on state (such as red, green, or yellow lights).
[0030] For example, mainstream saliency detection algorithms, such as the LC algorithm proposed by Yun Zhai et al., the HC algorithm proposed by Ming-Ming Cheng, the FT algorithm and the AC algorithm proposed by Radhakrishna Achantay et al., can be used to perform saliency detection processing on the video image frames to be detected.
[0031] Step S110: Segment the saliency map according to the preset segmentation threshold to obtain a binarized image.
[0032] In this embodiment of the application, it is considered that in addition to traffic lights, there may be other salient targets in the salient map (such as targets with colors that are significantly different from the surrounding environment), such as orange or red rooftops, isolated small trees, etc.
[0033] Meanwhile, considering that the saliency of different salient targets may vary, and that traffic lights are generally more salient, a pixel threshold (referred to as the segmentation threshold in this paper) can be set to segment the saliency map and filter out salient targets in the saliency map.
[0034] For example, the saliency map obtained in step S100 can be segmented according to a preset segmentation threshold to obtain a binarized image.
[0035] For example, based on the preset segmentation threshold, the pixel values of the pixel positions in the saliency map where the pixel value is greater than the segmentation threshold and the pixel values of the pixel positions where the pixel value is less than the segmentation threshold can be set to different pixel values to obtain a binarized image.
[0036] For example, the pixel values at positions in the saliency map where the pixel value is less than a preset segmentation threshold can be set as the first pixel value, such as 0, and the pixel values at positions in the saliency map where the pixel value is greater than the preset segmentation threshold can be set as the second pixel value, such as 255.
[0037] It should be noted that for pixel positions in the saliency map where the pixel value is equal to the preset segmentation threshold, the pixel value can be set to the first pixel value, or the pixel value can be set to the second pixel value.
[0038] Step S120: Based on the prior feature information of the traffic lights, determine the candidate traffic light regions in the binarized image.
[0039] In this embodiment of the application, it is considered that the traffic light area has specific prior feature information in the video image frame.
[0040] For example, the prior feature information of traffic lights is the feature information of the traffic lights that has been determined before the video image to be detected is processed. It can be used to filter candidate traffic light regions in the image. That is, the candidate traffic light regions in the image (such as a binarized image) need to meet the prior feature information of traffic lights. Regions that do not meet the prior feature information of traffic lights can be filtered out as non-candidate traffic light regions.
[0041] For example, traffic light areas are usually located in the upper half of a video image frame, and the size of traffic light areas is usually within a certain range, neither too large nor too small.
[0042] Therefore, in order to improve the accuracy and efficiency of traffic light area detection, regions in the binarized image that do not meet the requirements of the prior feature information can be screened out based on the prior feature information of the traffic lights to obtain candidate traffic light regions in the binarized image.
[0043] Step S130: Determine the traffic light area from the candidate traffic light areas based on the number of times and / or frequency of each candidate traffic light area appearing in a preset number of video images to be detected.
[0044] In this embodiment of the application, considering that traffic lights usually switch between red, yellow and green states in a regular manner, the number of times and frequency of occurrence of the traffic light area detected in the video image frame to be detected are both regular.
[0045] Accordingly, candidate traffic light regions can be screened based on the pattern of the number of times and / or frequency of traffic light regions appearing in the video image to be detected.
[0046] For example, when candidate traffic light areas are determined in accordance with the manner described in steps S100 to S120, traffic light areas can be determined from the candidate traffic light areas based on the number of times and / or frequency of each candidate traffic light area appearing in a preset number of video images to be detected.
[0047] For example, the number of times and / or frequency of each candidate traffic light area in a preset number of video images to be detected can be determined based on the preset holding time of the traffic light (such as the holding time of the red light, the holding time of the yellow light, and the holding time of the green light) and the time interval between consecutive frames of the video images to be detected.
[0048] It can be seen that, in Figure 1 The method shown in the diagram utilizes the salient features of traffic lights to detect traffic light areas, thus avoiding the inability to accurately detect traffic light areas by color and shape in actual monitoring scenarios with poor exposure, thereby improving the accuracy of traffic light detection.
[0049] In some embodiments, step S120, determining candidate traffic light regions in the binarized image based on prior feature information of the traffic lights, may include:
[0050] In the binarized image, the second region that matches the prior feature information is determined as the candidate traffic light region; wherein, the pixel value of each pixel position in the second region is the second pixel value.
[0051] For example, prior feature information of a traffic light area may include:
[0052] Prior location information and / or prior size information.
[0053] For example, when a binarized image is obtained in accordance with the method described in steps S100 to S110, for a region in the binarized image whose pixel value is the second pixel value (e.g., 255) (referred to as the second region in this document), it can be determined whether each second region matches the prior feature information of the traffic light region based on the position information and / or size information of each second region, and the second region that matches the prior feature information of the traffic light region is determined as a candidate traffic light region.
[0054] For example, the region in a binarized image where the pixel value is the first pixel value (such as 0) can be called the first region.
[0055] In some embodiments, the video image to be detected includes a bright frame image and a dark frame image sampled from two video data streams at the same sampling rate, which are acquired by a monitoring front-end device according to different exposure parameters.
[0056] In step S100, the saliency detection processing of the video image frame to be detected may include:
[0057] Saliency detection processing was performed on bright and dark frames separately.
[0058] For example, considering the changes in lighting conditions in the deployment environment of the monitoring front-end device, when the monitoring front-end device uses a single exposure parameter to acquire video data, the video image frames in the acquired video data may be too bright due to excessive external lighting, or too dark due to insufficient external lighting, thereby affecting the accuracy of traffic light area detection.
[0059] Accordingly, in order to improve the accuracy and scene adaptability of traffic light area detection, two video data streams can be acquired by setting different exposure parameters in the monitoring front-end equipment, and then processed in different ISP (Image Signal Processor) processes to obtain bright frame data and dark frame data of the same scene for subsequent traffic light area detection.
[0060] For example, considering the large amount of video data and the degree of redundancy, in order to reduce the amount of detection data, when detecting traffic light areas, a series of sampling frames (including bright frame images and dark frame images) can be obtained from the two video data at the same sampling rate (e.g., 1 frame every 2 seconds).
[0061] For example, the sampled bright frame images and dark frame images correspond one-to-one, that is, the corresponding bright frame images and dark frame images are obtained by sampling video images acquired at the same time (allowing for time deviations within a tolerable range).
[0062] For example, in order to perform traffic light area detection, saliency detection processing can be performed on the above-mentioned bright frame image and dark frame image respectively.
[0063] In one example, before determining the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected in step S130, the following may also be included:
[0064] For any candidate traffic light region in any bright frame image, if there is no candidate traffic light region in the corresponding dark frame image that matches the position of the candidate traffic light region, the candidate traffic light region is filtered out.
[0065] For example, considering that the corresponding bright frame image and dark frame image are acquired by the same monitoring front-end device at the same time using different exposure parameters for the same scene, the same target will exist simultaneously in both the bright frame image and the corresponding dark frame image. Therefore, the existence of the target can be determined based on whether it exists in both the bright frame image and the corresponding dark frame image, thereby reducing the false detection rate and improving the accuracy of traffic light area detection.
[0066] Accordingly, for any candidate traffic light region in any bright frame image, the position of the candidate traffic light region in the bright frame image can be used to determine whether there is a candidate traffic light region at the matching position in the corresponding dark frame image. If there is, the candidate traffic light region is determined to exist; otherwise, it is determined to be a false detection and the candidate traffic light region is filtered out.
[0067] In some embodiments, after determining the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected in step S130, the method may further include:
[0068] Based on the determined traffic light area, identify the target color area in the traffic light area of the video image frame to be detected, where the target color is one of red, green, and yellow;
[0069] For any target color region, if the surrounding area of the target color region is not a black area, the traffic light region is filtered out; and / or, if the number of times the target color region in the traffic light region appears at a fixed position in multiple consecutive video image frames to be detected does not meet the requirements, the traffic light region is filtered out.
[0070] For example, considering that the colors of traffic lights mainly include red, green, and yellow (the color that is one of red, green, and yellow can be called the target color); in addition, considering that in actual scenarios, traffic lights are usually surrounded by a black cover, it is possible to determine whether the identified traffic light area is a false detection based on whether the area surrounding the target color area is a black area, thereby further improving the accuracy of traffic light area detection.
[0071] Accordingly, when the traffic light area is determined in accordance with the manner described in the above embodiments, the target color area in the traffic light area of the video image frame to be detected can be identified based on the determined traffic light area.
[0072] For any target color area, if the surrounding area of the target color area is not a black area, the traffic light area is filtered out.
[0073] Furthermore, considering that the timing of traffic light red, yellow, or green lights follows a certain pattern, the timing of red lights is usually also regular, and the positions of different colored lights are generally fixed, it is possible to determine whether a traffic light area is a false alarm based on whether the lighting time is regular.
[0074] Accordingly, if the number of times the target color region appears at a fixed position in multiple consecutive video image frames to be detected meets the requirements (which can be set according to the time interval between the video image frames to be detected and the lighting duration of the target color), then the traffic light region is filtered out.
[0075] To enable those skilled in the art to better understand the technical solutions provided in the embodiments of this application, the technical solutions provided in the embodiments of this application are described below with reference to specific examples.
[0076] Please see Figure 2 This is a general system block diagram of a traffic signal area detection scheme provided in an embodiment of this application, as shown below. Figure 2 As shown, in this embodiment, the implementation of the traffic signal light area detection scheme includes the following processing steps: a bright / dark frame acquisition module, a saliency detection module, a threshold segmentation module, a preliminary area positioning module, an area filtering and merging module, and a supplementary detection module.
[0077] For example, the bright and dark frame acquisition module acquires image data of the current scene in bright and dark frame modes from the monitoring camera respectively; the threshold segmentation module performs threshold segmentation based on the saliency map obtained by the saliency detection module to obtain a segmentation map; the preliminary region localization module roughly locates the area where traffic lights may appear based on the segmented image, and then the region filtering and merging module filters and merges the initially located areas; finally, the supplementary detection module 106 performs supplementary detection on the traffic lights again based on color information to obtain a more comprehensive and complete detection area.
[0078] The functions of each module are explained below.
[0079] I. Bright and Dark Frame Acquisition Module
[0080] The bright and dark frame acquisition module can acquire two video streams by setting different exposure parameters in the monitoring camera (i.e., the aforementioned monitoring front-end device), and process them in different ISP processes to obtain bright and dark frame data of the same scene, which can be used for subsequent traffic light area detection.
[0081] For example, since the video data is relatively large and has a certain degree of redundancy, the sampling rate (which is set by the user in the settings interface, with the default being 1 frame every 2 seconds) can be set to sample the video to obtain a series of sampled frames.
[0082] II. Significance Detection Module
[0083] The saliency detection module can process each sampled frame on both bright and dark frames according to saliency detection algorithms (including but not limited to the following algorithms: LC algorithm proposed by Yun Zhai et al., HC algorithm proposed by Ming-Ming Cheng, FT algorithm proposed by Radhakrishna Achantay et al., and AC algorithm, etc.) to obtain a series of saliency maps. These saliency maps contain the parts of the scene that are visually significantly more prominent than their surroundings.
[0084] III. Threshold Segmentation Module
[0085] The threshold segmentation module can segment the saliency maps obtained on bright and dark frames by setting a segmentation threshold respectively. The part of the saliency map that is less than the segmentation threshold is set to zero and the part that is greater than the threshold is set to 255, thus obtaining a binarized image. This eliminates most of the parts with low saliency.
[0086] IV. Preliminary Area Positioning Module
[0087] Since the location and size of traffic lights have certain prior information (i.e., the aforementioned prior feature information, taking the prior feature information including prior location information and prior size information as an example), the preliminary area localization module can eliminate areas that are obviously not traffic lights in terms of location and size (i.e., areas whose location does not match the prior location information and / or whose size does not match the prior size information) based on this prior information, and locate and mark the location and size of the remaining possible areas (i.e., the aforementioned candidate traffic light areas).
[0088] V. Region Filtering and Merging Module
[0089] The region filtering and merging module can retain regions that appear in both bright and dark frames (those in the same or similar positions) based on the preliminary regions (i.e., candidate traffic light regions) recorded in each video sampling frame. It also excludes regions that appear less than a certain number of times (these regions are likely vehicle lights, but they won't appear in a fixed position as vehicles move). Furthermore, it excludes regions with irregular frequency of occurrence based on the periodicity of traffic light changes. Finally, it merges some adjacent regions (these regions typically belong to the red, yellow, and green lights of the same traffic light). The regions obtained after processing by the region filtering and merging module are the traffic light regions obtained through non-color and shape features.
[0090] VI. Supplementary Testing Module
[0091] The supplementary detection module can segment the red area in the scene of the bright frame (taking the target color as red as an example) based on the traffic light area obtained through the above steps and according to the color information.
[0092] Since traffic lights are usually surrounded by a black casing, this feature can be used to eliminate red areas that act as interference in the scene. At the same time, the frequency of red areas appearing at fixed positions in the video sampling frame sequence can be used as a requirement. Red areas that do not meet the requirements (usually red areas on buildings in the background or car headlights) will also be eliminated. In this way, the traffic light area obtained through color information will serve as a supplement to the aforementioned detection area, further improving the accuracy of detection.
[0093] It should be noted that the traffic light detection area determined in the above manner can be marked with a rectangle on the monitoring screen for users to check whether the detection is accurate.
[0094] As can be seen, in the above traffic light area detection process, by simultaneously utilizing information from both bright and dark frames, false detections caused by single exposure of the surveillance camera can be avoided, greatly improving the robustness of the detection. Furthermore, using salient features for traffic light area detection avoids the limitation of not being able to detect traffic lights using color and shape in poorly exposed scenarios, effectively avoiding the problem of traditional traffic light area detection schemes relying entirely on the color and shape information of traffic lights. Finally, by utilizing the correlation information between consecutive video frames, other similar areas in the background and car lights can be effectively prevented from being mistakenly detected as traffic lights, improving the accuracy of traffic light area detection.
[0095] The method provided in this application has been described above. The apparatus provided in this application is described below:
[0096] Please see Figure 3 This is a schematic diagram of the structure of a traffic signal area detection device provided in an embodiment of this application, as shown below. Figure 3 As shown, the traffic signal area detection device may include:
[0097] Processing unit 310 is used to perform saliency detection processing on the video image frame to be detected to obtain a saliency map;
[0098] The segmentation unit 320 is used to segment the saliency map according to a preset segmentation threshold to obtain a binarized image;
[0099] The first determining unit 330 is used to determine the candidate traffic light region in the binarized image based on the prior feature information of the traffic lights; the prior feature information of the traffic lights is the feature information of the traffic lights that has been determined before processing the video image to be detected.
[0100] The second determining unit 340 is used to determine the traffic light area from the candidate traffic light area based on the number of times and / or frequency of each candidate traffic light area appearing in a preset number of video images to be detected.
[0101] In some embodiments, the segmentation unit 320 segments the saliency map according to a preset segmentation threshold to obtain a binarized image, including:
[0102] The pixel values at the locations of pixels in the saliency map whose pixel values are less than the preset segmentation threshold are set as the first pixel values, and the pixel values at the locations of pixels in the saliency map whose pixel values are greater than the preset segmentation threshold are set as the second pixel values.
[0103] The first determining unit 330 determines candidate traffic light regions in the binarized image based on prior feature information of the traffic lights, including:
[0104] The second region in the binarized image that matches the prior feature information is determined as a candidate traffic light region; wherein, the pixel value of each pixel position in the second region is the second pixel value;
[0105] The prior feature information includes:
[0106] Prior location information and / or prior size information.
[0107] In some embodiments, the video image to be detected includes a bright frame image and a dark frame image sampled from two video data streams at the same sampling rate, wherein the two video data streams are acquired by a monitoring front-end device according to different exposure parameters;
[0108] The processing unit 310 performs saliency detection processing on the video image frames to be detected, including:
[0109] Saliency detection processing is performed on the bright frame image and the dark frame image respectively.
[0110] In some embodiments, before the second determining unit 340 determines the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, it further includes:
[0111] For any candidate traffic light region in any bright frame image, if there is no candidate traffic light region in the corresponding dark frame image that matches the position of the candidate traffic light region, the candidate traffic light region is filtered out.
[0112] In some embodiments, after the second determining unit 340 determines the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, it further includes:
[0113] Based on the determined traffic light area, identify the target color area in the traffic light area of the video image frame to be detected, wherein the target color is one of red, green and yellow;
[0114] For any target color area, if the surrounding area of the target color area is not a black area, the traffic light area is filtered out.
[0115] And / or, if the number of times the target color region in the traffic light area appears at a fixed position in multiple consecutive video image frames to be detected does not meet the requirements, the traffic light area is filtered out.
[0116] Please see Figure 4 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of this application. The electronic device may include a processor 401 and a machine-readable storage medium 402 storing machine-executable instructions. The processor 401 and the machine-readable storage medium 402 can communicate via a system bus 403. Furthermore, by reading and executing the machine-executable instructions corresponding to the traffic light area detection control logic in the machine-readable storage medium 402, the processor 401 can execute the traffic light area detection method described above.
[0117] The machine-readable storage medium 402 mentioned herein can be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, etc. For example, a machine-readable storage medium can be: RAM (Random Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard disk drive), solid-state drive, any type of storage disk (such as optical disc, DVD, etc.), or similar storage media, or combinations thereof.
[0118] In some embodiments, a machine-readable storage medium is also provided, which stores machine-executable instructions that, when executed by a processor, implement the traffic light area detection method described above. For example, the machine-readable storage medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, or optical data storage device.
[0119] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0120] The above description is merely a preferred embodiment of this application and is not intended to limit this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application.
Claims
1. A method for detecting traffic signal light areas, characterized in that, include: The saliency detection process is performed on the video image frames to be detected to obtain a saliency map; The saliency map is segmented according to a preset segmentation threshold to obtain a binarized image; Based on the prior feature information of traffic lights, candidate traffic light regions in the binarized image are determined; the prior feature information of traffic lights refers to the feature information of the traffic lights that has been determined before processing the video image to be detected. Based on the number of times and / or frequency of each candidate traffic light region appearing in a preset number of video images to be detected, a traffic light region is determined from the candidate traffic light regions. The video images to be detected include bright frame images and dark frame images sampled from two video data streams at the same sampling rate, and the two video data streams are acquired by a monitoring front-end device according to different exposure parameters. The saliency detection process for the video image frames to be detected, resulting in a saliency map, includes: Saliency detection processing is performed on the bright frame image and the dark frame image respectively; Before determining the traffic light region from the candidate traffic light regions based on the number and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, the method further includes: For any candidate traffic light region in any bright frame image, if there is no candidate traffic light region in the corresponding dark frame image that matches the position of the candidate traffic light region, the candidate traffic light region is filtered out.
2. The method according to claim 1, characterized in that, Before determining the traffic light region from the candidate traffic light regions based on the number and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, the method further includes: For any candidate traffic light region in any bright frame image, if there is no candidate traffic light region in the corresponding dark frame image that matches the position of the candidate traffic light region, the candidate traffic light region is filtered out.
3. The method according to claim 1, characterized in that, After determining the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, the method further includes: Based on the determined traffic light area, identify the target color area in the traffic light area of the video image frame to be detected, wherein the target color is one of red, green and yellow; For any target color area, if the surrounding area of the target color area is not a black area, the traffic light area is filtered out. And / or, if the number of times the target color region in the traffic light area appears at a fixed position in multiple consecutive video image frames to be detected does not meet the requirements, the traffic light area is filtered out.
4. A traffic signal light area detection device, characterized in that, include: The processing unit is used to perform saliency detection processing on the video image frames to be detected, and obtain a saliency map; The segmentation unit is used to segment the saliency map according to a preset segmentation threshold to obtain a binarized image; The first determining unit is used to determine candidate traffic light regions in the binarized image based on the prior feature information of the traffic lights; the prior feature information of the traffic lights is the feature information of the traffic lights that has been determined before processing the video image to be detected. The second determining unit is used to determine the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of each candidate traffic light region appearing in a preset number of video images to be detected. The segmentation unit segments the saliency map according to a preset segmentation threshold to obtain a binarized image, including: The pixel values at the locations of pixels in the saliency map whose pixel values are less than the preset segmentation threshold are set as the first pixel values, and the pixel values at the locations of pixels in the saliency map whose pixel values are greater than the preset segmentation threshold are set as the second pixel values. The first determining unit determines candidate traffic light regions in the binarized image based on prior feature information of the traffic lights, including: The second region in the binarized image that matches the prior feature information is determined as a candidate traffic light region; wherein, the pixel value of each pixel position in the second region is the second pixel value; The prior feature information includes: Prior location information and / or prior dimension information; Before the second determining unit determines the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, it further includes: For any candidate traffic light region in any bright frame image, if there is no candidate traffic light region in the corresponding dark frame image that matches the position of the candidate traffic light region, the candidate traffic light region is filtered out.
5. The apparatus according to claim 4, characterized in that, The segmentation unit segments the saliency map according to a preset segmentation threshold to obtain a binarized image, including: The pixel values at the locations of pixels in the saliency map whose pixel values are less than the preset segmentation threshold are set as the first pixel values, and the pixel values at the locations of pixels in the saliency map whose pixel values are greater than the preset segmentation threshold are set as the second pixel values. The first determining unit determines candidate traffic light regions in the binarized image based on prior feature information of the traffic lights, including: The second region in the binarized image that matches the prior feature information is determined as a candidate traffic light region; wherein, the pixel value of each pixel position in the second region is the second pixel value; The prior feature information includes: Prior location information and / or prior size information.
6. The apparatus according to claim 4, characterized in that, After the second determining unit determines the traffic light region from the candidate traffic light regions based on the number of times and / or frequency of occurrence of each candidate traffic light region in a preset number of video images to be detected, it further includes: Based on the determined traffic light area, identify the red area in the traffic light area of the video image frame to be detected; For any red area, if the surrounding area of the red area is not a black area, the traffic light area is filtered out. And / or, if the number of times the red area in the traffic light region appears at a fixed position in multiple consecutive video image frames to be detected does not meet the requirement, the traffic light region is filtered out.
7. An electronic device, characterized in that, The method includes a processor and a memory, the memory storing machine-executable instructions that can be executed by the processor, the processor executing the machine-executable instructions to implement the method as described in any one of claims 1-3.