Road condition image processing method, driving recorder and storage medium
By adopting a differentiated image processing method based on driving risk weights, the problem of excessive computing power consumption of dashcams in low-light environments has been solved, achieving more efficient image processing and more stable device operation, and improving the detection accuracy and real-time warning of key targets.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENZHEN FANTTIK TECHNOLOGY INNOVATION CO LTD
- Filing Date
- 2026-03-24
- Publication Date
- 2026-06-19
Smart Images

Figure CN122244976A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of dashcam technology, and in particular to a method for processing road condition images, a dashcam, and a storage medium. Background Technology
[0002] A dashcam is a device that records video and audio information while a vehicle is in motion. The road condition images captured by a dashcam can be used to provide effective evidence in case of vehicle accidents. In low-light environments such as at night and in rain, dashcam system-on-a-chip typically needs to perform deep noise reduction and high dynamic range enhancement on the raw road condition images to improve image clarity.
[0003] However, performing full-image depth denoising and high dynamic range enhancement on the entire original road condition image with high resolution (such as 4K / 8K) will result in excessive computing power consumption of the system-on-a-chip, leading to a sharp increase in system-on-a-chip power consumption, severe heat generation, and triggering the system frequency reduction protection mechanism. This will reduce the processing speed of the road condition image, increase latency, and affect the stable operation of the dashcam. Summary of the Invention
[0004] This application provides a method for processing road condition images, a dashcam, and a storage medium. It can perform differentiated image processing on each region of interest in a road condition image based on the driving risk weight of each traffic element. This is beneficial for allocating the computing resources of the system-on-a-chip (SoC) to each region of interest as needed, thereby reducing the computing power consumption of the SoC.
[0005] To address the aforementioned technical problems, the embodiments of this application provide the following technical solutions: In a first aspect of this application, a method for processing road condition images is provided. The method includes: acquiring a road condition image of a vehicle using a camera, the road condition image including at least one traffic element; obtaining feature parameters of each traffic element among the at least one traffic element, the feature parameters including category, physical distance relative to the vehicle, and relative speed vector; determining a driving risk weight of each traffic element based on the feature parameters of each traffic element; and performing differentiated image processing on each region of interest of the road condition image based on the magnitude of the driving risk weight of each traffic element; wherein the driving risk weight of each traffic element is used to characterize the potential threat level of each traffic element to the driving safety of the vehicle, and each region of interest is the region where each traffic element is located in the road condition image.
[0006] In the embodiments of this application, the processor determines the driving risk weight of each traffic element based on its category, physical distance relative to the vehicle, and relative speed vector in the road condition image. Based on the magnitude of the driving risk weight of each traffic element, the processor performs differentiated image processing on each region of interest in the road condition image. By performing differentiated image processing on each region of interest, it is beneficial to allocate the computing resources of the system-on-a-chip (SoC) to each region of interest on demand, reducing the computing power consumed by the SoC during the processing of the road condition image. For example, computing resources can be preferentially allocated to the regions in the road condition image where traffic elements with higher driving risk weights are located, according to the magnitude of their driving risk weights.
[0007] In some embodiments, the categories of each traffic element and the semantic importance factors of each traffic element have a mapping relationship; determining the driving risk weight of each traffic element based on the characteristic parameters of each traffic element includes: determining the semantic importance factors of each traffic element based on the categories of each traffic element and the mapping relationship; calculating the driving risk weight of each traffic element based on the semantic importance factors of each traffic element, the physical distance relative to the vehicle, and the relative speed vector; wherein, the semantic importance factors of each traffic element are used to characterize the driving risk level of the vehicle corresponding to the category of each traffic element.
[0008] In some embodiments, calculating the driving risk weight of each traffic element based on its semantic importance factor, physical distance relative to the vehicle, and relative speed vector includes: calculating the product of the semantic importance factor of the i-th traffic element and the relative speed vector relative to the vehicle; calculating the sum of the physical distance of the i-th traffic element relative to the vehicle and a smoothing coefficient; dividing the product of the semantic importance factor of the i-th traffic element and the relative speed vector relative to the vehicle by the sum of the physical distance of the i-th traffic element relative to the vehicle and the smoothing coefficient to obtain the driving risk weight of the i-th traffic element; wherein the i-th traffic element is any one of the at least one traffic element.
[0009] In some embodiments, obtaining the feature parameters of each traffic element in the at least one traffic element includes: downsampling the road condition image to obtain a thumbnail image; extracting the semantic features of the thumbnail image using a lightweight perception operator to obtain the semantic information of the thumbnail image; and obtaining the feature parameters of each traffic element in the at least one traffic element based on the semantic information of the thumbnail image.
[0010] In some embodiments, the differential image processing of each region of interest in the road condition image based on the driving risk weights of each traffic element includes: constructing a weight matrix of the road condition image based on the driving risk weights of each traffic element; and performing differential image processing on each region of interest in the road condition image based on the weight matrix; wherein the road condition image includes P*Q pixel blocks, the weight matrix includes P*Q matrix elements, and the pixel block in the p-th row and q-th column of the road condition image corresponds to the matrix element in the p-th row and q-th column of the weight matrix, where P and Q are positive integers greater than 1, p... q If the pixel block in row p and column q is located in the area where any of the traffic elements are located in the road condition image, then the value of the matrix element in row p and column q is the target value, which is positively correlated with the driving risk weight of any of the traffic elements; if the pixel block in row p and column q is located in the background area, then the value of the matrix element in row p and column q is the baseline value, which is less than the target value; the background area is the area in the road condition image other than the area where each of the traffic elements is located in the road condition image.
[0011] In some embodiments, the differential image processing of each region of interest in the road condition image based on the weight matrix includes: parsing the weight matrix through a hardware scheduler to divide the road condition image into several image slices, and performing differential image processing on the several image slices; combining the several image slices after differential image processing to obtain the processed road condition image; wherein, the several image slices include image slices of target regions of interest and image slices of non-target regions of interest; the target regions of interest are the regions in the road condition image where each traffic element with a driving risk weight greater than a preset weight threshold is located, and the non-target regions of interest include the regions in the road condition image where each traffic element with a driving risk weight not greater than the preset weight threshold is located and the background region.
[0012] In some embodiments, the differential image processing of the plurality of image slices includes: invoking a deep neural network operator to perform nonlinear feature reconstruction on the image slices of the target region of interest; and performing linear processing on the image slices of the non-target region of interest.
[0013] In some embodiments, the traffic image includes the m-th frame traffic image, where m is a positive integer not less than 1, and the image slice of the target region of interest includes the image slice of the target region of interest in the m-th frame image; the method further includes: during the nonlinear feature reconstruction of the image slice of the target region of interest in the m-th frame traffic image using the deep neural network operator, extracting the intermediate layer feature map output by the intermediate layer of the deep neural network operator; extracting the semantic features of the intermediate layer feature map to obtain the semantic information of the intermediate layer feature map; re-acquiring the feature parameters of each traffic element in the m-th frame traffic image based on the semantic information of the intermediate layer feature map to obtain the updated feature parameters of each traffic element in the m-th frame traffic image; updating the weight matrix of the m-th frame traffic image based on the updated feature parameters of each traffic element in the m-th frame traffic image to obtain the updated weight matrix; if the (m+1)-th frame traffic image is acquired by the camera, then differential image processing is performed on each region of interest in the (m+1)-th frame traffic image based on the updated weight matrix.
[0014] In a second aspect of this application, a dashcam is also provided, the dashcam comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the method described in the first aspect.
[0015] In a third aspect of this application, a non-volatile computer-readable storage medium is also provided, the computer-readable storage medium storing computer-executable instructions that, when executed, enable the execution of the method described in the first aspect.
[0016] It should be understood that the description in the Summary of the Invention section is not intended to limit the key or essential features of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description
[0017] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments of the present invention will be briefly described below. Obviously, the drawings described below are merely some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without any creative effort.
[0018] Figure 1 This is a schematic diagram of the structure of a road condition image processing system provided in some embodiments of this application; Figure 2This is a schematic diagram of the hardware structure of a dashcam for performing a road condition image processing method, provided in some embodiments of this application. Figure 3 This is a schematic flowchart of a road condition image processing method provided in some embodiments of this application; Figure 4 These are schematic diagrams of the system-on-a-chip provided in some embodiments of this application; In the image: 100, Road condition image processing system; 200, Vehicle; 300, Dashcam; 301. System-on-a-chip; 302. Camera; 303. Display screen; 304. External storage; 305. Communication module; 306. USB interface; 710. Processor; 720. Memory. Detailed Implementation
[0019] The principles and spirit of this disclosure will be described below with reference to several exemplary embodiments illustrated in the accompanying drawings. It should be understood that these specific embodiments are described merely to enable those skilled in the art to better understand and implement this disclosure, and are not intended to limit the scope of this disclosure in any way. In the following description and claims, unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.
[0020] As used herein, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "an embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first", "second", etc., may refer to different or the same objects and are used only to distinguish the objects referred to, without implying a particular spatial order, temporal order, order of importance, etc., of the objects referred to.
[0021] For example, Figure 1 Schematic diagrams of the road condition image processing system provided in some embodiments of this application are shown, such as... Figure 1 As shown, system 100 includes vehicle 200 and dashcam 300 installed on vehicle 200.
[0022] For example, Figure 2 The following are schematic diagrams illustrating the hardware structure of a dashcam 300 according to some embodiments of this application, such as... Figure 2As shown, the dashcam 300 includes a system-on-chip (SOC) 301 and one or more cameras 302, a display screen 303, an external memory 304, a communication module 305, and a universal serial bus (USB) interface 306 that are communicatively connected to the SOC.
[0023] It should be understood that Figure 2 The illustrated structure does not constitute a specific limitation on the dashcam 300. In other embodiments of this application, the dashcam 300 may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
[0024] The system-on-a-chip 301's processor may include one or more processing units, such as a main control processor, a graphics processing unit (GPU), a traditional image signal processor (ISP), an artificial intelligence image signal processor (AIISP), a visual processing unit (VPU), and / or a neural network processing unit (NPU). Different processing units can be independent devices or integrated into one or more processors. The NPU, by referencing biological neural network structures, enables intelligent cognitive applications in the dashcam 300, such as image recognition and voice recognition.
[0025] The dashcam 300 achieves its display function through a GPU, a display screen 303, and a main control processor. The GPU is a microprocessor for image processing, connected to the display screen 303 and the main control processor. The GPU is used to perform mathematical and geometric calculations and for graphics rendering. The processor may include one or more GPUs, which execute program instructions to generate or modify display information.
[0026] Camera 302 is used to capture still images or videos. The dashcam 300 may include multiple cameras 302, such as a front-view camera and a rear-view camera; wherein, the front-view camera is used to capture road condition images in front of the vehicle 200, and the rear-view camera is used for reversing assistance and to capture road condition images behind the vehicle 200. It should be understood that in some other embodiments, the dashcam 300 may also include more cameras 302, such as an interior camera, which is located inside the vehicle 200's cabin and used to capture images of the interior of the vehicle 200's cabin.
[0027] The display screen 303 is used to display images captured by each of the cameras 320 in at least one camera 302. The display screen 303 includes a display panel. The display panel may be a liquid crystal display (LCD) or an organic light-emitting diode (OLED), etc.
[0028] External memory 304 may include double data rate synchronous dynamic random access (DDR) memory, embedded multi-media card (eMMC), and TF card (Trans-Flash Card), etc.; among which, DDR memory is used for caching. As system memory, DDR memory is used to store program instructions and real-time data during system operation, such as image data, intermediate layer feature maps, and driving risk weight data, providing high-speed read and write data cache space for modules such as ISP, AIISP, and NPU.
[0029] The communication module 305 includes a wired communication module and a wireless communication module. The wireless communication module can provide solutions for wireless communication applications on the dashcam 300, including wireless local area network (WLAN), Wi-Fi, Bluetooth (BT), and Global Navigation Satellite System (GNSS). GNSS can include Global Positioning System (GPS), Global Navigation Satellite System (GLONASS), and / or BeiDou Navigation Satellite System.
[0030] USB interface 306 is used to enable data transmission, software upgrades and power supply interaction between the dashcam and external devices, such as exporting still images or video data captured by at least one camera 302 to external devices.
[0031] Dashcam images of road conditions are frequently used to provide effective evidence in vehicle accidents. In low-light or complex lighting conditions, raw road condition images captured by cameras often suffer from severe noise, blurred dark areas, and overexposure in bright areas, resulting in reduced image clarity. Currently, to improve the clarity of road condition images, deep noise reduction and high dynamic range enhancement are typically performed on the raw images using an AIISP. However, performing full-image AI enhancement on the entire high-resolution (e.g., 4K / 8K) road condition image can lead to excessive system-on-a-chip (SoC) computing power consumption, a sharp increase in power consumption, severe overheating, and forced system frequency reduction, affecting device stability and lifespan.
[0032] In existing technologies, processors process each region of interest in a road image based solely on the speed or position of individual traffic elements, without distinguishing the categories of these elements. For example, pedestrians stationary on the road are also key targets, but in traditional algorithms, they have low weight and are difficult to prioritize for enhancement, resulting in insufficient detail of key targets and affecting subsequent identification and evidence collection.
[0033] Traditional road condition image processing systems employ a serial processing link of ISP image enhancement, image encoding and reconstruction, and target recognition. This requires first reconstructing and encoding the original road condition image domain before transmitting the complete image to the perception and recognition module. This process involves numerous data processing steps and long transmission paths, resulting in significant end-to-end latency. In high-speed driving scenarios, substantial perception latency can cause Advanced Driver Assistance Systems (ADAS) modules to issue warnings untimely, easily leading to delayed or even malfunctioning warnings, posing a driving safety hazard.
[0034] In view of this, embodiments of this application provide a method for processing road condition images, a dashcam, and a storage medium. Based on the driving risk weight of each traffic element, differentiated image processing is performed on each region of interest in the road condition image. This facilitates the priority allocation of computing resources to the regions in the road condition image where traffic elements with higher driving risk weights are located, according to their driving risk weights. To facilitate the reader's understanding of this application, specific embodiments are described below.
[0035] For example, Figure 3 This application presents some embodiments of a road condition image processing method, which is applied to a dashcam, for example... Figure 1 and Figure 2 The dashcam 300, such as Figure 3 As shown, the method includes the following steps: Step S11: The processor acquires road condition images of the vehicle through a camera, and the road condition images include at least one traffic element.
[0036] In this embodiment, the processor acquires road condition images of the vehicle via a camera. These road condition images can be raw domain data, that is, the original road condition images captured by the camera. The road condition images include at least one traffic element, such as pedestrians, animals, electric vehicles, bicycles, cars, buses, trees, or traffic signs.
[0037] Specifically, in some embodiments, the road condition image can be a road condition image in front of the vehicle or a road condition image behind the vehicle. For example, when the vehicle is moving forward, the processor can acquire a road condition image in front of the vehicle using a front-view camera. When the vehicle is reversing, the processor can acquire a road condition image behind the vehicle using a rear-view camera.
[0038] Step S12: The processor acquires the characteristic parameters of each traffic element in at least one traffic element.
[0039] In this embodiment, the processor can acquire feature parameters of each traffic element in the road condition image. These feature parameters include the category of each traffic element, the physical distance of each traffic element relative to vehicles, and the relative velocity vector of each traffic element relative to vehicles. Specifically, the category of each traffic element can be pedestrian, motor vehicle, non-motor vehicle, or traffic sign, etc. The processor can pre-set corresponding categories for each traffic element; for example, bicycles and electric vehicles can be categorized as non-motor vehicles.
[0040] In some embodiments, to reduce computational power consumption and memory bandwidth usage, step S12 specifically includes: the processor performing downsampling processing on the road condition image to obtain a thumbnail image, and the processor acquiring feature parameters of each traffic element based on the thumbnail image. In this embodiment, by performing downsampling processing on the road condition image, the processor can compress the spatial dimension of the road condition image, thereby reducing the size and resolution of the road condition image simultaneously, resulting in a thumbnail image. Compared to the road condition image without downsampling processing, the thumbnail image can reduce the amount of data while preserving the main structure of each traffic element, thus reducing the computational power required for subsequent extraction of semantic features.
[0041] In some embodiments, the processor obtains the feature parameters of each traffic element based on the thumbnail image, specifically including: the processor extracts the semantic features of the thumbnail image through a lightweight perception operator to obtain the semantic information of the thumbnail image; the processor obtains the feature parameters of each traffic element in at least one traffic element based on the semantic information of the thumbnail image.
[0042] NPUs typically include lightweight perception operators and high-precision perception operators. The high-precision perception operators run on the high-performance cores of the NPU, used for fine-grained perception and accurate recognition of full-resolution road condition images. The lightweight perception operators, deployed on the low-power cores of the NPU and implemented as miniature visual perception computing units based on a lightweight network structure, are used for coarse-grained feature extraction from low-resolution road condition images. They can quickly identify traffic elements such as pedestrians, vehicles, and traffic signs and output preliminary semantic features. In this embodiment, the processor uses lightweight perception operators to extract preliminary semantic features of traffic elements such as pedestrians, other vehicles, and traffic signs from a relatively low-resolution thumbnail image to initially identify the location and category of each traffic element, without needing to identify high-precision bounding boxes or fine-grained features of each traffic element.
[0043] Step 13: The processor determines the driving risk weight of each traffic element based on the characteristic parameters of each traffic element.
[0044] In some embodiments, there is a mapping relationship between the categories of each traffic element and the semantic importance factors of each traffic element, wherein the semantic importance factors of each traffic element are used to characterize the driving risk level of the vehicle corresponding to the category of each traffic element. Step 13 specifically includes: the processor determining the semantic importance factors of each traffic element according to the category of each traffic element and the mapping relationship; calculating the driving risk weight of each traffic element based on the semantic importance factors of each traffic element, the physical distance relative to the vehicle, and the relative speed vector; wherein the semantic importance factors of each traffic element are used to characterize the driving risk level of the vehicle corresponding to the category of each traffic element.
[0045] In some embodiments, there is a one-to-one correspondence between categories and semantic importance factors. Different categories correspond to different semantic importance factors. Those skilled in the art can pre-set the semantic importance factors corresponding to the categories of each traffic element according to actual needs. If the semantic importance factor of a traffic element is larger, the driving risk level of the vehicle corresponding to that category is higher. For example, the semantic importance factors corresponding to pedestrians, motor vehicles, and traffic signs can be set to 1.5, 1.0, and 0.8, respectively; in this case, the driving risk levels of the vehicles corresponding to pedestrians, motor vehicles, and traffic signs decrease sequentially. Assigning different driving risk levels to different categories of traffic elements intuitively reflects the level difference of "pedestrian risk is higher than vehicle risk, and vehicle risk is higher than traffic sign risk," providing an important benchmark for the calculation of driving risk weights.
[0046] In some embodiments, the processor calculates the driving risk weight of each traffic element based on its semantic importance factor, physical distance relative to the vehicle, and relative speed vector. This includes: the processor calculating the product of the semantic importance factor of the i-th traffic element and its relative speed vector relative to the vehicle; the processor calculating the sum of the physical distance of the i-th traffic element relative to the vehicle and a smoothing coefficient; and the processor dividing the product of the semantic importance factor of the i-th traffic element and its relative speed vector relative to the vehicle by the sum of the physical distance of the i-th traffic element relative to the vehicle and the smoothing coefficient to obtain the driving risk weight of the i-th traffic element. Here, the i-th traffic element is any one of the less than one traffic element.
[0047] In some embodiments, the traffic image specifically includes N traffic elements, where N is a positive integer not less than 1; and, if N=1, then i=1; if N is a positive integer not less than 2, then The processor calculates the driving risk weight of each traffic element based on its semantic importance factor, physical distance relative to the vehicle, and relative speed vector. Specifically, the processor calculates the driving risk weight of the i-th traffic element using the following formula:
[0048] in, Let be the driving risk weight for the i-th traffic element; Let be the relative velocity vector of the i-th traffic element relative to the vehicle; Let be the semantic importance factor of the i-th traffic element; Let be the physical distance of the i-th traffic element relative to the vehicle; This is the smoothing coefficient.
[0049] In the embodiments of this application, the driving risk weight of each traffic element is used to characterize the potential threat posed by each traffic element to the vehicle's driving safety. The larger the driving risk weight of a traffic element, the greater the potential threat it poses to the vehicle's driving safety. The relative speed vector is used to measure how quickly each traffic element approaches the vehicle; the larger the relative speed vector, the higher the driving risk. The smoothing coefficient is a constant greater than 0, such as 0.1, 0.05, or 0.2. The smoothing coefficient is used to avoid division-by-zero errors, smooth out the increase in driving risk weights in extremely close-range scenarios, and offset the weight fluctuations caused by small errors in physical distance measurement.
[0050] Step S34: The processor performs differentiated image processing on each region of interest in the road condition image based on the magnitude of the driving risk weight of each traffic element; wherein, each region of interest is the region where each traffic element is located in the road condition image.
[0051] In some embodiments, step S34 specifically includes: the processor constructing a weight matrix for the road condition image based on the driving risk weights of each traffic element; and performing differentiated image processing on each region of interest in the road condition image based on the weight matrix; the road condition image includes P*Q pixel blocks, the weight matrix includes P*Q matrix elements, and the pixel block in the p-th row and q-th column of the road condition image corresponds to the matrix element in the p-th row and q-th column of the weight matrix, where P and Q are positive integers greater than 1, p... q If the pixel in row p and column q is located in the area where any traffic element is located in the road condition image, then the value of the matrix element in row p and column q is the target value, which is positively correlated with the driving risk weight of any traffic element; if the pixel in row p and column q is located in the background area, then the value of the matrix element in row p and column q is the baseline value, which is less than the target value; the background area is the area in the road condition image other than the area where each traffic element is located in the road condition image.
[0052] Specifically, the pixel array of the road condition image comprises P rows and Q columns of pixel blocks, each pixel block including at least one pixel. The weight matrix comprises P rows and Q columns of matrix elements. There is a one-to-one correspondence between the pixel blocks of the road condition image and the matrix elements of the weight matrix. Any pixel block of the road condition image corresponds to a matrix element in the same position in the weight matrix. This weight matrix can distinguish between different regions of interest and background regions in the road condition image by the gradient of the numerical values of the matrix elements. Thus, the processor can perform differentiated image processing on different regions of interest in the road condition image based on the weight matrix, realizing on-demand allocation of computing resources.
[0053] In some embodiments, the processor performs differential image processing on each region of interest in the road condition image based on a weight matrix, including: the processor parses the weight matrix through a hardware scheduler to divide the road condition image into several image slices, and performs differential image processing on the several image slices; the processor combines the several image slices after differential image processing to obtain a processed road condition image; wherein, the several image slices include image slices of the target region of interest and image slices of non-target regions of interest; the target region of interest is the region in the road condition image where each traffic element has a driving risk weight greater than a preset weight threshold, and the non-target regions of interest include the regions in the road condition image where each traffic element has a driving risk weight not greater than a preset weight threshold. In some other embodiments, the non-target regions of interest may also include background regions.
[0054] In some embodiments, the processor performs differential image processing on several image slices, including: the processor invokes a deep neural network operator to perform nonlinear feature reconstruction on the image slices of the target region of interest through the deep neural network operator; and the processor performs linear processing on the image slices of non-target regions of interest.
[0055] In the embodiments of this application, image slices of non-target regions of interest can be linearly processed by the ISP. The hardware scheduler can irregularly slice the road condition image according to the weight matrix to obtain several image slices, and dynamically decide whether each image slice is reconstructed nonlinearly by the NPU depth operator or linearly processed by the ISP, thereby realizing intelligent allocation of computing resources.
[0056] Specifically, deep neural network operators can be AI Noise Reduction (AINR) operators and image super-resolution operators. The processor uses deep neural network operators to perform non-linear feature reconstruction on image slices representing the target region of interest (ROI), ensuring clear features of ROIs (such as license plates and faces) in low-light nighttime road condition images. Image slices representing non-ROI regions are directly passed to the ISP for linear image processing such as white balance, eliminating the need for complex AI calculations and feature reconstruction via the NPU. This effectively reduces the number of read / write operations in DDR memory, lowering DDR memory access frequency and system power consumption.
[0057] In some embodiments, the traffic image includes the m-th frame traffic image, where m is a positive integer not less than 1, and the image slice of the target region of interest includes the image slice of the target region of interest of the m-th frame image. The method further includes: during the nonlinear feature reconstruction of the image slice of the target region of interest of the m-th frame traffic image using a deep neural network operator, the processor extracts the intermediate layer feature map output by the intermediate layer of the deep neural network operator; the processor extracts the semantic features of the intermediate layer feature map to obtain the semantic information of the intermediate layer feature map; the processor re-acquires the feature parameters of each traffic element in the m-th frame traffic image based on the semantic information of the intermediate layer feature map to obtain the updated feature parameters of each traffic element in the m-th frame traffic image; the processor updates the weight matrix of the m-th frame traffic image based on the updated feature parameters of each traffic element in the m-th frame traffic image to obtain the updated weight matrix; if the processor acquires the (m+1)-th frame traffic image through a camera, the processor performs differential image processing on each region of interest in the (m+1)-th frame traffic image based on the updated weight matrix.
[0058] Specifically, before image reconstruction, the intermediate layer feature map is directly transmitted to the Advanced Driver Assistance System (ADAS) module using zero-copy memory technology. The processor then extracts the semantic features of the intermediate layer feature map through the ADAS module.
[0059] In the embodiments of this application, the processor can update the weight matrix of the m-th frame traffic image based on the intermediate layer feature map corresponding to the m-th frame traffic image, thereby obtaining the updated weight matrix and realizing dynamic updating of the weight matrix. Compared with the method of directly extracting the semantic features of the (m+1)-th frame traffic image to obtain the feature parameters of each traffic element in the (m+1)-th frame traffic image and constructing a weight matrix to perform differential image processing on each region of interest in the (m+1)-th frame traffic image, the method of the processor performing differential image processing on each region of interest in the (m+1)-th frame traffic image based on the updated weight matrix in the embodiments of this application can effectively reduce encoding and decoding latency and improve the processing efficiency of traffic images.
[0060] In the embodiments of this application, the raw road condition data collected by the camera is stored in the DDR cache after basic image signal processing by a traditional ISP. The cached data in the DDR memory is processed in two ways: one way is rendered by the GPU and output to the display screen, or encoded by the VPU and stored in the memory; the other way is intelligently enhanced by the AIISP and sent to the NPU for neural network inference, and the output result is used by the ADAS module to realize driving assistance perception. The ADAS module sends the perception result back to the AIISP, and the closed-loop architecture of the weight matrix obtained by the AIISP is corrected in real time through the perception result of the ADAS module, thereby improving the image quality and perception robustness in complex scenes.
[0061] The embodiments of this application have the following beneficial effects: In terms of power consumption, compared with the full-image AIISP processing method, the power consumption of the dashcam system-on-a-chip can be reduced by about 45%, effectively reducing heat generation and extending the service life of the hardware; in terms of perception accuracy, the detection confidence of key targets such as license plates and pedestrians in low-light environments at night is improved by 30%, improving the accuracy of ADAS perception; in terms of response speed, the end-to-end latency from light signal acquisition to ADAS warning is reduced by 15ms to 25ms, improving the real-time performance of warnings and the efficiency of system operation.
[0062] For example, Figure 4 A schematic diagram of the system-on-a-chip (SoC) hardware structure is shown. For example... Figure 4 As shown, the system-on-a-chip 301 includes: One or more processors 710 and memory 720, Figure 4 Take the 710 processor as an example.
[0063] The processor 710 and memory 720 can be connected via a bus or other means. Figure 4 Taking the example of a connection between China and Israel via a bus.
[0064] The memory 720, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the program instructions / modules corresponding to the methods in the embodiments of this application. The processor 710 executes various functional applications and data processing of the dashcam by running the non-volatile software programs, instructions, and modules stored in the memory 720, thereby implementing the methods in the above-described method embodiments.
[0065] The memory 720 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created based on the use of the dashcam. Furthermore, the memory 720 may include high-speed random access memory and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, the memory 720 may optionally include memory remotely located relative to the processor 710. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
[0066] The one or more modules are stored in the memory 720. When executed by the one or more processors 710, they perform the methods in any of the above method embodiments, for example, the methods described above. Figure 3 Method steps S31-S34.
[0067] The above-described product can perform the methods provided in the embodiments of this application, and has the corresponding functional modules and beneficial effects for performing the methods. Technical details not described in detail in this embodiment can be found in the methods provided in the embodiments of this application.
[0068] This application provides a non-volatile computer-readable storage medium storing computer-executable instructions that are executed by one or more processors, for example... Figure 4 One of the processors 710 can cause the one or more processors to perform the methods in any of the above method embodiments, for example, to perform the methods described above. Figure 3 Method steps S31-S34.
[0069] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.
[0070] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented using software and a general-purpose hardware platform, or of course, using hardware. Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), or random access memory (RAM), etc.
[0071] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; under the concept of the present invention, the technical features of the above embodiments or different embodiments can also be combined, the steps can be implemented in any order, and there are many other variations of different aspects of the present invention as described above, which are not provided in detail for the sake of brevity; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for processing road condition images, characterized in that, The method includes: The vehicle captures road condition images using a camera, and the road condition images include at least one traffic element; Obtain feature parameters for each traffic element in the at least one traffic element, the feature parameters including category, physical distance relative to the vehicle, and relative speed vector; The driving risk weight of each traffic element is determined based on the characteristic parameters of each traffic element. Based on the driving risk weights of each traffic element, differentiated image processing is performed on each region of interest in the road condition image. The driving risk weights of each traffic element are used to characterize the potential threat level of each traffic element to the driving safety of the vehicle, and each region of interest is the region where each traffic element is located in the road condition image.
2. The method according to claim 1, characterized in that, The categories of each traffic element and the semantic importance factors of each traffic element have a mapping relationship; The step of determining the driving risk weight of each traffic element based on its characteristic parameters includes: The semantic importance factor of each traffic element is determined based on the category of each traffic element and the mapping relationship. Based on the semantic importance factor of each traffic element, the physical distance relative to the vehicle, and the relative speed vector, the driving risk weight of each traffic element is calculated. The semantic importance factor of each traffic element is used to characterize the driving risk level of the vehicle corresponding to the category of each traffic element.
3. The method according to claim 2, characterized in that, The calculation of driving risk weights for each traffic element based on its semantic importance factor, physical distance relative to the vehicle, and relative speed vector includes: Calculate the product of the semantic importance factor of the i-th traffic element and the relative speed vector relative to the vehicle; Calculate the sum of the physical distance of the i-th traffic element relative to the vehicle and the smoothing coefficient; The driving risk weight of the i-th traffic element is obtained by multiplying the semantic importance factor of the i-th traffic element with the relative speed vector relative to the vehicle and dividing by the sum of the physical distance of the i-th traffic element relative to the vehicle and the smoothing coefficient. Wherein, the i-th traffic element is any one of the least one traffic elements.
4. The method according to claim 1, characterized in that, The step of obtaining the feature parameters of each traffic element in the at least one traffic element includes: The road condition image is downsampled to obtain a thumbnail image; The semantic features of the thumbnail image are extracted using a lightweight perceptual operator to obtain the semantic information of the thumbnail image; Based on the semantic information of the thumbnail image, the feature parameters of each traffic element in the at least one traffic element are obtained.
5. The method according to any one of claims 1-4, characterized in that, The differential image processing of each region of interest in the road condition image based on the driving risk weights of each traffic element includes: A weight matrix for the road condition image is constructed based on the driving risk weights of each traffic element; and Differential image processing is performed on each region of interest in the road condition image based on the weight matrix; The road condition image comprises P*Q pixel blocks, and the weight matrix comprises P*Q matrix elements. The pixel block in the p-th row and q-th column of the road condition image corresponds to the matrix element in the p-th row and q-th column of the weight matrix, where P and Q are positive integers greater than 1. q ; If the pixel block in row p and column q is located in the area where any of the traffic elements are located in the road condition image, then the value of the matrix element in row p and column q is the target value, and the target value is positively correlated with the driving risk weight of any of the traffic elements. If the pixel block in the p-th row and q-th column is located in the background region, then the value of the matrix element in the p-th row and q-th column is the reference value, which is less than the target value. The background area is the area in the traffic image other than the area where each of the traffic elements is located in the traffic image.
6. The method according to claim 5, characterized in that, The differential image processing based on the weight matrix for each region of interest in the road condition image includes: The weight matrix is parsed by a hardware scheduler to divide the road condition image into several image slices, and the image slices are then subjected to differentiated image processing. The processed image slices are combined to obtain the processed road condition image. The plurality of image slices include image slices of the target region of interest and image slices of non-target regions of interest; The target region of interest is the region in the road condition image where each traffic element with a driving risk weight greater than a preset weight threshold is located. The non-target region of interest includes the region in the road condition image where each traffic element with a driving risk weight not greater than the preset weight threshold is located and the background region.
7. The method according to claim 6, characterized in that, The differential image processing of the plurality of image slices includes: The deep neural network operator is invoked to perform nonlinear feature reconstruction on the image slices of the target region of interest. Linear processing is performed on the image slices of the non-target region of interest.
8. The method according to claim 7, characterized in that, The road condition image includes the m-th frame road condition image, where m is a positive integer not less than 1, and the image slice of the target region of interest includes the image slice of the target region of interest of the m-th frame image; The method further includes: During the process of nonlinear feature reconstruction of the image slices of the target region of interest in the m-th frame road condition image using the deep neural network operator, the intermediate layer feature map output by the intermediate layer of the deep neural network operator is extracted. Extract the semantic features of the intermediate layer feature map to obtain the semantic information of the intermediate layer feature map; Based on the semantic information of the intermediate layer feature map, the feature parameters of each traffic element in the m-th frame traffic image are re-obtained to obtain the updated feature parameters of each traffic element in the m-th frame traffic image. Based on the feature parameters of each traffic element in the updated m-th frame traffic image, the weight matrix of the m-th frame traffic image is updated to obtain the updated weight matrix. If the (m+1)th frame of the road condition image is captured by the camera, then differentiated image processing is performed on each region of interest in the (m+1)th frame of the road condition image based on the updated weight matrix.
9. A dashcam, characterized in that, The dashcam includes: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the method according to any one of claims 1-8.
10. A non-volatile computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions that, when executed, enable the execution of the method described in any one of claims 1-8.