An intelligent judgment system for shielding license plates of road transportation vehicles exceeding limits and overloading
Through multi-source data fusion and intelligent analysis, efficient identification and accurate identity assessment of vehicles with obscured license plates have been achieved, solving the problems of low identification accuracy and incomplete evidence chains in existing technologies, and improving law enforcement efficiency and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NANJING GUANWEI INTELLIGENT SOFTWARE TECH CO LTD
- Filing Date
- 2026-03-24
- Publication Date
- 2026-06-19
Smart Images

Figure CN122244445A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent traffic enforcement technology, specifically to an intelligent analysis system for identifying overloaded and oversized vehicles that obscure their license plates. Background Technology
[0002] In road traffic enforcement, overloading and obscuring license plates are common violations. Traditional enforcement methods rely on single devices (such as weighing systems or cameras) for data collection, which are limited in function and lack a complete chain of evidence, making it difficult to accurately identify vehicles, especially when license plates are obscured. Existing technologies, such as single-frame image-based recognition methods, suffer from poor robustness in complex scenarios involving target occlusion, changes in lighting, and differences in viewing angles, easily leading to tracking loss and identification errors.
[0003] For example, patent document CN120766260A discloses an overloaded vehicle license plate occlusion recognition system based on AI visual analysis of surveillance video. Its purpose is to trigger a multi-view camera in the acquisition layer to capture images of overloaded vehicles through dynamic weighing equipment, and accurately locate the license plate area by combining vehicle contour comparison; it uses a Gaussian kernel algorithm that integrates small-scale fine edges and large-scale coarse structure edges to adaptively enhance image clarity, and constructs a license plate generation matrix based on ORB feature matching and Hamming distance and selects the optimal combination to achieve intelligent reconstruction of occluded characters.
[0004] However, its multi-view image processing workflow and outdated feature extraction methods result in severely insufficient accuracy and practicality in identifying heavily occluded vehicles in real, complex road network enforcement scenarios. Specifically: 1. If only a vehicle contour comparison method based on a fixed template is used for license plate region localization, the problem is that, assuming all camera views are fixed and vehicle postures are standard, simply comparing the captured vehicle contour with a pre-stored template directly maps the license plate region from the template to the current image. In actual road networks, cameras at different checkpoints vary significantly in installation position, angle, and focal length, and vehicles undergo posture changes during movement. This mapping method introduces severe pixel-level misalignment, causing subsequent character regions extracted from images from different viewpoints to fail to correspond accurately.
[0005] 2. When license plates are severely obscured by mud, stickers, etc., the texture area available for extracting stable ORB feature points is significantly reduced, and feature descriptions are easily rendered ineffective due to interference from occlusions. Generally, the license plate character matching stage uses ORB feature matching combined with Hamming distance to calculate similarity. This method belongs to traditional local feature descriptors. Under conditions of severe occlusion, dirt, or low lighting on the license plate area, feature point detection and description are prone to failure, directly resulting in matching failure or mismatch. The proposed solution involves performing ORB matching between the key region corresponding to the target and the blurred image. However, if the occlusion covers most of the license plate area, the number of ORB feature points will decrease sharply, making stable matching impossible and leading to the failure of the license plate generation matrix or the generation of incorrect license plate combinations.
[0006] Therefore, there is an urgent need for a system that can integrate multi-source data, achieve automatic evidence collection and intelligent analysis, and improve law enforcement efficiency and accuracy. Summary of the Invention
[0007] To address the shortcomings of existing technologies, the present invention aims to provide an intelligent judgment system for identifying obscured license plates of overloaded and oversized vehicles in road transportation. Through multi-source data fusion, algorithm-driven judgment, and automated evidence collection, the system achieves real-time identification of illegal acts, accurate judgment of vehicle identity, and complete and secure evidence chain, thereby solving the problems mentioned in the background technology.
[0008] To achieve the above objectives, the present invention provides a technical solution as follows: an intelligent judgment system for detecting obscured license plates of overloaded and oversized vehicles in road transportation, comprising: The multi-source data acquisition module, through integrated front-end sensing devices, collects and transmits vehicle information in real time, including multi-view images and associated structured metadata. The visual feature extraction module receives image data and processes multi-view asynchronously captured image sequences to generate vehicle feature fingerprint data packets. The processing includes: first, parsing image sequences from multiple checkpoints, deriving initial displacement estimates between images based on the structured metadata and a vehicle motion model, and achieving sub-pixel-level spatial alignment of each frame in the image sequence relative to a selected reference frame; second, locating the license plate region and conducting preliminary occlusion probability assessment on the aligned images, and reliably verifying the occlusion status of the license plate region through at least one physical verification mechanism. If the verification confirms occlusion, a pixel-level precision binary occlusion mask is generated; finally, based on the binary occlusion mask, a vehicle global feature vector robust to occlusion interference is extracted. The intelligent analysis module is used to receive the vehicle feature fingerprint data packet, compare it with the pre-stored multi-source vehicle feature database, and output a list of candidate vehicle identities. The evidence chain management module is used to encapsulate the entire chain of evidence and provide visual review.
[0009] Compared with the prior art, the beneficial effects of the present invention are as follows: This invention analyzes metadata such as the geographic coordinates, lane direction angle, and timestamp of the checkpoint device, and estimates displacement priors by combining vehicle motion models. This guides multi-frame asynchronous image capture to achieve automatic alignment with sub-pixel accuracy. The method effectively eliminates scale, rotation, and translation transformations caused by differences in the deployment position, viewing angle, and focal length of different cameras in the road network, as well as the vehicle's own movement. It unifies images from multiple perspectives and times into the same geometric coordinate system, solving the problem of feature misalignment and matching failure caused by differences in viewing angle in traditional single-frame analysis.
[0010] This invention employs a lightweight network to quickly locate the license plate region and initially determine the occlusion probability. It then performs parallel uniformity analysis based on local texture variance and character continuity checks based on projection segmentation, forming a dual verification mechanism. Triggering either verification mechanism confirms the occlusion state and generates a pixel-level precision binary occlusion mask. During feature extraction, a weighting mechanism actively strengthens the feature contribution of unoccluded areas while simultaneously suppressing noise in occluded areas. This ensures that the extracted vehicle fingerprint can consistently identify the vehicle body and headlights, even if the license plate information is completely missing, resulting in a more discriminative and robust vehicle fingerprint. Attached Figure Description
[0011] The disclosure of this invention is illustrated with reference to the accompanying drawings. It should be understood that the drawings are for illustrative purposes only and are not intended to limit the scope of protection of this invention. In the drawings, the same reference numerals are used to refer to the same parts. Wherein: Figure 1 This is a block diagram of the intelligent judgment system for obscuring license plates of overloaded and oversized vehicles in road transportation proposed in one embodiment of the present invention; Figure 2 This is a schematic diagram of the image preprocessing unit processing flow proposed in one embodiment of the present invention; Figure 3 This is a schematic diagram of the process by which the intelligent occlusion determination unit proposed in one embodiment of the present invention performs reliable determination and fine annotation after image alignment is completed. Figure 4 This is a schematic diagram illustrating an example of images captured from the same vehicle according to one embodiment of the present invention. Detailed Implementation
[0012] It is readily understood that, based on the technical solution of this invention, those skilled in the art can propose various interchangeable structural methods and implementations without altering the essential spirit of the invention. Therefore, the following detailed embodiments and accompanying drawings are merely illustrative examples of the technical solution of this invention and should not be considered as the entirety of the invention or as limitations or restrictions on the technical solution of this invention.
[0013] The present invention will be further described in detail below with reference to the accompanying drawings, but this is not intended to limit the scope of the invention.
[0014] Combination Figures 1-3 As shown, this invention proposes an intelligent analysis system for detecting obscured license plates of overloaded vehicles in road transportation. The system comprises a multi-source data acquisition module, a visual feature extraction module, an intelligent analysis module, and an evidence chain management module connected in sequence. This enables fully automated processing from illegal data collection, vehicle feature generation, intelligent identity analysis, to electronic case file solidification. Specifically: The multi-source data acquisition module integrates and deploys front-end equipment such as dynamic weighing, video surveillance, radar speed measurement, and vehicle positioning to collect multi-dimensional raw data such as weight, speed, multi-view images, and continuous spatiotemporal trajectory of vehicles in real time and synchronously.
[0015] The visual feature extraction module receives the above multi-view image sequence and its associated data. First, it performs intelligent registration and alignment on the images to eliminate the viewpoint changes caused by differences in device deployment. Then, it accurately detects the license plate area and determines the occlusion status, generating a refined occlusion mask. Finally, guided by this mask, it reconstructs and extracts the vehicle's deep feature fingerprint, which is highly robust to occlusion.
[0016] The intelligent analysis module efficiently compares extracted feature fingerprints with vehicle registration databases, violation history databases, and other sources, and performs correlation analysis based on vehicle spatiotemporal trajectories. By integrating multi-dimensional evidence such as feature similarity and behavioral consistency, it outputs high-confidence identification results and a ranked list of suspected vehicles.
[0017] The evidence chain management module automatically and structurally integrates, digitally signs, and encapsulates the entire chain of evidence generated in the aforementioned stages, including images, data, features, and analytical conclusions, forming a standardized and tamper-proof electronic case file. Simultaneously, a visual review interface assists law enforcement personnel in completing evidence review and disposal decisions.
[0018] In one embodiment of the present invention, the proposed multi-source data acquisition module is used to provide comprehensive and real-time data sources. It mainly collects multi-dimensional data such as vehicle weight, speed, images and continuous trajectories in real time through dynamic weighing systems, video surveillance, license plate recognition cameras, radar speed measurement equipment, vehicle-mounted Beidou / GPS terminals, etc.; and transmits the collected video streams, images and related data in real time and reliably to ensure that the subsequent processing modules can obtain the original information in a timely manner.
[0019] In real-world vehicle analysis scenarios, challenges arise due to scale and perspective distortions caused by vehicle movement, drastic lighting changes (such as day-night cycles and sudden changes in tunnel brightness), and the prevalent issue of occlusion. In particular, intentional occlusion of license plates directly leads to the loss of crucial identification information. Furthermore, the presence of similar background interference and inter-class similarity in complex road networks makes traditional methods based on single frames or simple models highly susceptible to tracking loss and identification errors.
[0020] To address the problems caused by factors such as multiple sources, heterogeneity, and occlusion, the proposed visual feature extraction module includes an image preprocessing unit, an intelligent occlusion determination unit, and a feature vector extraction unit.
[0021] In one embodiment of the present invention, the image preprocessing unit first receives a sequence of suspected occluded vehicle images transmitted by the multi-source data acquisition module, and is responsible for performing preliminary occlusion detection, image quality enhancement, and crucial multi-frame image alignment. The specific process is as follows: Step 1: Data Normalization Preprocessing In this step, structured image data packets are received via a high-speed interface. Each data packet corresponds to an independent vehicle tracking event and contains M original images of the same vehicle asynchronously captured by N checkpoint devices within a short period of time. Each image frame is accompanied by structured metadata including device ID, high-precision timestamp, geographic coordinates, and lane direction angle.
[0022] First, to effectively preserve structural information while reducing image processing complexity, each frame of color image is converted into a grayscale image. This conversion process employs a weighted model that conforms to the characteristics of human visual perception. Specifically, the intensity values of the red, green, and blue channels of each pixel are assigned weights of 0.299, 0.587, and 0.114 respectively, and then summed to calculate the corresponding grayscale value. The intensity values of the red, green, and blue channels are all integers between 0 and 255.
[0023] Simultaneously, the structured metadata attached to each frame of image is parsed and extracted, key prior physical quantities are calculated, including the time difference between adjacent frames, the geographical distance between checkpoint devices, and the average driving speed of vehicles between road segments is estimated.
[0024] Step 2: Select a reference frame Due to differences in viewing angle and focal length between different checkpoint devices, as well as attitude changes caused by the vehicle's own movement, the original image sequence exhibits significant spatial misalignment (such as translation, rotation, and scale differences) on the image plane. To establish a unified geometric reference frame, a frame needs to be selected from the image sequence as a reference, and all subsequent images need to be aligned to this reference.
[0025] The sharpness evaluation function Qclarity, based on gradient statistics, is used to evaluate each frame of the image. For any grayscale image It, its sharpness score is calculated as follows: First, the Sobel operator is used to compute the gradient vector of the image at each pixel coordinate. Its amplitude is: ,in, and These are the gradient components in the horizontal and vertical directions, respectively.
[0026] Secondly, calculate the arithmetic mean of the gradient magnitudes of all pixels in the entire image. The higher this score, the clearer the image edges and texture details, and the better the overall visual quality.
[0027] Traverse all M frames in the sequence and calculate the Qclarity score for each frame, resulting in a set {Q1, Q2, ..., QM}. Select the frame with the highest score as the geometric reference frame Iref, and establish its image coordinate system as the unified reference coordinate system for the entire processing flow to ensure the consistency of all subsequent spatial transformations.
[0028] Step 3: Extract robust high-level semantic features To further overcome the limitations of raw pixel matching in being sensitive to illumination, noise, and deformation, a VGG16 convolutional neural network pre-trained on a large-scale image dataset is used to map images from a variable pixel space to a more discriminative and robust deep semantic feature space.
[0029] Each grayscale image It is resampled to the standard input size of the VGG16 network using bilinear interpolation, typically set to 224×224 pixels, to obtain the image. Since VGG16 is a three-channel input network, the system copies the grayscale value of a single channel to the three color channels, constructing a pseudo-RGB image. The image... The input is a restructured VGG16 network (the fully connected layers used for classification have been removed, retaining only the feature extraction backbone consisting of convolutional and pooling layers). Forward propagation is performed, and the outputs of the network's intermediate layers are extracted as the feature representation F for this frame. t : In the formula, ΦVGG16 represents the mapping function from the input image to the selected feature layer. The output feature map Ft is a three-dimensional tensor whose dimensions Hf (height), Wf (width), and D (number of channels) are determined by the selected network layer. Typical parameter values are Hf=14, Wf=14, and D=512. The first two dimensions correspond to the spatial arrangement of features, and the third dimension corresponds to different semantic feature channels.
[0030] Perform the same operation on all M frames to obtain a feature map set {F1,F2,...,FM}, where the feature map corresponding to the reference frame Iref is labeled Fref, which prepares for subsequent feature matching and image correspondence.
[0031] Step 4: Feature matching and displacement estimation based on prior constraints To improve computational efficiency, the prior physical information provided by the metadata attached to the image is used to reasonably constrain the displacement search range of feature matching, thereby greatly reducing the global search that originally needed to be performed on the entire feature map to a fast matching within a small local area.
[0032] First, based on the geographical distance Dstation between adjacent checkpoint devices, the estimated average vehicle speed Vavg, and the time difference Δt between adjacent frames, the displacement amplitude on the original image plane caused by physical motion is estimated. : In the formula, This represents the conversion factor from actual physical distance to image pixel distance, and its value is determined by the installation parameters of the monitoring equipment and the image resolution.
[0033] During image matching, road topology information and metadata such as lane direction angle θlane are combined to estimate the displacement amplitude. The displacement is decomposed into the horizontal and vertical directions of the image coordinate system to form an initial two-dimensional displacement vector estimate dinitial=(ΔXest,ΔYest).
[0034] Subsequently, this initial vector is scaled to the feature map scale according to the spatial resolution ratio between the feature map and the original image, and a rectangular region is defined around its coordinate position as the constraint search region, the size of which is determined by... Together with the prediction error, this limits the matching calculation to this local range. Furthermore, within the constrained search range, a normalized cross-correlation algorithm is used to calculate the current feature map F. t Relative to the reference frame feature map F ref The formula for calculating the similarity at each candidate displacement (Δu, Δv), and the normalized correlation coefficient ρ(Δu, Δv), is as follows: In the formula, (i,j) represents all spatial positions of the traversed feature map, and d represents all feature channels traversed. and These represent the average values of the feature values of the reference frame and the current frame within the calculation window, respectively.
[0035] In this embodiment, the correlation coefficient ρ ranges from [-1, 1]. The closer ρ is to 1, the more similar the patterns of the two feature windows are; the closer it is to -1, the opposite the patterns are; and the closer it is to 0, the less linearly related they are. In actual matching, the search aims to maximize ρ.
[0036] Traverse all integer candidate displacements (Δu, Δv) within the constrained search region, calculate the corresponding ρ value, and find the integer pixel displacement (Δu*, Δv*) that maximizes ρ. This position is used as the initial matching position obtained on the feature map scale.
[0037] Step 5: Optimize sub-pixel level displacement accuracy Since the spatial resolution of the feature map is significantly lower than that of the original image, typically 1 / 16 to 1 / 32, integer pixel displacements (Δu*, Δv*) alone are insufficient to meet the requirements for high-quality image alignment. This quantization error, when directly mapped to the original image space, can lead to alignment deviations of several pixels, severely impacting subsequent processing. Therefore, a quadratic surface fitting is performed on the sampled correlation coefficient values within the local neighborhood of the integer pixel displacement (Δu*, Δv*).
[0038] In practice, a 5×5 neighborhood window is selected around the integer pixel displacement point (Δu*, Δv*), and the correlation coefficients ρ(Δu, Δv) at the corresponding 25 integer coordinate points within this window are obtained. A quadratic polynomial surface model is used to fit the sampling points. In the formula, a, b, c, d, e, f are the fitting coefficients to be determined.
[0039] Furthermore, to improve the fitting accuracy, the weighted least squares method is used to solve for the fitting coefficients, and the weighting function adopts a Gaussian kernel function centered at (Δu*, Δv*): In the formula, The weight decay rate is controlled by setting it to 0.5-1.0 feature map pixels. This weight setting makes the sampled data closer to the center integer pixel have a greater impact on the surface fitting.
[0040] After obtaining the optimal fitting coefficients {a,b,c,d,e,f}, the fitted surface is solved. By determining that the gradient is zero, the sub-pixel level displacement correction can be analytically obtained. :
[0041] Adding the initial integer pixel displacement to the sub-pixel correction amount yields the high-precision final displacement at the feature map scale. This improves the matching accuracy to the level of 0.1 pixels.
[0042] Step Six: Geometric Transformation and Resampling in Image Space Based on the sampling ratio between the feature map and the original image, the high-precision displacement is mapped back to the original image space, and the final geometric transformation is performed to generate an image sequence that is precisely aligned with the reference frame.
[0043] Let the spatial resolution of the original image be W×H (width×height), and the spatial resolution of the feature map be Wf×Hf. Then the scaling factor from the feature map coordinates to the original image coordinates is: .
[0044] Calculate the displacement at the original image scale based on the scaling factor: .
[0045] The displacements (Δx, Δy) mentioned above represent the distances that each pixel in the current frame needs to be translated in the horizontal and vertical directions to align with the reference frame.
[0046] Furthermore, using the coordinate system of the reference frame Iref as a unified target, the aligned image is calculated by reverse mapping the coordinates (x, y) of each target pixel to find its corresponding source coordinates (x-Δx, y-Δy) in the original grayscale image It. .
[0047] Since the inverse mapping coordinates are usually non-integer, a bicubic interpolation algorithm is typically used to perform weighted calculations based on the pixel values of the surrounding 4×4 neighborhood to obtain an image sequence with sub-pixel-level precise positioning.
[0048] In this embodiment, the intelligent occlusion determination unit, after completing image alignment, accurately locates the license plate area from each frame of the image based on the image sequence, and reliably determines and finely marks whether the area is occluded.
[0049] Step 1: License Plate Region Detection Based on Deep Learning To achieve accurate license plate area recognition and preliminary occlusion judgment, a lightweight target detection network based on the MobileNetV3 architecture is adopted, which ensures high detection accuracy while having low computational complexity, making it suitable for real-time scenarios.
[0050] The network input is a fixed-size RGB image. A pseudo-RGB input is constructed by copying the single-channel grayscale value to three color channels, and a feature pyramid structure is adopted to adapt to the change of license plate size at different imaging distances.
[0051] The network output contains two independent branches: a bounding box regression branch that predicts the rectangular coordinates of the license plate region, and a classification branch that assesses the probability of occlusion in the region. Specifically: The bounding box regression branch outputs a license plate region parameter for each frame of the image. This parameter is defined in rectangular coordinates: The coordinate system has its origin (0,0) at the top left corner of the image, with the positive x-axis pointing to the right and the positive y-axis pointing downwards.
[0052] Meanwhile, the classification branch outputs the occlusion probability P. b This value is obtained through S oftmax Function calculation: In the formula, Zc and Zoc are the original scores of the network output layer for the clear and occluded states, respectively. b The value ranges from [0,1]. The closer the value is to 1, the higher the initial confidence level that the license plate area is considered to be obstructed.
[0053] Step 2: Occlusion state determination based on dual verification mechanism To improve the robustness of occlusion determination and avoid misjudgment by a single model, a dual physical verification mechanism is adopted to verify the license plate region parameters obtained in step one. The defined areas are cross-validated. If any validation determines that the license plate area is obscured, then the license plate area of that frame is marked as obscured. Normal license plates have drastic grayscale changes and complex textures in some areas due to the alternation of characters and background colors. In contrast, obscured areas (such as those covered by stickers or mud) tend to have smooth grayscale changes and uniform textures, resulting in a significant reduction in grayscale changes in some areas.
[0054] In practice, the first step is to extract the detected rectangular area of the license plate. For a grayscale sub-image within a region, the grayscale value Ii of each pixel within that region is linearly normalized to the range [0,1], and its average grayscale value is calculated. Then, the local contrast variance of the region was calculated. : In the formula, N is the total number of pixels in the region; A higher value indicates a more uneven texture, making it more likely to be a normal license plate; a lower value indicates a more uniform texture, making it more likely to be obscured. Based on statistical analysis of a large number of labeled samples, an empirical threshold T can be set. variance .when <T variance If the threshold is not met, it is considered an abnormal texture uniformity, indicating occlusion. This threshold can be dynamically adjusted according to the license plate style and imaging characteristics in the actual scene to adapt to the differences in texture characteristics of license plates from different regions and of different models.
[0055] While normal license plates can be correctly segmented into individual characters, occlusions in the license plate area often disrupt character continuity or introduce abnormal blank areas, leading to character segmentation algorithm failure or abnormal segmentation results. In practice, a joint vertical and horizontal projection segmentation method is used within the detected license plate area to segment the license plate characters. The specific steps include: The license plate area image is binarized, and an adaptive thresholding method is used to determine the segmentation thresholds for the foreground (characters) and background (background color). The vertical projection histogram is calculated, and the number of foreground points in each column of pixels is counted. The vertical segmentation lines between characters are determined by finding the troughs of the projection histogram. For each vertically segmented candidate character region, the horizontal projection histogram is calculated to determine the upper and lower boundaries of the character.
[0056] Furthermore, anomalies in the spacing between consecutive characters during the segmentation process are investigated. First, the average spacing of all characters is calculated. Then, it checks if there are any consecutive character intervals that exceed the set average value. If such abnormal intervals exist, it is determined that the character segmentation failure rate is too high, indicating that there may be obstruction or interference in the license plate area, causing the characters to be unable to be segmented normally.
[0057] Step 3: Occlusion Status Marking and Mask Generation To address the issue in existing technologies where occlusion interference prevents the system from calculating reliable character associations within the license plate generation matrix, this paper proposes a new approach. For frames identified as occluded, a refined binary occlusion mask image is generated to precisely pinpoint the pixel-level location of the occlusion, rather than simply labeling the entire license plate area. By distinguishing between clear and occluded portions within the license plate area, a weighted mechanism actively strengthens the feature contribution of unoccluded areas during feature extraction, while simultaneously suppressing noise in occluded areas. This ensures that the extracted vehicle fingerprint can consistently identify stable components such as the vehicle body and headlights. Even if license plate information is completely missing, high-confidence identity assessment can still be performed based on other stable features of the vehicle, significantly improving the ability to handle severe occlusion situations.
[0058] The system generates an image that is identical to the original image. Binary occlusion mask image of the same resolution The appropriate mask generation strategy is selected based on the verification mechanism of the trigger determination: if triggered by texture uniformity analysis, then... Connected regions within a region whose localization texture variance consistently falls below a threshold are designated as occlusion areas; if triggered by character segmentation failure, abnormal locations in the projection histogram are mapped out. Potentially obstructed areas within the region.
[0059] Finally, the binary occlusion mask image O corresponding to each frame is output. t (p,q) and license plate area parameters Together, they constitute the license plate location and occlusion space information required for feature extraction.
[0060] In this embodiment, finally, the feature vector extraction unit extracts the license plate region parameters based on the output. With binary occlusion mask image O t (p,q) represents the aligned and denoised image sequence. In this process, vehicle appearance feature vectors with strong discriminative power and robustness to occlusion are extracted. The specific implementation steps are as follows: First, taking the detected license plate area rectangle Rt as the center, expand it outward by 20% of its side length to form a larger local vehicle clipping area that includes the license plate and surrounding stable body components. Then, use a mask O... t (p,q) Generate the feature extraction weight map W for this region. t The assignment strategy for (x, y) is as follows: Within this finally generated local vehicle clipping area, based on Rt and the mask O t (p,q) are used to filter and extract all clear pixels located within the license plate area that are not obscured, forming the key feature source point set S. c : This allows for the localization of the crucial, clear pixel source. In the formula, the mask... 0 indicates sharpness or foreground, 255 indicates occlusion, or other non-zero values indicate occlusion.
[0061] For each pixel in the cropped image within the vehicle's local cropped region, a dynamic fundamental contribution continuously correlated with its spatial location is calculated. The weighting is designed so that pixels closer to a reliable information source (clear license plate pixels) and with a stronger visual association should have a higher potential contribution to the final vehicle feature representation. The specific calculation is as follows: For each target pixel (x, y) in the cropped image, its dynamic initial weights By accumulating this point into the set of key feature source points S c The individual contribution of each sharp pixel source point (p, q) is used for calculation: In the formula, △x and △y are the coordinate differences between the target pixel and the source pixel in the horizontal and vertical directions of the image, respectively. This vertical difference corresponds to the offset of different character lines; r qp σ is the Euclidean distance, used to quantize the spatial distance between the target pixel and the sharp source point; s The spatial attenuation coefficient is set to 0.5 times the average character width to ensure consistency of local features; α is the contribution amplitude normalization coefficient, set to 1.0; β is the attenuation adjustment factor, set to 0.5. The denominator term is used to ensure that when the distance r... qpWhen the value is large, the contribution decreases slowly in an inverse relationship, so that even if the pixels in the extended area, such as the headlights and the air intake grille, are far away, they can still obtain a non-zero but low weight based on their spatial relationship.
[0062] c. Output the final weight map used to guide deep feature extraction: Iterate through each pixel (x, y) in the cropped image and apply the above formula to generate a continuous and smooth dynamic weight distribution map; for each target pixel location, query the corresponding binary occlusion mask value O. t (x,y), specifically: For pixels marked as occluded by the mask (O) t (x,y)=255) is penalized by multiplying its initial weight by a very small penalty factor, such as 0.05, to reduce its weight value to near zero, thereby achieving a very low contribution from the occluded pixel; for pixels marked as clear by the mask (O t If (x,y)=0, then its original weight value is retained to complete the occlusion correction of the weight; similarly, for clear pixels in the extended area (cropped area minus license plate area), the original dynamic weight value calculated in step b is retained; then normalization is performed to map the numerical range to [0,1], resulting in the feature extraction weight map W. t (x,y).
[0063] Secondly, the cropped region is adjusted to a fixed size, such as 512×512 pixels, and then input into a pre-trained ResNet50 network to extract multi-scale feature maps. (l is the layer index, dimension) ).
[0064] Next, the feature maps of each layer are restored to the same scale as the cropped region using bilinear interpolation. Based on the feature extraction weight map, a weighted average is calculated for each channel of each feature map layer to obtain a weighted feature vector for each layer. Subsequently, each weighted feature vector is normalized using the L2 norm, and the normalized vectors of all layers are concatenated sequentially to form a feature vector representing the appearance of a single frame of the vehicle. Finally, temporal average pooling is performed on the feature vectors of all frames, and principal component analysis is used to reduce the feature dimension to 512 dimensions, thereby obtaining a global feature vector representing the overall appearance of the queried vehicle.
[0065] For ease of understanding, such as Figure 4The image shown is a sequence of multiple original images of the same target vehicle, exhibiting temporal and spatial continuity. Specifically, the vehicle first acquired image sequences a1, a2, and a3 at 12:15 on Tongdu Avenue (development zone direction); subsequently, image sequences b1, b2, and b3 were acquired again at 13:41 in the Jinhu direction. During this period, when performing license plate detection on the above-mentioned registered and aligned images, it was found that the license plate area in frame a1 exhibited an abnormally uniform texture, and the character segmentation algorithm failed continuously. Therefore, it was determined that there was physical occlusion in this area, and a corresponding binary occlusion mask O was generated. t The pixel value of the area confirmed as occluded is set to 255, while the pixel value of the clearly identifiable body areas in frames a2, a3, b1, b2, and b3 (such as the complete license plate, headlights, air intake grille, and the edge of the front tires) is marked as 0 (clear).
[0066] At this point, in the subsequent feature extraction, taking frame a1 as an example, because the license plate area is occluded (O t =255), which cannot provide reliable source points for key features internally, therefore the source point set S c The pixels primarily come from sharp, unobstructed parts of the stable vehicle body (such as bumper edges and parts of the headlight outlines) within the frame. Based on this set, an initial weight is calculated for each pixel within the cropped area using the aforementioned contribution formula, resulting in a distribution map where the weights are higher at sharp parts of the vehicle body and smoothly decrease in the background and distant areas. Subsequently, the mask value O at each pixel location (x, y) is queried. t Correction is performed on (x,y): For the license plate occlusion area (O) in a1, t =255), its initial weight is multiplied by a penalty factor of 0.05 to suppress it to an extremely low level; for the clear vehicle body areas in a2, a3, b2, b3 and the complete license plate area in b1 (O t If the weight is 0, then the higher weight calculated dynamically is retained. The final weight map guides the deep network to focus on the vehicle appearance features that are always clear and stable in the multi-frame sequence (such as specific headlight shapes, tire edge textures, body color distribution, etc.), and generates a global feature vector of the vehicle occluded in frame a1.
[0067] In one embodiment of the present invention, the proposed intelligent judgment and analysis module includes a feature comparison unit, a trajectory analysis unit, and a judgment output unit. The purpose is, as... Figure 4As shown, using it as an example again, after generating the global feature vector of the vehicle occluded in frame a1, the intelligent judgment and analysis module intelligently compares the features of the queried vehicle with a massive multi-source database. Even though the license plate information in frame a1 is lost, by utilizing highly consistent vehicle appearance features extracted from multiple frames a2, a3, b1, b2, and b3, and combining this with a reasonable driving trajectory matching the timestamp for association verification, the vehicle can ultimately be matched with a medium-sized truck registered in the database with high similarity, thus listing it as the primary suspect in the output list of high-confidence candidate vehicles.
[0068] In this embodiment, the feature comparison unit is used to receive vehicle feature fingerprint data packets, and to solve the problem of target vehicle identification by performing efficient and detailed comparison with the background multi-source database (including vehicle registration database, historical violation database, and enterprise filing database), and output a list of candidate vehicles with high confidence.
[0069] First, the input vehicle feature fingerprint data packet is parsed to obtain the global feature vector Vq, visual attributes, and original image index of the queried vehicle. Based on the image index, the original image of the suspect vehicle is retrieved from the centralized storage system, and a Haar-like local feature extraction process is executed to obtain a fine-grained representation of the vehicle's local structure that complements the global appearance features. This process is as follows: A region of interest of a fixed size (e.g., 512×512 pixels) is extracted from the vehicle detection box in the image, converted to a grayscale image, and then subjected to normalization preprocessing such as histogram equalization.
[0070] For each predefined black-and-white rectangular template targeting key vehicle components (such as headlights and grille), the system calculates its feature value hk as follows: First, the grayscale intensity values of pixels within all white areas of the template are summed; then, the sum of the grayscale intensity values of pixels within all black areas is subtracted. The difference is the feature value of the template. The feature values calculated from all templates are then concatenated sequentially to form a high-dimensional original local structure feature vector. .
[0071] To improve the efficiency of retrieving massive amounts of data, an index generated by pre-clustering all local features of vehicles in the feature database is used. The distance from the local structural feature vector of the queried vehicle to each cluster center is calculated, and the vehicle is assigned to the cluster with the smallest distance, thus obtaining the corresponding cluster code (CLS). q Using this cluster code as a primary index, all vehicle records belonging to the same cluster in the feature library are quickly retrieved to form an initial candidate set, thereby significantly narrowing the comparison range.
[0072] Furthermore, based on the narrowed candidate set, a progressive two-level matching process is performed, and multi-dimensional evidence is integrated for comprehensive scoring.
[0073] First, in the initial candidate set Perform a coarse screening and calculate the global feature vector v for the query. q Global features of vehicles in the database v i cosine similarity ; Filter out (For example Vehicles that form a coarse candidate set. .
[0074] Furthermore, for each candidate vehicle in the coarse-screened candidate set, fine calculations are performed to calculate the original local features of the query vehicle. Compared with the original local features of the candidate vehicle Improved Hellinger distance And convert it into a similarity score. ; .
[0075] To compensate for the limitations of image features under occlusion or extreme conditions, attributes such as vehicle model, color, and brand are compared between the query vehicle and candidate vehicles. Rule-based judgment is used to calculate the attribute matching degree 'a'. i A perfect match earns 1 point; otherwise, it earns 0 points or partial points.
[0076] Finally, for each candidate vehicle, the final matching degree is calculated by combining the scores of the above three factors, and S is selected. i Vehicles exceeding the threshold are used to form a fine-match candidate set. . . refine the candidate set Vehicles in the system are ranked according to their overall matching score S. i Sort the results in descending order, select the Top-K (e.g., K=5) results as the final output, and generate a structured report for each candidate vehicle containing the license plate number, vehicle information, matching degree of each sub-item, and comprehensive score.
[0077] In this embodiment, the trajectory analysis unit performs in-depth verification of the candidate identities based on the high-confidence candidate vehicle list selected by the feature comparison unit. This is achieved by reconstructing the continuous driving trajectory of the vehicle to be assessed and matching it with its historical trajectories, frequently used routes, and driving behavior patterns.
[0078] To provide physical constraints for trajectory analysis, a digital road network motion knowledge base is first constructed. The road network is divided into different types of regions Z according to function. m (such as highways, expressways, main roads, etc.), and define physical motion constraint parameters for each type of area, including reasonable speed ranges. Reasonable acceleration range And the connectivity topology between regions. At the same time, multi-source trajectory data is collected, including recent checkpoint passage records of vehicles to be analyzed, historical passage records of each candidate vehicle, and GIS data of frequently used routes of vehicles registered with enterprises.
[0079] Since checkpoint records are discrete point sequences, they need to be reconstructed into continuous and reasonable trajectories. Therefore, using the latest checkpoint point P0 of the vehicle to be analyzed as a benchmark, the time window is retrieved in reverse. The historical records within the data form a candidate preorder point set.
[0080] For each candidate preceding point Calculate the shortest path distance L and time difference between it and the checkpoint. To obtain the required average speed Verify whether the speed falls within the intersection of reasonable speed ranges across the various regions traversed by the path. Only points that pass this verification are included in the reliable initial trajectory sequence. .
[0081] For the validated discrete point sequence, an Extended Kalman Filter (EKF) is applied for smoothing and state estimation. Process noise is constrained by the acceleration range of the road area, and observation noise is modeled by the checkpoint positioning error (e.g., 10 meters). The final output is a smoothed continuous trajectory state estimate. This includes continuous estimates of position, velocity, and acceleration, as well as their covariance matrices. This optimized trajectory... This will serve as the benchmark for all subsequent correlation analyses.
[0082] Furthermore, by utilizing optimized trajectories Perform multi-dimensional correlation calculations for each candidate vehicle j: First, the Dynamic Time Warping (DTW) algorithm is used to calculate the optimized trajectory of the vehicle to be analyzed. Historical trajectory of candidate vehicle j similarity between And convert it into an intuitive similarity score. .
[0083] Secondly, for vehicles registered by enterprises, the spatial overlap between the optimized vehicle trajectory and the frequently used routes of candidate vehicles is calculated, and this is used as the route matching degree. .
[0084] Microscopic driving behavior features (such as speed distribution and frequency of rapid acceleration / deceleration) are extracted from the optimized vehicle trajectory and compared with typical behavior patterns corresponding to the candidate vehicle type and purpose to calculate the behavior fit score. Thus, the system generates independent trajectory association evidence for each candidate vehicle j: .
[0085] In this embodiment, the analysis output unit serves as the decision-making terminal for identity analysis. It performs evidence fusion and comprehensive decision-making on all multi-dimensional evidence generated in the early stages, and finally outputs a high-confidence vehicle identity determination conclusion and a complete analysis report.
[0086] For each candidate vehicle, calculate the evaluation weight W. j This weight is a comprehensive quantitative assessment of the vehicle's appearance feature matching degree and spatiotemporal behavior correlation, and the calculation formula is as follows: The weighting coefficient λ is pre-set based on the degree of importance attached to different dimensions of evidence in actual business operations, and the final weights are selected accordingly. The tallest vehicle is considered the most likely candidate for identification.
[0087] Finally, an analysis report is output, including the most likely identity determination and its confidence level (i.e., the highest weight value); a weighted ranking list of all candidate vehicles and a detailed breakdown of evidence (showing the feature matching score, trajectory similarity score, route matching score, and behavior consistency score for each vehicle); an optimized visualization chart of the vehicle's spatiotemporal trajectory; and clear recommendations for subsequent handling (such as confirming control measures if there is sufficient evidence, associating with historical violations, or recommending manual verification, etc.).
[0088] In one embodiment of the present invention, the evidence chain management module realizes the complete solidification and efficient review of the evidence chain. It includes: The evidence chain encapsulation unit, based on the processing results of previous modules, structurally integrates and packages vehicle violation evidence to form a complete electronic evidence file. The specific implementation steps are as follows: S1: Obtain all evidence associated with a specific violation event ID from each business module, including at least the vehicle feature fingerprint data package based on the visual feature extraction module; the original captured images and spatiotemporal data from the checkpoint system; the weight and over-limit data from the dynamic weighing system; and the complete report output by the intelligent analysis module. Simultaneously, mark all evidence with the violation event ID and a timestamp.
[0089] S2: Bind all evidence items to the same case through the violation event ID to generate a structured HTML analysis report. The report includes violation spatiotemporal information, vehicle images, interactive spatiotemporal trajectory maps, detailed weighing and speed data, license plate obstruction analysis results, and complete vehicle identity analysis conclusions.
[0090] S3. Package all the above analysis reports into a ZIP file according to the preset directory structure, and digitally sign it using a digital certificate. Finally, upload it to the storage system.
[0091] The review and interaction unit displays all pending cases using a visual card grid data structure. Each card grid includes at least a vehicle thumbnail, license plate (or obscured markings), violation time, location, type, evidence completeness score, and assessment confidence level. Law enforcement officers can then combine and quickly search based on multiple criteria such as time, region, and violation type. The system automatically matches and compares current cases with similar historical cases, identifying recurring violations or clues to habitual offenders based on feature and trajectory similarities.
[0092] The technical scope of this invention is not limited to the content described above. Those skilled in the art can make various modifications and variations to the above embodiments without departing from the technical concept of this invention, and all such modifications and variations should fall within the protection scope of this invention.
Claims
1. An intelligent judgment system for detecting obscured license plates of overloaded and oversized vehicles in road transportation, characterized in that, include: The multi-source data acquisition module, through integrated front-end sensing devices, collects and transmits vehicle information in real time, including multi-view images and associated structured metadata. The visual feature extraction module receives image data and processes multi-view asynchronously captured image sequences to generate vehicle feature fingerprint data packets. The processing includes: first, parsing image sequences from multiple checkpoints, deriving initial displacement estimates between images based on the structured metadata and a vehicle motion model, and achieving sub-pixel-level spatial alignment of each frame in the image sequence relative to a selected reference frame; second, locating the license plate region and conducting preliminary occlusion probability assessment on the aligned images, and reliably verifying the occlusion status of the license plate region through at least one physical verification mechanism. If the verification confirms occlusion, a pixel-level precision binary occlusion mask is generated; finally, based on the binary occlusion mask, a vehicle global feature vector robust to occlusion interference is extracted. The intelligent analysis module is used to receive the vehicle feature fingerprint data packet, compare it with the pre-stored multi-source vehicle feature database, and output a list of candidate vehicle identities. The evidence chain management module is used to encapsulate the entire chain of evidence and provide visual review.
2. The intelligent analysis system according to claim 1, characterized in that, Extracting the global feature vector of a vehicle based on the binary occlusion mask includes: within the local cropped area of the vehicle, locating and extracting all clear pixels based on the license plate area parameters and the binary occlusion mask to form a set of key feature source points; using the mask to generate a feature extraction weight map; and performing weighted fusion of multi-scale depth features based on the weight map to generate the global feature vector. The specific process for generating the weight graph is as follows: For each target pixel in the cropped image, an initial weight continuously related to its spatial location is calculated by accumulating its dynamic spatial contribution to all sharp source points in the key feature source point set, where the contribution is set to decrease as the Euclidean distance increases. Correction is performed based on the target pixel's own mask value, penalizing pixels marked as occluded within the license plate area to reduce their weights to near zero, while retaining the calculated weights for pixels marked as sharp within the license plate area and sharp pixels outside the license plate area. The corrected weight values are then normalized to form a feature extraction weight map. The Euclidean distance is used to quantify the spatial distance between the target pixel and the sharp source pixel.
3. The intelligent analysis system according to claim 1, characterized in that, The structured metadata includes device geographic coordinates, timestamps, and lane direction angles. The average vehicle speed is estimated based on the device geographic spacing and image time difference. Based on the average speed, time difference, and imaging parameters, the displacement amplitude between images is estimated. The displacement amplitude estimate is decomposed into an initial two-dimensional displacement vector in the image coordinate system. The initial two-dimensional displacement vector is used to define the constraint search range for feature matching.
4. The intelligent analysis system according to claim 1 or 3, characterized in that, The subpixel-level spatial alignment includes: within the constrained search range, performing feature matching through normalized cross-correlation calculation to obtain integer pixel displacement; performing subpixel-level optimization on the integer pixel displacement to obtain high-precision displacement parameters; and performing geometric transformation and resampling on each frame image based on the high-precision displacement parameters to achieve spatial alignment.
5. The intelligent analysis system according to claim 4, characterized in that, The subpixel-level optimization includes: sampling multiple normalized cross-correlation values within the integer coordinate neighborhood of the integer pixel displacement; fitting the sampled cross-correlation values using a quadratic surface model; obtaining subpixel-level displacement corrections by solving for the extreme points of the fitted surface; and summing the integer pixel displacement with the displacement corrections to obtain the high-precision displacement parameters.
6. The intelligent analysis system according to claim 1, characterized in that, The reference frame is a single frame image with the highest visual quality score selected from an asynchronously captured image sequence of the same target vehicle based on a sharpness evaluation function; the coordinate system of this frame image is established as a unified geometric reference for the entire multi-frame image processing flow.
7. The intelligent judgment system according to claim 1, characterized in that, The process of locating the license plate region and assessing the occlusion probability includes: processing the input image using a lightweight convolutional neural network, wherein the network output includes the rectangular coordinate parameters of the license plate region output by the bounding box regression branch, and the occlusion probability value calculated by the Softmax function output by the classification branch.
8. The intelligent analysis system according to claim 1, characterized in that, The physical verification mechanism includes texture uniformity analysis verification and / or character readability evaluation verification; The texture uniformity analysis and verification is as follows: extract the grayscale sub-image within the license plate area and calculate its local contrast variance. If the variance is lower than a preset threshold, it is determined to be a texture abnormality and occlusion is confirmed. The character readability assessment and verification is performed as follows: character segmentation is performed within the license plate area and character intervals are analyzed. If consecutive character intervals exceed a set threshold, the segmentation is deemed to have failed and occlusion is confirmed.
9. The intelligent analysis system according to claim 1, characterized in that, The intelligent analysis module includes a feature comparison unit, which is used to: extract local structural feature vectors of the query vehicle and perform rapid coarse screening using an index generated by pre-clustering based on local features to significantly narrow down the range of vehicles to be compared; within the narrowed candidate set, first perform coarse screening based on global feature similarity, and then calculate the local feature similarity and visual attribute matching degree in parallel on the coarse screening results; fuse the above matching degrees according to preset weights, and filter and sort the candidates based on the comprehensive score to output the candidate list.
10. The intelligent analysis system according to claim 9, characterized in that, The intelligent judgment and analysis module also includes a trajectory analysis unit, used to perform trajectory-driven deep verification: A knowledge base containing road types and their physical motion constraints is constructed, and based on this, the physical rationality of discrete checkpoint records of vehicles to be analyzed is verified and the trajectory is reconstructed. The continuous trajectory estimate is obtained through filtering optimization. Based on the continuous trajectory estimation, the association analysis of each candidate vehicle is performed to calculate its similarity to historical trajectories, evaluate its matching degree with frequently traveled routes, and conduct a consistency analysis based on driving behavior characteristics, so as to generate multi-dimensional behavioral evidence.