An image clustering method and apparatus

By calculating the average vehicle speed using time and geographic location in image clustering and comparing it based on vehicle feature information, the clustering problem of images without license plates or facial information is solved, and the accuracy of vehicle trajectory analysis is improved.

CN115841584BActive Publication Date: 2026-06-12ZHEJIANG DAHUA TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ZHEJIANG DAHUA TECH CO LTD
Filing Date
2022-12-21
Publication Date
2026-06-12

Smart Images

  • Figure CN115841584B_ABST
    Figure CN115841584B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide an image clustering method and device to solve the problem of clustering vehicle data without license plate and face information. The method provided by the present application comprises: for any two adjacent images in the image archive, determining a plurality of first to-be-clustered images between the time ranges corresponding to the two adjacent images and between the corresponding geographical positions from the to-be-clustered images; determining K second to-be-clustered images with vehicle average speed within the driving speed range from the plurality of first to-be-clustered images; for each second to-be-clustered image, comparing the vehicle feature information with the vehicle feature information of L images in the image archive, which have a deviation between the vehicle angle information of the second to-be-clustered image and the vehicle angle information of the L images less than a set deviation threshold, to obtain L feature similarities; when it is determined according to the L feature similarities that the second to-be-clustered image meets the clustering condition, clustering the second to-be-clustered image into the image archive.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image processing technology, and in particular to an image clustering method and apparatus. Background Technology

[0002] To facilitate the management of motor vehicles and their occupants, relevant data for the same vehicle can be aggregated to create a "one vehicle, one file" system. In the security field, this data primarily refers to visual data, such as license plates, vehicle images, and images of occupants. By arranging data within the same file chronologically based on the time and location of the image capture, the vehicle's trajectory can be obtained. Currently, a common technique is to cluster images based on license plates, ensuring that images in each file share the same license plate. Secondly, when images lack license plate information, occupant images can be used as an aid, as facial features exhibit high robustness, allowing for vehicle classification based on occupant facial features. However, the above methods can only cluster images with license plates or facial information; for images without license plates or facial information, there is currently no mature solution. Summary of the Invention

[0003] This application provides an image clustering method and apparatus to solve the problem of clustering vehicle data without license plates or facial information.

[0004] In a first aspect, this application provides an image clustering method, including:

[0005] For any two adjacent images in an image archive, multiple first images to be clustered are determined from the images to be clustered, corresponding to the time range and geographical location of the two adjacent images; the image archive consists of multiple images of the same vehicle collected by multiple acquisition devices and clustered according to license plate information and facial information; the two adjacent images are adjacent images after being sorted according to time information; the images to be clustered are images that cannot be clustered into the image archive collected by the multiple acquisition devices corresponding to the image archive.

[0006] From the plurality of first images to be clustered, K second images to be clustered are determined where the average vehicle speed is within the driving speed range; the average vehicle speed is the average speed between the geographical locations corresponding to the first images to be clustered and the target image, the target image is any one of the two adjacent images, and the driving speed range is determined based on the time information and location information corresponding to the two adjacent images respectively;

[0007] For each second image to be clustered, the vehicle feature information of the second image to be clustered is compared with the vehicle feature information of L images in the image archive to obtain L feature similarities. The deviation between the vehicle angle information of each of the L images and the vehicle angle information of the second image to be clustered is less than a set deviation threshold.

[0008] When the second image to be clustered is determined to meet the clustering conditions based on the L feature similarities, the second image to be clustered is clustered into the image archive.

[0009] Based on the above scheme, images without license plate information and without facial information can be clustered according to clustering conditions, thereby making the vehicle trajectory in the image archive obtained after clustering more accurate and facilitating subsequent analysis and processing.

[0010] In one possible implementation, determining that the second image to be clustered meets the clustering conditions based on the L feature similarities includes: if the number of images in the L images whose feature similarity is greater than a similarity threshold is determined, then the proportion of similar images is determined based on the number of images; when the proportion of similar images is greater than a set proportion threshold, the second image to be clustered is determined to meet the clustering conditions.

[0011] In one possible implementation, the second image to be clustered has a similarity ratio in other image archives, and the clustering condition further includes that the similarity ratio of the second image to be clustered in other image archives is less than the similarity ratio of the second image to be clustered in the image archive itself.

[0012] Based on the above scheme, images without license plate information and without facial information can be clustered into only one image file, ensuring the accuracy of the image file.

[0013] In one possible implementation, the images acquired by the multiple acquisition devices include multiple first-type images containing license plate information, and second-type images containing facial information but not license plate information. The image archive is determined as follows:

[0014] Based on the license plate information, the multiple first-class images are clustered to obtain multiple initial image files;

[0015] Based on facial information, the multiple second-type images are clustered with the multiple initial image files to obtain multiple image files.

[0016] In one possible implementation, the process of clustering the multiple first-class images according to license plate information to obtain multiple initial image files includes:

[0017] The multiple first-class images are clustered according to license plate information to determine multiple license plate image files;

[0018] For any given license plate image file, filter out erroneous license plate images that do not meet the conditions of spatiotemporal rationality and vehicle body similarity.

[0019] Based on the probability of character errors in the license plate information of the erroneous license plate image, and the probability that the characters in the license plate information are predicted as other characters, multiple predicted license plates corresponding to the license plate information and the license plate probability corresponding to each predicted license plate are determined.

[0020] According to the order of the probability of each predicted license plate from high to low, if the erroneous license plate image satisfies the spatiotemporal rationality condition and the facial similarity condition in the license plate image file corresponding to the predicted license plate, then the erroneous license plate image is clustered into the license plate image file corresponding to the predicted license plate information to obtain the initial image file.

[0021] In one possible implementation, if the erroneous license plate image does not meet the spatiotemporal rationality condition and vehicle body similarity condition in the license plate image files corresponding to the multiple predicted license plates, then the erroneous license plate image is placed in the license plate image file where the erroneous license plate image was before it was filtered, so as to obtain the initial image file.

[0022] Based on the above scheme, when clustering using license plate information, the correct license plate is determined based on the probability and likelihood of the predicted license plate corresponding to the license plate, thereby improving the accuracy of image archives.

[0023] In one possible implementation, the spatiotemporal rationality condition is that the average speed determined by the geographical location of any two adjacent images in the license plate image archive is less than a speed threshold, and the vehicle body similarity condition is that the vehicle body attribute information of any two adjacent images is the same. The vehicle body attribute information includes at least one of the following: vehicle body color, vehicle model, and interior decorations; the two adjacent images are adjacent images in the license plate image archive after being sorted according to time information.

[0024] In one possible implementation, the step of clustering the plurality of second-type images with the plurality of initial image files according to facial information to obtain a plurality of image files includes: clustering the second-type images and the first-type images according to facial information to obtain a plurality of facial image files; sorting the images in each facial image file according to time order, and for each second-type image in the facial image file, determining the adjacent first-type image whose time information is closest to that of the second-type image; when the second-type image meets the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the adjacent first-type image, clustering the second-type image into the initial image file corresponding to the adjacent first-type image to obtain an image file.

[0025] In one possible implementation, the method further includes: if the second type of image does not meet the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the adjacent first type of image, then determine the next adjacent first type of image in the face image file, wherein the next adjacent first type of image and the adjacent first type of image are the first type of images in the face image file that are temporally closest to the second type of image, and the next adjacent first type of image and the adjacent first type of image are respectively distributed before and after the time corresponding to the second type of image; if the second type of image meets the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the next adjacent first type of image, then cluster the second type of image into the initial image file corresponding to the next adjacent first type of image to obtain an image file.

[0026] Based on the above scheme, images containing only facial information and images containing license plate information can be clustered. The images are then filtered layer by layer based on spatiotemporal rationality and vehicle body similarity conditions, making the clustered image archives more accurate.

[0027] Secondly, embodiments of this application provide an image clustering apparatus, including:

[0028] The first determining module is used to, for any two adjacent images in an image archive, determine multiple first images to be clustered from the images to be clustered, corresponding to the time range and the corresponding geographical location of the two adjacent images; the image archive consists of multiple images of the same vehicle collected by multiple acquisition devices and clustered according to license plate information and facial information; the two adjacent images are images that are adjacent after being sorted according to time information; the images to be clustered are images that cannot be clustered into the image archive collected by the multiple acquisition devices corresponding to the image archive.

[0029] The second determining module determines K second images from the plurality of first images to be clustered, in which the average vehicle speed is within the driving speed range; the average vehicle speed is the average speed between the geographical locations corresponding to the first images to be clustered and the target image, the target image is any one of the two adjacent images, and the driving speed range is determined based on the time information and location information corresponding to the two adjacent images respectively;

[0030] The comparison module is used to compare the vehicle feature information of each second image to be clustered with the vehicle feature information of L images in the image archive to obtain L feature similarities. The deviation between the vehicle angle information of each of the L images and the vehicle angle information of the second image to be clustered is less than a set deviation threshold.

[0031] The second determining module is further configured to cluster the second image to be clustered into the image archive when it is determined that the second image to be clustered meets the clustering conditions based on the L feature similarities.

[0032] Thirdly, embodiments of this application provide an execution device, including:

[0033] Memory, used to store program instructions;

[0034] A processor is configured to invoke program instructions stored in the memory and execute the method described in the first aspect and different implementations of the first aspect according to the obtained program instructions.

[0035] Fourthly, embodiments of this application provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the methods described in the first aspect and different implementations of the first aspect.

[0036] Furthermore, the technical effects of any of the implementation methods in the second to fourth aspects can be found in the first aspect and the technical effects of different implementation methods of the first aspect, which will not be repeated here. Attached Figure Description

[0037] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0038] Figure 1A This is a schematic diagram of an application scenario provided by an embodiment of this application;

[0039] Figure 1B A schematic diagram of a server structure provided in an embodiment of this application;

[0040] Figure 2 A flowchart illustrating an image clustering method provided in an embodiment of this application;

[0041] Figure 3 This is a schematic diagram of an image clustering device provided in an embodiment of this application;

[0042] Figure 4 This is a schematic diagram of an execution device provided in an embodiment of this application. Detailed Implementation

[0043] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. The components of the embodiments of this application described and shown in the accompanying drawings can be arranged and designed in various different configurations.

[0044] Therefore, the following detailed description of the embodiments of this application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely to illustrate selected embodiments of the application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.

[0045] It should be noted that relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0046] Current technologies primarily cluster vehicle images by combining license plate numbers with vehicle exterior features. Generally, initial screening can be performed using the license plate, followed by further determination of whether the images belong to the same vehicle based on the similarity of the vehicle body or license plate exterior features. In some scenarios, facial information of drivers and passengers may also be used to assist in refining vehicle clustering. However, current technologies lack specific solutions for vehicle images where neither the license plate nor high-quality facial information of drivers and passengers has been captured.

[0047] To address the aforementioned issues, this application provides an image clustering method and apparatus, which filters images to be clustered without license plate information based on vehicle average speed and vehicle feature information. When the filtered images meet the clustering conditions, the images to be clustered are clustered into an image archive.

[0048] The following is a brief introduction to the application scenarios to which the technical solutions of the embodiments of this application are applicable. It should be noted that the application scenarios described below are only for illustrating the embodiments of this application and are not intended to limit the scope. In specific implementation, the technical solutions provided by the embodiments of this application can be flexibly applied according to actual needs.

[0049] The image clustering method provided in this application can be implemented by an execution device. In some embodiments, the execution device may be an electronic device, which may be implemented by one or more servers. Figure 1A Let's take a server with 100 servers as an example. (Reference) Figure 1AThe diagram illustrates a possible application scenario provided by an embodiment of this application, including a server 100 and a data acquisition device 200. The server 100 can be implemented as a physical server or a virtual server. The server can be implemented as a single server or as a server cluster consisting of multiple servers; the image clustering method provided in this application can be implemented using either a single server or a server cluster. The data acquisition device 200 is a device with image acquisition capabilities, including electronic police equipment, electronic monitoring equipment, surveillance cameras, video recorders, etc. The data acquisition device 200 can send the acquired images to be clustered to the server 100 via a network. Optionally, the server 100 can be connected to a terminal device 300, receiving image clustering tasks sent by the terminal device 300, and performing image clustering based on the received images to be clustered sent by the data acquisition device 200. In some scenarios, the server 100 can send the image clustering results to the terminal device 300. The terminal device 300 can be a television, mobile phone, tablet computer, personal computer, etc. In some embodiments, after the acquisition device 200 acquires images, it can send the acquired images to the server 100. The server 100 then performs image parsing and vehicle image aggregation on the images to be clustered, resulting in multiple images that are saved as image archives. Each image archive includes vehicle images of the same vehicle in different scenes within a set time period. In some scenarios, the acquisition device 200 can send the acquired images to the server 100 in real time. After receiving the images to be clustered from the acquisition device 200, the server 100 can analyze the content in the image using computer vision algorithms to obtain features such as faces, vehicle exteriors, and license plate numbers, and record the acquisition information and geographical location information. Furthermore, based on corresponding feature and attribute extraction algorithms, features such as face features, vehicle exterior features, vehicle color, vehicle model, and vehicle orientation (roughly distinguishing between heading and departing) can be obtained.

[0050] As an example, see Figure 1B As shown, server 100 may include processor 110, communication interface 120, and memory 130. Of course, server 100 may also include other components. Figure 1B Not shown in the image.

[0051] The communication interface 120 is used to communicate with the acquisition device 200 and the terminal device 300, to receive images to be clustered sent by the acquisition device 200, or to receive image clustering tasks sent by the terminal device 300, or to send image clustering results to the terminal device 300.

[0052] In the embodiments of this application, the processor 110 may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components, capable of implementing or executing the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor may be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly manifested as being executed by a hardware processor, or executed by a combination of hardware and software modules within the processor.

[0053] Processor 110 is the control center of server 100, connecting various parts of server 100 through various interfaces and routes. It executes various functions and processes data by running or executing software programs and / or modules stored in memory 130, and by calling data stored in memory 130. Optionally, processor 110 may include one or more processing units. Processor 110 may be, for example, a processor, microprocessor, controller, or other control component. It may be a general-purpose central processing unit (CPU), a general-purpose processor, a digital signal processing unit (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.

[0054] The memory 130 can be used to store software programs and modules. The processor 110 executes various functional applications and data processing by running the software programs and modules stored in the memory 130. The memory 130 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, application programs required for at least one function, etc.; the data storage area may store data created according to business processing, etc. As a non-volatile computer-readable storage medium, the memory 130 can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The memory 130 may include at least one type of storage medium, such as flash memory, hard disk, multimedia card, card-type memory, random access memory (RAM), static random access memory (SRAM), programmable read-only memory (PROM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), magnetic memory, magnetic disk, optical disk, etc. The memory 130 is any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but is not limited thereto. The memory 130 in this embodiment can also be a circuit or any other means capable of implementing storage functions for storing program instructions and / or data.

[0055] In other embodiments, the execution device may be a terminal device. In some scenarios, the terminal device can receive the image to be clustered sent by the acquisition device 200, and perform image clustering based on the image to be clustered to obtain the clustering result of the image to be clustered. In some embodiments, the terminal device may include a display device, which may be a liquid crystal display, an organic light-emitting diode (OLED) display, a projection display device, etc., and this application does not specifically limit it in this regard.

[0056] It should be noted that the above Figure 1A and Figure 1B The structure shown is merely an example, and the embodiments of this application are not limited thereto.

[0057] This application provides an image clustering method. Figure 2 An illustrative diagram illustrates the flow of an image clustering method, which can be executed by an image clustering apparatus, such as... Figure 1BThe server 100 shown could be, for example, a processor 110 or the server 100 itself. This device could also be a terminal device. The specific process is as follows:

[0058] 201. For any two adjacent images in an image archive, determine multiple first images to be clustered from the images to be clustered, based on the time range and geographical location of the two adjacent images.

[0059] In some embodiments, the image to be clustered is an image frame captured by an acquisition device. The acquisition device can be an electronic police device, electronic monitoring device, surveillance camera, video recorder, etc. For example, after the acquisition device captures the image frame, the server obtains the image to be clustered from the acquisition device.

[0060] In some embodiments, the server receives image files sent by the acquisition device, the image files including images to be clustered. These image files can be encoded image files. The server can then decode the received image files to obtain the images to be clustered. Encoding the images effectively reduces the file size, facilitating transmission. This improves the image transmission speed, thereby increasing the efficiency of subsequent image clustering. The encoded bitstream data can be acquired using any applicable method, including but not limited to: Real-Time Streaming Protocol (RTSP), Open Network Video Interface Forum (ONVIF) standards, or proprietary protocols.

[0061] In some embodiments, the image archive consists of multiple images of the same vehicle captured by multiple acquisition devices and clustered according to license plate information and facial information. Adjacent images are those that are adjacent after being sorted according to time information. The images to be clustered are images captured by multiple acquisition devices corresponding to the image archive that cannot be clustered into the image archive. In some scenarios, the images to be clustered are vehicle images captured by acquisition devices that only include vehicle body information, or images that only include vehicle body information and facial information but cannot be clustered into the image archive.

[0062] As an example, if the time information corresponding to two adjacent images in the image archive is 10:30 and 10:36 respectively, and the two images correspond to intersection A and intersection B respectively, then based on the location information of intersection A and intersection B, the multiple images to be clustered collected by all the acquisition devices distributed between the two intersections during the time period of 10:30-10:36 are determined as the first images to be clustered.

[0063] 202. From multiple first images to be clustered, determine K second images to be clustered where the average vehicle speed is within the driving speed range.

[0064] In some embodiments, multiple images in an image archive can be sorted according to their time information, and the driving speed range is determined based on the time and location information corresponding to two adjacent images, respectively. Specifically, the average speed between two adjacent images can be determined based on their respective time and location information. For example, two adjacent images are image A and image B. The time information for image A is 10:30, and the time information for image B is 10:36. The geographical locations of images A and B are intersection A and intersection B, respectively. The driving route length between intersection A and intersection B is 4 kilometers, so the average speed from intersection A to intersection B is 40 kilometers per hour. Further, the lengths of other driving routes between intersection A and intersection B can be determined, and the average speed of that route can be determined based on the length of the driving route. Once the average speed of at least one driving route between intersection A and intersection B is determined, the driving speed range can be determined based on the average speed of at least one driving route. As an example, if there are three routes between intersection A and intersection B, with the first route having an average speed of 40 km / h, the second route having an average speed of 43 km / h, and the third route having an average speed of 37 km / h, then the speed range between intersection A and intersection B can be set to 37-43 km / h. In some scenarios, to ensure error tolerance, the speed range can be appropriately widened based on the average speed range. For example, the speed range between intersection A and intersection B can be set to 35-45 km / h. In some scenarios, the speed range can also be determined based on the average of the average speeds corresponding to multiple routes. For example, if the average speeds of the three routes are 40 km / h, 43 km / h, and 37 km / h respectively, then the average of the average speeds of the three routes is 40 km / h. Further, the speed range can be set as V ± T, where V is the average of the average speeds, and T should be a smaller value.

[0065] In some embodiments, the average vehicle speed is the average speed between the geographical locations corresponding to the first image to be clustered and the target image, where the target image is any one of two adjacent images. As an example, the two adjacent images are image A and image B, where the time information for image A is 10:30 and the time information for image B is 10:36. The geographical locations of images A and B are intersection A and intersection B, respectively. The geographical location corresponding to one of the multiple first images to be clustered is intersection C, and the time information for this first image to be clustered is 10:32. The average vehicle speed of this first image to be clustered can be determined based on the time information between intersection C and intersection A, or based on the time information between intersection C and intersection B. Taking intersection A and intersection C as an example, if the length of the travel route between intersection A and intersection C is 1.2 kilometers, then the average vehicle speed from intersection A to intersection C is 36 kilometers per hour. If the average vehicle speed of the first image to be clustered falls within the range of 35-45 km / h, then the first image to be clustered is a second image to be clustered. In some embodiments, K second images to be clustered can be determined from multiple first images to be clustered using the method described above.

[0066] In some embodiments, before determining the L images in the image archive based on vehicle angle information, a vehicle exterior information check can be performed. Specifically, a similarity check of vehicle body features (including local features) can be performed. The similarity of the second image to be clustered with vehicle images in the image archive within a preset time period (e.g., the past week) is calculated based on the vehicle body features. If the number of second images to be clustered with a similarity greater than a similarity threshold exceeds a set threshold, the second image to be clustered is retained. Through the above method, it can be ensured that the second image to be clustered and the images of vehicles in the image archive are at least the same model and color, thus filtering out second-class images that do not meet the requirements.

[0067] In some embodiments, after filtering the second group of images to be clustered based on the driving speed range, further filtering can be performed using the vehicle orientation information in the images. Specifically, images from multiple second group of images whose vehicle orientation does not match the vehicle orientation in the image archive can be filtered out.

[0068] 203. For each second image to be clustered, the vehicle feature information of the second image to be clustered is compared with the vehicle feature information of L images in the image archive to obtain L feature similarities.

[0069] In some embodiments, after determining K second images to be clustered, for each second image to be clustered, L images can be determined from the image archive based on vehicle angle information. The deviation between the vehicle angle information of each of the L images and the vehicle angle information of the second images to be clustered is less than a set deviation threshold.

[0070] Furthermore, the vehicle feature information of the second image to be clustered can be compared with the vehicle feature information of L images in the image archive to obtain L feature similarities.

[0071] 204. When the second image to be clustered is determined to meet the clustering conditions based on the similarity of L features, the second image to be clustered is clustered into the image archive.

[0072] In some embodiments, determining whether a second image to be clustered meets the clustering condition based on L feature similarities includes: if the number of images among the L images with feature similarity greater than a similarity threshold is determined, then the proportion of similar images is determined based on the number of images. When the proportion of similar images is greater than a set proportion threshold, the second image to be clustered is determined to meet the clustering condition. For example, if the number of images among the L images with feature similarity greater than the similarity threshold is 7, and L = 10, then the proportion of similar images is 0.7. If the set proportion threshold is 0.6, and the proportion of similar images is greater than the set proportion threshold, then the second image to be clustered meets the clustering condition.

[0073] In some embodiments, the proportion of similar images of the second image to be clustered exists in other image archives. The clustering condition also includes that the proportion of similar images of the second image to be clustered in other image archives is less than the proportion of similar images of the second image to be clustered in the current image archive. For example, the proportion of similar images of the second image to be clustered in the current image archive is 0.7, and the proportions of similar images of the second image to be clustered in other image archives are 0.5, 0.55, and 0.6. If the proportion of similar images of the second image to be clustered in other image archives is less than the proportion of similar images of the second image to be clustered in the current image archive, then the second image to be clustered is clustered into the current image archive. According to the above scheme, it can be guaranteed that each second image to be clustered is clustered into only one image archive.

[0074] In some embodiments, before determining the similarity ratio of images, the vehicle body attributes of the second image to be clustered, such as ornament information and front tag information, are compared with the vehicle body attribute data of the images in the image archive. If there are more than a preset proportion of mismatched vehicle bodies, the second image to be clustered is removed. In some scenarios, when the vehicle body attribute data of an image in the image archive is marked as unknown, it does not participate in the vehicle body matching check. When determining the similarity ratio of the second image to be clustered, images that do not participate in the vehicle body matching check are not included.

[0075] In some embodiments, when there are multiple second images to be clustered between two adjacent images in an image archive, the average speed between the two adjacent second images to be clustered can be determined based on the time information and location information corresponding to the two adjacent second images to be clustered respectively. When the average speed between the two adjacent second images to be clustered is not within the above-mentioned driving speed range, the second image to be clustered is deleted.

[0076] In some embodiments, when the image to be clustered is an image that includes vehicle information and facial information but cannot be clustered into an image archive, it can be clustered into multiple face archives based on facial information, with each face archive containing the facial information of the same person. The multiple images in each face archive are sorted according to time information. When the image to be clustered is an image from a face archive, the image archive of the license plate corresponding to each image in the face archive can be recorded, and the frequency of each license plate can be counted. For example, if a face archive contains 10 images that only contain facial information, and the first 4 adjacent images of the facial information correspond to license plate 1, and the last 6 adjacent images correspond to license plate 2, assuming the frequency threshold for the face archive is 3, then the license plate (license plate 2) that appears most frequently in the facial images appears more than the frequency threshold. Therefore, the 6 adjacent images corresponding to license plate 2 are clustered into the image archive corresponding to vehicle 2.

[0077] In some embodiments, the images acquired by multiple acquisition devices include multiple first-type images containing license plate information and second-type images containing facial information but not license plate information. Image archives can be determined as follows: First, based on license plate information, the multiple first-type images are clustered to obtain multiple initial image archives. Further, based on facial information, the multiple second-type images are clustered with the multiple initial image archives to obtain multiple image archives.

[0078] In some embodiments, multiple first-class images are clustered according to license plate information to obtain multiple initial image files. Specifically, this can be achieved by clustering multiple first-class images according to license plate information to determine multiple license plate image files. The first-class images are images containing both facial and license plate information, or images containing only license plate information and excluding facial information. For the first-class images, clustering can be performed according to license plate information, grouping images with the same license plate number into one license plate image file to obtain multiple license plate image files. Further, for any given license plate image file, erroneous license plate images that do not meet the spatiotemporal rationality condition and the vehicle body similarity condition can be filtered out. The spatiotemporal rationality condition is that the average speed determined by the geographical locations of any two adjacent images in the license plate image file is less than a speed threshold. The vehicle body similarity condition is that any two adjacent images have the same vehicle body attribute information, which includes at least one of the following: vehicle body color, vehicle model, or interior decorations. The two adjacent images are images in the license plate image file that are adjacent after being sorted according to time information.

[0079] As an example, each image in the license plate image archive can be sorted according to time information. For the Mth and M+1th images, spatiotemporal rationality is determined by obtaining the location information of these two images and calculating the average speed based on the travel distance between their locations. If the average speed exceeds a maximum speed threshold, it is considered unreasonable. Further, vehicle body similarity can be judged for the Mth and M+1th images. Specifically, vehicle body attribute information can be obtained, including at least one of the following: vehicle color, vehicle model, and interior decorations. If at least one of the vehicle body attribute information is consistent between the Mth and M+1th images, the Mth and M+1th images are determined to meet the vehicle body similarity condition. Otherwise, the image is identified as an incorrect vehicle image.

[0080] In some embodiments, when the Mth image includes both facial and license plate information, the next image including both facial and license plate information is determined from the license plate image archive based on time information, and this image is denoted as image C. If the average speed determined based on the position information of the Mth image and image C is not lower than a lower speed threshold and the quality scores of the facial information in both the Mth image and image C are not lower than a set quality threshold, then feature similarity is judged on the facial information of the Mth image and image C. When at least one facial similarity is not lower than the set threshold, the Mth image and image C are considered similar; otherwise, the vehicle image archive is removed. In some scenarios, when both the driver and passenger seats are present in the Mth image and image C, two feature similarities are determined based on the driver's facial information in the Mth image and the driver and passenger facial information in image C, respectively. Then, two feature similarities are determined based on the passenger's facial information in the Mth image and the driver and passenger facial information in image C, respectively. When all four feature similarities are lower than the set threshold, the image is considered an incorrect license plate image.

[0081] In some embodiments, multiple predicted license plates and their corresponding probabilities can be determined based on the probability of character errors in the license plate information of the erroneous license plate image and the probability that a character in the license plate information is predicted to be another character. Further, the erroneous license plate image can be clustered into the license plate image file corresponding to the predicted license plate information, according to the descending order of the predicted license plate probabilities, if the erroneous license plate image satisfies the spatiotemporal plausibility condition and the facial similarity condition in the license plate image file corresponding to the predicted license plate information, to obtain an initial image file.

[0082] As an example, when performing license plate recognition, it's easy to confuse B with 8, and 0 with O. Each character has a probability of being incorrectly identified and a probability that the character in the license plate information will be predicted as another character. Multiplying these two probabilities gives the probability of each predicted license plate. For example, for the license plate "Jiang A 8Z7L3", the probability of incorrectly identifying the character 8 is 5%, the probability of predicting it as B is 80%, and the probability of predicting it as G is 20%; the probability of incorrectly identifying the letter Z is 2%, the probability of predicting it as 2 is 90%, and the probability of predicting it as 9 is 10%. After determining the license plate probabilities, they can be sorted according to their probability. The probability of predicting the license plate "Jiang A BZ7L3" is 4%, the probability of predicting the license plate "Jiang A 827L3" is 1.8%, the probability of predicting the license plate "Jiang A GZ7L3" is 1%, and the probability of predicting the license plate "Jiang A 897L3" is 0.2%. A preset probability threshold is used. For license plates with probabilities higher than the threshold, images of erroneous license plates are sequentially placed into the corresponding image archives for inspection, based on their probability values. This determines whether the erroneous license plate images meet the spatiotemporal rationality and vehicle body similarity conditions. Specifically, if the vehicle body attributes, including body color, vehicle model, and interior decorations, are all identical, the vehicle body similarity condition is considered met. In this case, the erroneous license plate images can be clustered into the image archive corresponding to the predicted license plate to obtain the initial image archive. For example, an incorrect license plate image can be placed in the vehicle image file corresponding to "Jiang A BZ7L3". If the spatiotemporal rationality condition and the vehicle body similarity condition are met, the incorrect license plate image is clustered into the vehicle image file corresponding to "Jiang A BZ7L3". If the incorrect license plate image is placed in the vehicle image file corresponding to "Jiang A BZ7L3" but the spatiotemporal rationality condition and the vehicle body similarity condition are not met, the incorrect license plate image is placed in the vehicle image file corresponding to "Jiang A 827L3" to determine whether the incorrect license plate image meets the spatiotemporal rationality condition and the vehicle body similarity condition in that vehicle image file.

[0083] In some embodiments, if the erroneous license plate image does not meet the spatiotemporal rationality condition and vehicle body similarity condition in the license plate image files corresponding to multiple predicted license plates, the erroneous license plate image is placed in the license plate image file where the erroneous license plate image was before it was filtered, so as to obtain the initial image file.

[0084] In some embodiments, multiple second-type images and multiple initial image files are clustered according to facial information to obtain multiple image files. This can be achieved through the following steps: clustering the second-type images and first-type images according to facial information to obtain multiple facial image files; sorting the images in each facial image file in chronological order, and for each second-type image in the facial image file, determining the adjacent first-type image whose chronological information is closest to that of the second-type image; when the second-type image meets the spatiotemporal rationality condition and vehicle body similarity condition in the initial image file corresponding to the adjacent first-type image, clustering the second-type image into the initial image file corresponding to the adjacent first-type image to obtain an image file.

[0085] In some embodiments, if a second-type image does not meet the spatiotemporal plausibility and vehicle body similarity conditions in the initial image archive corresponding to an adjacent first-type image, then the next adjacent first-type image in the face image archive is determined. If a second-type image meets the spatiotemporal plausibility and vehicle body similarity conditions in the initial image archive corresponding to the next adjacent first-type image, then the second-type image is clustered into the initial image archive corresponding to the next adjacent first-type image to obtain an image archive. The next adjacent first-type image and the adjacent first-type image are the first-type images in the face image archive that are temporally closest to the second-type image, and the next adjacent first-type image and the adjacent first-type image are respectively distributed before and after the time corresponding to the second-type image.

[0086] As an example, for a facial image archive containing first-category images, the images can be sorted chronologically within the archive. For each second-category image, the closest first-category image in time is determined based on the time information, and the corresponding initial image archive is identified. Further, the second-category images can be inserted into the initial image archive corresponding to the first-category image according to time, and it is determined whether they meet the spatiotemporal plausibility and vehicle body similarity conditions. Otherwise, the next nearest first-category image in the facial image archive is identified, and then it is determined whether the second-category image in the initial image archive corresponding to the next nearest first-category image meets the spatiotemporal plausibility and vehicle body similarity conditions. For example, in a facial image archive, a second-category image has a first-category image 2 minutes ago, another first-category image 3 minutes ago, and another first-category image 4 minutes later. The second-category image is first placed in the initial image archive corresponding to the first-category image 2 minutes ago for spatiotemporal plausibility and vehicle body similarity conditions. If it does not meet the requirements, the second type of image will be placed into the initial image file corresponding to the first type of image 4 minutes later, and the spatiotemporal rationality condition and vehicle body similarity condition will be judged.

[0087] In some embodiments, face profiles that do not include images of the first type in the face image profile can be used as images to be clustered for judgment.

[0088] Based on the same technical concept, this application provides an image clustering device 300, see [link to relevant documentation]. Figure 3 As shown. The device 300 can perform each step in the above-described image clustering method. The device 300 includes a first determining module 301, a second determining module 302, and a comparison module 303.

[0089] The first determining module 301 is used to determine, for any two adjacent images in an image archive, multiple first images to be clustered from the images to be clustered, corresponding to the time range and geographical location of the two adjacent images; the image archive consists of multiple images of the same vehicle collected by multiple acquisition devices and clustered according to license plate information and facial information; the two adjacent images are images that are adjacent after being sorted according to time information; the images to be clustered are images that cannot be clustered into the image archive collected by the multiple acquisition devices corresponding to the image archive.

[0090] The second determining module 302 determines K second images to be clustered from the plurality of first images to be clustered, wherein the average vehicle speed is within the driving speed range; the average vehicle speed is the average speed between the geographical locations corresponding to the first images to be clustered and the target image, the target image is any one of the two adjacent images, and the driving speed range is determined based on the time information and location information corresponding to the two adjacent images respectively;

[0091] The comparison module 303 is used to compare the vehicle feature information of the second image to be clustered with the vehicle feature information of L images in the image archive for each second image to be clustered to obtain L feature similarities. The deviation between the vehicle angle information of each of the L images and the vehicle angle information of the second image to be clustered is less than a set deviation threshold.

[0092] The second determining module 302 is further configured to cluster the second image to be clustered into the image archive when it is determined that the second image to be clustered meets the clustering conditions based on the L feature similarities.

[0093] In some embodiments, when the second determining module 302 determines that the second image to be clustered meets the clustering conditions based on the L feature similarities, it is specifically used for:

[0094] If the number of images in the L images whose feature similarity is greater than a similarity threshold is determined, then the proportion of similar images is determined based on the number of images; when the proportion of similar images is greater than a set proportion threshold, the second image to be clustered is determined to meet the clustering conditions.

[0095] In some embodiments, the second image to be clustered has a similarity ratio in other image archives, and the clustering condition also includes that the similarity ratio of the second image to be clustered in other image archives is less than the similarity ratio of the second image to be clustered in the image archive.

[0096] In some embodiments, the images acquired by the plurality of acquisition devices include a plurality of first-type images containing license plate information, and a second-type image containing facial information but not license plate information. The first determining module 301 is further configured to determine the image file in the following manner:

[0097] Based on the license plate information, the multiple first-class images are clustered to obtain multiple initial image files;

[0098] Based on facial information, the multiple second-type images are clustered with the multiple initial image files to obtain multiple image files.

[0099] In some embodiments, the first determining module 301, when clustering the plurality of first-type images according to license plate information to obtain a plurality of initial image files, is specifically used for: clustering the plurality of first-type images according to license plate information to determine a plurality of license plate image files; for any license plate image file, filtering out erroneous license plate images in the license plate image file that do not meet the spatiotemporal rationality condition and the vehicle body similarity condition; determining a plurality of predicted license plates corresponding to the license plate information and the license plate probability corresponding to each predicted license plate based on the character error probability in the license plate information of the erroneous license plate image and the probability that the character in the license plate information is predicted to be other characters; and, according to the license plate probability of each predicted license plate in descending order, if the erroneous license plate image meets the spatiotemporal rationality condition and the face similarity condition in the license plate image file corresponding to the predicted license plate information, then clustering the erroneous license plate image into the license plate image file corresponding to the predicted license plate information to obtain an initial image file.

[0100] In some embodiments, if the erroneous license plate image does not meet the spatiotemporal rationality condition and vehicle body similarity condition in the license plate image files corresponding to the multiple predicted license plates, the erroneous license plate image is placed in the license plate image file where the erroneous license plate image was before it was filtered, so as to obtain an initial image file.

[0101] In some embodiments, the spatiotemporal rationality condition is that the average speed determined by the geographical location of any two adjacent images in the license plate image archive is less than a speed threshold, and the vehicle body similarity condition is that the vehicle body attribute information of any two adjacent images is the same. The vehicle body attribute information includes at least one of the following: vehicle body color, vehicle model, and interior decorations; the two adjacent images are adjacent images in the license plate image archive after being sorted according to time information.

[0102] In some embodiments, the first determining module 301, when clustering the plurality of second-type images with the plurality of initial image files according to facial information to obtain a plurality of image files, is specifically configured to: cluster the second-type images and the first-type images according to facial information to obtain a plurality of facial image files; sort the images in each facial image file according to time order, and for each second-type image in the facial image file, determine the adjacent first-type image whose time information is closest to that of the second-type image; when the second-type image meets the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the adjacent first-type image, cluster the second-type image into the initial image file corresponding to the adjacent first-type image to obtain an image file.

[0103] In some embodiments, the first determining module 301 is further configured to: if the second type of image does not meet the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the adjacent first type of image, then determine the next adjacent first type of image in the face image file, wherein the next adjacent first type of image and the adjacent first type of image are the first type of images in the face image file that are closest to the second type of image in time, and the next adjacent first type of image and the adjacent first type of image are respectively distributed before and after the time corresponding to the second type of image; if the second type of image meets the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the next adjacent first type of image, then cluster the second type of image into the initial image file corresponding to the next adjacent first type of image to obtain an image file.

[0104] Based on the same technical concept, this application provides an execution device 400, see [link to relevant documentation]. Figure 4 As shown. The device 400 can perform the various steps in the image clustering method described above. The device 400 includes a memory 401 and a processor 402.

[0105] The memory 401 is used to store program instructions;

[0106] The processor 402 is used to call the program instructions stored in the memory and execute any step of the above image clustering method according to the obtained program.

[0107] In the embodiments of this application, the processor 402 may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field-programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, capable of implementing or executing the various methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor may be a microprocessor or any conventional processor, etc. The steps of the methods disclosed in the embodiments of this application can be directly manifested as being executed by a hardware processor, or being executed by a combination of hardware and software modules in the processor.

[0108] Memory 401, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. Memory 401 may include at least one type of storage medium, such as flash memory, hard disk, multimedia card, card-type memory, random access memory (RAM), static random access memory (SRAM), programmable read-only memory (PROM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), magnetic storage, magnetic disk, optical disk, etc. Memory 401 can be any other medium capable of carrying or storing desired program code in the form of instructions or data structures that can be accessed by a computer, but is not limited thereto. In the embodiments of this application, memory 401 can also be a circuit or any other device capable of implementing storage functions for storing program instructions and / or data.

[0109] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0110] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to this application. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in one or more blocks of the flowchart illustrations and / or one or more blocks of the block diagrams.

[0111] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowcharts and / or one or more block diagrams.

[0112] These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable apparatus, provide steps for implementing the functions specified in one or more flowcharts and / or one or more block diagrams.

[0113] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.

Claims

1. An image clustering method, characterized in that, include: For any two adjacent images in an image archive, multiple first images to be clustered are determined from the images to be clustered, corresponding to the time range and geographical location of the two adjacent images. The image archive consists of multiple images of the same vehicle collected by multiple acquisition devices and clustered according to license plate information and facial information. The two adjacent images are adjacent images after being sorted according to time information. The images to be clustered are images collected by the multiple acquisition devices corresponding to the image archive that cannot be clustered into the image archive. The images to be clustered are images that do not include license plate information. The images collected by the multiple acquisition devices include multiple first-class images containing license plate information and second-class images containing facial information but not license plate information. The image archive is determined in the following way: The multiple first-class images are clustered according to license plate information to determine multiple license plate image files; For any given license plate image file, filter out erroneous license plate images that do not meet the conditions of spatiotemporal rationality and vehicle body similarity. Based on the probability of character errors in the license plate information of the erroneous license plate images and the probability that the characters in the license plate information are predicted to be other characters, the erroneous license plate images are clustered into a license plate image file to obtain an initial image file. Based on facial information, the multiple second-type images are clustered with the multiple initial image files to obtain multiple image files; From the plurality of first images to be clustered, K second images to be clustered are determined where the average vehicle speed is within the driving speed range; the average vehicle speed is the average speed between the geographical locations corresponding to the first images to be clustered and the target image, the target image is any one of the two adjacent images, and the driving speed range is determined based on the time information and location information corresponding to the two adjacent images respectively; For each second image to be clustered, the vehicle feature information of the second image to be clustered is compared with the vehicle feature information of L images in the image archive to obtain L feature similarities. The deviation between the vehicle angle information of each of the L images and the vehicle angle information of the second image to be clustered is less than a set deviation threshold. When the second image to be clustered is determined to meet the clustering conditions based on the L feature similarities, the second image to be clustered is clustered into the image archive.

2. The method as described in claim 1, characterized in that, The step of determining whether the second image to be clustered satisfies the clustering conditions based on the L feature similarities includes: If the number of images among the L images whose feature similarity is greater than a similarity threshold is determined, then the proportion of similar images is determined based on the number of such images. When the proportion of similar images is greater than a set proportion threshold, the second image to be clustered is determined to meet the clustering conditions.

3. The method as described in claim 2, characterized in that, The clustering condition further includes that the proportion of similar images of the second image to be clustered in other image archives is less than the proportion of similar images of the second image to be clustered in the image archive itself.

4. The method as described in claim 1, characterized in that, The step involves clustering the erroneous license plate images into a single license plate image archive based on the probability of character errors in the license plate information and the probability that a character in the license plate information is predicted to be another character, thereby obtaining an initial image archive, including: Based on the probability of character errors in the license plate information of the erroneous license plate image, and the probability that the characters in the license plate information are predicted as other characters, multiple predicted license plates corresponding to the license plate information and the license plate probability corresponding to each predicted license plate are determined. According to the order of the probability of each predicted license plate from high to low, if the erroneous license plate image satisfies the spatiotemporal rationality condition and the facial similarity condition in the license plate image file corresponding to the predicted license plate, then the erroneous license plate image is clustered into the license plate image file corresponding to the predicted license plate information to obtain the initial image file.

5. The method as described in claim 4, characterized in that, If the erroneous license plate image does not meet the spatiotemporal rationality condition and vehicle body similarity condition in the license plate image files corresponding to the multiple predicted license plates, then the erroneous license plate image is placed in the license plate image file where the erroneous license plate image was before it was filtered, so as to obtain the initial image file.

6. The method as described in claim 4 or 5, characterized in that, The spatiotemporal rationality condition is that the average speed determined by the geographical location of any two adjacent images in the license plate image archive is less than a speed threshold. The vehicle body similarity condition is that the vehicle body attribute information of any two adjacent images is the same. The vehicle body attribute information includes at least one of the following: vehicle body color, vehicle model, and interior decorations. The two adjacent images are adjacent images in the license plate image archive after being sorted according to time information.

7. The method as described in claim 1, characterized in that, The process involves clustering the multiple second-type images with the multiple initial image files based on facial information to obtain multiple image files, including: The second type of images and the first type of images are clustered according to facial information to obtain multiple facial image files; The images in each face image file are sorted in chronological order, and for each second type of image in the face image file, the adjacent first type image that is closest to the second type of image in terms of time information is determined; When the second type of image meets the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the adjacent first type of image, the second type of image is clustered into the initial image file corresponding to the adjacent first type of image to obtain the image file.

8. The method as described in claim 7, characterized in that, The method further includes: If the second type of image does not meet the spatiotemporal rationality condition and vehicle body similarity condition in the initial image file corresponding to the adjacent first type of image, then the next adjacent first type of image in the face image file is determined. The next adjacent first type of image and the adjacent first type of image are the first type of images in the face image file that are closest to the second type of image in time, and the next adjacent first type of image and the adjacent first type of image are respectively distributed before and after the time corresponding to the second type of image. If the second type of image meets the spatiotemporal rationality condition and the vehicle body similarity condition in the initial image file corresponding to the next adjacent first type of image, then the second type of image is clustered into the initial image file corresponding to the next adjacent first type of image to obtain an image file.

9. An image clustering device, characterized in that, include: The first determining module is used to, for any two adjacent images in an image archive, determine multiple first images to be clustered from the images to be clustered, corresponding to the time range and geographical location of the two adjacent images; the image archive consists of multiple images of the same vehicle collected by multiple acquisition devices and clustered according to license plate information and facial information, and the two adjacent images are adjacent images after being sorted according to time information; the images to be clustered are images collected by the multiple acquisition devices corresponding to the image archive that cannot be clustered into the image archive, and the images to be clustered are images that do not include license plate information; the images collected by the multiple acquisition devices... The image archive includes multiple first-class images containing license plate information and second-class images containing facial information but not license plate information. The image archive is determined as follows: the multiple first-class images are clustered according to license plate information to determine multiple license plate image archives; for any license plate image archive, erroneous license plate images that do not meet the spatiotemporal rationality condition and vehicle body similarity condition are filtered out; based on the probability of character errors in the license plate information of the erroneous license plate images and the probability that the characters in the license plate information are predicted as other characters, the erroneous license plate images are clustered into one license plate image archive to obtain an initial image archive. Based on facial information, the multiple second-type images are clustered with the multiple initial image files to obtain multiple image files; The second determining module determines K second images from the plurality of first images to be clustered, in which the average vehicle speed is within the driving speed range; the average vehicle speed is the average speed between the geographical locations corresponding to the first images to be clustered and the target image, the target image is any one of the two adjacent images, and the driving speed range is determined based on the time information and location information corresponding to the two adjacent images respectively; The comparison module is used to compare the vehicle feature information of each second image to be clustered with the vehicle feature information of L images in the image archive to obtain L feature similarities. The deviation between the vehicle angle information of each of the L images and the vehicle angle information of the second image to be clustered is less than a set deviation threshold. The second determining module is further configured to cluster the second image to be clustered into the image archive when it is determined that the second image to be clustered meets the clustering conditions based on the L feature similarities.

10. An execution device, characterized in that, include: Memory, used to store program instructions; A processor is configured to invoke program instructions stored in the memory and execute the method as described in any one of claims 1-8 according to the obtained program instructions.

11. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed on a computer, cause the computer to perform the method as described in any one of claims 1-8.