A vehicle tracking method, apparatus, device and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By obtaining vehicle feature vectors and region categories from image frames, and utilizing training models and verification algorithms, the problem of low accuracy in vehicle feature matching was solved, thereby improving the precision and accuracy of vehicle tracking.

CN115272988BActive Publication Date: 2026-06-16CHINA TELECOM CLOUD TECH CO LTD

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: CHINA TELECOM CLOUD TECH CO LTD
Filing Date: 2022-07-18
Publication Date: 2026-06-16

Application Information

Patent Timeline

18 Jul 2022

Application

16 Jun 2026

Publication

CN115272988B

IPC: G06V20/54; G06V10/40; G06V10/75; G06V10/764

CPC: G06V20/54; G06V10/764; G06V10/40; G06V10/75; G06V2201/08

AI Tagging

Application Domain

Character and pattern recognition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

In existing technologies, the accuracy of vehicle feature matching is low, which leads to a decrease in vehicle tracking accuracy. In particular, when crossing image acquisition devices, the vehicle ID recognition error is large, resulting in tracking failure.

⚗Method used

By acquiring vehicle feature vectors and region categories from image frames, the trained model is used for matching to determine vehicle IDs. Kalman filtering and the Hungarian algorithm are then combined to verify vehicle consistency, thereby improving feature matching accuracy.

🎯Benefits of technology

It improves the accuracy of vehicle feature matching, enhances the accuracy of vehicle tracking, and ensures the correctness of vehicle ID recognition across image acquisition devices.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115272988B_ABST

Patent Text Reader

Abstract

The embodiment of the application provides a vehicle tracking method, device, equipment and medium, in the embodiment of the application, the electronic device determines the sub-feature vector corresponding to each region category of each vehicle in the current image frame, and the region category corresponding to the region where each vehicle is collected, the electronic device can match the vehicle in the target region category and the target feature vector of the current image frame with the vehicle in the last image frame, improve the accuracy of the feature matching of the vehicle, and further improve the accuracy of the vehicle tracking.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image processing technology, and in particular to a vehicle tracking method, apparatus, device, and medium. Background Technology

[0002] Multi-object tracking algorithms, given an image sequence, determine the position of each object in each image within the sequence, and then determine the trajectory of each object. Based on this, electronic devices can achieve vehicle tracking across image acquisition devices by utilizing the positions of the image acquisition devices and the features of each vehicle in the image frames acquired by each device.

[0003] In existing technologies, when extracting vehicle features, each image frame captured by an image acquisition device within a certain time period is typically acquired. For each vehicle, each feature of that vehicle in each image frame is determined. Then, each feature is weighted or averaged, and the calculated feature is identified as the vehicle's feature. This feature is then matched with the vehicle's features determined in the previous time period to identify the target vehicle that matched that vehicle in the previous time period, thus determining the vehicle's ID. However, due to the high speed of vehicle movement, the relative position of the vehicle and the image acquisition device changes rapidly, resulting in significant differences in vehicle features across each image frame captured within a certain time period. This leads to a decrease in the accuracy of vehicle feature matching, further reducing vehicle tracking accuracy. Furthermore, when tracking vehicles across image acquisition devices, vehicle ID re-identification is required using vehicle features. However, the shooting angles of multiple image acquisition devices are generally different, resulting in significant differences in the extracted vehicle features. This leads to large errors in vehicle ID re-identification and tracking failures when tracking across image acquisition devices. Summary of the Invention

[0004] This application provides a vehicle tracking method, apparatus, device, and medium to solve the problem in the prior art where the extracted vehicle features in each frame have too large differences, leading to a decrease in the accuracy of vehicle feature matching and further a decrease in vehicle tracking accuracy.

[0005] In a first aspect, embodiments of this application provide a vehicle tracking method, the method comprising:

[0006] Acquire the current image frame captured by the image acquisition device;

[0007] The current image frame is input into the trained model to obtain the feature vector of each vehicle in the current image frame output by the model, and the region category corresponding to the area where each vehicle is collected. The feature vector includes a sub-feature vector corresponding to each region category.

[0008] For each vehicle contained in the current image frame, a target vehicle matching the vehicle in the previous image frame is determined based on the target feature vector and target region category corresponding to the vehicle; the ID corresponding to the target vehicle is determined as the ID of the vehicle in the current image frame.

[0009] Secondly, embodiments of this application also provide a vehicle tracking device, the device comprising:

[0010] The acquisition module is used to acquire the current image frame captured by the image acquisition device;

[0011] The feature extraction module is used to input the current image frame into the trained model, obtain the feature vector of each vehicle contained in the current image frame output by the model, and the region category corresponding to the area where each vehicle is collected, wherein the feature vector includes a sub-feature vector corresponding to each region category;

[0012] The matching module is used to determine the target vehicle that matches the vehicle in the previous image frame for each vehicle contained in the current image frame, based on the target feature vector and target region category corresponding to the vehicle; and to determine the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame.

[0013] Thirdly, embodiments of this application also provide an electronic device, the electronic device including a processor, the processor being configured to execute a computer program stored in a memory to implement the steps of the vehicle tracking method as described above.

[0014] Fourthly, embodiments of this application also provide a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the vehicle tracking method described above.

[0015] In this embodiment, the electronic device acquires the current image frame captured by the image acquisition device, inputs the current image frame into the trained model, and obtains the feature vector of each vehicle contained in the current image frame, as well as the region category corresponding to the region where each vehicle is acquired. The feature vector includes a sub-feature vector corresponding to each region category. For each vehicle in the current image frame, based on the target feature vector and target region category corresponding to the vehicle, the target vehicle matching the vehicle in the previous image frame is determined. The ID corresponding to the target vehicle is then determined as the ID of the vehicle in the current image frame. That is, in this embodiment, the electronic device determines the sub-feature vector corresponding to each region category of each vehicle in the current image frame, as well as the region category corresponding to the region where each vehicle is acquired. The electronic device can match the vehicle in the previous image frame with the target region category and target feature vector of the vehicle in the current image frame, improving the accuracy of vehicle feature matching and further improving the accuracy of vehicle tracking. Attached Figure Description

[0016] To more clearly illustrate the technical solutions of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0017] Figure 1 This is a schematic diagram of an alarm processing procedure provided in an embodiment of this application;

[0018] Figure 2 This is a schematic diagram of the current image frame captured by the image acquisition device provided in the embodiments of this application;

[0019] Figure 3 A schematic diagram of vehicles of various regional categories provided for embodiments of this application;

[0020] Figure 4a The region category of vehicle a in the first image frame acquired by image acquisition device 1, as provided in the embodiments of this application;

[0021] Figure 4b The region category of vehicle a in the second image frame acquired by image acquisition device 1, as provided in the embodiments of this application;

[0022] Figure 4c The region category of vehicle a in the third image frame acquired by image acquisition device 2, as provided in the embodiments of this application;

[0023] Figure 4dThe region category of vehicle a in the fourth image frame acquired by image acquisition device 2, as provided in the embodiments of this application;

[0024] Figure 5 This is a schematic diagram of the model structure provided in the embodiments of this application;

[0025] Figure 6 A schematic diagram illustrating the appearance of a vehicle in an image frame acquired by an image acquisition device, as provided in an embodiment of this application.

[0026] Figure 7 This is a schematic diagram of a risk data acquisition device provided in an embodiment of this application;

[0027] Figure 8 This is a schematic diagram of an electronic device structure provided in an embodiment of this application. Detailed Implementation

[0028] To make the objectives, technical solutions, and advantages of this application clearer, the application will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0029] To improve the accuracy of vehicle feature matching and vehicle tracking, this application provides a vehicle tracking method, apparatus, device, and medium.

[0030] Example 1:

[0031] Figure 1 This application provides a schematic diagram of a vehicle tracking process, which includes:

[0032] S101: Obtain the current image frame acquired by the image acquisition device.

[0033] The vehicle tracking method provided in this application is applied to an electronic device, which may be a server, PC, or image acquisition device, etc.

[0034] In this embodiment, the image acquisition device monitors the scene and objects within the monitoring range in real time and acquires image frames. The electronic device acquires the current image frame acquired by the image acquisition device in real time and tracks the vehicle based on the current image frame.

[0035] Figure 2 This is a schematic diagram of the current image frame acquired by the image acquisition device provided in the embodiments of this application. Figure 2 As shown, the image acquisition device monitors the scene and objects within the monitoring range and acquires image frames.

[0036] S102: Input the current image frame into the trained model, obtain the feature vector of each vehicle contained in the current image frame output by the model, and the region category corresponding to the area where each vehicle is collected, wherein the feature vector includes a sub-feature vector corresponding to each region category.

[0037] After acquiring the current image frame, the electronic device can determine each vehicle contained in the current image frame, as well as the feature vector corresponding to each vehicle and the region category corresponding to the area where each vehicle is captured, so that the electronic device can perform vehicle tracking for each vehicle in the current image frame.

[0038] Specifically, when electronic devices perform vehicle tracking, the position of the same vehicle relative to the image acquisition device may change significantly in different image frames acquired by the same image acquisition device. Therefore, in order to better track each vehicle, in this embodiment, the electronic device determines the region category corresponding to the area acquired by each vehicle, and the electronic device generates the feature vector corresponding to the vehicle based on each region category of the vehicle, that is, the feature vector of each vehicle includes a sub-feature vector corresponding to each region category. In this embodiment, the region categories of the vehicle include at least the front, body, and rear of the vehicle.

[0039] Figure 3 This is a schematic diagram of vehicles of various regional categories provided in the embodiments of this application, as shown in the figure. Figure 3 As shown, the area category for each vehicle in the first row is the front of the vehicle, the area category for each vehicle in the second row is the body of the vehicle, and the area category for each vehicle in the third row is the rear of the vehicle.

[0040] Specifically, in this embodiment of the application, the electronic device inputs the current image frame into the trained model. The model identifies each vehicle contained in the image frame, determines the feature vector of each vehicle in the current image frame, the region category corresponding to the area where each vehicle is collected, and outputs the feature vector and region category corresponding to each vehicle.

[0041] Figure 4a For the region category of vehicle a in the first image frame acquired by image acquisition device 1 according to the embodiments of this application, such as... Figure 4a As shown, the area category corresponding to the area collected for vehicle a is the front of the vehicle.

[0042] Figure 4b For the region category of vehicle a in the second image frame acquired by image acquisition device 1, as provided in the embodiments of this application, such as... Figure 4bAs shown, the area category corresponding to the area collected for vehicle a is vehicle body.

[0043] Figure 4c For the region category of vehicle a in the third image frame acquired by image acquisition device 2 as provided in the embodiments of this application, such as... Figure 4c As shown, the area category corresponding to the area collected for vehicle a is vehicle body.

[0044] Figure 4d For the region category of vehicle a in the fourth image frame acquired by image acquisition device 2 as provided in the embodiments of this application, such as... Figure 4d As shown, the area category corresponding to the area collected for vehicle a is the rear of the vehicle.

[0045] S103: For each vehicle contained in the current image frame, determine the target vehicle that matches the vehicle in the previous image frame based on the target feature vector and target region category corresponding to the vehicle; and determine the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame.

[0046] In this embodiment, for each vehicle contained in the current image frame, the electronic device determines a target vehicle matching the vehicle in the previous image frame based on the target region category and target feature vector of the vehicle in the current image frame, and sets the ID of the target vehicle as the ID of the vehicle in the current image frame. The previous image frame and the current image frame are acquired by the same image acquisition device.

[0047] Furthermore, in this embodiment, after determining the target vehicle that matches the vehicle, the electronic device can further verify whether the vehicle and the target vehicle are the same vehicle. Specifically, based on the position of each vehicle in the previous image frame, the electronic device uses Kalman filtering and the Hungarian algorithm to predict the motion state of each vehicle in the previous image frame and predict the position of each vehicle in the current image frame. Based on the prediction result, it further verifies whether the vehicle and the target vehicle are the same vehicle. This prediction process is prior art and will not be described in detail here.

[0048] In this embodiment, the electronic device determines the sub-feature vector corresponding to each region category of each vehicle in the current image frame, as well as the region category corresponding to the region where each vehicle is collected. The electronic device can match the vehicle in the previous image frame with the target region category and target feature vector of the vehicle in the current image frame, thereby improving the accuracy of vehicle feature matching and further improving the accuracy of vehicle tracking.

[0049] Example 2:

[0050] To improve the accuracy of vehicle feature matching, based on the above embodiments, in this embodiment, the step of inputting the current image frame into the trained model to obtain the feature vector of each vehicle contained in the current image frame output by the model, and the region category corresponding to the region where each vehicle is sampled, includes:

[0051] The current image frame is input into the first sub-model of the model to determine the first intermediate image frame carrying the position information of each vehicle;

[0052] The first intermediate image frame is input into the second sub-model of the model to determine the second intermediate image frame carrying the feature vector of each vehicle;

[0053] The second intermediate image frame is input into the third sub-model of the model to determine the region category corresponding to each vehicle's region, and output the feature vector and region category corresponding to each vehicle.

[0054] In this embodiment of the application, when the model determines the feature vector and region category corresponding to each vehicle in the current image frame, the model includes at least three sub-models: a first sub-model for determining the location information of each vehicle in the current image frame, a second sub-model for determining the feature vector corresponding to each vehicle, and a third sub-model for determining the region category of each vehicle.

[0055] Specifically, in this embodiment, the model inputs the current image frame into a first sub-model, which includes at least modules such as a heatmap, center offset, and bounding box size. The heatmap is used to determine whether an object in the current image frame is a vehicle. The center offset is used to select each vehicle in the current image frame using a rectangle or similar shape. The bounding box size is used to construct a coordinate system in the current image frame according to a pre-configured coordinate system construction method, and to determine the coordinates of preset vertices and the length and width of the rectangle corresponding to each vehicle. The rectangle corresponding to each vehicle, along with the coordinates of its preset vertices and its length and width, constitutes the vehicle's position information. The first sub-model outputs a first intermediate image frame carrying the position information of each vehicle.

[0056] The model inputs the first intermediate image frame into the second sub-model. Based on the location information of each vehicle carried in the first intermediate image frame, the second sub-model locates each vehicle within the first intermediate image frame and determines the corresponding feature vector for each vehicle. The second sub-model outputs a second intermediate image frame carrying the feature vector corresponding to each vehicle.

[0057] The model inputs the second intermediate image frame into a third sub-model, which determines the region category for each vehicle. The model outputs a feature vector and region category for each vehicle.

[0058] Figure 5 This is a schematic diagram of the model structure provided in the embodiments of this application, as shown below. Figure 5 As shown, this model is a Deep Layer Aggregation (DLA) network model, which includes a first sub-model, a second sub-model, and a third sub-model. The first sub-model is a vehicle detection module, which includes a heatmap, a center offset, and a bounding box size. The heatmap is used to determine whether each object in the current image frame is a vehicle, the center offset is used to determine the position of each vehicle in the current image frame and select each vehicle, and the bounding box size is used to determine the position information of the bounding box corresponding to each vehicle. The second sub-model is a Reid module, which is used to determine the feature vector corresponding to each vehicle in the current image frame. The third sub-model is a vehicle morphology classification module, which is used to determine the region category corresponding to the area where each vehicle is captured.

[0059] Example 3:

[0060] To better train the model and enable the trained model to better determine the feature vector and region category corresponding to each vehicle, based on the above embodiments, the training process of the model in this embodiment includes:

[0061] Obtain each sample image frame stored in the training sample set, wherein each sample image frame carries the initial position information and initial region category corresponding to each vehicle;

[0062] For each sample image frame, the sample image frame is input into the first sub-model of the model to determine a first intermediate image frame carrying the predicted location information of each vehicle; the first intermediate image frame is input into the second sub-model of the model, the second sub-model determines the predicted feature vector of each vehicle, assigns a prediction number to each vehicle based on the predicted feature vector, and determines a second intermediate image frame carrying the predicted feature vector of each vehicle; the second intermediate image frame is input into the third sub-model of the model to determine the predicted region category corresponding to the region of each vehicle.

[0063] Based on the initial position information and predicted position information of each vehicle in each sample image frame, determine the first loss value corresponding to the first sub-model;

[0064] The second loss value corresponding to the second sub-model is determined based on whether there are at least two vehicles with the same predicted number in each sample image frame.

[0065] The third loss value corresponding to the third sub-model is determined based on the initial region category and the predicted region category corresponding to each vehicle in each sample image frame;

[0066] The parameters of the model are adjusted based on the first loss value, the second loss value, and the third loss value.

[0067] In this embodiment, the training process of the model includes: acquiring each sample image frame in the training sample set, wherein the sample image frame carries the initial position information of each vehicle and the initial region category of each vehicle, and the initial position information includes the initial vehicle bounding box and the initial information corresponding to the initial vehicle bounding box. For each sample image frame, the sample image frame is input into the model, and the first sub-model in the model predicts the position information of the vehicles in the sample image frame and outputs a predicted first intermediate image frame carrying the predicted position of the vehicles; the second sub-model in the model determines the predicted feature vector corresponding to each vehicle in the predicted first intermediate image frame based on the predicted first intermediate image frame, compares each predicted feature vector, identifies vehicles with similar predicted feature vectors as the same vehicle, assigns a prediction number to each vehicle, wherein the prediction number of the same vehicle is the same, and the second sub-model outputs a predicted second intermediate image frame carrying the predicted feature vector of the vehicle; the third sub-model determines the predicted region category of each vehicle based on the predicted second intermediate image frame.

[0068] The electronic device determines a first loss value for the first sub-model based on the initial and predicted position information of each vehicle; it determines a second loss value for the second sub-model based on the fact that no two vehicles with the same number can exist in an image frame and the predicted number of each vehicle predicted by the second sub-model; and it determines a third loss value for the third sub-model based on the initial and predicted region categories of each vehicle. The electronic device then determines the sum of these three loss values and sets this sum as the total loss value of the model, adjusting the model's parameters accordingly.

[0069] In this embodiment, when determining the first loss value corresponding to the first sub-model, the electronic device determines the first sub-loss value based on whether each object in the sample image frame is a vehicle; and determines the second sub-loss value based on each initial position information and predicted position information. The electronic device determines the sum of the first sub-loss value and the second sub-loss value, and uses this sum as the first loss value corresponding to the first sub-model.

[0070] The first sub-loss value can be calculated using the following formula:

[0071]

[0072] Among them, L heatmap Indicates the first sub-loss value; This represents a pre-saved value indicating whether each object is a vehicle. If the object is a vehicle, then... The value of α is 1, otherwise it is something else; M indicates whether each object predicted by the first sub-model is a vehicle. If the object is a vehicle, the value of M is 1, otherwise it is something else; N indicates the number of input sample image frames, and α and β are preset values.

[0073] The second sub-loss value can be calculated using the following formula:

[0074]

[0075] Among them, L box This represents the second sub-loss value. o represents the predicted offset of the predicted vehicle bounding box for the i-th vehicle. i Let represent the initial offset of the initial vehicle bounding box for the i-th vehicle, where Let i be the center coordinates of the i-th initial vehicle frame. Let s represent the length of the i-th predicted vehicle bounding box. i Let represent the length of the i-th initial vehicle frame, where Let be the coordinates of the top-left corner of the i-th initial vehicle frame. Let be the coordinates of the top right corner of the i-th initial vehicle frame, and N represent the number of input sample image frames.

[0076] The second loss value can be calculated using the following formula:

[0077]

[0078] Among them, L identity Let p(k) represent the second loss value. p(k) represents the probability that the predicted number of the kth vehicle is the same as the predicted number of other vehicles. If it is the same, p(k) is 1, and if it is not the same, p(k) is 0. L(k) is the first preset value, and L(k) is the same for each vehicle.

[0079] The third loss value can be calculated using the following formula:

[0080]

[0081] Among them, L car_shape_classes Denotes the third loss value, where y iThe initial area category is pre-defined for the i-th vehicle. Let L be the predicted region category for the i-th vehicle, and N be the total number of samples. In this embodiment, the region category can be represented by numbers; for example, if the region category is the front of the vehicle, it corresponds to the number 0; if the region category is the body of the vehicle, it corresponds to the number 1; if the region category is the rear of the vehicle, it corresponds to the number 2, and so on. i (yi) is the second preset value, and the L(y) is the same for each vehicle.

[0082] The total loss value can be calculated using the following formula:

[0083] L total =L heatmap +L box +L identity +L car_shape_classes

[0084] Among them, L total L represents the total loss value. heatmap L represents the first sub-loss value. box L represents the second sub-loss value. identity L represents the second loss value. car_shape_classes This represents the third loss value.

[0085] Example 4:

[0086] To achieve cross-camera vehicle tracking, based on the above embodiments, in this embodiment, before determining the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame, the method includes:

[0087] Determine whether the number of times the target vehicle appears in the image frames already acquired by the image acquisition device exceeds a preset first number threshold.

[0088] If not, then determine the first image acquisition device preceding the image acquisition device according to the pre-saved order of each image acquisition device;

[0089] Obtain the vehicle information of each first candidate vehicle that leaves the monitoring range of the first image acquisition device, wherein the vehicle information includes the candidate ID and candidate feature vector of the first candidate vehicle;

[0090] Based on the candidate feature vector carried in the vehicle information of each first candidate vehicle and the feature vector of the target vehicle, determine the target first candidate vehicle that matches the target vehicle in each first candidate vehicle;

[0091] Based on the candidate feature vector of the first candidate vehicle and the target feature vector of the vehicle, it is determined whether the first candidate vehicle matches the vehicle. If they match, the second number of times the target vehicle and the first candidate vehicle match is updated.

[0092] If the updated second count exceeds the preset second count threshold, then the target candidate ID of the first target candidate vehicle is determined as the ID of the target vehicle.

[0093] In this embodiment of the application, there may be multiple image acquisition devices on a road, and each image acquisition device has a different monitoring range. When a vehicle leaves the monitoring range of one image acquisition device, the vehicle will enter the monitoring range of another image acquisition device. The electronic device needs to ensure that the vehicle ID remains unchanged regardless of which image acquisition device captures the vehicle.

[0094] When a vehicle enters the monitoring range of an image acquisition device, the area of the vehicle being captured may change gradually in the image frames captured by the image acquisition device. Figure 6 This is a schematic diagram illustrating the appearance of a vehicle in an image frame acquired by an image acquisition device, as provided in the embodiments of this application. Figure 6 As shown, image frame a is the first image frame that the vehicle appears in the image acquisition device, at which point only 1 / 4 of the vehicle's area appears in image frame a; image frame b is the second image frame that the vehicle appears in the image acquisition device, at which point 1 / 2 of the vehicle's area appears in image frame b; image frame c is the third image frame that the vehicle appears in the image acquisition device, at which point 3 / 4 of the vehicle's area appears in image frame c; and image frame d is the fourth image frame that the vehicle appears in the image acquisition device, at which point the entire area of the vehicle appears in image frame d.

[0095] In this embodiment, when determining the ID of a target vehicle, the electronic device checks whether the first appearance count of the target vehicle in the image frames already captured by the image acquisition device exceeds a preset first threshold. This first threshold indicates whether the image acquisition device has completely captured the vehicle. If the first appearance count does not exceed the first preset threshold, it is determined that the image acquisition device has not yet completely captured the vehicle, meaning the vehicle has just entered the monitoring range of the image acquisition device, and the detected vehicle feature vector is insufficient. At this point, the electronic device needs to perform multiple matches with the target vehicle based on the vehicles included in the image frames captured by the first image acquisition device preceding it to determine the target vehicle's ID. The first preset threshold of image frames can also be referred to as a feature matching buffer.

[0096] Specifically, based on the installation location of the image acquisition devices, the electronic device pre-stores the sequence corresponding to each image acquisition device. The electronic device determines the first image acquisition device preceding the current one according to this sequence and obtains the vehicle information of each first candidate vehicle that leaves the monitoring range of that first image acquisition device. This vehicle information includes the candidate ID and candidate feature vector of the first candidate vehicle. It should be noted that the electronic device is pre-configured with a storage area, which is divided into multiple sub-storage areas. Each sub-storage area corresponds to a specific image acquisition device and stores the vehicle information of each first candidate vehicle that leaves the monitoring range of that image acquisition device. This storage area can be named the cross-camera trajectory container.

[0097] The electronic device determines the target first candidate vehicle that matches the target vehicle based on the candidate feature vector carried in the vehicle information of each first candidate vehicle and the feature vector of the target vehicle. It then determines whether the target first candidate vehicle matches the target vehicle based on the candidate feature vector of the target first candidate vehicle and the target feature vector of the target vehicle. If they match, the second number of matches between the target vehicle and the target first candidate vehicle is updated. If the updated second number exceeds a preset second number threshold, the target candidate ID of the target first candidate vehicle is determined as the ID of the target vehicle.

[0098] To alleviate the storage pressure on electronic devices, based on the above embodiments, the method in this application embodiment further includes:

[0099] Delete the saved vehicle information of the first candidate vehicle of the target.

[0100] In this embodiment of the application, in order to reduce the storage pressure on the electronic device, after it is determined that the target vehicle and the target first candidate vehicle are successfully matched, the electronic device deletes the vehicle information of the target first candidate vehicle that has been saved.

[0101] Example 5:

[0102] To improve the accuracy of vehicle matching, based on the above embodiments, in this embodiment, determining the target vehicle matching the vehicle in the previous image frame according to the target feature vector and target region category corresponding to the vehicle includes:

[0103] Determine the target sub-feature vector corresponding to the target region category in the target feature vector, and the sub-feature vector corresponding to the target region category in the feature vector of each second candidate vehicle in the previous image frame;

[0104] The similarity between the target sub-feature vector and each sub-feature vector is determined, and the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold is determined as the target vehicle.

[0105] In this embodiment, to improve the accuracy of vehicle matching, when determining a target vehicle to be matched, the electronic device determines the target sub-feature vector corresponding to the target region category in the target feature vector of the vehicle, and the sub-feature vector corresponding to the target region category in the feature vector of each second candidate vehicle in the previous image frame. The electronic device determines the similarity between each target sub-feature vector and each sub-feature vector, and determines the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold as the target vehicle.

[0106] In this embodiment, the feature vector corresponding to the vehicle contains multiple sub-feature vectors, each of which corresponds to a region category. When the electronic device determines the target vehicle matching the vehicle in the previous image frame based on the target feature vector and the target region category, it only needs to determine the similarity between the target feature vector and the sub-feature vector corresponding to the target region category in the feature vector. This improves the accuracy of vehicle matching and tracking, while also reducing the computational burden on the electronic device.

[0107] Example 6:

[0108] To further track vehicles, based on the above embodiments, in this embodiment of the application, the method further includes:

[0109] Based on the ID of each vehicle in the current image frame and the ID of the vehicle in each image frame already acquired by the image acquisition device, determine whether there is a third candidate vehicle that does not appear in the current image frame.

[0110] If it exists, the number of first frames in which the third candidate vehicle appears and the number of second frames in which it disappears in the image frames already acquired by the image acquisition device are counted. If the number of first frames exceeds a preset threshold for the number of appearing frames and the number of second frames exceeds a preset threshold for the number of disappearing frames, then the third candidate vehicle is determined to be a vehicle that has left the monitoring range of the image acquisition device.

[0111] Obtain the feature vector and vehicle information of the third candidate vehicle in the last image frame in which it appears, and save the vehicle information.

[0112] In this embodiment, there may be a situation where a vehicle appeared in the previous image frame but not in the current frame, meaning a vehicle has left the monitoring range of the image acquisition device. To enable vehicle tracking across image acquisition devices, in this embodiment, the electronic device stores vehicle information for vehicles that have left the monitoring range of the image acquisition device. Furthermore, to eliminate false detections, this embodiment pre-stores a filtering mechanism in the electronic device. Only when the number of frames in which a vehicle appears reaches a threshold for appearance, and the number of frames in which a vehicle disappears reaches a threshold for disappearance, will the electronic device consider the vehicle to have left the monitoring range of the image acquisition device.

[0113] Specifically, in this embodiment, the electronic device determines whether there is a third candidate vehicle that does not appear in the current image frame based on the ID of each vehicle in the current image frame and the ID of each vehicle in each image frame already acquired by the image acquisition device. If there is, the electronic device counts the number of first frames in which the third candidate vehicle appears and the number of second frames in which it disappears in the image frames already acquired by the image acquisition device. If the number of first frames exceeds a preset threshold for the number of appearance frames and the number of second frames exceeds a preset threshold for the number of disappearance frames, then the electronic device determines that the third candidate vehicle is a vehicle that has left the monitoring range of the image acquisition device. The electronic device then obtains the feature vector of the third candidate vehicle in the last image frame in which it appears and the vehicle information with the ID of the third candidate vehicle, and saves the vehicle information.

[0114] Example 7:

[0115] Figure 7 This application provides a schematic diagram of a vehicle tracking device, which includes:

[0116] The acquisition module 701 is used to acquire the current image frame acquired by the image acquisition device;

[0117] The feature extraction module 702 is used to input the current image frame into the trained model, and obtain the feature vector of each vehicle contained in the current image frame output by the model, as well as the region category corresponding to the region where each vehicle is collected, wherein the feature vector includes a sub-feature vector corresponding to each region category.

[0118] The matching module 703 is used to determine, for each vehicle contained in the current image frame, a target vehicle that matches the vehicle in the previous image frame based on the target feature vector and target region category corresponding to the vehicle; and to determine the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame.

[0119] In one possible implementation, the feature extraction module 702 is specifically used to input the current image frame into the first sub-model of the model to determine a first intermediate image frame carrying the location information of each vehicle; input the first intermediate image frame into the second sub-model of the model to determine a second intermediate image frame carrying the feature vector of each vehicle; input the second intermediate image frame into the third sub-model of the model to determine the region category corresponding to the region of each vehicle, and output the feature vector and region category corresponding to each vehicle.

[0120] In one possible implementation, the device further includes:

[0121] Training module 704 is used to acquire each sample image frame stored in the training sample set, wherein each sample image frame carries the initial position information and initial region category corresponding to each vehicle; for each sample image frame, the sample image frame is input into the first sub-model of the model to determine the predicted first intermediate image frame carrying the predicted position information of each vehicle; the predicted first intermediate image frame is input into the second sub-model of the model, the second sub-model determines the predicted feature vector of each vehicle, assigns a prediction number to each vehicle according to the predicted feature vector of each vehicle, and determines the predicted second intermediate image frame carrying the predicted feature vector of each vehicle. The model takes a series of images: 1) Image frames; 2) The predicted second intermediate image frames are input into the third sub-model of the model to determine the predicted region category corresponding to the region of each vehicle; 3) Based on the initial position information and predicted position information corresponding to each vehicle in each sample image frame, a first loss value corresponding to the first sub-model is determined; 4) Based on whether there are vehicles with at least the same predicted number in each sample image frame, a second loss value corresponding to the second sub-model is determined; 5) Based on the initial region category and predicted region category corresponding to each vehicle in each sample image frame, a third loss value corresponding to the third sub-model is determined; 6) Based on the first loss value, the second loss value, and the third loss value, the parameters of the model are adjusted.

[0122] In one possible implementation, the device further includes:

[0123] The processing module 705 is further configured to determine whether the first appearance of the target vehicle in the image frames already acquired by the image acquisition device exceeds a preset first number threshold; if not, then determine the first image acquisition device before the image acquisition device according to the pre-saved order of each image acquisition device; obtain the vehicle information of each first candidate vehicle that leaves the monitoring range of the first image acquisition device, wherein the vehicle information includes the candidate ID and candidate feature vector of the first candidate vehicle; determine the target first candidate vehicle that matches the target vehicle among each first candidate vehicle according to the candidate feature vector carried in the vehicle information of each first candidate vehicle and the feature vector of the target vehicle; determine whether the target first candidate vehicle matches the vehicle according to the candidate feature vector of the target first candidate vehicle and the target feature vector of the vehicle; if they match, update the second number of times the target vehicle and the target first candidate vehicle match; if the updated second number exceeds the preset second number threshold, then determine the target candidate ID of the target first candidate vehicle as the ID of the target vehicle.

[0124] In one possible implementation, the processing module 705 is further configured to delete the saved vehicle information of the target first candidate vehicle.

[0125] In one possible implementation, the matching module 703 is specifically used to determine the target sub-feature vector corresponding to the target region category in the target feature vector, and the sub-feature vector corresponding to the target region category in the feature vector corresponding to each second candidate vehicle in the previous image frame; determine the similarity between the target sub-feature vector and each sub-feature vector, and determine the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold as the target vehicle.

[0126] In one possible implementation, the processing module 705 is further configured to determine whether there is a third candidate vehicle that does not appear in the current image frame based on the ID of each vehicle in the current image frame and the ID of the vehicle in each image frame already acquired by the image acquisition device; if there is, the module counts the number of first frames in which the third candidate vehicle appears and the number of second frames in which it disappears in the image frames already acquired by the image acquisition device; if the number of first frames exceeds a preset threshold for the number of appearance frames and the number of second frames exceeds a preset threshold for the number of disappearance frames, the module determines that the third candidate vehicle is a vehicle that has left the monitoring range of the image acquisition device; the module obtains the feature vector of the third candidate vehicle in the last image frame in which it appears and the vehicle information with the ID of the third candidate vehicle, and saves the vehicle information.

[0127] Example 8:

[0128] Based on the above embodiments, this application also provides an electronic device. Figure 8 This application provides a schematic diagram of an electronic device structure, such as... Figure 8 As shown, it includes: processor 801, communication interface 802, memory 803 and communication bus 804, wherein processor 801, communication interface 802 and memory 803 communicate with each other through communication bus 804.

[0129] The memory 803 stores a computer program. When the program is executed by the processor 801, the processor 801 performs the following steps:

[0130] Acquire the current image frame captured by the image acquisition device;

[0131] The current image frame is input into the trained model to obtain the feature vector of each vehicle in the current image frame output by the model, and the region category corresponding to the area where each vehicle is collected. The feature vector includes a sub-feature vector corresponding to each region category.

[0132] For each vehicle contained in the current image frame, a target vehicle matching the vehicle in the previous image frame is determined based on the target feature vector and target region category corresponding to the vehicle; the ID corresponding to the target vehicle is determined as the ID of the vehicle in the current image frame.

[0133] In one possible implementation, the processor is further configured to:

[0134] The current image frame is input into the first sub-model of the model to determine the first intermediate image frame carrying the position information of each vehicle;

[0135] The first intermediate image frame is input into the second sub-model of the model to determine the second intermediate image frame carrying the feature vector of each vehicle;

[0136] The second intermediate image frame is input into the third sub-model of the model to determine the region category corresponding to each vehicle's region, and output the feature vector and region category corresponding to each vehicle.

[0137] In one possible implementation, the processor is further configured to:

[0138] Obtain each sample image frame stored in the training sample set, wherein each sample image frame carries the initial position information and initial region category corresponding to each vehicle;

[0139] For each sample image frame, the sample image frame is input into the first sub-model of the model to determine a first intermediate image frame carrying the predicted location information of each vehicle; the first intermediate image frame is input into the second sub-model of the model, the second sub-model determines the predicted feature vector of each vehicle, assigns a prediction number to each vehicle based on the predicted feature vector, and determines a second intermediate image frame carrying the predicted feature vector of each vehicle; the second intermediate image frame is input into the third sub-model of the model to determine the predicted region category corresponding to the region of each vehicle.

[0140] Based on the initial position information and predicted position information of each vehicle in each sample image frame, determine the first loss value corresponding to the first sub-model;

[0141] The second loss value corresponding to the second sub-model is determined based on whether there are at least two vehicles with the same predicted number in each sample image frame.

[0142] The third loss value corresponding to the third sub-model is determined based on the initial region category and the predicted region category corresponding to each vehicle in each sample image frame;

[0143] The parameters of the model are adjusted based on the first loss value, the second loss value, and the third loss value.

[0144] In one possible implementation, the processor is further configured to:

[0145] Determine whether the number of times the target vehicle appears in the image frames already acquired by the image acquisition device exceeds a preset first number threshold.

[0146] If not, then determine the first image acquisition device preceding the image acquisition device according to the pre-saved order of each image acquisition device;

[0147] Obtain the vehicle information of each first candidate vehicle that leaves the monitoring range of the first image acquisition device, wherein the vehicle information includes the candidate ID and candidate feature vector of the first candidate vehicle;

[0148] Based on the candidate feature vector carried in the vehicle information of each first candidate vehicle and the feature vector of the target vehicle, determine the target first candidate vehicle that matches the target vehicle in each first candidate vehicle;

[0149] Based on the candidate feature vector of the first candidate vehicle and the target feature vector of the vehicle, it is determined whether the first candidate vehicle matches the vehicle. If they match, the second number of times the target vehicle and the first candidate vehicle match is updated.

[0150] If the updated second count exceeds the preset second count threshold, then the target candidate ID of the first target candidate vehicle is determined as the ID of the target vehicle.

[0151] In one possible implementation, the processor is further configured to:

[0152] Delete the saved vehicle information of the first candidate vehicle of the target.

[0153] In one possible implementation, the processor is further configured to:

[0154] Determine the target sub-feature vector corresponding to the target region category in the target feature vector, and the sub-feature vector corresponding to the target region category in the feature vector of each second candidate vehicle in the previous image frame;

[0155] The similarity between the target sub-feature vector and each sub-feature vector is determined, and the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold is determined as the target vehicle.

[0156] In one possible implementation, the processor is further configured to:

[0157] Based on the ID of each vehicle in the current image frame and the ID of the vehicle in each image frame already acquired by the image acquisition device, determine whether there is a third candidate vehicle that does not appear in the current image frame.

[0158] If it exists, the number of first frames in which the third candidate vehicle appears and the number of second frames in which it disappears in the image frames already acquired by the image acquisition device are counted. If the number of first frames exceeds a preset threshold for the number of appearing frames and the number of second frames exceeds a preset threshold for the number of disappearing frames, then the third candidate vehicle is determined to be a vehicle that has left the monitoring range of the image acquisition device.

[0159] Obtain the feature vector and vehicle information of the third candidate vehicle in the last image frame in which it appears, and save the vehicle information.

[0160] Since the principle of the above-mentioned electronic device in solving the problem is similar to that of the vehicle tracking method, the implementation of the above-mentioned electronic device can be found in the embodiments of the method, and repeated details will not be described again.

[0161] The communication bus mentioned in the aforementioned electronic device can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used in the figure, but this does not indicate that there is only one bus or one type of bus. The communication interface 802 is used for communication between the aforementioned electronic device and other devices. The memory can include random access memory (RAM) or non-volatile memory (NVM), such as at least one disk storage device. Optionally, the memory can also be at least one storage device located remotely from the aforementioned processor.

[0162] The processors mentioned above can be general-purpose processors, including central processing units, network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits, field-programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

[0163] Example 9:

[0164] Based on the above embodiments, this invention also provides a computer-readable storage medium storing a computer program executable by a processor. When the program runs on the processor, it causes the processor to perform the following steps:

[0165] Acquire the current image frame captured by the image acquisition device;

[0166] The current image frame is input into the trained model to obtain the feature vector of each vehicle in the current image frame output by the model, and the region category corresponding to the area where each vehicle is collected. The feature vector includes a sub-feature vector corresponding to each region category.

[0167] For each vehicle contained in the current image frame, a target vehicle matching the vehicle in the previous image frame is determined based on the target feature vector and target region category corresponding to the vehicle; the ID corresponding to the target vehicle is determined as the ID of the vehicle in the current image frame.

[0168] In one possible implementation, the step of inputting the current image frame into the trained model to obtain the feature vector of each vehicle contained in the current image frame output by the model, and the region category corresponding to the region where each vehicle is sampled, includes:

[0169] The current image frame is input into the first sub-model of the model to determine the first intermediate image frame carrying the position information of each vehicle;

[0170] The first intermediate image frame is input into the second sub-model of the model to determine the second intermediate image frame carrying the feature vector of each vehicle;

[0171] The second intermediate image frame is input into the third sub-model of the model to determine the region category corresponding to each vehicle's region, and output the feature vector and region category corresponding to each vehicle.

[0172] In one possible implementation, the training process of the model includes:

[0173] Obtain each sample image frame stored in the training sample set, wherein each sample image frame carries the initial position information and initial region category corresponding to each vehicle;

[0174] For each sample image frame, the sample image frame is input into the first sub-model of the model to determine a first intermediate image frame carrying the predicted location information of each vehicle; the first intermediate image frame is input into the second sub-model of the model, the second sub-model determines the predicted feature vector of each vehicle, assigns a prediction number to each vehicle based on the predicted feature vector, and determines a second intermediate image frame carrying the predicted feature vector of each vehicle; the second intermediate image frame is input into the third sub-model of the model to determine the predicted region category corresponding to the region of each vehicle.

[0175] Based on the initial position information and predicted position information of each vehicle in each sample image frame, determine the first loss value corresponding to the first sub-model;

[0176] The second loss value corresponding to the second sub-model is determined based on whether there are at least two vehicles with the same predicted number in each sample image frame.

[0177] The third loss value corresponding to the third sub-model is determined based on the initial region category and the predicted region category corresponding to each vehicle in each sample image frame;

[0178] The parameters of the model are adjusted based on the first loss value, the second loss value, and the third loss value.

[0179] In one possible implementation, before determining the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame, the method includes:

[0180] Determine whether the number of times the target vehicle appears in the image frames already acquired by the image acquisition device exceeds a preset first number threshold.

[0181] If not, then determine the first image acquisition device preceding the image acquisition device according to the pre-saved order of each image acquisition device;

[0182] Obtain the vehicle information of each first candidate vehicle that leaves the monitoring range of the first image acquisition device, wherein the vehicle information includes the candidate ID and candidate feature vector of the first candidate vehicle;

[0183] Based on the candidate feature vector carried in the vehicle information of each first candidate vehicle and the feature vector of the target vehicle, determine the target first candidate vehicle that matches the target vehicle in each first candidate vehicle;

[0184] Based on the candidate feature vector of the first candidate vehicle and the target feature vector of the vehicle, it is determined whether the first candidate vehicle matches the vehicle. If they match, the second number of times the target vehicle and the first candidate vehicle match is updated.

[0185] If the updated second count exceeds the preset second count threshold, then the target candidate ID of the first target candidate vehicle is determined as the ID of the target vehicle.

[0186] In one possible implementation, the method further includes:

[0187] Delete the saved vehicle information of the first candidate vehicle of the target.

[0188] In one possible implementation, determining the target vehicle matching the vehicle in the previous image frame based on the target feature vector corresponding to the vehicle and the target region category includes:

[0189] Determine the target sub-feature vector corresponding to the target region category in the target feature vector, and the sub-feature vector corresponding to the target region category in the feature vector of each second candidate vehicle in the previous image frame;

[0190] The similarity between the target sub-feature vector and each sub-feature vector is determined, and the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold is determined as the target vehicle.

[0191] In one possible implementation, the method further includes:

[0192] Based on the ID of each vehicle in the current image frame and the ID of the vehicle in each image frame already acquired by the image acquisition device, determine whether there is a third candidate vehicle that does not appear in the current image frame.

[0193] If it exists, the number of first frames in which the third candidate vehicle appears and the number of second frames in which it disappears in the image frames already acquired by the image acquisition device are counted. If the number of first frames exceeds a preset threshold for the number of appearing frames and the number of second frames exceeds a preset threshold for the number of disappearing frames, then the third candidate vehicle is determined to be a vehicle that has left the monitoring range of the image acquisition device.

[0194] Obtain the feature vector and vehicle information of the third candidate vehicle in the last image frame in which it appears, and save the vehicle information.

[0195] Since the principle of the computer-readable storage medium in solving the problem is similar to that of the vehicle tracking method, the implementation of the computer-readable storage medium can be found in the embodiments of the method, and repeated details will not be described again.

[0196] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0197] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to this application. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0198] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0199] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0200] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.

Claims

1. A vehicle tracking method, characterized in that, The method includes: Acquire the current image frame captured by the image acquisition device; The current image frame is input into the trained model to obtain the feature vector of each vehicle contained in the current image frame output by the model, as well as the region category corresponding to the area where each vehicle is sampled. The feature vector includes a sub-feature vector corresponding to each region category. The region category includes the front of the vehicle, the body of the vehicle, and the rear of the vehicle. For each vehicle contained in the current image frame, based on the target feature vector and target region category corresponding to the vehicle, determine the target vehicle that matches the vehicle in the previous image frame; and determine the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame. The step of inputting the current image frame into the trained model to obtain the feature vector of each vehicle contained in the current image frame output by the model, and the region category corresponding to the region where each vehicle is sampled, includes: The current image frame is input into the first sub-model of the model to determine the first intermediate image frame carrying the position information of each vehicle; The first intermediate image frame is input into the second sub-model of the model to determine the second intermediate image frame carrying the feature vector of each vehicle; The second intermediate image frame is input into the third sub-model of the model to determine the region category corresponding to the region of each vehicle, and output the feature vector and region category corresponding to each vehicle. The step of determining the target vehicle matching the vehicle in the previous image frame based on the target feature vector and target region category corresponding to the vehicle includes: Determine the target sub-feature vector corresponding to the target region category in the target feature vector, and the sub-feature vector corresponding to the target region category in the feature vector of each second candidate vehicle in the previous image frame; The similarity between the target sub-feature vector and each sub-feature vector is determined, and the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold is determined as the target vehicle.

2. The method according to claim 1, characterized in that, The training process of the model includes: Obtain each sample image frame stored in the training sample set, wherein each sample image frame carries the initial position information and initial region category corresponding to each vehicle; For each sample image frame, the sample image frame is input into the first sub-model of the model to determine a first intermediate image frame carrying the predicted location information of each vehicle; the first intermediate image frame is input into the second sub-model of the model, the second sub-model determines the predicted feature vector of each vehicle, assigns a prediction number to each vehicle based on the predicted feature vector, and determines a second intermediate image frame carrying the predicted feature vector of each vehicle; the second intermediate image frame is input into the third sub-model of the model to determine the predicted region category corresponding to the region of each vehicle. Based on the initial position information and predicted position information of each vehicle in each sample image frame, determine the first loss value corresponding to the first sub-model; The second loss value corresponding to the second sub-model is determined based on whether there are at least two vehicles with the same predicted number in each sample image frame. The third loss value corresponding to the third sub-model is determined based on the initial region category and the predicted region category corresponding to each vehicle in each sample image frame; The parameters of the model are adjusted based on the first loss value, the second loss value, and the third loss value.

3. The method according to claim 1, characterized in that, Before determining the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame, the method includes: Determine whether the number of times the target vehicle appears in the image frames already acquired by the image acquisition device exceeds a preset first number threshold. If not, then determine the first image acquisition device preceding the image acquisition device according to the pre-saved order of each image acquisition device; Obtain the vehicle information of each first candidate vehicle that leaves the monitoring range of the first image acquisition device, wherein the vehicle information includes the candidate ID and candidate feature vector of the first candidate vehicle; Based on the candidate feature vector carried in the vehicle information of each first candidate vehicle and the feature vector of the target vehicle, determine the target first candidate vehicle that matches the target vehicle in each first candidate vehicle; Based on the candidate feature vector of the first candidate vehicle and the target feature vector of the vehicle, it is determined whether the first candidate vehicle matches the vehicle. If they match, the second number of times the target vehicle and the first candidate vehicle match is updated. If the updated second count exceeds the preset second count threshold, then the target candidate ID of the first target candidate vehicle is determined as the ID of the target vehicle.

4. The method according to claim 3, characterized in that, The method further includes: Delete the saved vehicle information of the first candidate vehicle of the target.

5. The method according to claim 1, characterized in that, The method further includes: Based on the ID of each vehicle in the current image frame and the ID of the vehicle in each image frame already acquired by the image acquisition device, determine whether there is a third candidate vehicle that does not appear in the current image frame. If it exists, the number of first frames in which the third candidate vehicle appears and the number of second frames in which it disappears in the image frames already acquired by the image acquisition device are counted. If the number of first frames exceeds a preset threshold for the number of appearing frames and the number of second frames exceeds a preset threshold for the number of disappearing frames, then the third candidate vehicle is determined to be a vehicle that has left the monitoring range of the image acquisition device. Obtain the feature vector and vehicle information of the third candidate vehicle in the last image frame in which it appears, and save the vehicle information.

6. A vehicle tracking device, characterized in that, The device includes: The acquisition module is used to acquire the current image frame captured by the image acquisition device; The feature extraction module is used to input the current image frame into the trained model, obtain the feature vector of each vehicle contained in the current image frame output by the model, and the region category corresponding to the area where each vehicle is collected. The feature vector includes a sub-feature vector corresponding to each region category. The region category includes the front of the vehicle, the body of the vehicle, and the rear of the vehicle. The matching module is used to determine the target vehicle that matches the vehicle in the previous image frame for each vehicle contained in the current image frame, based on the target feature vector and target region category corresponding to the vehicle; and to determine the ID corresponding to the target vehicle as the ID of the vehicle in the current image frame. Specifically, the feature extraction module is used to input the current image frame into the first sub-model of the model to determine a first intermediate image frame carrying the location information of each vehicle; input the first intermediate image frame into the second sub-model of the model to determine a second intermediate image frame carrying the feature vector of each vehicle; input the second intermediate image frame into the third sub-model of the model to determine the region category corresponding to the region of each vehicle, and output the feature vector and region category corresponding to each vehicle. Specifically, the matching module is used to determine the target sub-feature vector corresponding to the target region category in the target feature vector, and the sub-feature vector corresponding to the target region category in the feature vector of each second candidate vehicle in the previous image frame; determine the similarity between the target sub-feature vector and each sub-feature vector, and determine the second candidate vehicle corresponding to the highest similarity exceeding the similarity threshold as the target vehicle.

7. An electronic device, characterized in that, The electronic device includes a processor that executes a computer program stored in a memory to implement the steps of the vehicle tracking method as described in any one of claims 1-5.

8. A computer-readable storage medium, characterized in that, It stores a computer program that, when executed by a processor, implements the steps of the vehicle tracking method as described in any one of claims 1-5.