Intelligent parking state sensing method and device based on target detection tracking

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By deploying vehicle target detection and tracking models on edge computing gateways, and combining improved deep learning and optimization algorithms, the problems of network dependence, detection and tracking disconnect, and weak model generalization ability of existing parking status perception technologies are solved, achieving efficient and stable parking status perception.

CN121545115BActive Publication Date: 2026-06-23BEIJING TEDA ZHIYUAN ENG TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BEIJING TEDA ZHIYUAN ENG TECH CO LTD
Filing Date: 2025-11-13
Publication Date: 2026-06-23

Application Information

Patent Timeline

13 Nov 2025

Application

23 Jun 2026

Publication

CN121545115B

IPC: G06V20/52; G06V20/40; G06V10/25; G06V10/80; G06V10/82; G06V10/44

AI Tagging

Application Domain

Character and pattern recognition

Technology Topics

Cloud processingTracking model

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A bar wafer pre-scanning method based on laser scanning technology
CN120746997BImage analysis Geometric image transformationCloud processingLaser scanning
A cloud-edge collaboration based deep hole geological monitoring method and system
CN120956761BResource allocation Measurement devicesCloud processingEngineering
Point cloud processing device and point cloud processing method
WO2026126322A1Using optical means Computational scienceCloud processing
New energy infrastructure remote collaborative inspection method fusing digital employees and augmented reality
CN122266062AChecking time patrolsClosed circuit television systemsCloud processingAcquisition apparatus
A material grabbing sequence determination method, device, equipment and storage medium
CN122244166AImage analysis De-stacking articlesCloud processingEngineering

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing parking status perception technologies rely on cloud processing, resulting in high network bandwidth and latency, a disconnect between detection and tracking, and simple status perception with weak model generalization ability, making it difficult to meet the requirements of real-time performance and adaptability to complex scenarios.

Method used

Vehicle target detection and tracking models are deployed to edge computing gateways, and combined with improved deep learning and optimization algorithms, vehicle target detection, tracking, and state determination are performed. The model is optimized through a federated learning mechanism to achieve multi-objective optimization decision-making.

Benefits of technology

It reduces network bandwidth dependence, reduces data transmission latency, provides second-level parking space status awareness, improves the stability and accuracy of awareness, adapts to various complex scenarios, and reduces model deployment costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN121545115B_ABST

Patent Text Reader

Abstract

The application discloses a parking state intelligent sensing method and device based on target detection tracking, and relates to the technical field of image recognition. The method comprises the following steps: deploying a vehicle target detection model and a vehicle target tracking model in a cloud server to all edge computing gateways of a parking lot monitoring system; inputting a video stream collected by a camera into the vehicle target detection model of the edge computing gateway, performing vehicle target detection, and obtaining a plurality of vehicle targets; inputting the plurality of vehicle targets into the vehicle target tracking model of the edge computing gateway, performing vehicle target tracking, and obtaining active trajectories of the plurality of vehicle targets; and using an improved optimization algorithm to perform dynamic optimization according to the active trajectories of the plurality of vehicle targets, and obtaining a parking state list of the parking lot. The method solves the problems of the prior art, such as dependence on cloud processing, disconnection between detection and tracking, simple state sensing, and weak model generalization ability.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image recognition technology, and in particular to a parking status intelligent perception method and device based on target detection and tracking. Background Technology

[0002] With the acceleration of urbanization and the surge in car ownership, parking difficulties have become a common problem restricting urban development. Smart parking systems, as a key infrastructure for solving this problem, are characterized by their ability to accurately and in real-time sense the occupancy status of each parking space.

[0003] Currently, mainstream parking status sensing technologies mainly include: Geomagnetic / inductive loop detection: This method detects magnetic field or physical changes caused by vehicles by burying sensors under the parking space. While this method offers high accuracy, it is complex to install, costly, difficult to maintain, and cannot obtain additional information such as vehicle identification. Ultrasonic / infrared detection: This method installs sensors above the parking space to determine occupancy by emitting and receiving sound or infrared waves. This method is relatively low-cost but susceptible to environmental interference (such as rain, snow, dust, and temperature) and has limited installation locations. Pure video analysis technology: This method uses cameras to capture images and then uses image processing techniques to determine the parking space status. Traditional methods typically employ background subtraction, inter-frame subtraction, or simple image recognition (such as identifying whether a vehicle is within the parking space lines). These methods have poor robustness in complex scenarios such as changes in lighting, shadows, and occlusion, resulting in high false positive and false negative rates.

[0004] However, existing technologies still have the following shortcomings:

[0005] 1) Reliance on cloud processing: Uploading all video streams to cloud servers for analysis requires high network bandwidth, has high transmission latency, makes it difficult to meet real-time requirements, and poses a risk of data privacy leakage.

[0006] 2) Disconnect between detection and tracking: Most solutions rely solely on single-frame images for detection, lacking continuous analysis of vehicle movement trajectories. When a vehicle is briefly obscured or detection jitter occurs, it can easily lead to repeated jumps in parking space status judgment, resulting in insufficient stability.

[0007] 3) Simple state perception: After obtaining the vehicle detection results, simple rules (such as the intersection-union ratio (IOU) between the detection box and the parking space area) are usually used to determine the parking space occupancy. This decision-making method is prone to errors in complex situations such as vehicles crossing parking spaces or dense parking, and lacks intelligent comprehensive decision-making capabilities.

[0008] 4) Weak model generalization ability: When deployed in different parking lots (such as indoor / outdoor, lighting conditions, and camera angles), a large amount of labeled data is required for retraining, resulting in high costs for model migration and optimization. Summary of the Invention

[0009] This invention provides a parking status intelligent perception method and device based on target detection and tracking, which solves the problems of existing technologies such as reliance on cloud processing, disconnect between detection and tracking, simple status perception, and weak model generalization ability.

[0010] In a first aspect, embodiments of the present invention provide a parking state intelligent perception method based on target detection and tracking, the method comprising:

[0011] Deploy the vehicle target detection model and vehicle target tracking model from the cloud server to all edge computing gateways of the parking lot monitoring system;

[0012] The video stream captured by the camera is input into the vehicle target detection model of the edge computing gateway to perform vehicle target detection and obtain several vehicle targets;

[0013] Several vehicle targets are input into the vehicle target tracking model of the edge computing gateway to perform vehicle target tracking and obtain the active trajectories of several vehicle targets;

[0014] Based on the active trajectories of several vehicle targets, an improved optimization algorithm is used to dynamically optimize and obtain a parking status list for the parking lot.

[0015] The technical solution provided in this application has at least the following beneficial effects:

[0016] Employing a cloud-based training and edge-based inference architecture, complex model training and aggregation are performed in the cloud, while real-time video analysis tasks (detection, tracking, and state determination) are offloaded to the edge computing gateway. This significantly reduces the dependence of video stream uploads on network bandwidth, minimizes data transmission latency, and achieves second-level parking space status awareness, while simultaneously ensuring the privacy and security of local video data. Through analysis of vehicle activity trajectories and decision-making based on optimization algorithms, the system is insensitive to jitter and transient errors in single-frame detection. Combined with the final state confirmation step, it can output smooth, stable, and persistent parking space status, avoiding frequent state jumps and providing reliable data support for upper-layer applications (such as parking guidance, reverse vehicle search, and billing management). The parking space status awareness problem is modeled as a multi-objective problem. By comprehensively considering multiple pieces of evidence, the algorithm can make intelligent decisions that are closer to human cognition, accurately distinguish the occupancy status of adjacent parking spaces, greatly reduce the false positive and false negative rates, and flexibly adapt to various complex parking scenarios. Whether it is angled parking spaces, mechanical garages, or dynamic processes such as vehicle entry and exit and reversing into parking spaces, the algorithm can achieve the optimal perception effect in the scenario by adjusting the weights of the fitness function. The cloud uses a dynamic federated weight mechanism (which can dynamically adjust the aggregated weights according to the model evaluation metrics of each training server) to perform secure aggregation, so that the final model integrates knowledge from multiple scenarios and has stronger generalization ability. When deploying in a new parking lot, only a small amount of local data is needed to participate in federated fine-tuning for rapid adaptation, which greatly reduces the cost and cycle of model deployment and optimization.

[0017] In one alternative implementation, the vehicle target detection model and vehicle target tracking model from the cloud server are deployed to all edge computing gateways of the parking lot monitoring system, including:

[0018] Using deep learning algorithms, an initial vehicle target detection model and an initial vehicle target tracking model are built in a cloud server, and the initial vehicle target detection model and the initial vehicle target tracking model are deployed to all training servers.

[0019] Based on the federated learning mechanism, local training image datasets are used in training servers for different scenarios to train the corresponding initial vehicle target detection model and initial vehicle target tracking model locally, and the model update and model evaluation metrics after local training are uploaded to the cloud server.

[0020] Based on the model update volume and corresponding model evaluation metrics of each training server, the initial vehicle target detection model and the initial vehicle target tracking model of the cloud server are safely aggregated according to the dynamic federated weight mechanism to obtain the final vehicle target detection model and the final vehicle target tracking model.

[0021] The final vehicle target detection model and the final vehicle target tracking model are deployed to all edge computing gateways connected to the cloud server.

[0022] In one alternative implementation, the vehicle target detection model is built based on the improved YOLOv8 algorithm, and the vehicle target detection model includes a backbone network based on the MobileNetV3 algorithm, a neck network built based on the BiFPN algorithm, and a multi-task unified detection head. A dynamic receptive field attention module is set between the backbone network and the neck network. The dynamic receptive field attention module includes parallel local feature branches, global context branches, and dynamic receptive field branches.

[0023] The vehicle target tracking model is built based on the GAT-KMA algorithm and includes a spatiotemporal graph construction module, an inference and association module based on the GAT algorithm, and a trajectory generation module based on the KMA algorithm.

[0024] In one optional implementation, the video stream captured by the camera is input to the vehicle target detection model of the edge computing gateway to perform vehicle target detection, resulting in several vehicle targets, including:

[0025] The video stream is captured by cameras installed at key locations in the parking lot monitoring system and then input to the edge computing gateway.

[0026] The video stream is frame-trimmed and preprocessed to obtain continuous frames of image data to be identified, and then the continuous frames of image data to be identified are input into the vehicle target detection model.

[0027] The backbone network of the vehicle target detection model is used to extract the original feature map of the image data to be identified in each frame;

[0028] Using the dynamic receptive field attention module of the vehicle target detection model, the local feature map, global context feature map, and dynamic receptive field feature map of the original feature map are extracted, and then fused by a concatenation operation according to the dynamically generated attention weights to obtain the first fused feature map.

[0029] The neck network of the vehicle target detection model is used to perform a second fusion of the original feature map and the first fusion feature map to obtain the second fusion feature map.

[0030] Based on the second fused feature map, the multi-task unified detection head of the vehicle target detection model is used to detect vehicle targets and obtain several initial vehicle targets in the image data to be identified.

[0031] Post-processing is performed on several initial vehicle targets to obtain several final vehicle targets.

[0032] In one optional implementation, several vehicle targets are input into the vehicle target tracking model of the edge computing gateway to perform vehicle target tracking and obtain the active trajectories of the several vehicle targets, including:

[0033] Several vehicle targets are input into the edge computing gateway, the initial trajectory set corresponding to the image data to be identified in the previous frame is extracted, and a spatiotemporal graph including several node features and edge features is constructed using the spatiotemporal graph construction module of the vehicle target tracking model based on several vehicle targets.

[0034] Using the reasoning and association module of the vehicle target tracking model, reasoning and association are performed on the spatiotemporal graph to obtain the association matrix;

[0035] Using the trajectory generation module of the vehicle target tracking model, the correlation matrix is used as the cost matrix to find the optimal trajectory node for each detection node in the spatiotemporal graph.

[0036] Traverse all frames of the image data to be identified, and based on the initial trajectory set, perform trajectory management on several trajectory nodes of the image data to be identified in each frame to obtain the active trajectories of several vehicle targets.

[0037] In one alternative implementation, an improved optimization algorithm is used to dynamically optimize based on the active trajectories of several vehicle targets, resulting in a parking status list for the parking lot, including:

[0038] Based on the active trajectories of several vehicle targets, parking space areas are defined and evidence is extracted to obtain several parking spaces and a multi-dimensional evidence vector for each parking space.

[0039] Based on the multidimensional evidence vectors of several parking spaces, the problem of intelligent perception of parking space status is modeled as a multi-objective optimization problem. An improved optimization algorithm is used to perform dynamic optimization to obtain the set of parking status of the parking spaces.

[0040] The parking status set of parking spaces is confirmed to obtain a parking status list.

[0041] In one optional implementation, parking space areas are defined and evidence is extracted based on the active trajectories of several vehicle targets, resulting in several parking spaces and a multi-dimensional evidence vector for each parking space, including:

[0042] Define parking space areas in the parking lot to obtain a number of parking spaces;

[0043] Perform evidence initialization to obtain the parking space state set format and multidimensional evidence format;

[0044] Based on the state set format and multidimensional evidence format, evidence is extracted and quantified according to the active trajectories of several vehicle targets to generate a multidimensional evidence vector for each parking space.

[0045] In one alternative implementation, based on the multidimensional evidence vectors of several parking spaces, the intelligent perception problem of parking space status is modeled as a multi-objective optimization problem. An improved optimization algorithm is used to dynamically optimize and obtain a set of parking space parking states, including:

[0046] Based on the multidimensional evidence format of the intelligent perception problem of parking space status, a fitness function for a multi-objective optimization problem is set, and the IACO population parameters and maximum number of iterations are set.

[0047] Based on the IACO population parameters, the population is initialized using the Tent chaotic mapping sequence to obtain the initial IACO population; each IACO individual in the IACO population corresponds to a set of alternative parking states for a parking space.

[0048] Perform pheromone initialization to obtain the initial pheromone of each initial IACO individual;

[0049] A solution is constructed for the initial IACO population. Once all initial IACO individuals have constructed solutions, a global pheromone update is performed, and the global optimal solution is retained.

[0050] Based on the updated pheromones, the initial IACO population is iteratively updated to obtain the updated IACO population, and the global optimal solution is updated.

[0051] When the number of iterations reaches the maximum number of iterations or the fitness value of the global optimum meets the requirements, the iterative update of the IACO population is terminated, and the global optimum of the current iteration is output.

[0052] Decode the individual vectors of the IACO individuals corresponding to the global optimal solution to obtain the optimal parking state set of the parking space.

[0053] In one alternative implementation, the multidimensional evidence vector includes spatial evidence, constraint evidence, temporal evidence, and behavioral evidence.

[0054] Secondly, embodiments of the present invention provide a parking status intelligent sensing device based on target detection and tracking, used to implement an image recognition method, the device comprising:

[0055] The model deployment unit is used to deploy the vehicle target detection model and the vehicle target tracking model in the cloud server to all edge computing gateways of the parking lot monitoring system.

[0056] The vehicle target detection unit is used to input the video stream captured by the camera into the vehicle target detection model of the edge computing gateway to perform vehicle target detection and obtain several vehicle targets;

[0057] The vehicle target tracking unit is used to input several vehicle targets into the vehicle target tracking model of the edge computing gateway, perform vehicle target tracking, and obtain the active trajectories of several vehicle targets;

[0058] The parking status generation unit is used to dynamically optimize the parking status list of the parking lot based on the active trajectories of several vehicle targets using an improved optimization algorithm.

[0059] A third aspect of this invention provides an electronic device, which includes:

[0060] At least one processor; and a memory communicatively connected to the at least one processor; wherein,

[0061] The memory stores instructions that can be executed by at least one processor, such that the at least one processor can perform the method proposed in the first aspect of the present invention.

[0062] A fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method as described in the first aspect of the present invention. Attached Figure Description

[0063] Figure 1 This is a schematic diagram of the electronic device structure of the hardware operating environment involved in the embodiments of the present invention;

[0064] Figure 2 This is a flowchart illustrating the steps of an intelligent parking status perception method based on target detection and tracking provided in an embodiment of the present invention.

[0065] Figure 3 This is a functional unit diagram of a parking status intelligent sensing device based on target detection and tracking provided in an embodiment of the present invention. Detailed Implementation

[0066] To make the above-mentioned objects, features, and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without inventive effort are within the scope of protection of the present invention.

[0067] The present invention will be further described below with reference to the accompanying drawings.

[0068] Reference Figure 1 , Figure 1 This is a schematic diagram of the electronic device structure of the hardware operating environment involved in the embodiments of the present invention.

[0069] like Figure 1 As shown, the electronic device may include: a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to enable communication between these components. The user interface 1003 may include a display screen or an input unit such as a keyboard; optionally, the user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface). The memory 1005 may be a high-speed random access memory (RAM) or a stable non-volatile memory (NVM), such as a disk drive. The memory 1005 may also optionally be a storage device independent of the aforementioned processor 1001.

[0070] Those skilled in the art will understand that Figure 1 The structure shown does not constitute a limitation on the electronic device and may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0071] like Figure 1 As shown, the memory 1005, which serves as a storage medium, may include an operating device, a data storage module, a network communication module, a user interface module, and electronic programs.

[0072] exist Figure 1 In the electronic device shown, the network interface 1004 is mainly used for data communication with the network server; the user interface 1003 is mainly used for data interaction with the user; the processor 1001 and the memory 1005 in the electronic device of the present invention can be set in the electronic device. The electronic device calls the parking state intelligent sensing device based on target detection and tracking stored in the memory 1005 through the processor 1001 and executes the parking state intelligent sensing method based on target detection and tracking provided in the embodiment of the present invention.

[0073] Reference Figure 2 The present invention provides an intelligent parking status perception method based on target detection and tracking, the method comprising:

[0074] S201: Deploy the vehicle target detection model and vehicle target tracking model from the cloud server to all edge computing gateways of the parking lot monitoring system;

[0075] S202: Input the video stream captured by the camera into the vehicle target detection model of the edge computing gateway to perform vehicle target detection and obtain several vehicle targets;

[0076] S203: Input several vehicle targets into the vehicle target tracking model of the edge computing gateway to perform vehicle target tracking and obtain the active trajectories of several vehicle targets;

[0077] S204: Based on the active trajectories of several vehicle targets, an improved optimization algorithm is used to dynamically optimize and obtain a parking status list for the parking lot.

[0078] The technical solution provided in this application has at least the following beneficial effects:

[0079] Employing a cloud-based training and edge-based inference architecture, complex model training and aggregation are performed in the cloud, while real-time video analysis tasks (detection, tracking, and state determination) are offloaded to the edge computing gateway. This significantly reduces the dependence of video stream uploads on network bandwidth, minimizes data transmission latency, and achieves second-level parking space status awareness, while simultaneously ensuring the privacy and security of local video data. Through analysis of vehicle activity trajectories and decision-making based on optimization algorithms, the system is insensitive to jitter and transient errors in single-frame detection. Combined with the final state confirmation step, it can output smooth, stable, and persistent parking space status, avoiding frequent state jumps and providing reliable data support for upper-layer applications (such as parking guidance, reverse vehicle search, and billing management). The parking space status awareness problem is modeled as a multi-objective problem. By comprehensively considering multiple pieces of evidence, the algorithm can make intelligent decisions that are closer to human cognition, accurately distinguish the occupancy status of adjacent parking spaces, greatly reduce the false positive and false negative rates, and flexibly adapt to various complex parking scenarios. Whether it is angled parking spaces, mechanical garages, or dynamic processes such as vehicle entry and exit and reversing into parking spaces, the algorithm can achieve the optimal perception effect in the scenario by adjusting the weights of the fitness function. The cloud uses a dynamic federated weight mechanism (which can dynamically adjust the aggregated weights according to the model evaluation metrics of each training server) to perform secure aggregation, so that the final model integrates knowledge from multiple scenarios and has stronger generalization ability. When deploying in a new parking lot, only a small amount of local data is needed to participate in federated fine-tuning for rapid adaptation, which greatly reduces the cost and cycle of model deployment and optimization.

[0080] In one alternative implementation, the vehicle target detection model and vehicle target tracking model from the cloud server are deployed to all edge computing gateways of the parking lot monitoring system, including:

[0081] S2011: Using deep learning algorithms, build the initial vehicle target detection model and the initial vehicle target tracking model in the cloud server, and deploy the initial vehicle target detection model and the initial vehicle target tracking model to all training servers.

[0082] S2012: Based on the federated learning mechanism, local training image datasets are used in training servers for different scenarios to train the corresponding initial vehicle target detection model and initial vehicle target tracking model locally, and the model update and model evaluation metrics after local training are uploaded to the cloud server.

[0083] The local training image dataset includes images taken in the current scene, with manually annotated bounding boxes and predicted labels, resulting in an image dataset for scene-based training;

[0084] S2013: Based on the model update amount and corresponding model evaluation index of each training server, and using the dynamic federated weight mechanism, the initial vehicle target detection model and the initial vehicle target tracking model of the cloud server are safely aggregated to obtain the final vehicle target detection model and the final vehicle target tracking model.

[0085] The formula is:

[0086]

[0087] In the formula, The global model parameters are for iteration number t+1, t; For the first The number of model parameter updates per training server; t is the current iteration number; For the iteration number t, the first... Dynamic federated weights for each training server; K represents the number of training servers; K is the number of training servers.

[0088]

[0089] In the formula, According to the first The performance evaluation weights of the training server k are generated by the model evaluation metrics; t is the current iteration number; y is the model evaluation metric indicator, which includes training data volume, model training accuracy, model training efficiency, importance of model training scenario, etc.; Y is the number of model evaluation metrics. Weights for model evaluation indicators;

[0090]

[0091] In the formula, For training server The y-th model evaluation index; K represents the number of training servers; K is the number of training servers.

[0092] S2014: Deploy the final vehicle target detection model and the final vehicle target tracking model to all edge computing gateways connected to the cloud server.

[0093] In one alternative implementation, the vehicle target detection model is built based on the improved YOLOv8 algorithm, and the vehicle target detection model includes a backbone network based on the MobileNetV3 algorithm, a neck network built based on the Bidirectional Feature Pyramid Network (BiFPN) algorithm, and a multi-task unified detection head. A dynamic receptive field attention module is set between the backbone network and the neck network. The dynamic receptive field attention module includes parallel local feature branches, global context branches, and dynamic receptive field branches.

[0094] Local feature branch: A 3x3 depthwise separable convolution is responsible for capturing local detail features of the vehicle, such as headlights and grille;

[0095] Global context branch: A global average pooling layer followed by two 1x1 convolutions, responsible for capturing the overall contour and semantic information of the vehicle;

[0096] Dynamic receptive field branch: A deformable convolutional layer whose kernel sampling points are not fixed, but are learned through an additional sub-network to obtain offsets. This allows the kernel to dynamically adjust its receptive field according to the actual shape of the vehicle (such as when it is partially occluded), focusing on the visible part.

[0097] The improved YOLOv8 model is adopted, and its MobileNetV3 backbone network is lightweight and suitable for edge deployment. The innovative dynamic receptive field attention module can adaptively fuse local details, global context and multi-scale receptive field information, effectively cope with occlusion, illumination changes and scale differences, and improve the accuracy of vehicle detection.

[0098] The vehicle target tracking model is built based on the Graph Attention Network (GAT)-Kuhn-Munkres Algorithm (KMA) algorithm, and includes a spatiotemporal graph construction module, an inference and association module based on the GAT algorithm, and a trajectory generation module based on the KMA algorithm.

[0099] By leveraging GAT to uncover complex high-order relationships between detected targets and historical trajectories, and combining it with KMA for optimal matching, we can effectively handle complex scenarios such as target occlusion, intersection, disappearance, and reappearance, generating continuous and stable vehicle activity trajectories, and providing high-quality temporal input for subsequent state judgment.

[0100] In one optional implementation, the video stream captured by the camera is input to the vehicle target detection model of the edge computing gateway to perform vehicle target detection, resulting in several vehicle targets, including:

[0101] S2021: Use cameras installed at key locations in the parking lot monitoring system to capture video streams and input the video streams to the edge computing gateway;

[0102] High-definition network cameras supporting H.265 encoding, with a resolution of no less than 1920x1080 and a frame rate of no less than 25fps should be selected. The cameras should be deployed on the top of the parking lot or on the pillars, using a top-down or oblique top-down angle to ensure that the field of view can completely cover the target monitoring area and minimize mutual obstruction between vehicles. Precise geometric calibration is required during deployment to obtain the camera's intrinsic parameters (focal length, principal point, distortion coefficient) and extrinsic parameters (rotation matrix, translation vector) for subsequent image coordinate to world coordinate conversion.

[0103] S2022: Perform frame truncation and image preprocessing on the video stream to obtain continuous frame image data to be identified, and input the continuous frame image data to be identified into the vehicle target detection model;

[0104] Frame capture involves capturing images from the video stream according to a fixed time window to obtain initial image data to be identified for consecutive frames. Image preprocessing is then performed on the initial image data to be identified for consecutive frames to obtain final image data to be identified for consecutive frames. Finally, the final image data to be identified for consecutive frames is input into the vehicle target detection model.

[0105] Image preprocessing includes:

[0106] Distortion correction: Using camera intrinsic parameters and distortion coefficients, distortion correction is performed on the initial image data to be identified in each frame to eliminate barrel or pincushion distortion caused by wide-angle lenses and ensure the authenticity of vehicle shape and spatial relationship.

[0107] Perspective transformation: Define the four vertices of the monitored area in the image (such as the four corners of a parking lot) and their corresponding four vertices in the bird's-eye view. Calculate the perspective transformation matrix, apply the matrix to each frame of the corrected image, and generate a bird's-eye view. In the bird's-eye view, the size, orientation, and spacing of vehicles are more intuitive, which greatly simplifies the subsequent detection, tracking, and parking space association tasks.

[0108] Image Enhancement: To address the common problems of insufficient lighting or overexposure in parking lots, adaptive image enhancement algorithms are employed. For example, contrast-limited adaptive histogram equalization is used to enhance details in dark areas while suppressing noise. For strong light scenes, the Retinex algorithm can be used to separate the illumination component and the reflection component, and the reflection component is enhanced to restore details in overexposed areas.

[0109] S2023: Use the backbone network of the vehicle target detection model to extract the original feature map of the image data to be identified in each frame;

[0110] S2024: Using the dynamic receptive field attention module of the vehicle target detection model, extract the local feature map, global context feature map and dynamic receptive field feature map of the original feature map, and perform a fusion operation through concatenation based on the dynamically generated attention weights to obtain the first fused feature map;

[0111] S2025: Using the neck network of the vehicle target detection model, the original feature map and the first fused feature map are fused a second time to obtain the second fused feature map;

[0112] S2026: Based on the second fused feature map, use the multi-task unified detection head of the vehicle target detection model to perform vehicle target detection and obtain several initial vehicle targets in the image data to be identified;

[0113] S2027: Post-process several initial vehicle targets to obtain several final vehicle targets;

[0114] Post-processing includes:

[0115] Confidence filtering: First, set a confidence threshold (e.g., 0.5) to filter out all predicted boxes with a confidence level below the threshold, thus removing a large number of obvious background false detections;

[0116] Non-maximum suppression: For the remaining predicted boxes, non-maximum suppression is performed according to the category. The purpose is to solve the problem of a vehicle being detected by multiple boxes at the same time. The steps are as follows:

[0117] Sort all boxes by confidence level from highest to lowest;

[0118] Select the box with the highest confidence level and denote it as A;

[0119] Calculate the intersection-union ratio (IUU) of box A with all other boxes;

[0120] Delete all boxes whose IoU with box A is greater than a set threshold (e.g., 0.45);

[0121] Repeat the above process for the remaining boxes until all boxes have been processed;

[0122] Finally, for each frame, several initial vehicle targets are obtained from the image data to be identified;

[0123] In one optional implementation, several vehicle targets are input into the vehicle target tracking model of the edge computing gateway to perform vehicle target tracking and obtain the active trajectories of the several vehicle targets, including:

[0124] S2031: Input several vehicle targets into the edge computing gateway, extract the initial trajectory set corresponding to the image data to be identified in the previous frame, and construct a spatiotemporal graph including several node features and edge features based on several vehicle targets using the spatiotemporal graph construction module of the vehicle target tracking model.

[0125] It is worth noting that the detection box of each vehicle target among several vehicle targets is regarded as a detection node in the graph, and the last predicted position of the vehicle target's trajectory (predicted by motion models such as Kalman filtering) is regarded as a trajectory node in the graph.

[0126] Node characteristics:

[0127] The node features of the detected node are composed of three parts:

[0128] 1) First spatial feature: normalized center coordinates of the detection box; 2) First motion feature: if the detection box is associated with a detection box in the previous frame, calculate its instantaneous velocity, otherwise it is zero; 3) First appearance feature: the image patch cropped from the detection box is input into a pre-trained re-identification (ReID) network to obtain a high-dimensional (e.g., 2048-dimensional) feature vector.

[0129] The node features of a trajectory node are composed of three parts: 1) Second spatial features: the current frame center coordinates predicted by its Kalman filter; 2) Second motion features: the velocity vector predicted by the Kalman filter; 3) Second appearance features: the appearance feature descriptor of the trajectory at the time of the most recent successful match.

[0130] Edge definition: Construct a fully connected bipartite graph between all detection nodes and all trajectory nodes, with edges connecting detection nodes and trajectory nodes;

[0131] Edge features: Edge features are used to measure the similarity between the detection box and the trajectory of the vehicle target, and consist of two parts:

[0132] Motion similarity: the Mahalanobis distance between the center point of the detection box and the predicted center point of the trajectory;

[0133] Appearance similarity: the cosine similarity between the appearance features of the detection box and the appearance features of the trajectory;

[0134] The final edge feature is a concatenation of motion similarity and appearance similarity, or a scalar value that combines the two.

[0135] By combining all nodes and their node features, and edges and their edge features, a spatiotemporal graph is constructed.

[0136] S2032: Using the reasoning and association module of the vehicle target tracking model, reasoning and association are performed on the spatiotemporal graph to obtain the association matrix;

[0137] It is worth noting that message passing: In each layer of the GAT network, node information is aggregated through edges. For example, for a trajectory node, information from all connected detection nodes is collected, and the importance of the information is determined based on edge features (i.e., similarity). The GAT network learns an attention coefficient.

[0138] Node update: The new feature of a trajectory node is the weighted sum of its own feature and the features of all neighboring nodes, with the weights being the attention coefficients. After multiple layers of message passing, the node's features encode the context information of its entire neighborhood.

[0139] Association matrix generation: The output of the GAT network is the updated node features. In particular, we focus on the updated edge features or a dedicated edge classifier. For each edge, a multilayer perceptron is used to map its updated features to an association score, which represents the probability that the detection box belongs to the trajectory. The scores of all edges constitute an association matrix.

[0140] S2033: Using the trajectory generation module of the vehicle target tracking model, the correlation matrix is used as the cost matrix to find the optimal trajectory node for each detection node in the spatiotemporal graph;

[0141] It is worth noting that the correlation matrix is used as the cost matrix, and the optimal one-to-one match between the detection and the trajectory is found through the Hungarian algorithm to maximize the total correlation score.

[0142] Track status update:

[0143] Match successful: For a matched detection box-trajectory pair, update the Kalman filter of the trajectory with the actual observation value of the detection box, add the position of the detection box to the trajectory point sequence, and update the appearance descriptor of the trajectory with the appearance features of the detection box (moving average can be used). The trajectory remains active and its continuous matching counter is incremented by 1.

[0144] Unmatched trajectories: For active trajectories that do not match any detection, their state becomes temporarily lost. Their Kalman filters only make predictions and do not update. Their continuous loss counter is incremented by 1. If the loss counter exceeds a preset threshold, the trajectory is marked as removed and removed from the image.

[0145] Unmatched detections: For detection boxes that do not match any trajectory, treat them as new targets, initialize a new trajectory, assign a new unique ID to it, set the initial state of the Kalman filter, extract appearance features, and set the state to temporarily lost. Only when the new trajectory successfully matches the detection for several consecutive frames will its state be promoted to active; otherwise, it will be deleted.

[0146] S2034: Traverse all frames of image data to be identified, and based on the initial trajectory set, perform trajectory management on several trajectory nodes of the image data to be identified in each frame to obtain the active trajectories of several vehicle targets.

[0147] In one alternative implementation, an improved optimization algorithm is used to dynamically optimize based on the active trajectories of several vehicle targets, resulting in a parking status list for the parking lot, including:

[0148] S2041: Based on the active trajectories of several vehicle targets, define parking space areas and extract evidence to obtain several parking spaces and a multi-dimensional evidence vector for each parking space.

[0149] S2042: Based on the multidimensional evidence vectors of several parking spaces, the problem of intelligent perception of parking space status is modeled as a multi-objective optimization problem. An improved optimization algorithm is used to perform dynamic optimization to obtain the set of parking status of the parking spaces.

[0150] S2043: Confirm the parking status set of parking spaces to obtain a parking status list;

[0151] It is worth noting that, in order to prevent frequent switching of status due to momentary interference, a time window filter is applied to the parking status set. For example, only when a parking space is judged as "occupied" for N consecutive seconds (such as 3 seconds) will its status be finally confirmed as "occupied" and updated to the parking status list. The final parking status list (including parking space ID, status, update time, etc.) can be provided to parking guidance screens, user apps, or parking management systems through API interfaces to realize intelligent applications.

[0152] In one optional implementation, parking space areas are defined and evidence is extracted based on the active trajectories of several vehicle targets, resulting in several parking spaces and a multi-dimensional evidence vector for each parking space, including:

[0153] S20411: Define parking space areas in the parking lot to obtain a number of parking spaces;

[0154] During initialization, a calibration tool is used to define a precise polygonal area for each parking space on the monitoring screen and assign it a unique ID. This area information is stored in a configuration file, resulting in a number of parking spaces.

[0155] S20412: Perform evidence initialization to obtain the parking space state set format and multidimensional evidence format;

[0156] Maintain a state machine for each parking space, with the state set format S = {idle, occupied, entering, leaving, uncertain}. Initially, all parking spaces are in the idle state.

[0157] Define a multidimensional evidence vector for each parking space. The format of the multidimensional evidence vector is: multidimensional evidence vector = [ This is used for intelligent sensing of parking space status over time. The spatial evidence, constraint evidence, temporal evidence, and behavioral evidence corresponding to state S;

[0158] S20413: Based on the state set format and multidimensional evidence format, evidence is extracted and quantified according to the active trajectories of several vehicle targets to generate a multidimensional evidence vector for each parking space.

[0159] In this embodiment, the multidimensional evidence vector includes spatial evidence, constraint evidence, temporal evidence, and behavioral evidence;

[0160] Spatial evidence: For each vehicle target in the current frame, calculate the IoU between its position and each parking space region. If the IoU is greater than the maximum IoU threshold, the vehicle is considered to be within the parking space. A high score indicates that if the IoU is less than the minimum IoU threshold, the vehicle is considered not to be in the parking space; if the score is between the two thresholds, the vehicle is considered to be entering or leaving the parking space. The score is average;

[0161] Constraint evidence: A parking space can only be occupied by one vehicle at a time. If the score is high, but another vehicle with a high IoU is also detected, then... A decrease in score indicates a conflict and requires further observation.

[0162] Temporal evidence: Records the historical state of parking spaces. If the previous state was vacant, and the current state is... If a vehicle is detected entering, then The status changes from "supporting" to "entering" if the previous status was "occupied" and the current status is "in progress". If the vehicle has left, then Supporting the transition from state to departure, temporal evidence provides inertia for state changes, preventing state jumps caused by single-frame detection jitter;

[0163] Behavioral evidence: Combining tracking information, analyze the vehicle's trajectory. If the vehicle's trajectory smoothly moves from outside the parking space to inside within the most recent few frames and eventually stops (with a speed close to 0), then... Strong support indicates an occupied state; conversely, if the trajectory moves from inside to outside, it indicates a departing state. Behavioral evidence is the most powerful evidence and can effectively filter out false detections (such as water stains or shadows on the ground being mistaken for vehicles) and brief stops (such as a vehicle briefly passing by an adjacent space when reversing into a parking space).

[0164] In one alternative implementation, based on the multidimensional evidence vectors of several parking spaces, the intelligent perception problem of parking space status is modeled as a multi-objective optimization problem. An improved ant colony optimization (IACO) algorithm is used for dynamic optimization to obtain a set of parking space parking states, including:

[0165] S20421: Based on the multidimensional evidence format of the intelligent perception problem of parking space status, set the fitness function of the multi-objective optimization problem, and set the IACO population parameters (number of parking spaces) and the maximum number of iterations;

[0166] It is worth noting that this also includes: initializing the pheromone matrix (usually a small constant); setting the pheromone evaporation coefficient; and setting the heuristic information weights.

[0167]

[0168] In the formula, The fitness value of individual X in IACO; Spatial evidence corresponding to IACO individual X; Constraint evidence corresponding to IACO individual X; The temporal evidence corresponding to IACO individual X; Behavioral evidence corresponding to IACO individual X; The fitness weighting coefficient can be adjusted according to actual needs; X is the IACO individual reference parameter. Calculate the scoring function;

[0169] S20422: Based on the IACO population parameters, the population is initialized using the Tent chaotic mapping sequence to obtain the initial IACO population; each IACO individual in the IACO population corresponds to a set of alternative parking states for a parking space.

[0170] The formula is:

[0171]

[0172] In the formula, The i-th initial IACO individual in the initial IACO population; Let i be the i-th chaotic variable; represents the upper and lower bounds of the search space; i represents the IACO individual indicator;

[0173]

[0174] In the formula, Let i be the (i-1)th chaotic variable; compared with random initialization, chaotic initialization can ensure that the population is evenly distributed in the solution space, thus enhancing diversity.

[0175] S20423: Perform pheromone initialization to obtain the initial pheromone of each initial IACO individual;

[0176] S20424: Construct solutions for the initial IACO population until all initial IACO individuals have constructed solutions, then perform a global pheromone update and retain the global optimal solution;

[0177] Based on trajectory With probability Select parking space ,in, For trajectory indication quantity, The formula for parking space indication is:

[0178]

[0179] In the formula, Let be the probability of the i-th IACO individual in iteration number t; For the first The pheromone concentration in each parking space; For the first Heuristic information for each parking space; Parking space indication; Number of parking spaces; For the first The set of accessible parking spaces along the trajectory; The weights are heuristic information; t is the current iteration number;

[0180] During the solution construction process of IACO individuals, at each step, the pheromones along that path are immediately locally updated to increase the diversity of subsequent IACO individuals exploring other paths. The formula for pheromone volatilization is:

[0181]

[0182] In the formula, The pheromone after evaporation at iteration number t+1; The convergence factor; The initial pheromone;

[0183]

[0184] In the formula, These are the maximum and minimum values of the convergence factor; t represents the maximum number of iterations; t represents the current number of iterations. , To adjust the parameters; It is the hyperbolic tangent function;

[0185] After all IACO individuals have constructed solutions, calculate the fitness of each IACO individual and find the optimal solution and the global optimal solution in this iteration.

[0186] The formula for global pheromone updates is:

[0187]

[0188] In the formula, The updated pheromone at iteration number t+1; This represents the total pheromone increment.

[0189]

[0190] In the formula, Let be the fitness value of the i-th IACO individual;

[0191] S20425: Based on the updated pheromones, iteratively update the initial IACO population to obtain the updated IACO population, and update the global optimal solution.

[0192] S20426: When the number of iterations reaches the maximum number of iterations or the fitness value of the global optimal solution meets the requirements, terminate the iterative update of the IACO population and output the global optimal solution of the current iteration;

[0193] S20427: Decode the individual vectors of the IACO individuals corresponding to the global optimal solution to obtain the optimal parking state set of the parking space;

[0194] In this embodiment, the parking space status perception problem is modeled as a multi-objective optimization problem. By extracting four-dimensional evidence of space, constraints, time series and behavior, and using the improved IACO algorithm for dynamic optimization, it can comprehensively weigh multiple uncertain factors and make a globally optimal decision, which significantly reduces the misjudgment rate in complex situations such as vehicles crossing lanes and dense parking.

[0195] This invention also provides a parking status intelligent sensing device based on target detection and tracking, referring to... Figure 3 The diagram shows a functional unit diagram of a parking status intelligent sensing device 300 based on target detection and tracking according to the present invention. The device may include the following units:

[0196] The model deployment unit 301 is used to deploy the vehicle target detection model and the vehicle target tracking model in the cloud server to all edge computing gateways of the parking lot monitoring system.

[0197] The vehicle target detection unit 302 is used to input the video stream captured by the camera into the vehicle target detection model of the edge computing gateway to perform vehicle target detection and obtain several vehicle targets;

[0198] The vehicle target tracking unit 303 is used to input several vehicle targets into the vehicle target tracking model of the edge computing gateway, perform vehicle target tracking, and obtain the active trajectories of several vehicle targets.

[0199] The parking status generation unit 304 is used to dynamically optimize based on the active trajectories of several vehicle targets using an improved optimization algorithm to obtain a parking status list for the parking lot.

[0200] Based on the same inventive concept, another embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus.

[0201] Memory, used to store computer programs;

[0202] When the processor executes the program stored in the memory, it implements the intelligent parking state perception method based on target detection and tracking of the present invention.

[0203] The communication bus mentioned above can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc. For ease of representation, only one thick line is used in the diagram, but this does not indicate that there is only one bus or one type of bus. The communication interface is used for communication between the aforementioned terminal and other devices. The memory can include Random Access Memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory can also be at least one storage device located remotely from the aforementioned processor.

[0204] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0205] Furthermore, to achieve the above objectives, embodiments of the present invention also propose a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the intelligent parking state perception method based on target detection and tracking according to embodiments of the present invention.

[0206] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, apparatus, or computer program products. Therefore, embodiments of the present invention can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, embodiments of the present invention can take the form of computer program products implemented on one or more computer-usable vehicles (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0207] The embodiments of the present invention are described with reference to flowchart illustrations and / or block diagrams of methods, terminal devices (apparatus), and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0208] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0209] These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, causing a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0210] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. "And / or" indicates that either one or both can be chosen. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes the element.

[0211] The above are merely specific embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in the present invention, and these modifications or substitutions should all be covered within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A parking status intelligent perception method based on target detection and tracking, characterized in that, The method includes: Deploy the vehicle target detection model and vehicle target tracking model from the cloud server to all edge computing gateways of the parking lot monitoring system; The vehicle target detection model is built based on the improved YOLOv8 algorithm, and includes a backbone network based on the MobileNetV3 algorithm, a neck network built based on the BiFPN algorithm, and a multi-task unified detection head. A dynamic receptive field attention module is set between the backbone network and the neck network. The dynamic receptive field attention module includes parallel local feature branches, global context branches, and dynamic receptive field branches. The vehicle target tracking model is constructed based on the GAT-KMA algorithm, and includes a spatiotemporal graph construction module, an inference and association module based on the GAT algorithm, and a trajectory generation module based on the KMA algorithm. The video stream captured by the camera is input into the vehicle target detection model of the edge computing gateway to perform vehicle target detection and obtain several vehicle targets; Several vehicle targets are input into the vehicle target tracking model of the edge computing gateway to perform vehicle target tracking and obtain the active trajectories of several vehicle targets, including: Several vehicle targets are input into the edge computing gateway, the initial trajectory set corresponding to the image data to be identified in the previous frame is extracted, and a spatiotemporal graph including several node features and edge features is constructed using the spatiotemporal graph construction module of the vehicle target tracking model based on several vehicle targets. Using the reasoning and association module of the vehicle target tracking model, reasoning and association are performed on the spatiotemporal graph to obtain the association matrix; Using the trajectory generation module of the vehicle target tracking model, the correlation matrix is used as the cost matrix to find the optimal trajectory node for each detection node in the spatiotemporal graph. Traverse all frames of image data to be identified, and based on the initial trajectory set, perform trajectory management on several trajectory nodes of the image data to be identified in each frame to obtain the active trajectories of several vehicle targets. Based on the active trajectories of several vehicle targets, an improved optimization algorithm is used to dynamically optimize and obtain a parking status list for the parking lot, including: Based on the active trajectories of several vehicle targets, parking space areas are defined and evidence is extracted to obtain several parking spaces and a multi-dimensional evidence vector for each parking space. Based on the multidimensional evidence vectors of several parking spaces, the intelligent perception problem of parking space status is modeled as a multi-objective optimization problem. An improved optimization algorithm is used for dynamic optimization to obtain the set of parking space parking states, including: Based on the multidimensional evidence format of the intelligent perception problem of parking space status, a fitness function for a multi-objective optimization problem is set, and the IACO population parameters and maximum number of iterations are set. Based on the IACO population parameters, the population is initialized using the Tent chaotic mapping sequence to obtain the initial IACO population; each IACO individual in the IACO population corresponds to a set of candidate parking states for a parking space. Perform pheromone initialization to obtain the initial pheromone of each initial IACO individual; A solution is constructed for the initial IACO population. Once all initial IACO individuals have constructed solutions, a global pheromone update is performed, and the global optimal solution is retained. Based on the updated pheromones, the initial IACO population is iteratively updated to obtain the updated IACO population, and the global optimal solution is updated. When the number of iterations reaches the maximum number of iterations or the fitness value of the global optimum meets the requirements, the iterative update of the IACO population is terminated, and the global optimum of the current iteration is output. Decode the individual vectors of the IACO individuals corresponding to the global optimal solution to obtain the optimal parking state set of the parking space; The parking status set of parking spaces is confirmed to obtain a parking status list.

2. The intelligent parking status perception method based on target detection and tracking according to claim 1, characterized in that, Deploy the vehicle target detection model and vehicle target tracking model from the cloud server to all edge computing gateways of the parking lot monitoring system, including: Using deep learning algorithms, an initial vehicle target detection model and an initial vehicle target tracking model are built in a cloud server, and the initial vehicle target detection model and the initial vehicle target tracking model are deployed to all training servers. Based on the federated learning mechanism, local training image datasets are used in training servers for different scenarios to train the corresponding initial vehicle target detection model and initial vehicle target tracking model locally, and the model update and model evaluation metrics after local training are uploaded to the cloud server. Based on the model update volume and corresponding model evaluation metrics of each training server, the initial vehicle target detection model and the initial vehicle target tracking model of the cloud server are safely aggregated according to the dynamic federated weight mechanism to obtain the final vehicle target detection model and the final vehicle target tracking model. The final vehicle target detection model and the final vehicle target tracking model are deployed to all edge computing gateways connected to the cloud server.

3. The intelligent parking state perception method based on target detection and tracking according to claim 2, characterized in that, The video stream captured by the camera is input into the vehicle target detection model of the edge computing gateway to perform vehicle target detection, resulting in several vehicle targets, including: The video stream is captured by cameras installed at key locations in the parking lot monitoring system and then input to the edge computing gateway. The video stream is frame-trimmed and preprocessed to obtain continuous frames of image data to be identified, and then the continuous frames of image data to be identified are input into the vehicle target detection model. The backbone network of the vehicle target detection model is used to extract the original feature map of the image data to be identified in each frame; Using the dynamic receptive field attention module of the vehicle target detection model, the local feature map, global context feature map, and dynamic receptive field feature map of the original feature map are extracted, and then fused by a concatenation operation according to the dynamically generated attention weights to obtain the first fused feature map. The neck network of the vehicle target detection model is used to perform a second fusion of the original feature map and the first fusion feature map to obtain the second fusion feature map. Based on the second fused feature map, the multi-task unified detection head of the vehicle target detection model is used to detect vehicle targets and obtain several initial vehicle targets in the image data to be identified. Post-processing is performed on several initial vehicle targets to obtain several final vehicle targets.

4. The intelligent parking state perception method based on target detection and tracking according to claim 3, characterized in that, Based on the active trajectories of several vehicle targets, parking space areas are defined and evidence is extracted, resulting in several parking spaces and a multi-dimensional evidence vector for each parking space, including: Define parking space areas in the parking lot to obtain a number of parking spaces; Perform evidence initialization to obtain the parking space state set format and multidimensional evidence format; Based on the state set format and multidimensional evidence format, evidence is extracted and quantified according to the active trajectories of several vehicle targets to generate a multidimensional evidence vector for each parking space.

5. The intelligent parking status perception method based on target detection and tracking according to claim 4, characterized in that, The multidimensional evidence vector includes spatial evidence, constraint evidence, temporal evidence, and behavioral evidence.

6. A parking state intelligent sensing device based on target detection and tracking, used to implement the parking state intelligent sensing method as described in any one of claims 1-5, characterized in that, The device includes: The model deployment unit is used to deploy the vehicle target detection model and the vehicle target tracking model in the cloud server to all edge computing gateways of the parking lot monitoring system. The vehicle target detection unit is used to input the video stream captured by the camera into the vehicle target detection model of the edge computing gateway to perform vehicle target detection and obtain several vehicle targets; The vehicle target tracking unit is used to input several vehicle targets into the vehicle target tracking model of the edge computing gateway, perform vehicle target tracking, and obtain the active trajectories of several vehicle targets; The parking status generation unit is used to dynamically optimize the parking status list of the parking lot based on the active trajectories of several vehicle targets using an improved optimization algorithm.