A target tracking method and system based on unmanned aerial vehicle ID tracing
By constructing a combination of multi-factor identity identification and spatiotemporal graph neural network, the problems of single identity identification and data association in UAV target tracking technology in complex scenarios are solved, realizing the reliability of identity traceability and continuous tracking at all times, and adapting to resource constraints in different environments.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENZHEN LONGING INNOVATION AVIATION TECH CO LTD
- Filing Date
- 2026-03-24
- Publication Date
- 2026-06-19
AI Technical Summary
Existing UAV target tracking technologies suffer from several problems in complex scenarios, including single identity identification, lack of multi-factor cross-validation capabilities, weak anti-spoofing ability, shallow data association, failure to fully explore high-order interaction relationships between targets, high ambiguity in association in dense scenes, disconnect between identity and tracking, and poor identity retention capabilities across sensors and occluded areas.
By constructing a multi-factor identity identifier, combining the dynamic binding of Remote ID and RFID fingerprint, using a spatiotemporal graph neural network for target detection and association, and employing a three-level cascaded strategy for matching and re-identification, the continuity and reliability of identity and trajectory are achieved.
It improves the reliability and anti-spoofing ability of drone identity tracing, solves the problem of identity preservation in complex environments, ensures continuous and real-time tracking at all times, and adapts to resource constraints in different scenarios.
Smart Images

Figure CN122244098A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of low-altitude security technology, specifically to a target tracking method and system based on UAV ID tracing. Background Technology
[0002] With the rapid development of the low-altitude economy, drones are widely used in logistics delivery, inspection and monitoring, etc. However, security incidents such as "black flight" intrusion and malicious interference with air traffic occur frequently, which has created an urgent need for target tracking technology based on drone identity tracing. Current drone target tracking mainly relies on two types of schemes: one is the broadcast recognition scheme based on Remote ID, which obtains the identity by parsing the remote identification code actively broadcast by the drone; the other is the visual association scheme based on multi-target tracking (MOT) algorithm, which uses a detection-association framework to maintain the target identity. The former complies with the requirements of FAA / EASA and other regulations, but faces limitations such as short detection range, susceptibility to forgery, and inability to identify targets without ID; the latter relies on algorithms such as DeepSORT and ByteTrack, which achieve data association through Kalman filter prediction and appearance re-identification. However, the aforementioned mainstream solutions have significant drawbacks in complex scenarios. The Remote ID solution only utilizes network-layer identity identification and lacks physical-layer verification methods, making it difficult to cope with ID spoofing attacks. At the same time, its detection range is usually less than 100 meters, which is insufficient to meet the beyond-line-of-sight requirements for key area defense. The visual tracking solution assumes linear target movement and uses the Hungarian algorithm for shallow binary matching. In the case of drone swarms, the high mobility of the targets and frequent mutual occlusion lead to serious identity switching problems. In addition, existing solutions separate identity recognition from target tracking, failing to take advantage of the short-term stability of the relative geometric structure during drone swarm flight, and failing to establish a dynamic binding mechanism between Remote ID and multi-source identities such as RFID fingerprints, resulting in the traceability chain being prone to breakage in complex environments. In summary, the current technology has three main problems: First, the identity identifier is single and lacks multi-factor cross-validation capability, resulting in weak anti-spoofing ability; second, the data association is shallow and does not fully explore the high-order interaction relationship between targets, leading to high ambiguity in association in dense scenes; and third, identity and tracking are disconnected, with poor ability to maintain identity across sensors and across occluded areas. Therefore, there is an urgent need for a tracking method that integrates multi-source identity tracing with spatiotemporal graph neural network association to improve the reliability of UAV identity preservation and tracing in complex environments. Summary of the Invention
[0003] To address the aforementioned technical issues, a target tracking method and system based on UAV ID tracing is provided. This technical solution solves the problems of single identity identifiers, lack of multi-factor cross-validation capabilities, weak anti-spoofing ability, shallow data association, insufficient exploration of high-order interaction relationships between targets, high ambiguity in dense scenes, and disconnect between identity and tracking, resulting in poor identity retention capabilities across sensors and occluded areas.
[0004] To achieve the above objectives, the technical solution adopted by the present invention is as follows: A target tracking method based on UAV ID tracing includes: S1. Detect the target in the current frame image to obtain the detection box and detection confidence. At the same time, receive the Remote ID information of the UAV, collect flight control signals or image transmission signals and extract radio frequency fingerprint features. Dynamically bind the Remote ID and radio frequency fingerprint to construct a multi-factor identity identifier. S2. For each UAV target detected in the current frame, construct a local neighborhood centered on its spatial location, calculate the relative geometric relationship with other targets in the neighborhood, including Euclidean distance, relative azimuth angle and velocity vector difference, encode it into geometric feature vectors and construct a spatial graph structure; S3. Expand the continuous multi-frame spatial graph according to the time dimension to construct a spatiotemporal heterogeneous graph. Embed the multi-factor identity identifier as a node attribute into the spatiotemporal heterogeneous graph. Input it into the spatiotemporal graph neural network. Aggregate neighborhood geometric features through spatial attention mechanism to complete the occluded target features. Fusion historical trajectory and current observation features through temporal attention mechanism. Output the association confidence matrix that fuses identity similarity and geometric similarity. S4. A three-level cascaded strategy is adopted based on the association confidence matrix: high-confidence targets are deterministically matched by joint judgment of strong geometric constraints and identity consistency; medium-confidence targets are finely associated by fusion association confidence output by spatiotemporal graph neural network; unmatched targets are re-identified and compared with the radio frequency fingerprint and historical feature database to restore their identity or initialize a new trajectory. S5. Update the trajectory status and multi-factor identity of the successfully associated target, trigger an alarm for the target with identity switching, and output the trajectory sequence with multi-factor identity for flight tracing.
[0005] Preferably, step S1, which involves constructing a multi-factor identity identifier, includes: The Remote ID and RFID fingerprint are hashed and concatenated to generate a composite identity code, and the binding confidence of the two is calculated. When the binding confidence is lower than a preset threshold, an identity anomaly alarm is triggered. For targets without Remote ID, an anonymous identity is constructed solely based on radio frequency fingerprints, and cross-sensor identity maintenance of anonymous targets is achieved through cross-site radio frequency fingerprint comparison.
[0006] Preferably, step S2 specifically includes: Using the spatial location of each target in the current frame as the center, an adaptive k-nearest neighbor strategy is used to select the k nearest targets to form a local neighborhood, where the value of k is dynamically adjusted according to the target density in the current frame. Calculate the relative geometric relationship between the central target and each target in the neighborhood, including Euclidean distance, relative azimuth angle, velocity vector difference and acceleration vector difference, and encode the relative geometric relationship into a multidimensional geometric feature vector; A spatial graph structure is constructed based on all targets and their geometric feature vectors in the current frame, where nodes are targets and the weights of edges are determined by geometric similarity. The geometric similarity is negatively correlated with the distance between targets and positively correlated with the consistency of velocity direction. For a target that is occluded, its spatial position is inferred by its relative geometric relationship with neighboring targets, and the geometric feature vector of the occluded target is completed.
[0007] Preferably, in step S3, the step of expanding the spatial graph of consecutive multiple frames according to the time dimension to construct a spatiotemporal heterogeneous graph, and embedding the multi-factor identity identifier as a node attribute into the spatiotemporal heterogeneous graph specifically includes: Stack the spatial graphs of multiple consecutive frames in temporal order, establish temporal edges between target nodes in adjacent frames, and determine the weight of the temporal edges by motion prediction bias and appearance similarity to construct a spatiotemporal heterogeneous graph containing spatial and temporal edges. The Remote ID in the multi-factor identity identifier is encoded as a one-hot vector and the radio frequency fingerprint is encoded as a low-dimensional feature vector. After concatenation, the concatenation is mapped to a spatiotemporal heterogeneous graph with node attributes embedded through a linear transformation. Independent feature transformation matrices are configured for the spatial and temporal edges respectively, and the embedded spatiotemporal heterogeneous graph is input into the spatiotemporal graph neural network.
[0008] Preferably, in step S3, the inference and output of the spatiotemporal graph neural network specifically include: The attention coefficients between neighboring targets within a single frame are calculated using a spatial attention mechanism. Neighboring geometric features are aggregated to generate a complete node feature representation, which is used to infer the spatial state of occluded targets. The attention coefficients between targets across frames are calculated using a temporal attention mechanism, and historical trajectory features are aggregated with current observation features to generate temporal fusion features. The identity similarity in the node feature representation and the geometric similarity in the temporal fusion feature are weighted and fused to output the association confidence matrix.
[0009] Preferably, in step S4, the hierarchical determination of the three-level cascading strategy includes: Targets with a detection confidence level higher than the first threshold are classified as high-confidence targets, targets with a detection confidence level between the first threshold and the second threshold are classified as medium-confidence targets, and targets with a detection confidence level lower than the second threshold are classified as low-confidence targets and classified as unmatched targets. The first threshold and the second threshold are dynamically and adaptively adjusted according to weather conditions, target size, or detection signal-to-noise ratio.
[0010] Preferably, in step S4, the high-confidence target is determined by a joint judgment of strong geometric constraints and identity consistency, and the medium-confidence target is determined by a fine association using the fusion association confidence score output by the spatiotemporal graph neural network. Specifically, this includes: High-confidence targets utilize strong geometric constraints to verify spatial position deviation and velocity direction consistency, and use identity consistency to verify Remote ID matching and RF fingerprint similarity. When both geometric constraints and identity consistency are satisfied, deterministic matching is performed. The medium confidence target utilizes the fusion association confidence output by the spatiotemporal graph neural network to solve for the optimal match using the Hungarian algorithm. The fusion association confidence is a weighted fusion of identity similarity and geometric similarity, with the weights adaptively adjusted according to the degree of occlusion.
[0011] Preferably, in step S4, the process of restoring the identity or initializing a new trajectory of the unmatched target based on re-identification comparison with the historical feature database using radio frequency fingerprints includes: Extract the radio frequency fingerprint features of the current frame of the unmatched target, calculate the similarity with the last known radio frequency fingerprint of each trajectory in the historical trajectory database, and determine that the target is the same and restore its identity when the maximum similarity exceeds the preset threshold; otherwise, initialize a new trajectory based on multi-factor identity. The re-identification comparison takes precedence over the initialization of the new trajectory, and the radio frequency fingerprint features remain stable under occlusion and appearance degradation scenarios.
[0012] Preferably, step S5 specifically includes: Update the trajectory status and multi-factor identity of the successfully associated target, and append the radio frequency fingerprint features of the current frame to the identity feature database to improve the identity profile; An alarm is triggered for the identity switching target. The determination conditions for the identity switching include the inconsistency between the current frame multi-factor identity identifier and the predicted trajectory identity identifier, or the radio frequency fingerprint similarity is lower than the security threshold, or the Remote ID changes. The output trajectory sequence with multi-factor identification is used for flight tracing. The trajectory sequence includes timestamp, spatial location, Remote ID, radio frequency fingerprint hash and association confidence, which is used for post-flight path reconstruction and control source localization.
[0013] A target tracking system based on UAV ID tracing includes the following modules: The multi-source identity feature extraction module is used to obtain the detection box and detection confidence, receive Remote ID information and extract radio frequency fingerprint features, and construct multi-factor identity identifiers; The geometric topology encoding module is used to construct local neighborhoods, calculate relative geometric relationships and encode them into geometric feature vectors, and construct a spatial graph structure. The spatiotemporal graph neural network inference module is used to construct spatiotemporal heterogeneous graphs, embed multi-factor identity identifiers, and output the association confidence matrix through spatial attention mechanism and temporal attention mechanism; The cascaded identity association module is used to perform deterministic matching, fine association, and re-identification comparison using a three-level cascaded strategy. The identity retention and traceability output module is used to update the trajectory status and identity identifier, trigger identity switching alarms, and output trajectory sequences with identity identifiers.
[0014] Compared with the prior art, the beneficial effects of the present invention are as follows: This invention constructs a multi-factor identity identifier to achieve dynamic binding between Remote ID and RFID fingerprint, solving the problems of easy forgery of single identity identifiers and inability to identify targets without IDs, thus improving the reliability and anti-spoofing capability of identity traceability. Through a dynamic threshold adjustment strategy, it achieves environmental adaptability of the hierarchical association mechanism, solving the performance rigidity problem of fixed parameters under complex weather conditions and expanding the scope of applicable scenarios of the system. Through RFID fingerprint re-identification and comparison, it achieves identity preservation in occlusion and appearance degradation scenarios, solving the problem of failure of pure visual tracking features and ensuring continuous tracking at all times. Through a three-level cascaded association architecture, it achieves differentiated processing of high-confidence fast matching, medium-confidence fine inference, and unmatched target recovery, resolving the contradiction between real-time performance and accuracy, and meeting the resource constraints of edge deployment. Attached Figure Description
[0015] Figure 1 This is a flowchart of the method of the present invention; Figure 2 This is a system framework diagram of the present invention. Detailed Implementation
[0016] The following description is intended to disclose the invention and enable those skilled in the art to implement it. The preferred embodiments described below are merely examples, and other obvious variations will occur to those skilled in the art.
[0017] Reference Figure 1 As shown, a target tracking method based on UAV ID tracing includes: S1. Detect the target in the current frame image to obtain the detection box and detection confidence. At the same time, receive the Remote ID information of the UAV, collect flight control signals or image transmission signals and extract radio frequency fingerprint features. Dynamically bind the Remote ID and radio frequency fingerprint to construct a multi-factor identity identifier. The target detection is responsible for obtaining the spatial location and observation confidence of the UAV target from the visual input. A lightweight YOLOv8 neural network is used as the basic detector. The input is the current frame image captured by a visible light or infrared camera, and the output is a set of detection boxes and corresponding confidence scores. The detection boxes are represented as four-dimensional vectors (x, y, w, h), representing the center point coordinates and width and height dimensions, respectively. To address the small target characteristics of UAVs, a BiFPN feature fusion structure is introduced in the network Neck layer to enhance the multi-scale feature expression capability. The detection head adopts a decoupled design, independently optimizing the classification and localization tasks. The detection confidence is obtained by thresholding the Sigmoid output of the classification branch, reflecting the quality level of the current observation and providing a hierarchical basis for subsequent cascade association strategies. In edge deployment scenarios, the detector is accelerated by TensorRT quantization, and the single-frame inference latency is controlled within 15 milliseconds to meet the real-time requirements.
[0018] Step S1, which involves constructing a multi-factor identity, includes: The Remote ID and RFID fingerprint are hashed and concatenated to generate a composite identity code, and the binding confidence of the two is calculated. When the binding confidence is lower than a preset threshold, an identity anomaly alarm is triggered. The Remote ID receiving module captures remote identification information broadcast by the drone via a radio link. The module is equipped with a multi-band software-defined radio (SDR) receiver, covering the 2.4GHz, 5.8GHz, and 978MHz / 1090MHz aviation frequency bands, and is compatible with WiFi, Bluetooth, ADS-B, and vendor proprietary protocols. The receiver scans the frequency bands in real time and parses Remote ID data packets conforming to ASTM F3411 or ASD-STAN prEN 4709-002 standards. The extracted fields include: drone serial number, operator registration number, current location (latitude, longitude, and altitude), velocity vector, timestamp, and emergency status indicator. The parsed Remote ID information is output in a structured data format and aligned with the visual inspection box through the timestamp to establish a preliminary identity-observation association. For packet loss caused by signal obstruction or interference, the module enables forward error correction and retransmission mechanisms to ensure the complete reception of key identity fields.
[0019] For targets without a Remote ID, an anonymous identity is constructed solely based on radio frequency fingerprinting. Cross-sensor identity maintenance of anonymous targets is achieved through cross-site radio frequency fingerprint comparison. Specifically, the radio frequency fingerprint acquisition layer extracts inherent hardware features from UAV flight control signals or image transmission signals to construct an unforgeable identity. It shares a radio frequency front-end with the radio frequency analysis layer and extracts microscopic features through an independent signal processing link. The acquired signal undergoes down-conversion, filtering, and sampling before entering the feature extraction stage. Radio frequency fingerprint features include: carrier frequency offset (caused by local oscillator frequency deviation), phase noise (characterized by phase-locked loop stability), modulation error vector amplitude (EVM, reflecting power amplifier nonlinearity), pulse rise / fall time (transient response of radio frequency switches), and spurious emission spectrum features. These features are denoised using wavelet transform and concatenated into a 64-dimensional original fingerprint vector. Principal component analysis is then used to reduce the dimensionality to 16 dimensions, eliminating redundant information and preserving individual device differences. The radio frequency fingerprint possesses hardware uniqueness, making it difficult to tamper with via software, thus providing a low-level verification method to combat ID spoofing.
[0020] To further clarify, the multi-factor identity fusion layer achieves deep fusion of Remote ID and RFID fingerprint to construct a composite identity. First, the fusion confidence score is calculated: by comparing the timestamp consistency and signal strength correlation of Remote IDs and RFID fingerprints received at the same time period, an initial fusion score is generated. If the fusion score is higher than a preset threshold (e.g., 0.9), the Remote ID string and RFID fingerprint vector are hashed and concatenated to generate a fixed-length composite identity code; if it is lower than the threshold, an identity anomaly alarm is triggered, marking the target as a suspicious object. For "black flight" targets that do not broadcast Remote IDs, an anonymous identity is constructed solely using RFID fingerprints. Cross-site RFID fingerprint similarity comparison enables identity maintenance and relay tracking of anonymous targets in the multi-sensor network. The multi-factor identity is embedded as a node attribute into the subsequent spatiotemporal graph neural network, allowing the graph reasoning process to possess both network layer identity semantics and physical layer hardware features, improving the robustness of associations in complex environments.
[0021] S2. For each UAV target detected in the current frame, construct a local neighborhood centered on its spatial location, calculate the relative geometric relationship with other targets in the neighborhood, including Euclidean distance, relative azimuth angle and velocity vector difference, encode it into geometric feature vectors and construct a spatial graph structure; Step S2 specifically includes: Using the spatial location of each target in the current frame as the center, an adaptive k-nearest neighbor strategy is used to select the k nearest targets to form a local neighborhood, where the value of k is dynamically adjusted according to the target density in the current frame. In the neighborhood dynamic construction phase, each target's spatial location is used as the center, and an adaptive k-nearest neighbor strategy is employed to select locally associated targets. The value of k is not a fixed constant but dynamically adjusted based on the target density of the current frame: when the target density is higher than a preset threshold, the k value is increased to capture more topological information; when the target density is lower than the preset threshold, the k value is decreased to avoid introducing irrelevant noise. The neighborhood radius is synchronously adaptive, ensuring reasonable local connectivity in both dense clusters and sparse distribution scenarios. This dynamic mechanism makes the geometric topology environmentally adaptable, avoiding over-connectivity or under-connectivity problems caused by fixed parameters. Calculate the relative geometric relationship between the central target and each target in the neighborhood, including Euclidean distance, relative azimuth angle, velocity vector difference and acceleration vector difference, and encode the relative geometric relationship into a multidimensional geometric feature vector; A spatial graph structure is constructed based on all targets and their geometric feature vectors in the current frame, where nodes are targets and the weights of edges are determined by geometric similarity. The geometric similarity is negatively correlated with the distance between targets and positively correlated with the consistency of velocity direction. In the geometric relationship quantification process, the multidimensional relative geometric relationships between the central target and its neighboring targets are calculated. Euclidean distance reflects the degree of spatial proximity, relative azimuth angle represents the relative orientation of the target, velocity vector difference reveals motion consistency, and acceleration vector difference captures maneuvering trends. After normalization, the above physical quantities are spliced and encoded into multidimensional geometric feature vectors. A spatial graph structure is constructed based on all targets in the current frame and their geometric feature vectors. Nodes represent target entities, and the weights of edges are determined by geometric similarity. Geometric similarity is negatively correlated with the distance between targets and positively correlated with the consistency of velocity direction. Quantification is achieved through the composite calculation of exponential decay function and cosine similarity. This weighting mechanism strengthens the connection strength of cooperating targets and weakens the false associations of randomly distributed targets. For a target that is occluded, its spatial position is inferred by its relative geometric relationship with neighboring targets, and the geometric feature vector of the occluded target is completed.
[0022] In the occlusion feature inference step, for targets in an occluded state, the spatial state is inferred by using the relative geometric relationship between the target and visible neighboring targets. When the target detection confidence drops sharply or visual features are lost, occlusion determination is triggered. Based on the prior geometric constraints between the target and neighboring targets, combined with the current observation position of the neighboring targets, the system infers the possible position of the occluded target through geometric projection and motion extrapolation. The inference result is supplemented to the geometric feature vector, maintaining the integrity of node attributes and ensuring the continuity of the graph structure and the effectiveness of subsequent association reasoning during occlusion. This mechanism transforms the feature loss caused by occlusion into a topological reasoning problem, significantly reducing the risk of ID switch.
[0023] S3. Expand the continuous multi-frame spatial graph according to the time dimension to construct a spatiotemporal heterogeneous graph. Embed the multi-factor identity identifier as a node attribute into the spatiotemporal heterogeneous graph. Input it into the spatiotemporal graph neural network. Aggregate neighborhood geometric features through spatial attention mechanism to complete the occluded target features. Fusion historical trajectory and current observation features through temporal attention mechanism. Output the association confidence matrix that fuses identity similarity and geometric similarity. In step S3, the step of expanding the spatial graph of consecutive multiple frames according to the time dimension to construct a spatiotemporal heterogeneous graph, and embedding the multi-factor identity identifier as a node attribute into the spatiotemporal heterogeneous graph specifically includes: A spatiotemporal heterogeneous graph is constructed by stacking consecutive spatial graphs in temporal order and establishing temporal edges between target nodes in adjacent frames. The weight of the temporal edge is determined by both motion prediction bias and appearance similarity. Spatial edges connect geometrically related target nodes within the same frame, carrying local topological information; temporal edges connect motion-prediction-related target nodes between adjacent frames, carrying temporal evolution information. The weight of the temporal edge is determined by both motion prediction bias and appearance similarity: a smaller prediction bias and higher appearance similarity result in a higher weight, and vice versa. This dual-type edge design allows the graph structure to simultaneously possess spatial geometric constraints and temporal motion constraints, providing a heterogeneous propagation path for subsequent attention mechanisms.
[0024] The Remote ID in the multi-factor identity identifier is encoded as a one-hot vector, and the RFID fingerprint is encoded as a low-dimensional feature vector. These are concatenated and mapped through a linear transformation to form a spatiotemporal heterogeneous graph embedding node attributes. The Remote ID, after one-hot encoding, is mapped to a fixed-dimensional vector representing the network layer's identity semantics. The RFID fingerprint, after dimensionality reduction, represents the physical layer's hardware features. The concatenation of these two vectors is then mapped through a linear transformation to the same dimensional space as the geometric features, forming a unified node attribute representation. This embedding method allows identity information and geometric information to be computed within the same mathematical space, supporting subsequent cross-modal attention computation. Independent feature transformation matrices are configured for spatial and temporal edges to adapt to the different propagation characteristics of heterogeneous edges.
[0025] Independent feature transformation matrices are configured for the spatial and temporal edges respectively, and the embedded spatiotemporal heterogeneous graph is input into the spatiotemporal graph neural network.
[0026] In step S3, the inference and output of the spatiotemporal graph neural network specifically include: The attention coefficients between neighboring targets within a single frame are calculated using a spatial attention mechanism. Neighboring geometric features are aggregated to generate a complete node feature representation, which is used to infer the spatial state of occluded targets. The attention coefficients between targets across frames are calculated using a temporal attention mechanism, and historical trajectory features are aggregated with current observation features to generate temporal fusion features. The spatial-temporal attention mechanism achieves feature aggregation and inference through hierarchical attention calculation. The spatial attention mechanism calculates the attention coefficients between neighboring targets within a single frame, dynamically assigns weights based on geometric similarity and occlusion status, aggregates neighborhood geometric features to generate a completed node feature representation, which is used to infer the spatial state of occluded targets. The temporal attention mechanism calculates the attention coefficients between targets across frames, assigns weights based on motion consistency and identity similarity, aggregates historical trajectory features and current observation features to generate temporal fusion features, establishes long-term temporal dependencies, and the two-layer attention mechanism is independently computed and then cascaded for output, which maintains the integrity of the topological structure within a single frame and achieves the continuity of identity preservation across frames. The association confidence matrix is output by weighted fusion of identity similarity in node feature representation and geometric similarity in temporal fusion features. Specifically, the association confidence fusion step generates the final association confidence matrix by weighted fusion of identity similarity in node feature representation and geometric similarity in temporal fusion features. Identity similarity is calculated by combining Remote ID matching degree and RF fingerprint similarity, while geometric similarity is quantified by spatial position deviation and velocity direction consistency. The weight coefficients are adaptively adjusted according to the degree of occlusion: the weight of identity similarity is increased when occlusion is severe, and the weight of geometric similarity is increased when visibility is good. The output association confidence matrix serves as the input to the cascaded identity association module, supporting the subsequent hierarchical decision-making of high-confidence deterministic matching, medium-confidence fine association, and re-identification of unmatched targets.
[0027] S4. A three-level cascaded strategy is adopted based on the association confidence matrix: high-confidence targets are deterministically matched by joint judgment of strong geometric constraints and identity consistency; medium-confidence targets are finely associated by fusion association confidence output by spatiotemporal graph neural network; unmatched targets are re-identified and compared with the radio frequency fingerprint and historical feature database to restore their identity or initialize a new trajectory. In step S4, the hierarchical determination of the three-level cascading strategy includes: Targets with a detection confidence level higher than the first threshold are classified as high-confidence targets, targets with a detection confidence level between the first threshold and the second threshold are classified as medium-confidence targets, and targets with a detection confidence level lower than the second threshold are classified as low-confidence targets and classified as unmatched targets. The first threshold and the second threshold are dynamically and adaptively adjusted according to weather conditions, target size, or detection signal-to-noise ratio.
[0028] It should be further explained that the dynamic threshold adjustment strategy adaptively sets tiered thresholds based on environmental perception parameters. The first and second thresholds are not fixed constants, but are dynamically calculated based on weather conditions, target size, and detection signal-to-noise ratio (SNR). Weather conditions include light intensity and visibility level, which are quantified by image entropy and contrast estimation. Target size is measured by the pixel area of the detection box. Detection SNR encompasses both image SNR and radio frequency signal SNR. After normalization, these three values are input into a preset mapping model, which outputs an adjusted threshold combination. In clear, high-visibility scenarios, default thresholds of 0.8 and 0.5 are used. In foggy, low-visibility, or low-SNR scenarios, the thresholds are appropriately reduced to 0.65 and 0.35 to prevent excessive missed detections. In scenarios with dense small targets, the thresholds are simultaneously lowered and geometric constraints are tightened to suppress false alarms. This adaptive mechanism makes the cascaded strategy environmentally robust, avoiding performance rigidity caused by fixed parameters.
[0029] In step S4, the high-confidence target is determined by a joint judgment of strong geometric constraints and identity consistency, and the medium-confidence target is determined by a fine-grained association using the fusion association confidence output by the spatiotemporal graph neural network. Specifically, this includes: High-confidence targets utilize strong geometric constraints to verify spatial position deviation and velocity direction consistency, and use identity consistency to verify Remote ID matching and RF fingerprint similarity. When both geometric constraints and identity consistency are satisfied, deterministic matching is performed. The deterministic matching algorithm implements rigorous joint verification for high-confidence targets. The high-confidence judgment is based on the detection confidence being higher than the dynamically adjusted first threshold. During matching, strong geometric constraints are first used to verify the spatial position deviation and velocity direction consistency between the target and the predicted trajectory. The deviation must be less than the motion threshold and the directional angle must be less than the angle threshold. Then, identity consistency is used to verify the accurate matching of Remote ID and the similarity of RF fingerprint exceeding the security threshold. When both geometric constraints and identity consistency are satisfied, deterministic matching is performed to establish a high-confidence trajectory association, skipping complex calculations to improve real-time performance. If any condition is not met, the process is downgraded to a medium-confidence processing flow to avoid erroneous association.
[0030] The medium confidence target utilizes the fusion association confidence output by the spatiotemporal graph neural network to solve for the optimal match using the Hungarian algorithm. The fusion association confidence is a weighted fusion of identity similarity and geometric similarity, with the weights adaptively adjusted according to the degree of occlusion.
[0031] The refined association algorithm performs deep reasoning matching for targets with medium confidence. Medium confidence is determined by a detection confidence level between a first and second threshold. During matching, the algorithm directly calls the fused association confidence output by the spatiotemporal graph neural network and uses the Hungarian algorithm to find the globally optimal match. The fused association confidence is a weighted fusion of identity similarity and geometric similarity, with the weights adaptively adjusted according to the degree of occlusion: geometric weights dominate in the absence of occlusion, while identity weights dominate in the presence of heavy occlusion. The Hungarian algorithm aims to maximize the overall association confidence, handling multi-target competition scenarios with association ambiguity and outputting refined matching decisions. While its computational complexity is higher than deterministic matching, it is significantly better than traversal search, achieving a balance between accuracy and efficiency.
[0032] In step S4, the process of restoring the identity or initializing a new trajectory of the unmatched target based on the re-identification comparison between the radio frequency fingerprint and the historical feature database includes: Extract the radio frequency fingerprint features of the current frame of the unmatched target, calculate the similarity with the last known radio frequency fingerprint of each trajectory in the historical trajectory database, and determine that the target is the same and restore its identity when the maximum similarity exceeds the preset threshold; otherwise, initialize a new trajectory based on multi-factor identity. The re-identification comparison takes precedence over the initialization of the new trajectory, and the radio frequency fingerprint features remain stable under occlusion and appearance degradation scenarios.
[0033] The re-identification and comparison algorithm performs identity restoration or new trajectory initialization for unmatched targets. Unmatched targets include those with low-confidence detection and those failing fine-grained association. The algorithm prioritizes re-identification and comparison: extracting the radio frequency fingerprint features of the current frame and calculating the similarity with the last known radio frequency fingerprints of each trajectory in the historical trajectory database, using cosine similarity or Euclidean distance as a metric. When the maximum similarity exceeds a preset threshold, the target is identified as the same entity, its identity is restored, and it is associated with the corresponding historical trajectory; otherwise, a new trajectory is initialized based on multi-factor identity identification. Re-identification and comparison takes precedence over new trajectory initialization, leveraging the stability of radio frequency fingerprints in occlusion and appearance degradation scenarios to minimize identity fragmentation. For targets with no historical records, a new trajectory is directly initialized and an identity profile is established for subsequent tracking and management.
[0034] S5. Update the trajectory status and multi-factor identity of the successfully associated target, trigger an alarm for the target with identity switching, and output the trajectory sequence with multi-factor identity for flight tracing.
[0035] Step S5 specifically includes: Update the trajectory status and multi-factor identity of the successfully associated target, and append the radio frequency fingerprint features of the current frame to the identity feature database to improve the identity profile; An alarm is triggered for the identity switching target. The determination conditions for the identity switching include the inconsistency between the current frame multi-factor identity identifier and the predicted trajectory identity identifier, or the radio frequency fingerprint similarity is lower than the security threshold, or the Remote ID changes. The output trajectory sequence with multi-factor identification is used for flight tracing. The trajectory sequence includes timestamp, spatial location, Remote ID, radio frequency fingerprint hash and association confidence, which is used for post-flight path reconstruction and control source localization.
[0036] Reference Figure 2 A target tracking system based on UAV ID tracing includes the following modules: The multi-source identity feature extraction module is used to obtain the detection box and detection confidence, receive Remote ID information and extract radio frequency fingerprint features, and construct multi-factor identity identifiers; The geometric topology encoding module is used to construct local neighborhoods, calculate relative geometric relationships and encode them into geometric feature vectors, and construct a spatial graph structure. The spatiotemporal graph neural network inference module is used to construct spatiotemporal heterogeneous graphs, embed multi-factor identity identifiers, and output the association confidence matrix through spatial attention mechanism and temporal attention mechanism; The cascaded identity association module is used to perform deterministic matching, fine association, and re-identification comparison using a three-level cascaded strategy. The identity retention and traceability output module is used to update the trajectory status and identity identifier, trigger identity switching alarms, and output trajectory sequences with identity identifiers.
[0037] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely principles of the invention. Various changes and modifications can be made to the invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the claimed invention. The scope of protection claimed by the appended claims and their equivalents is defined.
Claims
1. A target tracking method based on UAV ID tracing, characterized in that, include: S1. Detect the target in the current frame image to obtain the detection box and detection confidence. At the same time, receive the RemoteID information of the UAV, collect flight control signals or image transmission signals and extract radio frequency fingerprint features. Dynamically bind the Remote ID and radio frequency fingerprint to construct a multi-factor identity identifier. S2. For each UAV target detected in the current frame, construct a local neighborhood centered on its spatial location, calculate the relative geometric relationship with other targets in the neighborhood, including Euclidean distance, relative azimuth angle and velocity vector difference, encode it into geometric feature vectors and construct a spatial graph structure; S3. Expand the continuous multi-frame spatial graph according to the time dimension to construct a spatiotemporal heterogeneous graph. Embed the multi-factor identity identifier as a node attribute into the spatiotemporal heterogeneous graph. Input it into the spatiotemporal graph neural network. Aggregate neighborhood geometric features through spatial attention mechanism to complete the occluded target features. Fusion historical trajectory and current observation features through temporal attention mechanism. Output the association confidence matrix that fuses identity similarity and geometric similarity. S4. A three-level cascaded strategy is adopted based on the association confidence matrix: high-confidence targets are matched deterministically using strong geometric constraints and identity consistency joint judgment; medium-confidence targets are matched finely using the fused association confidence output by the spatiotemporal graph neural network. Unmatched targets are re-identified and compared with historical feature databases based on radio frequency fingerprints to restore their identity or initialize a new trajectory; S5. Update the trajectory status and multi-factor identity of the successfully associated target, trigger an alarm for the target with identity switching, and output the trajectory sequence with multi-factor identity for flight tracing.
2. The target tracking method based on UAV ID tracing according to claim 1, characterized in that, Step S1, which involves constructing a multi-factor identity, includes: The Remote ID and RFID fingerprint are hashed and concatenated to generate a composite identity code, and the binding confidence of the two is calculated. When the binding confidence is lower than a preset threshold, an identity anomaly alarm is triggered. For targets without Remote ID, an anonymous identity is constructed solely based on radio frequency fingerprints, and cross-sensor identity maintenance of anonymous targets is achieved through cross-site radio frequency fingerprint comparison.
3. The target tracking method based on UAV ID tracing according to claim 1, characterized in that, Step S2 specifically includes: Using the spatial location of each target in the current frame as the center, an adaptive k-nearest neighbor strategy is used to select the k nearest targets to form a local neighborhood, where the value of k is dynamically adjusted according to the target density in the current frame. Calculate the relative geometric relationship between the central target and each target in the neighborhood, including Euclidean distance, relative azimuth angle, velocity vector difference and acceleration vector difference, and encode the relative geometric relationship into a multidimensional geometric feature vector; A spatial graph structure is constructed based on all targets and their geometric feature vectors in the current frame, where nodes are targets and the weights of edges are determined by geometric similarity. The geometric similarity is negatively correlated with the distance between targets and positively correlated with the consistency of velocity direction. For a target that is occluded, its spatial position is inferred by its relative geometric relationship with neighboring targets, and the geometric feature vector of the occluded target is completed.
4. The target tracking method based on UAV ID tracing according to claim 1, characterized in that, In step S3, the step of expanding the spatial graph of consecutive multiple frames according to the time dimension to construct a spatiotemporal heterogeneous graph, and embedding the multi-factor identity identifier as a node attribute into the spatiotemporal heterogeneous graph specifically includes: Stack the spatial graphs of multiple consecutive frames in temporal order, establish temporal edges between target nodes in adjacent frames, and determine the weight of the temporal edges by motion prediction bias and appearance similarity to construct a spatiotemporal heterogeneous graph containing spatial and temporal edges. The Remote ID in the multi-factor identity identifier is encoded as a one-hot vector and the radio frequency fingerprint is encoded as a low-dimensional feature vector. After concatenation, the concatenation is mapped to a spatiotemporal heterogeneous graph with node attributes embedded through a linear transformation. Independent feature transformation matrices are configured for the spatial and temporal edges respectively, and the embedded spatiotemporal heterogeneous graph is input into the spatiotemporal graph neural network.
5. The target tracking method based on UAV ID tracing according to claim 1, characterized in that, In step S3, the inference and output of the spatiotemporal graph neural network specifically include: The attention coefficients between neighboring targets within a single frame are calculated using a spatial attention mechanism. Neighboring geometric features are aggregated to generate a complete node feature representation, which is used to infer the spatial state of occluded targets. The attention coefficients between targets across frames are calculated using a temporal attention mechanism, and historical trajectory features are aggregated with current observation features to generate temporal fusion features. The identity similarity in the node feature representation and the geometric similarity in the temporal fusion feature are weighted and fused to output the association confidence matrix.
6. The target tracking method based on UAV ID tracing according to claim 1, characterized in that, In step S4, the hierarchical determination of the three-level cascading strategy includes: Targets with a detection confidence level higher than the first threshold are classified as high-confidence targets, targets with a detection confidence level between the first threshold and the second threshold are classified as medium-confidence targets, and targets with a detection confidence level lower than the second threshold are classified as low-confidence targets and classified as unmatched targets. The first threshold and the second threshold are dynamically and adaptively adjusted according to weather conditions, target size, or detection signal-to-noise ratio.
7. The target tracking method based on UAV ID tracing according to claim 6, characterized in that, In step S4, the high-confidence target is determined by a joint judgment of strong geometric constraints and identity consistency, and the medium-confidence target is determined by a fine-grained association using the fusion association confidence output by the spatiotemporal graph neural network. Specifically, this includes: High-confidence targets utilize strong geometric constraints to verify spatial position deviation and velocity direction consistency, and use identity consistency to verify Remote ID matching and RF fingerprint similarity. When both geometric constraints and identity consistency are satisfied, deterministic matching is performed. The medium confidence target utilizes the fusion association confidence output by the spatiotemporal graph neural network to solve for the optimal match using the Hungarian algorithm. The fusion association confidence is a weighted fusion of identity similarity and geometric similarity, with the weights adaptively adjusted according to the degree of occlusion.
8. A target tracking method based on UAV ID tracing according to claim 6, characterized in that, In step S4, the process of restoring the identity or initializing a new trajectory of the unmatched target based on the re-identification comparison between the radio frequency fingerprint and the historical feature database includes: Extract the radio frequency fingerprint features of the current frame of the unmatched target, calculate the similarity with the last known radio frequency fingerprint of each trajectory in the historical trajectory database, and determine that the target is the same and restore its identity when the maximum similarity exceeds the preset threshold; otherwise, initialize a new trajectory based on multi-factor identity. The re-identification comparison takes precedence over the initialization of the new trajectory, and the radio frequency fingerprint features remain stable under occlusion and appearance degradation scenarios.
9. The target tracking method based on UAV ID tracing according to claim 1, characterized in that, Step S5 specifically includes: Update the trajectory status and multi-factor identity of the successfully associated target, and append the radio frequency fingerprint features of the current frame to the identity feature database to improve the identity profile; An alarm is triggered for the identity switching target. The determination conditions for the identity switching include the inconsistency between the current frame multi-factor identity identifier and the predicted trajectory identity identifier, or the radio frequency fingerprint similarity is lower than the security threshold, or the Remote ID changes. The output trajectory sequence with multi-factor identification is used for flight tracing. The trajectory sequence includes timestamp, spatial location, Remote ID, radio frequency fingerprint hash and association confidence, which is used for post-flight path reconstruction and control source localization.
10. A target tracking system based on UAV ID tracing, applied to the target tracking method based on UAV ID tracing as described in any one of claims 1-9, characterized in that, Includes the following modules: The multi-source identity feature extraction module is used to obtain the detection box and detection confidence, receive Remote ID information and extract radio frequency fingerprint features, and construct multi-factor identity identifiers; The geometric topology encoding module is used to construct local neighborhoods, calculate relative geometric relationships and encode them into geometric feature vectors, and construct a spatial graph structure. The spatiotemporal graph neural network inference module is used to construct spatiotemporal heterogeneous graphs, embed multi-factor identity identifiers, and output the association confidence matrix through spatial attention mechanism and temporal attention mechanism; The cascaded identity association module is used to perform deterministic matching, fine association, and re-identification comparison using a three-level cascaded strategy. The identity retention and traceability output module is used to update the trajectory status and identity identifier, trigger identity switching alarms, and output trajectory sequences with identity identifiers.