A multi-gimbal cooperative dual-spectrum low-altitude bird detection and continuous tracking method
By using a global visibility mapping model and transient disturbance rejection control based on dual-spectral signal-to-noise ratio state variables, the problems of target occlusion and sudden interference leading to tracking loss and handover mismatch in multi-gimbal systems in low-altitude airspace were solved, thus achieving continuity and accuracy in bird detection and tracking.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING JIRUIXIANG AVIATION TECH CO LTD
- Filing Date
- 2026-05-14
- Publication Date
- 2026-06-30
Smart Images

Figure CN122308469A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of photoelectric detection and target tracking technology, and in particular to a multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking method. Background Technology
[0002] In bird detection and tracking missions in low-altitude airspace, dual-spectral electro-optical gimbals are typically used for continuous observation. Due to the abundance of buildings and vegetation in the low-altitude environment, targets are easily physically obstructed, making it difficult for a single device to maintain tracking throughout the entire flight path. Therefore, multi-gimbal collaborative networking has become a major development direction. Existing multi-gimbal collaborative systems mostly employ a passive handover mechanism, meaning that the search guidance of peripheral devices is only triggered after the master gimbal has completely lost sight of the target. This approach lacks advance prediction of target spatial visibility and does not consider system communication time and mechanical operational delays, resulting in the target often leaving the preset field of view by the time the relay gimbal completes its repositioning, causing a break in the tracking chain. Furthermore, in complex weather or background conditions, dual-spectral devices often face sudden drops in signal-to-clutter ratio on one side of the spectrum due to abrupt changes in illumination or localized heat sources. Existing fixed-weight fusion tracking strategies cannot quickly isolate disturbed channels, easily leading to abnormal fluctuations in the gimbal servo system.
[0003] Furthermore, birds are highly mobile and often move in groups. During the joint observation phase involving target handover across multiple devices, slight deviations in the installation references of each gimbal and backlash in mechanical gear operation mean that relying solely on conventional three-dimensional coordinate mapping for target matching results in significant projection errors. When multiple birds or similar moving objects simultaneously appear within the field of view of the relay gimbal, conventional systems lack robust geometric constraints and kinematic disambiguation mechanisms, making target mismatch highly likely and severely reducing the accuracy and reliability of continuous cross-device tracking.
[0004] Therefore, this invention proposes a multi-gimbal collaborative dual-spectral low-altitude bird detection and continuous tracking method to address the shortcomings of existing technologies. Summary of the Invention
[0005] The purpose of this invention is to provide a dual-spectral low-altitude bird detection and continuous tracking method using multi-gimbal coordination, which solves the problems of easy tracking loss in multi-gimbal systems under complex low-altitude backgrounds due to target occlusion or sudden interference, as well as handover mismatch caused by system delay and interference from similar targets during collaborative relay.
[0006] To achieve the above objectives, the present invention provides the following technical solution: a multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking method, comprising the following steps:
[0007] Establish a geodetic coordinate system, obtain a three-dimensional elevation model and a digital surface model, discretize the low-altitude airspace into a three-dimensional voxel set, calculate the visibility weight of the dual-spectral photoelectric gimbal to the three-dimensional voxels in the three-dimensional voxel set, and construct a global visibility mapping model.
[0008] Receive initial target detection data, wake up the main control gimbal to lock onto the target according to the global visibility mapping model, extract the dual-spectral signal-to-noise ratio state variables, and use the Kalman filter algorithm to estimate the target's three-dimensional motion vector and jerk vector, and construct the target dynamic equation to generate a three-dimensional predicted trajectory;
[0009] Calculate the expected value of the line integral of the visibility of the three-dimensional predicted trajectory in the global visibility mapping model and the lower limit of visibility, and perform transient disturbance rejection control by combining the first-order gradient of the bispectral signal-to-noise ratio state variable.
[0010] When the expected value of the visibility line integral falls below the integration safety threshold or the visibility lower bound returns to zero, calculate the comprehensive tracking cost of the surrounding candidate dual-spectrum opto-gimbals and select a relay gimbal from the surrounding candidate dual-spectrum opto-gimbals.
[0011] The predicted 3D trajectory is compensated for by combining the total system delay time to generate predicted 3D coordinates and send them to the relay gimbal. During the joint observation overlap period, theoretical epipolar lines and dynamic epipolar line tolerance bands are generated. The orthogonal distance from the centroid of the candidate target pixel to the theoretical epipolar line is calculated. Based on the dynamic epipolar line tolerance band, matching targets are selected from the candidate targets to complete the handover.
[0012] Preferably, the step of calculating the visibility weights of the dual-spectral photoelectric gimbal to the three-dimensional voxels in the three-dimensional voxel set includes:
[0013] Static building coordinates and vegetation coordinates are extracted from the digital land model to construct the bounding box of obstacles, and a preset spatial expansion coefficient is introduced to process the edge tolerance of the bounding box of obstacles.
[0014] Construct a spatial line of sight connecting the optical center of the dual-spectrum opto-gimbal and the center point of the three-dimensional voxel. Determine whether there is geometric interference between the spatial line of sight and the bounding box of the obstacle after edge tolerance processing, and generate a rigid occlusion Boolean value.
[0015] The visibility weight is obtained by multiplying the rigid occlusion Boolean value, the gimbal distance attenuation function, and the dual-spectral atmospheric penetration attenuation coefficient.
[0016] Preferably, before obtaining the visibility weight, the process includes:
[0017] Based on the maximum physical focal length parameter of the lens of the dual-spectrum opto-gimbal, the preset reference physical feature size, the pixel center spacing of the opto-sensor, and the minimum pixel span threshold, a quantitative model of the optical resolution attenuation function as the gimbal distance attenuation function is constructed.
[0018] By acquiring atmospheric visibility parameters, the center wavelength of the current working channel, the wavelength scattering coefficient negatively correlated with atmospheric visibility parameters, and the slant path atmospheric thickness correction factor, the dual-spectral atmospheric penetration attenuation coefficient is calculated.
[0019] Preferably, the steps of estimating the target's three-dimensional motion vector and jerk vector using the Kalman filter algorithm and constructing the target's dynamic equations include:
[0020] A constant jerk model containing position, velocity, acceleration and jerk components is used as the state vector of the Kalman filter algorithm to continuously estimate the target's three-dimensional motion vector and jerk vector.
[0021] Based on the filtered and corrected three-dimensional position vector estimate, three-dimensional velocity vector, three-dimensional acceleration vector, jerk vector, and derivation step size at the current observation time, the target dynamic equation is constructed through joint integration.
[0022] Preferably, the step of performing transient disturbance rejection control by combining the first-order gradient of the dual-spectral signal-to-noise ratio state quantity includes:
[0023] Extract the first-order gradient of the signal-to-noise ratio (SNR) of the visible light channel and the first-order gradient of the SNR of the infrared thermal imaging channel from the dual-spectral SNR state variables.
[0024] When the expected value of the line integral of visibility and the lower limit of visibility meet the set visibility conditions, and the absolute value of the first gradient of the signal-to-noise ratio of the visible light channel or the absolute value of the first gradient of the signal-to-noise ratio of the infrared thermal imaging channel exceeds the preset mutation threshold, it is determined that a one-sided transient interference has been encountered.
[0025] The servo control weights of the disturbed channels that exceed the preset mutation threshold are cleared to zero, and the servo control weights of the normal channels that do not exceed the preset mutation threshold are compensated to the full scale state until the signal-to-noise ratio first gradient of the disturbed channels returns to the set range, and the dual-channel joint weighted control is restored.
[0026] Preferably, the step of calculating the comprehensive tracking cost of surrounding candidate dual-spectrum opto-gimbals and selecting a relay gimbal from the surrounding candidate dual-spectrum opto-gimbals includes:
[0027] The system obtains the spatial straight-line distance from the optical center of the surrounding candidate dual-spectral opto-gimbal to the current three-dimensional coordinates of the target, the three-dimensional spatial angle between the servo axis pointing and the predicted intersection position of the target, the remaining observation time window, and the expected value of the visibility line integral for the predicted trajectory of the target.
[0028] Calculate the required servo adjustment angular velocity of the surrounding candidate dual-spectrum opto-gimbals. If the servo adjustment angular velocity exceeds the maximum rated angular velocity of the servo motor of the surrounding candidate dual-spectrum opto-gimbal, assign an out-of-bounds penalty constant value to the surrounding candidate dual-spectrum opto-gimbal.
[0029] By using weighted logic, the linear distance in space, the angle in three-dimensional space, the remaining observation time window, the expected value of the integral of the visibility line, and the constant value of the over-boundary penalty are integrated to obtain the instantaneous cost score;
[0030] A moving average is applied to the instantaneous cost score within a continuous sampling period, and the dual-spectrum opto-gimbal with the smallest cost function value is selected as the relay gimbal.
[0031] Preferably, the steps of compensating for the 3D predicted trajectory based on the total system delay time, generating predicted 3D coordinates, and sending them to the relay gimbal include:
[0032] Extract the system's absolute timestamp at the moment the handover command is issued and the absolute timestamp at the moment the relay PTZ feedbacks the pre-start status, and quantify the network communication time consumption through differential calculation;
[0033] The system obtains the three-dimensional spatial angle between the current optical axis of the relay gimbal and the estimated intersection point, the real-time angular velocity of the servo motor, and the preset compensation dead zone time margin. Combined with the network communication time consumption, the total system delay time is calculated.
[0034] Extract the magnitude of the jerk vector estimated by the Kalman filter algorithm, map the magnitude of the jerk vector to the set effective damping range, and generate the motor damping attenuation coefficient;
[0035] The total system delay time and the motor damping attenuation coefficient are substituted into the extrapolation integral term for exponential attenuation suppression, and the predicted three-dimensional coordinates are calculated by combining the target's three-dimensional position vector.
[0036] Preferably, the steps for generating the theoretical polar line and dynamic polar line tolerance band include:
[0037] Extract the target two-dimensional pixel coordinates captured by the main control gimbal at the same aligned sampling time, as well as the real-time dynamic intrinsic parameter matrix of the main control gimbal and the relay gimbal;
[0038] The fundamental matrix is calculated by combining the pre-calibrated relative rotation matrix and translation vector, and the coordinates of the observation points on the image plane of the main control gimbal are projected forward onto the image plane of the relay gimbal to generate the theoretical epipolar line;
[0039] Based on the statistical confidence scaling factor, geometric projection magnification factor, trace of the two-way mechanical backlash covariance matrix, fixed variance constant term caused by non-parallelism of reference mounting planes between devices, and pixel compensation constant, the dynamic epipolar tolerance band defining the deviation envelope on both sides of the theoretical epipolar line is calculated.
[0040] Preferably, the steps for extracting the target two-dimensional pixel coordinates captured by the main control gimbal at the same aligned sampling time, and the real-time dynamic intrinsic parameter matrix of the main control gimbal and the relay gimbal include:
[0041] When a singularity is detected in the main diagonal element of the real-time dynamic intrinsic parameter matrix, the intrinsic parameter verification logic is triggered.
[0042] The intrinsic parameter state of the previous valid frame, which is cached within the smoothing filter, is retrieved to participate in the calculation of the fundamental matrix.
[0043] Preferably, the steps for selecting matching targets from candidate targets and completing the handover based on the dynamic polar tolerance band include:
[0044] Based on the dynamic epipolar tolerance band, a preliminary subset of candidate targets whose orthogonal distances satisfy the envelope constraint is selected;
[0045] Extract the plane velocity vector of the target image captured by the main gimbal, and the initial plane velocity vector of the candidate targets in the candidate target subset within the field of view of the relay gimbal;
[0046] The cosine similarity of the apparent motion trajectory is calculated by the dot product of the velocity vector in the target image plane and the velocity vector in the initial image plane.
[0047] A multidimensional weighted disambiguation criterion is constructed by combining the normalized distance penalty term corresponding to orthogonal distance with cosine similarity, and the candidate target with the highest weighted score is selected as the matching target.
[0048] In summary, the present invention has at least one of the following beneficial technical effects:
[0049] 1. This invention constructs a global visibility mapping model and combines it with the first-order gradient of the dual-spectral signal-to-clutter ratio state variable to perform transient disturbance rejection control. This technical solution can quantitatively evaluate the visibility status of a target in a three-dimensional environment in real time. When a certain spectral channel encounters sudden environmental interference causing a sudden change in the signal-to-clutter ratio, the system can dynamically adjust the servo control weights of the disturbed channel and the normal channel, effectively avoiding servo tracking loss of the photoelectric gimbal due to unilateral interference, and ensuring the continuity and stability of bird target tracking in complex low-altitude backgrounds.
[0050] 2. This invention actively triggers a handover mechanism by monitoring the expected value of the visibility line integral and the visibility lower bound of the target's predicted trajectory, and sends the predicted three-dimensional coordinates to the relay gimbal in conjunction with the total system delay time and the maneuvering damping attenuation coefficient. This design allows the system to select the optimal relay device in advance before the target is completely obscured by obstacles. At the same time, the maneuvering damping attenuation coefficient suppresses the divergence error caused by long-distance extrapolation calculations and offsets the inherent delay caused by network communication and mechanical dead zones, ensuring that the relay gimbal can accurately adjust its position in advance and enter the pre-aiming state.
[0051] 3. During the overlapping period of joint observations by the main gimbal and the relay gimbal, this invention generates theoretical and dynamic epipolar tolerance bands and filters candidate targets by combining the cosine similarity of the image plane velocity vectors. This feature utilizes epipolar geometry constraints to eliminate background noise that does not conform to the binocular imaging rules from the spatial position dimension, and then combines it with trajectory similarity from the kinematic dimension for comprehensive discrimination. This effectively overcomes the handover mismatch problem that easily occurs when multiple bird flocks are densely active or when there are similar moving interference objects, thus improving the relay accuracy of multi-gimbal collaborative observation tasks. Attached Figure Description
[0052] Figure 1 This is a schematic diagram of the system architecture of the present invention;
[0053] Figure 2 This is a schematic diagram of the method flow of the present invention;
[0054] Figure 3 This is a schematic diagram of the two-dimensional spatial field-of-view blind zone prediction and target trajectory prediction of the present invention;
[0055] Figure 4 This is a schematic diagram of the first-order gradient time series comparison of the dual-spectral signal-to-noise ratio of the present invention;
[0056] Figure 5 This is a box plot comparing the lifecycle of the target continuous tracking link for the three schemes of the present invention.
[0057] Among them, 10 is a radar detector; 20 is a dual-spectrum optoelectronic gimbal; 30 is a collaborative control server; 100 is a spatial modeling module; 200 is a tracking and early warning module; 300 is a collaborative handover module; and 400 is a geometric disambiguation module. Detailed Implementation
[0058] The following is in conjunction with the appendix Figure 1 - Appendix Figure 5 The present invention will be further described in detail below.
[0059] This invention provides a multi-gimbal collaborative dual-spectral low-altitude bird detection and continuous tracking system, including a radar detector 10, multiple dual-spectral opto-gimbals 20, and a collaborative control server 30.
[0060] The collaborative control server 30 is communicatively connected to the radar detector 10 and multiple dual-spectrum electro-optical gimbals 20. The radar detector 10 is used to acquire the initial three-dimensional coordinates and velocity vectors of targets within the airspace. The dual-spectrum electro-optical gimbals 20 include a visible light sensor, an infrared thermal imaging sensor, and a servo motor with an absolute encoder. The multi-gimbal collaborative dual-spectrum low-altitude target tracking system achieves clock synchronization between the radar detector 10, each dual-spectrum electro-optical gimbal 20, and the collaborative control server 30 through a precise time protocol.
[0061] The collaborative control server 30 is internally configured with multiple logical modules. Specifically, the collaborative control server 30 includes a spatial modeling module 100, a tracking and early warning module 200, a collaborative handover module 300, and a geometric disambiguation module 400.
[0062] The spatial modeling module 100 acquires the three-dimensional elevation model and digital surface model of the deployment area and establishes a geodetic coordinate system. The spatial modeling module 100 discretizes the low-altitude airspace into a set of three-dimensional voxels, calculates the visibility weight of each dual-spectral optoelectronic gimbal 20 for each three-dimensional voxel, and generates a global visibility mapping model.
[0063] The tracking and early warning module 200 receives the initial target detection data sent by the radar detector 10, and wakes up the corresponding dual-spectrum optoelectronic gimbal 20 as the main control gimbal to lock onto the target based on the global visibility mapping model. During the tracking process, the tracking and early warning module 200 extracts the dual-spectrum signal-to-clutter ratio state variables, outputs the target's three-dimensional instantaneous velocity vector and acceleration vector, and constructs the target dynamic equation.
[0064] The collaborative handover module 300 extrapolates the predicted trajectory within a future time window based on the target dynamics equation. It calculates the expected value of the line integral of the predicted trajectory in the global visibility mapping model and the lower bound of visibility, and executes anti-interference logic in conjunction with the gradient change of the dual-spectral signal-to-noise ratio state variable. When preset handover conditions are met, the collaborative handover module 300 selects the dual-spectral optoelectronic gimbal 20 with the lowest cost as the relay gimbal, and sends the predicted three-dimensional coordinates to the relay gimbal based on the total delay time and the maneuvering damping attenuation coefficient.
[0065] During the joint observation overlap period, the geometric disambiguation module 400 calculates the fundamental matrix and generates the theoretical epipolar line on the image plane of the relay gimbal. The geometric disambiguation module 400 integrates the mechanical error covariance of the main gimbal and the relay gimbal to construct a dynamic epipolar tolerance band, calculates the orthogonal distance from the pixel centroid of the candidate target to the theoretical epipolar line, and selects matching targets based on the dynamic epipolar tolerance band to complete the identity transfer.
[0066] See attached document Figure 2 This invention provides a multi-gimbal collaborative dual-spectral low-altitude bird detection and continuous tracking method, comprising the following steps:
[0067] S100: Obtain the three-dimensional elevation model and digital surface model, establish a geodetic coordinate system, calculate the visibility weights of each three-dimensional voxel for the dual-spectral photoelectric gimbal 20, and construct a global visibility mapping model.
[0068] S200 receives initial target detection data, wakes up the main control gimbal for dual-spectral locking, extracts dual-spectral signal-to-noise ratio state variables and estimates the target's three-dimensional motion vector, and constructs the target's dynamic equations.
[0069] S300 extrapolates the target predicted trajectory, calculates the integral expectation of the visibility line and the lower limit of visibility, and performs transient disturbance rejection control in combination with the dual-spectral signal-to-noise ratio gradient.
[0070] S400: When the integral falls below the safety threshold or the lower bound returns to zero, calculate the comprehensive tracking cost of the surrounding candidate gimbals, select the successor gimbal, and trigger the soft handover mechanism.
[0071] S500, combining the total system delay time and the motor damping attenuation coefficient, generates predicted coordinates and sends them to the relay gimbal, controlling the relay gimbal to adjust the servo attitude in advance to enter the joint observation overlap period;
[0072] S600 generates theoretical epipolar lines during the joint observation overlap period, constructs dynamic epipolar tolerance bands, calculates the orthogonal distance from the centroid of candidate target pixels to the theoretical epipolar lines within the field of view, and selects the unique matching target to complete the handover.
[0073] To further clarify the implementation of each technical aspect of the present invention, the following will provide a detailed description of the implementation of each functional module involved above and its internal processing flow.
[0074] In this embodiment, to achieve digital characterization of complex low-altitude physical environments, the specific implementation of S100 includes the following sub-steps:
[0075] S101, the spatial modeling module 100 acquires the three-dimensional elevation model and digital surface model of the deployment area, using them as a spatial registration reference to establish a unified geodetic coordinate system across the entire region. In practical engineering applications, considering the balance between target detection accuracy and system edge computing power, a preferred implementation is to divide the designated low-altitude airspace into multiple three-dimensional voxels according to a preset spatial resolution, thereby generating a set of three-dimensional voxels. Each three-dimensional voxel is uniquely characterized by the three-dimensional coordinates (x, y, z) of its center point. Based on the established geodetic coordinate system, the spatial modeling module 100 simultaneously acquires the absolute geodetic coordinates (X, Y, Z) of the optical centers of each dual-spectral photoelectric gimbal 20 deployed in a specific space. This process ensures that the spatial positions of subsequent multi-source sensors are strictly aligned, avoiding spatial physical distortion caused by inconsistent references.
[0076] After completing the discretization and reconstruction of the basic space, the system needs to further eliminate the obstruction interference of terrain and buildings on the observation line of sight. Specifically, in step S102, the spatial modeling module 100 extracts the coordinates of static buildings and vegetation in the digital surface model and generates a rigid occlusion Boolean value reflecting the physical blind zone through a spatial intersection algorithm. For any dual-spectrum opto-gimbal 20 and any three-dimensional voxel, the spatial modeling module 100 constructs a spatial line of sight connecting the optical center of the dual-spectrum opto-gimbal 20 and the center point of the three-dimensional voxel. In a real gridded computing scenario, directly determining the intersection of lines of sight is prone to abrupt state changes at the edge of buildings due to floating-point precision. To avoid such boundary misjudgments, the system introduces a preset spatial expansion coefficient to perform edge tolerance processing on the extracted obstacle bounding box, and then determines whether the spatial line of sight interferes geometrically with the processed obstacle bounding box. If geometric interference occurs, it is determined that the three-dimensional voxel corresponds to the dual-spectrum opto-gimbal 20 in the physical blind zone, and the corresponding rigid occlusion Boolean value is set to 0; if no interference occurs, its rigid occlusion Boolean value is set to 1. The specific construction logic of the bounding box of the obstacle space and the intersection operation between the three-dimensional ray and the bounding box can be implemented by those skilled in the art using the existing hierarchical bounding box tree structure and the well-known ray intersection technology, and will not be elaborated here.
[0077] Simply obtaining the absolute occlusion relationship in three-dimensional geometry is insufficient to assess the true observation quality. This is because photoelectric sensors are inevitably limited by both atmospheric absorption and distance attenuation in actual outdoor environments. Therefore, in stage S103, based on the physical attenuation principles of atmospheric radiative transmission and optical imaging, the spatial modeling module 100 comprehensively considers the gimbal distance attenuation function, the dual-spectral atmospheric penetration attenuation coefficient, and the aforementioned rigid occlusion Boolean value to calculate the visibility weight of each three-dimensional voxel. This allows for the construction of a global visibility mapping model based on multi-dimensional observation quality assessment. Specifically, the spatial modeling module 100 establishes the weight values of each three-dimensional voxel according to the following visibility weight formula:
[0078] ;
[0079] In the formula, This represents the visibility weight of the dual-spectral opto-gimbal to the target three-dimensional voxel. To facilitate the unified measurement of the subsequent cost function, the system usually normalizes its calculation result to the real number interval [0,1]. This represents the Boolean value for rigid occlusion calculated in step S102; This represents the straight-line distance in space from the optical center of the dual-spectrum opto-gimbal to the center point of the target's three-dimensional voxel. It is important to note that this distance is significant when the physical target is extremely close to the device. When the value approaches zero, to prevent division by zero anomalies or function divergence at extremely close distances, the system assigns a lower limit safety value equal to the minimum physical focusing distance of the lens.
[0080] This embodiment addresses the optical resolution attenuation function in the above formula. Specific quantification was performed. The system constructs a quantization model of the optical resolution attenuation function based on the Johnson criterion and the geometric projection relationship of optical imaging. The specific calculation formula is as follows:
[0081] ;
[0082] In the formula, The maximum physical focal length parameter of the lens of the dual-spectrum opto-gimbal; To preset the reference physical feature size of the target to be tracked, the value is usually taken as 0.5 meters to 1.5 meters based on the average profile of a typical UAV or low-altitude aircraft; The pixel center spacing of the internal photoelectric sensor; To meet the minimum pixel span threshold required for effective target recognition, this embodiment typically sets it to 6 to 10 pixels based on empirical requirements for reliable recognition. The reason for using this model is that when the calculated right-hand ratio is greater than 1, it means that the number of imaging pixels of the target at the current distance far exceeds the recognition threshold. Taking the upper limit of 1 indicates that distance does not cause an observational degradation at the resolution level; while as distance increases... As the ratio increases, it gradually becomes less than 1, which truly reflects the discretization and loss process of target details on the sensor target surface. This physical causal relationship provides a rigid optical basis for system handover.
[0083] Meanwhile, regarding the dual-spectral atmospheric penetration attenuation coefficient Existing macroscopic radiation models often lack clear boundary constraints and parameter definitions. Therefore, this embodiment introduces specific empirical calculation benchmarks and performs precise decomposition by combining the Krusk model with slant-path atmospheric transport characteristics:
[0084] ;
[0085] In the formula, The atmospheric visibility parameter is obtained in real time from external weather stations, and the unit is kilometers; This represents the center wavelength of the current operating channel, measured in micrometers. For example, the visible light channel typically uses 0.55 micrometers, while the infrared thermal imaging channel often uses a specific value in the range of 8 to 14 micrometers, depending on the detector material. For visibility The negatively correlated wavelength scattering coefficient occurs when visibility is poor. When the distance is less than 6 kilometers, its value is When visibility For distances greater than or equal to 6 kilometers and less than 50 kilometers, the value is 1.3; when visibility... For distances greater than or equal to 50 kilometers, its value is 1.6; this setting is used to correct for the Mie scattering effect under high concentrations of aerosols. Furthermore, To match the observed elevation angle The relevant slant path atmospheric thickness correction factor. Considering that when tracking targets in the low-altitude region, an increase in elevation angle will cause a nonlinear decrease in the thickness of the high-concentration aerosol layer traversed by the line of sight, the system is set to... ,in The atmospheric vertical elevation attenuation constant is the local atmospheric value, typically ranging from 0.1 to 0.3. Through computational decomposition, the system not only fundamentally quantifies the physical attenuation differences between visible and infrared spectra under complex weather conditions, but also tightly couples the atmospheric tomographic effects caused by gimbal attitude changes.
[0086] After the above process, the spatial modeling module 100 traverses the spatial combination of all dual-spectrum opto-gimbals 20 and all three-dimensional voxels, completes the global calculation, and stores the results as a global visibility mapping model, thereby providing reliable underlying environmental prior data support for the subsequent complex gimbal scheduling and collaborative handover logic.
[0087] In this embodiment, to ensure the cooperative guidance accuracy of multi-source heterogeneous sensors in complex spatial domains, the specific implementation of S200 includes the following sub-steps:
[0088] S201, based on the timestamp alignment mechanism of the precise time protocol, the tracking and early warning module 200 receives the initial target detection data with an absolute timestamp sent by the radar detector 10. This data includes the initial three-dimensional coordinates of the target output by the radar detector 10 in a global unified geodetic coordinate system. Since the radar detector 10 can provide the spatial approximation of the target in a wide-area search, it often lacks fine texture features. Therefore, the system must transfer the tracking weight to the optical device through efficient scheduling logic. Based on the initial three-dimensional coordinates of the target obtained above, the tracking and early warning module 200 maps them to the corresponding three-dimensional voxel position in the global visibility mapping model and iterates through and queries the visibility weight values of all dual-spectrum opto-panel 20s at that position. Considering that in real urban or field environments, simple geometric visibility cannot fully represent the actual usability of the equipment, as a preferred method, the system performs a weighted evaluation of the hardware health vector and visibility weight value of each device, and uses a multi-dimensional comprehensive evaluation logic to select the one with the highest score as the preferred master control pan-tilt. Subsequently, the main control gimbal receives servo pointing commands from the tracking and early warning module 200, and adjusts the pitch and azimuth angles of the internal servo motors to achieve initial physical field-of-view locking of the target. For the spatial transformation calculation between the radar detector 10 polar coordinate system and the system's geodetic coordinate system, those skilled in the art can use existing known techniques such as spatial coordinate rotation and translation matrices, which will not be elaborated upon here.
[0089] After the main control gimbal enters continuous tracking mode, in order to quantify the target observation quality in real time and identify potential visual environmental interference, stage S202 begins to extract the dual-spectral signal-to-noise ratio (SNR) state parameters of the visible light channel and infrared thermal imaging channel of the main control gimbal. This parallel computation of the dual channels can provide robust decision-making basis for the system under conditions of fog, strong light, or drastic fluctuations in background noise. The tracking and early warning module 200 delineates an internal target region centered on the centroid of the target's two-dimensional pixels and an outer annular background region within the image plane, and quantifies them according to the following SNR formula:
[0090] ;
[0091] In the formula, This represents the signal-to-noise ratio (SNR) state variable of the current calculation channel, indicated by the index. When vis represents the visible light channel, the subscript is used. When it is ir, it represents the infrared thermal imaging channel; Defined as the mathematical average of the grayscale values of all pixels within the internal target area, it is used to reflect the real-time imaging intensity of the target; It is the mathematical average of the grayscale values of all pixels within the outer annular background region; The standard deviation of the gray values of all pixels within the outer annular background area is used to measure the intensity of clutter disturbance in the local background environment. This is a system-preset overflow prevention minimum positive real number compensation term, whose value is limited to 10 in this embodiment. -5 Up to 10 -3 Between. Introduce compensation terms. The core significance lies in the fact that when the observation background is extremely pure, the standard deviation is reduced. When the value approaches zero, this effectively prevents computational crashes during division operations. Through the above processing, the system synchronously acquires the signal-to-noise ratio time series of two independent channels within each video frame period, providing a quantitative quality reference for subsequent determination of transient optical interference and cross-device handover.
[0092] Acquiring only two-dimensional features of the image plane is insufficient to support the system's forward-looking warning logic in three-dimensional physical space. Therefore, stage S203 requires upscaling the discrete observation data to the motion space. Before proceeding with the specific dynamics derivation, based on the recursive estimation principle of discrete-time linear dynamic systems, the system needs to filter out measurement white noise caused by sensor mechanical vibration and atmospheric disturbances. The tracking and warning module 200 uses the Kalman filter algorithm to continuously estimate the target's motion parameters in the geodetic coordinate system and establish the target's continuous kinematic equations accordingly. In this embodiment, the system uses a constant jerk model containing position, velocity, acceleration, and jerk components as the filtered state vector to support the subsequent extraction of the target's nonlinear maneuvering characteristics. In each filtering cycle, the tracking and warning module 200 combines the feedback value of the laser ranging information with the real-time angle of the servo motor to calculate the instantaneous observation coordinates of the target. Based on the above measurement inputs, the system performs state prediction and update. To ensure computational robustness when the measurement noise covariance matrix exhibits ill-conditioned or degenerate behavior, the tracking and early warning module 200 superimposes a small regularization scalar when calculating the inverse term of the Kalman gain, thereby avoiding computational divergence caused by matrix singularity. Using the posterior estimate obtained through filtering convergence, the system establishes the target dynamic equation of the following form:
[0093] ;
[0094] In the formula, The target is at the predicted time. The three-dimensional predicted position vector at the location; The current observation time The estimated three-dimensional position vector after filtering and correction; This represents the three-dimensional velocity vector at the current moment; This represents the three-dimensional acceleration vector at the current moment; This represents the three-dimensional accelerometer vector at the current moment. Parameters The preset inference step size for the system is fixed at 0.02 seconds based on the response frequency of the servo control link. By integrating the equation jointly with velocity and acceleration, the system can transform instantaneous discrete detection points into continuous predictive trajectories with physical inertia, thereby providing accurate future spatial coordinate guidance for subsequent cross-gimbal soft handover scheduling. The specific state transition matrix design within the Kalman filter algorithm can be implemented by those skilled in the art using existing standard extended Kalman filter techniques, and will not be elaborated upon here.
[0095] When the photoelectric detection system experiences invalid handover due to localized smoke obscuring or direct strong light, in this embodiment, the collaborative handover module 300 employs proactive forward-looking prediction logic. The specific implementation of S300 includes the following sub-steps:
[0096] S301, to avoid misjudging transient environmental disturbances as target loss and triggering unnecessary equipment handover actions, based on the target dynamics equations established in the preceding steps, the collaborative handover module 300 extrapolates a three-dimensional predicted trajectory with a preset time window into the future time domain. Considering the maneuvering characteristics of low-altitude targets and the inherent mechanical inertia of gimbal servo response, this preset time window is a preferred approach. The interval is typically set to 2 to 5 seconds. This range has been proven in actual field tests to ensure both the statistical significance of the predicted trajectory and effectively suppress divergence errors caused by long-term extrapolation. Subsequently, the collaborative handover module 300 discretizes the generated 3D predicted trajectory according to the system sampling frequency, thereby obtaining a series of continuous spatial point coordinates. The system maps these coordinates to a global visibility mapping model to extract the visibility weight value corresponding to each point. To comprehensively evaluate the visibility in front of the target, the system calculates the expected value of the visibility line integral and the lower bound of visibility within the trajectory segment. The specific calculation formula is as follows:
[0097] ;
[0098] ;
[0099] In the formula, The expected value of the visibility line integral of the three-dimensional predicted trajectory has a physical meaning that it is used to macroscopically assess the stability of the overall observation quality of the target over a period of time in the future. The preset time window span for extrapolation; This represents the total number of discrete sampling points within the time window. To query the predicted points obtained from the global visibility mapping model Visibility weight value at the location; The time step between adjacent discrete points; This is used to predict the lower bound of visibility within the trajectory segment. The technical purpose of setting this lower bound is to accurately detect whether an extremely narrow area of absolute physical occlusion exists in the future path. In this embodiment, the calculated lower bound is used... Compare with a preset occlusion safety threshold (e.g., 0.2). If... If the target remains above the safety threshold, it is determined that the target possesses the objective conditions for continuous observation in spatial geometry.
[0100] S302. After confirming the macroscopic visibility of the spatial geometric dimension, the system still needs to further identify transient physical interference through feature fluctuations in the image dimension. Actual research and observation show that the Mie scattering coefficients of the infrared and visible light bands differ significantly when facing aerosols of different concentrations. This orthogonal characteristic at the spectral level provides a solid physical basis for identifying non-obstructive interference. Based on this, the collaborative handover module 300 calculates the first-order gradient of the dual-spectral signal-to-noise ratio state variables of the visible and infrared channels in real time. The specific formula for calculating the first-order gradient is as follows:
[0101] ;
[0102] In the formula, The calculated signal-to-noise ratio first-order gradient, whose positive or negative sign intuitively reflects the evolution trend of the current channel imaging quality; The signal-to-noise ratio (SNR) state variable for a specific channel at the current observation time, where the subscript... Vis can be used to represent the visible light channel, or ir can be used to represent the infrared channel; This is the signal-to-noise ratio state quantity corresponding to the previous sampling time. This is the system's fixed sampling period. It should be specifically noted that... An infinitesimal regularization constant preset for the system. Introduced. The underlying logic is that when the system experiences extreme packet loss or timestamp alignment anomalies under harsh operating conditions, it will cause... When the value approaches zero, this compensation item can effectively prevent program crashes caused by division by zero calculations. This is achieved through continuous monitoring. and By varying the degree of dispersion, the system can effectively separate the target's actual physical occlusion from single-channel optical interference. Specifically, when the target is truly occluded by a building, the gradients of both channels will usually exhibit a sharp negative drop simultaneously; conversely, if only localized strong light or smoke is encountered, it will often only show an abnormal abrupt change in the gradient of a single channel.
[0103] S303, combining the joint feedback from spatial visibility prediction and image gradient monitoring, the collaborative handover module 300 constructs a set of orthogonal anti-interference logic with multi-dimensional weighted judgment. During operation, the system continuously compares the prediction results with a predetermined safety threshold. In this embodiment, based on statistical analysis of numerous actual observation samples of low-altitude flying targets under different weather conditions, the system integrates the safety threshold... Set it to 0.6, and at the same time set the lower bound safety threshold. Set to 0.2. When satisfied. and This proves that the physical path in front of the target is open and unobstructed. Under this premise, if the system detects the absolute value of the first-order gradient of the signal-to-noise ratio of a certain channel... Exceeded the preset mutation threshold This will determine that the current tracking process has encountered a one-sided transient disturbance.
[0104] In response to this type of unilateral interference detection, the cooperative handover module 300 will immediately issue a forced suppression handover command. As a preferred control implementation method, the system intervenes physically by dynamically adjusting the weight allocation matrix in the underlying servo control loop. Specifically, the system adjusts the servo control weights of the disturbed channel that experiences a sudden change. The system directly resets the weights to zero and simultaneously compensates the weights of the other normal imaging channel to full scale 1.0. This weight reset is essentially achieved by modifying the input gain term of the main gimbal's PID controller. Its core technical significance lies in cutting off the centroid drift noise generated by the disturbed channel from entering the servo feedback loop, thereby preventing severe jitter or irreversible mistracking by the gimbal motor. The system then maintains this unilateral suppression state until the signal-to-noise ratio gradient of the disturbed channel stabilizes again and returns to a safe threshold range. Only after confirming the dissipation of interference will the system gradually restore the joint weighted control of the two channels according to a preset smoothing decay function. With this rigorous closed-loop anti-interference logic, the system ensures continuous and stable locking of the main gimbal while fundamentally eliminating invalid cross-device handover caused by local environmental fluctuations, thus improving the continuity and robustness of target tracking.
[0105] In this embodiment, to address the risk of tracking interruption caused by the high maneuverability of low-altitude targets and the random distribution of obstructions, the specific implementation of S400 includes the following sub-steps:
[0106] S401. In complex urban or wilderness scenarios, the target's trajectory inevitably overlaps with obstacles such as buildings and trees. To achieve early response, based on the aforementioned visibility prediction of the target's future 3D trajectory, the system continuously monitors the expected value of the visibility line integral and the lower bound of visibility. Based on the above prediction results, when the calculated expected value of the visibility line integral falls below the aforementioned set integration safety threshold, or when the lower bound of visibility reaches zero, the system determines that the main control gimbal will absolutely block the line of sight within a preset time window. This determination indicates that the target is about to enter the rigid occlusion blind zone of the current main control gimbal. At this moment, the collaborative handover module 300 immediately triggers the cross-device handover mechanism. By utilizing the time margin before the target truly disappears, the system effectively offsets the servo response delay generated by subsequent heterogeneous sensors during pointing adjustment, which is crucial for maintaining the physical continuity of the high-mobility target tracking link.
[0107] S402, after triggering the handover mechanism, the system needs to find the most suitable sensor for the handover from the candidate device pool. Considering that simple spatial distance determination cannot simultaneously take into account the line-of-sight quality and dynamic response capability of the device, as a preferred approach, the collaborative handover module 300 constructs a comprehensive tracking cost function. This function deeply integrates spatial positional relationships, motion matching degree, and imaging environment quality through weighted logic. The formula for the comprehensive tracking cost function is as follows:
[0108] ;
[0109] In the formula, For the first The comprehensive tracking cost score of each peripheral candidate dual-spectral opto-gimbal; The distance between the optical center of the candidate gimbal and the current three-dimensional coordinates of the target; The maximum effective detection distance preset by the system is set to 3000 meters to 5000 meters in this embodiment based on the nominal detection envelope of the optical lens. The purpose is to perform dimensionless normalization processing on the distance component. The three-dimensional spatial angle between the current servo axis pointing of the candidate gimbal and the intersection position of the predicted target; The remaining observation time window is predicted, and its value is obtained by subtracting the current system timestamp from the absolute timestamp of the predicted occlusion. The system's preset minimum positive real number for overflow prevention has a value of 10. -4 This is used to avoid calculation divergence caused by the denominator approaching zero at the critical point of intersection; For the first The integral expectation of the visibility line calculated by each candidate gimbal for the target's future predicted trajectory; , and The weighting coefficients for each indicator are set to 0.3, 0.4 and 0.3 respectively in this embodiment. The basis for these values is to ensure that the candidate gimbal has sufficient dynamic response speed. This is a penalty item for crossing the boundary. (Regarding the penalty item for crossing the boundary...) The determination is based on the hard constraints of the device's physical limits. The system calculates the servo adjustment angular velocity required for candidate gimbals in real time. ,like If the maximum rated angular velocity of the servo motor of the device is exceeded (e.g., 60 degrees / second), then It is assigned a constant penalty value of 9999, which is much larger than the normal range. This judgment logic ensures the physical executability of the relay instruction by directly eliminating devices with incompatible hardware performance at the algorithm level.
[0110] S403, based on the instantaneous cost scores of each candidate gimbal calculated above, the system still needs to perform temporal smoothing to eliminate fluctuations caused by network jitter or sensor measurement noise. In actual engineering operation environments, instantaneous minimum values often have randomness. Therefore, the collaborative handover module 300 performs a moving average processing on the cost score of each candidate gimbal over five consecutive sampling periods (time span approximately 0.1 seconds). By comparing the average cost scores of each device, the system selects the dual-spectrum optoelectronic gimbal 20 with the smallest cost function value as the predetermined relay gimbal. After determining the relay device, the collaborative handover module 300 sends the predicted dynamic parameters of the target and the handover time reference to the gimbal. Upon receiving the instruction, the relay gimbal first starts the servo system to pre-aim at the predicted handover position. Based on this comprehensive judgment of multi-dimensional trade-offs and temporal smoothing, the system achieves seamless transfer of master control authority between heterogeneous devices, thereby reducing the risk of systemic target loss caused by single-point line-of-sight obstruction.
[0111] Actual field tests show that in a distributed optoelectronic defense network, there is an unavoidable physical time lag between the issuance of control commands by the master control node and the actual execution by the electromechanical equipment. When facing low-altitude targets with high maneuverability, if the system ignores this time lag, the spatial coordinates pointed to by the final optical axis of the relay equipment will inevitably deviate significantly from the target's actual position. Based on this engineering background, in this embodiment, the collaborative handover module 300 introduces spatiotemporal compensation logic with time decay characteristics. Specifically, this logic aims to overcome the divergence in predicted trajectory caused by the superposition of system delay and nonlinear target maneuverability. The specific implementation of S500 includes the following sub-steps:
[0112] The S501's spatiotemporal compensation is based on the precise quantification of clock deviations caused by network communication and mechanical servo response. Actual research and observation revealed that network routing jitter between distributed nodes often exhibits significant randomness. Furthermore, the inherent differences in startup inertia among different servo motor models cause the delay time of each task scheduling to display dynamic, non-linear characteristics. To establish a unified time measurement benchmark among heterogeneous devices, the system deploys an absolute clock synchronization architecture based on a precise time protocol across the entire network.
[0113] In this embodiment, the collaborative handover module 300 extracts the system's absolute timestamp at the moment the instruction package is issued, and performs a differential operation with the absolute timestamp when the relay gimbal feedbacks the servo pre-start state, thereby accurately quantifying the network communication time of a single scheduling operation. However, this is only the network-level latency. The system also needs to combine the current optical axis deflection angle fed back by the relay gimbal's underlying driver and the estimated angle deviation of the handover point to further deduce the physical-level mechanical servo response time. In order to incorporate multi-source latency into the calculation framework, the formula for calculating the total system latency is designed as follows:
[0114] ;
[0115] In the formula, This is the calculated total system delay time; The system absolute timestamp of the moment when the handover command is issued by the collaborative handover module 300; The absolute timestamp for the relay gimbal to receive the instruction and complete the internal state initialization; The angle between the current optical axis of the relay gimbal and the estimated intersection point in three dimensions; This represents the real-time angular velocity of the relay gimbal servo motor. This is a preset compensation dead zone time margin to account for static friction during motor startup and mechanical clearance of gears. As a preferred approach, based on a large amount of measured calibration data from field optoelectronic equipment, this time margin is typically set between 0.02 seconds and 0.05 seconds.
[0116] In the above calculation logic, special attention needs to be paid to the boundary anomalies of the division term under extreme operating conditions. This is especially important when the motor is completely stationary. When the value approaches zero, directly substituting it into the formula will cause a division-by-zero overflow crash in the underlying program. Therefore, the system has a dedicated fault-tolerance mechanism. When a fault is detected... Less than the set minimum positive number At that time, the system will directly call the rated starting angle acceleration parameters of the motor to perform an equivalent time estimation. Regarding the underlying implementation principle of clock synchronization based on a precise time protocol, those skilled in the art can use existing well-known technologies such as IEEE 1588 to complete the alignment of network timestamps, which will not be elaborated upon here.
[0117] The precise total system latency was obtained. Then, the system can proceed with the extrapolation and deduction of spatial coordinates. S502 provides the specific solution logic to avoid prediction divergence. In low-altitude defense scenarios, the constant acceleration integral term without attenuation suppression will change with the prediction time. The extension of the time limit leads to divergence in the output predicted coordinates. Because the real target is physically constrained by aerodynamic drag and structural overload limits, it cannot maintain peak acceleration indefinitely. Based on this, the collaborative handover module 300 implements a dimensionality reduction suppression strategy for long-term acceleration confidence. Specifically, a maneuvering damping attenuation coefficient is introduced into the original extrapolation integral term, thereby constructing the following predicted coordinate calculation formula with good mathematical convergence characteristics:
[0118] ;
[0119] In the formula, This represents the predicted target coordinates after damping correction. The target's three-dimensional spatial coordinates are captured by the main control gimbal at the current observation time and then filtered. and These correspond to the target's three-dimensional velocity vector and three-dimensional acceleration vector at the current moment, respectively. This refers to the total system delay time obtained through the rigorous calculation mentioned above; It is the base of the natural logarithm; The motion damping attenuation coefficient is specifically introduced for the system.
[0120] Introducing attenuation coefficient The technical objective is to ensure that the weight of the acceleration term's influence on the long-term trajectory follows objective physical laws, exhibiting a reasonable exponential decay. To avoid relying on a single constant empirical value that leads to poor scenario adaptability, in this embodiment... It performs dynamic adaptive mapping based on the target's current acceleration state. Specifically, the system extracts the magnitude of the target's acceleration vector output by the Kalman filter from the state vector in real time and maps it to an effective damping range of 0.5 to 2.0. When the target exhibits a strong tendency to change direction, the system assigns... A value close to 2.0 is relatively large, thus limiting the quadratic divergence trend and causing the long-term prediction model to smoothly degenerate into a uniform linear motion model; conversely, if the target course remains stable, This approximates the minimum value of 0.5, thus preserving the effect of the original inertial integral to the greatest extent. This adaptive adjustment mechanism based on actual maneuver characteristics ensures through algorithmic control logic that the output guiding coordinates always converge within a reasonable physical motion envelope, regardless of network latency fluctuations.
[0121] After obtaining the converged coordinates with anti-latency characteristics, the entire handover task then enters the cross-device physical allocation execution phase (S503). The collaborative handover module 300 will then process the calculated predicted coordinates... The data is packaged and directly sent to the designated relay pan-tilt unit. Traditional security monitoring systems typically passively wait for the target to re-enter the frame before initiating servo tracking. This response mechanism based on image lag is highly susceptible to causing a complete break in the tracking link in complex environments.
[0122] In this embodiment, after receiving the coordinate data, the relay gimbal immediately... within the total system delay time Within the predicted time window, the system operates in advance, driving the servo system to precisely align the optical spindle with the predicted spatial position. Furthermore, the underlying control unit of the relay gimbal synchronously adjusts the position based on the predicted coordinates. The absolute linear physical distance between the zoom lens and its optical center is automatically used to apply the preset lens focal length mapping curve to complete the mechanical position adjustment of the zoom lens group.
[0123] This series of sequential pre-adjustment actions ultimately aims to ensure that once the target penetrates a physical obstruction, it instantly falls into the clear depth of field and the center of the observation area. Thanks to the aforementioned pre-focusing and pre-aiming mechanisms, the relay gimbal has already entered a highly alert, observation-ready state while the main gimbal loses the target due to rigid obstruction. This misalignment compensation strategy, combining temporal pre-adjustment and spatial calculation, allows the system to construct a seamless joint observation overlap period at both ends of the objectively existing physical blind spot through spatiotemporal coordinate alignment, thus robustly ensuring the physical continuity of the target tracking link even under extreme conditions.
[0124] In actual field defense missions, after the relay gimbal completes pre-positioning in space and successfully captures the image according to predetermined instructions, multiple moving targets with extremely similar physical size and infrared radiation characteristics usually exist simultaneously within its field of view. Faced with such complex cluster interference backgrounds, relying solely on traditional image feature matching is prone to failure when there are drastic changes in lighting or large-scale switching of the observation perspective. To accurately and uniquely inherit the previous tracking identity of the main control gimbal, in this embodiment, the geometry disambiguation module 400 introduces a multi-dimensional verification mechanism based on epipolar geometric constraints and a joint error propagation model, thereby fundamentally eliminating the risk of identity confusion. Its specific implementation logic includes the following key steps.
[0125] The specific execution of S601 is based on the strict geometric mapping relationship between heterogeneous optoelectronic devices. Because the main gimbal and the relay gimbal are typically in high-speed servo operation and continuous zoom at the moment of handover, even a millisecond-level deviation in the spatiotemporal alignment of multi-source data can lead to significant deviations in subsequent geometric projection. Addressing this challenge, the geometric disambiguation module 400 strictly relies on the absolute timestamp of a precise time protocol to accurately extract the target two-dimensional pixel coordinates of the main gimbal at the same aligned sampling moment. and the real-time dynamic intrinsic parameter matrix of the two devices and Combined with the pre-calibrated relative rotation matrices of the two gimbals in the world coordinate system. With translation vector The system can then calculate the fundamental matrix used to describe the geometric constraints of the dual-viewpoint. .
[0126] It is worth noting that during the matrix operations performed by the underlying algorithm, the fundamental matrix... The derivation essentially relies heavily on inverting the intrinsic parameter matrix. Considering that the hardware driver layer might report abnormal transient data, this leads to... and The main diagonal elements (i.e., equivalent physical focal lengths) approach zero. To mitigate the risk of matrix singularity, the system is configured with intrinsic parameter verification logic: when a singularity is detected in the intrinsic parameter matrix, the system retrieves the intrinsic parameter state of the previous valid frame from the buffer in the smoothing filter to participate in the calculation, thus effectively avoiding program crashes caused by matrix singularity. After ensuring that the system state is non-singular, based on the classical epipolar geometry mapping principle, the coordinates of the observation point on the main gimbal image plane are forward-projected onto the image plane of the relay gimbal, thereby generating the corresponding theoretical epipolar equation:
[0127] ;
[0128] In the formula, The theoretical epipolar vector on the image plane of the relay gimbal is typically represented as The form; The fundamental matrix obtained after spatiotemporal alignment; These are the homogeneous coordinates of the target pixels observed by the main gimbal. From a physical perspective, this equation rigorously defines the ideal projection trajectory that the observation line of the main gimbal should present in the field of view of the relay gimbal.
[0129] However, due to the mechanical backlash of the gimbal gear transmission mechanism, the geometric magnification effect of the telephoto lens, and the slight installation deformation of the support under wind load, the actual pixel centroid of the target in the relay gimbal image deviates from the theoretical epipolar line. Since the fixed distance constraint threshold is prone to interference at the wide-angle end due to lenient conditions, and prone to rejecting legitimate targets at the telephoto end due to mechanical shake, in the specific implementation of S602, the geometric disambiguation module 400 integrates the above-mentioned multi-source physical errors and specifically constructs a joint error propagation model, based on which the dynamic epipolar tolerance band threshold is derived. The specific calculation formula is as follows:
[0130] ;
[0131] In the formula, This represents the threshold of the dynamic polarity tolerance band. Its technical purpose is to quantitatively define the reasonable physical deviation envelope on both sides of the theoretical polarity. The preset statistical confidence scaling factor is typically set between 2.5 and 3.5 in this embodiment, based on the three sigma principle of normal distribution of a large amount of field data. Represents the current focal length of the relay gimbal. The geometric projection magnification factor, which is positively correlated, is specifically defined as the current focal length. The ratio to the sensor pixel size is used to accurately quantify the magnification of minute angular deviations at long focal lengths in the image plane. The trace of the bidirectional mechanical backlash covariance matrix is used to characterize the standard deviation of the overall angular displacement of the servo system. This value is generally calibrated to be between 0.005 degrees and 0.03 degrees when the equipment is shipped from the factory. The fixed variance constant term is caused by the non-parallelism of the reference mounting planes between equipment. This is a pixel compensation constant specifically set to compensate for the sampling error of the discretized grid of the image sensor, typically ranging from 2 to 5 pixels. Through this adaptive threshold mechanism that deeply binds mechanical and optical parameters, the system obtains a set of elastic judgment boundaries that can automatically widen as the focal length increases and automatically converge as mechanical precision improves.
[0132] After establishing the dynamic epipolar tolerance band, the geometry disambiguation module 400 begins to calculate the first image extracted from the image segmentation within the field of view of the relay gimbal. Centroid of candidate target pixels To the theoretical pole orthogonal perpendicular distance Furthermore, when the extreme spatial location of the target causes the theoretical polar line to unexpectedly degenerate into a pole, i.e., the parameter... To avoid the traditional point-to-line distance formula causing a division-by-zero error, the system regularizes the distance calculation formula to approximate zero. ,in To set to 10 -6 The minimum positive real number that prevents overflow.
[0133] After obtaining the distances of each candidate, the system initially filters out those that meet the envelope constraints. The candidate target subset. If only the minimum orthogonal vertical distance as a single extreme feature is used for the final decision, considering that multiple targets may overlap or occlude each other at close range near the epipolar line, a single metric may lead to abnormal identity redistribution. Therefore, this embodiment introduces kinematic features as a second-level decision dimension. Combining the normalized distance penalty term and the consistency of apparent motion velocity direction under dual-camera perspectives, the geometric disambiguation module 400 constructs the following multi-dimensional weighted disambiguation criterion formula:
[0134] ;
[0135] In the formula, This is the index identifier that is finally confirmed as the unique and legitimate relay target after the above multi-dimensional cross-validation. This reflects the normalized spatial distance state from the candidate target to the polar line; The target image plane velocity vector captured by the main control gimbal in the previous control cycle; To relay the first in the field of view of the gimbal The initial image plane velocity vectors of each candidate target; the dot product of these two is used to rigorously calculate the cosine similarity of the apparent motion trajectories; and These represent the weighting coefficients for spatial location and kinematic dimensions, respectively. When dealing with the complex cluster crossing scenario described in this embodiment, since the target's motion direction characteristics are often more stable and reliable than its instantaneous spatial location, these two weights are preferably set to 0.4 and 0.6, respectively. By leveraging this weighted judgment logic that deeply integrates hard constraints in polar geometry with soft correlations in kinematic features, the false tracking rate is reduced in extreme conditions such as long focal lengths and overlapping multiple maneuvering targets, thereby ensuring the consistency of the target's physical identity before and after the handover.
[0136] Specific application examples:
[0137] In this embodiment, the system is deployed in the take-off and landing sensitive area of a typical civil airport. The airport is surrounded by control towers, terminal buildings, and large areas of protective forest belts. These physical barriers are scattered and the environmental characteristics are complex, which can easily interfere with the continuous tracking of low-altitude targets.
[0138] During the project implementation phase, one Ku-band phased array radar was deployed around the runway and at high points, along with four sets of dual-spectrum electro-optical gimbals 20, numbered PTZ-1 to PTZ-4 respectively. To achieve consistent spatial coordinate system representation among the heterogeneous devices, the spatial modeling module 100 in this embodiment reads a high-precision digital surface model of the area, finely dividing the entire airport's low-altitude airspace into a three-dimensional voxel grid with sides of 2 meters. Based on a preset optical resolution attenuation function and obstacle intersection logic, the system calculates a visibility mapping model covering the entire area. Based on this underlying data structure, the system can pre-calibrate airspace nodes with excellent observation conditions before target detection and clearly define the absolute physical blind zone range of buildings.
[0139] During actual operation, the radar detected a large bird of prey posing a significant threat of bird strike at an altitude of approximately 150 meters above the ground. After receiving the initial three-dimensional coordinates with absolute timestamps, the collaborative control server 30 immediately compared them with the global visibility model and activated the PTZ-1, which currently has the highest line-of-sight weight and is in normal working order, as the master control gimbal for initial guidance.
[0140] When a target crosses the runway, if a large passenger aircraft is taxiing and taking off, the high-temperature exhaust fumes and localized smoke from the engine can easily cause severe transient interference to the field of view of the photoelectric sensor. In conventional systems, such sudden localized occlusion can easily cause violent fluctuations in the image centroid, leading to abnormal oscillations of the servo motor or even target loss. This is further addressed by combining field measurements with... Figure 4 The dual-spectral signal-to-noise ratio gradient monitoring curves shown indicate that different spectral bands have varying penetration capabilities for aerosols. When encountering smoke interference (such as...),... Figure 4 As shown at 12.5 seconds (in the image), the system's real-time extracted dual-channel signal-to-noise ratio status display shows a sharp drop in the gradient curve of the visible light channel, while the gradient of the infrared channel remains stable. Based on this physical feedback, the system dynamically adjusts the feedback gain of the PID controller, clearing the servo control weight of the visible light channel to zero and compensating the weight of the infrared channel to 1.0 at full scale. Utilizing this unilateral suppression strategy, the main control gimbal maintains a stable lock on the target, and the servo system does not generate redundant mechanical jitter throughout the entire anti-interference process.
[0141] As the target continues its highly maneuvering flight, the system utilizes Kalman filtering to smooth observation noise and objectively assesses the severity of the target's maneuvering by directly extracting the posterior jerk state variables continuously output by the filter. Based on the target's dynamic equations, the system extrapolates the predicted trajectory into the future time domain and calculates the expected value Ev of the visibility line integral within that trajectory segment. The prediction results indicate that the target will enter the rigid obstruction blind zone of the terminal building in approximately 3 seconds, and the lower bound of visibility will soon reach zero.
[0142] In response to the impending absolute occlusion, the system proactively triggers a cross-device soft handover process. The collaborative handover module, considering both the spatial angle and remaining observation time, selects PTZ-2 as the relay device. Given the inherent total system delay due to network routing and motor startup (estimated to be approximately 150 milliseconds), directly using constant acceleration for long-distance position extrapolation could easily lead to prediction point divergence. Therefore, [further considerations are needed]. Figure 3 As shown in the schematic diagram of blind spot prediction and two-dimensional trajectory compensation, the system dynamically maps the maneuvering damping attenuation coefficient λ based on the extracted true posterior jerk value and superimposes it into the trajectory prediction formula. This attenuation suppression model conforms to aerodynamic laws (such as...). Figure 3 (As shown by the dashed line), this allows the PTZ-2 to accurately pre-set the lens axis to the expected launch point even under delayed conditions, completing the preset focal length wait while the target is still within the blind zone.
[0143] When the target finally penetrates the physical blind zone and enters the PTZ-2's pre-aiming field of view, another bird of similar size appears simultaneously in the frame due to the change in observation angle. Simply relying on image template matching often fails under such conditions of large-scale changes in perspective. To ensure the robustness of the handover link, the geometric disambiguation module in this embodiment integrates real-time intrinsic parameters from both machines and bidirectional mechanical backlash to calculate the fundamental matrix and generate theoretical epipolar lines on the PTZ-2 image plane, thereby constructing a dynamic epipolar tolerance band that adaptively scales with focal length. .
[0144] Subsequently, the system calculates the orthogonal perpendicular distance from the pixel centroid of each candidate target within the field of view to the theoretical epipolar line, initially eliminating interference terms that do not conform to the epipolar geometry constraints. Combining the cosine similarity of the apparent motion velocity direction of the target under dual-camera views, the system performs multi-dimensional weighted scoring, with the highest score confirmed as the original target. This mechanism effectively suppresses the risk of identity confusion under complex cluster interference and spatial occlusion backgrounds, achieving highly reliable identity continuity verification.
[0145] Furthermore, to verify the overall effectiveness of the aforementioned collaborative control mechanism, combined with Figure 5 The system tracking robustness distribution comparison chart shows that, compared to the single-gimbal radar guidance scheme with an average duration of only 18.6 seconds and the conventional range-greedy handover scheme with an average duration of 42.3 seconds, the embodiment of this invention, due to its deep integration of adaptive damping compensation and epipolar disambiguation mechanisms, achieves a median continuous stable tracking time of over 162.5 seconds. Furthermore, its experimental data distribution is more compact, fully demonstrating that the proposed solution possesses excellent continuous tracking capability for low-altitude, highly maneuverable targets and has practical engineering value.
[0146] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking, characterized in that, Includes the following steps: Establish a geodetic coordinate system, obtain a three-dimensional elevation model and a digital surface model, discretize the low-altitude airspace into a three-dimensional voxel set, calculate the visibility weight of the dual-spectral photoelectric gimbal to the three-dimensional voxels in the three-dimensional voxel set, and construct a global visibility mapping model. Receive initial target detection data, wake up the main control gimbal to lock onto the target according to the global visibility mapping model, extract the dual-spectral signal-to-noise ratio state variables, and use the Kalman filter algorithm to estimate the target's three-dimensional motion vector and jerk vector, and construct the target dynamic equation to generate a three-dimensional predicted trajectory; Calculate the expected value of the line integral of the visibility of the three-dimensional predicted trajectory in the global visibility mapping model and the lower limit of visibility, and perform transient disturbance rejection control by combining the first-order gradient of the bispectral signal-to-noise ratio state variable. When the expected value of the visibility line integral falls below the integration safety threshold or the visibility lower bound returns to zero, calculate the comprehensive tracking cost of the surrounding candidate dual-spectrum opto-gimbals and select a relay gimbal from the surrounding candidate dual-spectrum opto-gimbals. The predicted 3D trajectory is compensated for by combining the total system delay time to generate predicted 3D coordinates and send them to the relay gimbal. During the joint observation overlap period, theoretical epipolar lines and dynamic epipolar line tolerance bands are generated. The orthogonal distance from the centroid of the candidate target pixel to the theoretical epipolar line is calculated. Based on the dynamic epipolar line tolerance band, matching targets are selected from the candidate targets to complete the handover.
2. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 1, characterized in that, The steps for calculating the visibility weights of a dual-spectral opto-panel camera on three-dimensional voxels in a three-dimensional voxel set include: Static building coordinates and vegetation coordinates are extracted from the digital land model to construct the bounding box of obstacles, and a preset spatial expansion coefficient is introduced to process the edge tolerance of the bounding box of obstacles. Construct a spatial line of sight connecting the optical center of the dual-spectrum opto-gimbal and the center point of the three-dimensional voxel. Determine whether there is geometric interference between the spatial line of sight and the bounding box of the obstacle after edge tolerance processing, and generate a rigid occlusion Boolean value. The visibility weight is obtained by multiplying the rigid occlusion Boolean value, the gimbal distance attenuation function, and the dual-spectral atmospheric penetration attenuation coefficient.
3. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 2, characterized in that, Before obtaining visibility weights, the following is included: Based on the maximum physical focal length parameter of the lens of the dual-spectrum opto-gimbal, the preset reference physical feature size, the pixel center spacing of the opto-sensor, and the minimum pixel span threshold, a quantitative model of the optical resolution attenuation function as the gimbal distance attenuation function is constructed. By acquiring atmospheric visibility parameters, the center wavelength of the current working channel, the wavelength scattering coefficient negatively correlated with atmospheric visibility parameters, and the slant path atmospheric thickness correction factor, the dual-spectral atmospheric penetration attenuation coefficient is calculated.
4. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 1, characterized in that, The steps for estimating the target's three-dimensional motion vector and jerk vector using the Kalman filter algorithm and constructing the target's dynamic equations include: A constant jerk model containing position, velocity, acceleration and jerk components is used as the state vector of the Kalman filter algorithm to continuously estimate the target's three-dimensional motion vector and jerk vector. Based on the filtered and corrected three-dimensional position vector estimate, three-dimensional velocity vector, three-dimensional acceleration vector, jerk vector, and derivation step size at the current observation time, the target dynamic equation is constructed through joint integration.
5. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 1, characterized in that, The steps for implementing transient disturbance rejection control by combining the first-order gradient of the dual-spectral signal-to-noise ratio state variables include: Extract the first-order gradient of the signal-to-noise ratio (SNR) of the visible light channel and the first-order gradient of the SNR of the infrared thermal imaging channel from the dual-spectral SNR state variables. When the expected value of the line integral of visibility and the lower limit of visibility meet the set visibility conditions, and the absolute value of the first gradient of the signal-to-noise ratio of the visible light channel or the absolute value of the first gradient of the signal-to-noise ratio of the infrared thermal imaging channel exceeds the preset mutation threshold, it is determined that a one-sided transient interference has been encountered. The servo control weights of the disturbed channels that exceed the preset mutation threshold are cleared to zero, and the servo control weights of the normal channels that do not exceed the preset mutation threshold are compensated to the full scale state until the signal-to-noise ratio first gradient of the disturbed channels returns to the set range, and the dual-channel joint weighted control is restored.
6. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 1, characterized in that, The steps for calculating the overall tracking cost of surrounding candidate dual-spectrum electro-optical gimbals and selecting a relay gimbal from these candidates include: The system obtains the spatial straight-line distance from the optical center of the surrounding candidate dual-spectral opto-gimbal to the current three-dimensional coordinates of the target, the three-dimensional spatial angle between the servo axis pointing and the predicted intersection position of the target, the remaining observation time window, and the expected value of the visibility line integral for the predicted trajectory of the target. Calculate the required servo adjustment angular velocity of the surrounding candidate dual-spectrum opto-gimbals. If the servo adjustment angular velocity exceeds the maximum rated angular velocity of the servo motor of the surrounding candidate dual-spectrum opto-gimbal, assign an out-of-bounds penalty constant value to the surrounding candidate dual-spectrum opto-gimbal. By using weighted logic, the linear distance in space, the angle in three-dimensional space, the remaining observation time window, the expected value of the integral of the visibility line, and the constant value of the over-boundary penalty are integrated to obtain the instantaneous cost score; A moving average is applied to the instantaneous cost score within a continuous sampling period, and the dual-spectrum opto-gimbal with the smallest cost function value is selected as the relay gimbal.
7. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 1, characterized in that, The steps for compensating for the total system latency to generate predicted 3D coordinates and sending them to the relay gimbal include: Extract the system's absolute timestamp at the moment the handover command is issued and the absolute timestamp at the moment the relay PTZ feedbacks the pre-start status, and quantify the network communication time consumption through differential calculation; The system obtains the three-dimensional spatial angle between the current optical axis of the relay gimbal and the estimated intersection point, the real-time angular velocity of the servo motor, and the preset compensation dead zone time margin. Combined with the network communication time consumption, the total system delay time is calculated. Extract the magnitude of the jerk vector estimated by the Kalman filter algorithm, map the magnitude of the jerk vector to the set effective damping range, and generate the motor damping attenuation coefficient; The total system delay time and the motor damping attenuation coefficient are substituted into the extrapolation integral term for exponential attenuation suppression, and the predicted three-dimensional coordinates are calculated by combining the target's three-dimensional position vector.
8. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 1, characterized in that, The steps for generating theoretical and dynamic polar tolerance bands include: Extract the target two-dimensional pixel coordinates captured by the main control gimbal at the same aligned sampling time, as well as the real-time dynamic intrinsic parameter matrix of the main control gimbal and the relay gimbal; The fundamental matrix is calculated by combining the pre-calibrated relative rotation matrix and translation vector, and the coordinates of the observation points on the image plane of the main control gimbal are projected forward onto the image plane of the relay gimbal to generate the theoretical epipolar line; Based on the statistical confidence scaling factor, geometric projection magnification factor, trace of the two-way mechanical backlash covariance matrix, fixed variance constant term caused by non-parallelism of reference mounting planes between devices, and pixel compensation constant, the dynamic epipolar tolerance band that defines the deviation envelope on both sides of the theoretical epipolar line is calculated.
9. The method for multi-gimbal coordinated dual-spectral low-altitude bird detection and continuous tracking according to claim 8, characterized in that, The steps for extracting the target's two-dimensional pixel coordinates captured by the main gimbal at the same aligned sampling time, as well as the real-time dynamic intrinsic parameter matrices of the main gimbal and the relay gimbal, include: When a singularity is detected in the main diagonal element of the real-time dynamic intrinsic parameter matrix, the intrinsic parameter verification logic is triggered. The intrinsic parameter state of the previous valid frame, which is cached within the smoothing filter, is retrieved to participate in the calculation of the fundamental matrix.
10. The multi-gimbal collaborative dual-spectral low-altitude bird detection and continuous tracking method according to claim 1, characterized in that, The steps for selecting matching targets from candidate targets and completing the handover based on the dynamic polar tolerance band include: Based on the dynamic epipolar tolerance band, a preliminary subset of candidate targets whose orthogonal distances satisfy the envelope constraint is selected; Extract the plane velocity vector of the target image captured by the main gimbal, and the initial plane velocity vector of the candidate targets in the candidate target subset within the field of view of the relay gimbal; The cosine similarity of the apparent motion trajectory is calculated by the dot product of the velocity vector in the target image plane and the velocity vector in the initial image plane. A multidimensional weighted disambiguation criterion is constructed by combining the normalized distance penalty term corresponding to orthogonal distance with cosine similarity, and the candidate target with the highest weighted score is selected as the matching target.