A beidou video behavior early warning method and system based on deep learning
By combining deep learning technology with satellite and video data processing, the BeiDou satellite signals are separated and verified, solving the problem of inaccurate target positioning in urban environments and achieving accurate early warning in dynamic scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ANHUI ZHENGXIN INFORMATION TECH CO LTD
- Filing Date
- 2026-05-15
- Publication Date
- 2026-06-12
AI Technical Summary
Between urban buildings, the BeiDou satellite signal is affected by multipath reflection and non-line-of-sight propagation, resulting in inaccurate target coordinate positioning and thus triggering false alarms from the early warning system.
By combining deep learning technology with satellite elevation trend prediction, video reflection plane geometric constraints, and short-term sliding window feature fusion, direct, single reflection, and multiple reflection signals are separated. Satellite and video frame data are used to verify the target position, construct a set of positioning equations, and eliminate positioning drift errors.
It achieves accurate target positioning when the signal reflection state changes within seconds in dynamic scenarios, avoiding misjudgment and ensuring the reliability and real-time performance of the early warning system.
Smart Images

Figure CN122194218A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of satellite navigation and positioning technology, and in particular to a BeiDou video behavior early warning method and system based on deep learning. Background Technology
[0002] With the rapid popularization of smart cities, park security, construction site management and other scenarios, there are extremely high requirements for target positioning and behavior early warning. Therefore, existing technologies often use the Beidou positioning satellite navigation system for target positioning. However, satellite signals are subject to multi-path reflection between buildings in urban areas, and non-line-of-sight propagation is severely affected. Traditional positioning only uses direct signals for positioning. Due to the orbital motion of the satellite and the movement of the target terminal, the two form a dual dynamic motion system. The relative positions of the obstructions around the target, the reflecting plane and the signal propagation path change rapidly in real time. The direct and reflected states of the signal are irregular. The original direct signal can suddenly become a single reflection due to obstruction, or a single reflection can suddenly become multiple reflections due to increased obstruction, or the reflected signal can suddenly become direct again due to the disappearance of obstruction. The traditional signal separation results lag far behind the changes in the actual signal state, resulting in situations such as direct signals being misjudged as reflections and reflections being misjudged as direct signals, which leads to target coordinate drift and causes false alarms in the early warning system.
[0003] Therefore, it is necessary to propose a deep learning-based method and system for early warning of behavior in BeiDou videos to solve the above problems. Summary of the Invention
[0004] The purpose of this invention is to provide a BeiDou video behavior early warning method and system based on deep learning, so as to solve the problem of inaccurate target coordinate positioning caused by rapid changes in signal state, which in turn leads to false alarms in the early warning system.
[0005] To achieve the above objectives, the present invention provides the following technical solution: A deep learning-based BeiDou video behavior early warning method includes the following steps: For satellite signals received by the positioning terminal at the target location, the signal state is predicted by the satellite elevation angle trend, the number of intersections between the signal and the reflection plane in the video frame is used to constrain the scene, and the signal characteristics are calculated by a short-time sliding window to separate the signal. The separated signals include direct signals, single reflection signals and multiple reflection signals. The direct or reflected path corresponding to each signal is determined, and the three-dimensional coordinates of the satellite are obtained. Using the satellite's elevation angle and the signal's angle of arrival, the three-dimensional coordinates of the reflection points on the reflection path of a single reflected signal are calibrated. The reflection is segmented based on the reflection points on the reflection path of multiple reflected signals. The reflection points of the reflection segments are adjusted using the path length residuals calculated from the reflection segments to obtain the three-dimensional coordinates of the reflection points on the reflection path of multiple reflected signals. Combining the direct path length and the reflection path length, a set of equations for the direct signal and a set of equations for the reflected signal are constructed. After solving them, the real-time target coordinates are obtained. By using deep learning technology, the video positioning trajectory of the target can be obtained by combining the pixel coordinates and motion trajectory of the target in the video frame with the camera installation parameters. The real-time target coordinates are verified using video positioning trajectory to obtain the target's true location; The target's actual location is compared with the preset warning area to determine whether to issue a warning.
[0006] Preferably, the step of using satellite elevation angle trends to predict signal state, using the number of intersections between the signal and the reflection plane in the video frame to constrain the scene, and using a short-time sliding window to calculate signal features for signal separation includes: The satellite orbital parameters are obtained, and the satellite's elevation angle and elevation angle change rate are calculated to obtain the elevation angle trend. The elevation angle trend is used to predict the signal status and obtain the prediction results. The prediction results include elevation angle prediction of direct reflection, elevation angle prediction of single reflection, and elevation angle prediction of multiple reflections. Acquire video images of the actual scene, extract video frames from the video images, mark all pixel regions that can reflect signals in the video frames, and map the pixel regions onto a spatial plane to obtain the reflecting plane; Based on the azimuth and elevation angles of the satellite in the satellite orbit parameters, combined with the location of the positioning terminal, a ray signal is generated. The ray signal is then projected onto the video image to obtain the pixel path of the ray signal. Along the path of the ray signal pixels, the number of intersections with the reflection plane is determined to obtain the constraint result, which includes no reflection, single-sided reflection, and multi-sided reflection. Set up a data buffer window for the satellite, cache the feature values in the original satellite signal into the data buffer window in time sequence, and update the feature values in the data buffer window according to the sliding step size of the short-time sliding window; For the feature values of the data buffer window, calculate the mean and variance of the feature values of the data buffer window to represent the overall stability of the signal within the current data buffer window; For the data buffer window, the incremental delay jitter and incremental phase distortion of the feature values are calculated according to the latest set of feature values updated by the sliding step size, which are used to represent the changing trend of the signal in the current data buffer window; Based on the mean and variance of the feature values of the data cache window, incremental delay jitter and incremental phase distortion are fused to obtain feature results, which include feature stability, feature distortion and feature severe distortion. Signal separation is performed based on the prediction results, constraint results, and feature results.
[0007] Preferably, the step of performing signal separation based on the prediction result, constraint result, and feature result includes: If the signal meets the criteria of direct illumination, no reflection, and stable characteristics as predicted by elevation angle, it is determined to be a direct illumination signal. If the signal meets the criteria of single reflection, single-surface reflection, and characteristic distortion as predicted by elevation angle, it is determined to be a single reflection signal. If a signal satisfies the criteria of multiple reflections, multi-faceted reflections, and severe characteristic distortion as predicted by elevation angle, it is determined to be a multiple reflection signal.
[0008] Preferably, the step of calibrating the three-dimensional coordinates of the reflection point on the reflection path of a single reflected signal using the satellite's elevation angle and the signal's angle of arrival includes: Based on the satellite's elevation angle, calculate the direction vector of the incident ray from the satellite to the reflection point, and construct the incident ray parameter equation using the satellite's three-dimensional coordinates as the starting point of the incident ray. Based on the signal arrival angle, calculate the direction vector of the reflected ray from the reflection point to the positioning terminal, and construct the reflected ray parameter equation using the positioning terminal's three-dimensional coordinates as the ending point of the reflected ray. Based on the reflection plane of a single reflected signal, the spatial plane parameters of the reflection plane are extracted, and combined with the coordinate system of the satellite, the plane equation of the reflection plane of the single reflected signal is determined. Since the reflection point lies on both the incident ray and the reflected ray, we set the coordinates of the reflection point on the incident ray and the reflected ray to be equal. After parameter elimination, we obtain the spatial line equation of the reflection point. We then solve the spatial line equation and the plane equation of the reflection plane simultaneously to obtain the three-dimensional coordinates of the reflection point.
[0009] Preferably, the step of segmenting the reflection based on the reflection points along the reflection path of the multiple reflected signals, and adjusting the reflection points of the segmented reflections using the path length residuals calculated from the reflection segments to obtain the three-dimensional coordinates of the reflection points along the reflection path of the multiple reflected signals includes: Based on the propagation sequence and reflection point of the signal, the reflection path of the multiple reflected signals is divided into continuous and non-overlapping reflection segments. For each reflection segment, calculate the corresponding path length; The measured path length of the reflection segment is determined by satellite, and the difference between the path length of each reflection segment and the path length of each reflection segment is calculated to generate the segment residual. Using the piecewise residual as a correction factor, the three-dimensional coordinates of the reflection points in the reflection segments are corrected to obtain the adjusted three-dimensional coordinates of the reflection points.
[0010] Preferably, the step of using the piecewise residual as a correction factor to correct the three-dimensional coordinates of the reflection points in the reflection segments, thereby obtaining the adjusted three-dimensional coordinates of the reflection points, includes: Extract the normal vector of the reflection plane based on the reflection point. The direction of adjustment of the reflection point along the normal vector is determined by the sign of the residual of each segment of the reflection segment. Based on the adjustment direction, the coordinate adjustment amount is generated by using the length correction amount of the piecewise residual on the normal vector and combining it with the magnitude of the normal vector. The three-dimensional coordinates of the reflection point after adjustment are calculated based on the coordinate adjustment amount.
[0011] Preferably, the step of obtaining the video positioning trajectory of the target using deep learning technology, by combining the pixel coordinates and motion trajectory of the target in the video frame with the camera installation parameters, includes: A deep learning model is used to detect video frames, locate the pixel box of the target in the video frame image, and extract the center pixel coordinates of the pixel box. By concatenating the center pixel coordinates of consecutive frames according to the time sequence, a sequence of motion trajectory points is obtained; Obtain the camera's installation angle parameters and, in conjunction with the camera's intrinsic parameters, construct a mapping model between pixel coordinates and geographic coordinates. Based on the mapping model, the center pixel coordinates of the motion trajectory point sequence are converted into geographic coordinates to generate a geographic coordinate point sequence; Based on the time series, the geographic coordinate point sequence of consecutive frames is time-series sliding to generate the video positioning trajectory of the target.
[0012] Preferably, the step of verifying the real-time target coordinates using the video positioning trajectory to obtain the target's true location includes: Spatiotemporal alignment of real-time target coordinates and video positioning trajectory is performed to match real-time target coordinates and video positioning trajectory at the same moment. Calculate the spatial difference between the real-time target coordinates and the video positioning trajectory at the same moment, and compare it with a preset deviation threshold for judgment: If the spatial position difference is less than or equal to the deviation threshold, the real-time target coordinates are deemed valid and used as the target's true position. If the spatial position difference is greater than the deviation threshold, it is determined that there is an anomaly in the real-time target coordinates. The real-time target coordinates are then adjusted using the video positioning trajectory to obtain the adjusted true position of the target.
[0013] Preferably, the step of comparing the target's actual location with a preset warning area to determine whether to issue a warning includes: Perform geospatial modeling on the preset warning area to generate a set of geographic coordinate boundaries for the warning area; The system determines the inclusion relationship between the geographic coordinates of the target's actual location and the set of geographic coordinate boundaries. If the geographic coordinates are included in the set of geographic coordinate boundaries, an alert is triggered.
[0014] A deep learning-based BeiDou video behavior early warning system includes: The signal separation module is used to separate the raw satellite signals received by the positioning terminal into direct signals, single-reflection signals, and multiple-reflection signals; The calibration module is used to calibrate the coordinates of the reflection point of a single reflection signal, as well as to perform reflection segmentation and path residual calculation for multiple reflection signals, and adjust the coordinates of the reflection point along the normal vector of the reflection plane; The positioning module is used to calculate the lengths of the direct path and the reflected path based on the coordinates of the reflection point. By constructing a set of positioning equations for the direct signal and the reflected signal, the real-time target coordinates are calculated. The trajectory generation module is used to identify targets in video frames. Based on the center coordinates of the target pixel box and the camera installation parameters and intrinsic parameters, the generated mapping model is used to convert the pixel trajectory into a video positioning trajectory. The verification module is used to verify the real-time target coordinates against the video positioning trajectory and determine the validity of the target coordinates. The early warning module is used to determine the spatial inclusion relationship between the target's actual location and the early warning area in order to determine whether an early warning should be triggered.
[0015] The technical effects and advantages of the present invention in the above technical solution are as follows: 1. By fusing satellite elevation trend prediction, video reflection plane geometric constraints, and short-time sliding window features, the system can process signals even when the signal reflection state changes by a fraction of a second. This solves the problem of lag and distortion in traditional signal separation, which leads to misjudgment. By distinguishing between direct, single-reflection, and multiple-reflection signals, the system makes full use of satellite signal resources and avoids the loss of effective signals, which could lead to positioning failure.
[0016] 2. This invention constructs a joint positioning equation set of direct and reflected signals by calibrating the coordinates of a single reflection point and using piecewise residual compensation to correct multiple reflection points. This transforms previously unusable reflected signals into effective positioning observations, eliminates positioning drift errors caused by multipath reflections, and ensures the accuracy of target positioning in dynamic scenarios.
[0017] 3. This invention generates target video positioning trajectory through deep learning, verifies deviation of target real-time coordinates, automatically identifies and corrects positioning drift anomalies, and integrates the continuity of satellite positioning with the authenticity of video positioning to obtain a reliable true target position, thus solving the problem of target positioning being easily distorted by single satellite positioning. Attached Figure Description
[0018] Figure 1 This is a flowchart of a BeiDou video behavior early warning method based on deep learning according to the present invention.
[0019] Figure 2 This is a structural diagram of a Beidou video behavior early warning system based on deep learning, according to the present invention. Detailed Implementation
[0020] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0021] Example 1, such as Figure 1 As shown, this embodiment provides a BeiDou video behavior early warning method based on deep learning, including the following steps: S1: For satellite signals received by the positioning terminal at the target location, the signal state is predicted by the satellite elevation angle trend, the number of intersections between the signal and the reflection plane in the video frame is used to constrain the scene, and the signal features are calculated by a short-time sliding window to separate the signal. The separated signals include direct signals, single reflection signals and multiple reflection signals. S2: Determine the direct or reflected path for each signal, obtain the satellite's three-dimensional coordinates, and use the satellite's elevation angle and the signal's angle of arrival to calibrate the three-dimensional coordinates of the reflection points on the reflection path of a single reflected signal. Perform reflection segmentation based on the reflection points on the reflection path of multiple reflected signals, and use the path length residual calculated from the reflection segmentation to adjust the reflection points of the reflection segments to obtain the three-dimensional coordinates of the reflection points on the reflection path of multiple reflected signals. Combine the direct path length and the reflection path length to construct the direct signal equation system and the reflection signal equation system, and solve them to obtain the real-time target coordinates. S3: Using deep learning technology, the target's video positioning trajectory is obtained by combining the pixel coordinates and motion trajectory of the target in the video frame with the camera's installation parameters; S4: Verify the real-time target coordinates using the video positioning trajectory to obtain the target's true location; S5: Based on the target's actual location, compare it with the preset warning area to determine whether to issue a warning.
[0022] In one embodiment of the present invention, the step of predicting the signal state using satellite elevation angle trends, constraining the scene by the number of intersections between the signal and the reflection plane in the video frame, and calculating signal features using a short-time sliding window for signal separation includes: S11: Obtain satellite orbit parameters, calculate satellite elevation angle and elevation angle change rate to obtain elevation angle trend, use elevation angle trend to predict signal status, and obtain prediction results. The prediction results include elevation angle prediction of direct reflection, elevation angle prediction of single reflection, and elevation angle prediction of multiple reflections. S12: Acquire video images of the actual scene, extract video frames from the video images, mark all pixel regions that can reflect signals in the video frames, and map the pixel regions onto the spatial plane to obtain the reflecting plane; S13: Based on the azimuth and elevation angles of the satellite in the satellite orbit parameters, combined with the location of the positioning terminal, a ray signal is generated, and the ray signal is projected onto the video image to obtain the pixel path of the ray signal; S14: Along the ray signal pixel path, determine the number of intersections with the reflection plane to obtain the constraint result, which includes no reflection, single-sided reflection, and multi-sided reflection; S15: Set up a data buffer window for the satellite, cache the feature values in the original satellite signal into the data buffer window in time sequence, and update the feature values in the data buffer window according to the sliding step size of the short-time sliding window; S16: For the feature values of the data buffer window, calculate the mean and variance of the feature values of the data buffer window to represent the overall stability of the signal within the current data buffer window; S17: For the data buffer window, calculate the incremental delay jitter and incremental phase distortion of the feature values according to the latest set of feature values updated by the sliding step size, which are used to represent the changing trend of the signal in the current data buffer window. S18: Based on the mean and variance of the feature values of the data cache window, the incremental delay jitter and incremental phase distortion are fused to obtain the feature results, which include feature stability, feature distortion and feature severe distortion. S19: Separate the signal based on the prediction results, constraint results, and feature results.
[0023] The step of performing signal separation based on the prediction result, constraint result, and feature result includes: S191: If the signal meets the requirements of direct illumination, no reflection, and stable characteristics as predicted by elevation angle, it is determined to be a direct illumination signal; S192: If the signal satisfies the elevation angle prediction criteria for single reflection, single-surface reflection, and characteristic distortion, it is determined to be a single reflection signal. S193: If the signal meets the criteria of multiple reflections, multi-faceted reflections, and severe characteristic distortion as predicted by elevation angle, it is determined to be a multiple reflection signal.
[0024] In this embodiment of the invention, as shown in steps S11-S19 above, to address the problem of signal separation lag and distortion caused by instantaneous changes in the reflection state of satellite signals in dynamic scenarios, the following signal separation method is designed: Satellite orbit parameters are obtained from the BeiDou satellite navigation message. Combined with the position of the positioning terminal, the instantaneous elevation angle and elevation angle change rate of each satellite are calculated to obtain the elevation angle change trend. Analyzing the elevation angle change trend allows for a preliminary prediction of the signal type. Specifically, when the elevation angle θ > 35° and the elevation angle change rate is stable, for example... If the elevation angle θ is 15° ≤ elevation angle θ ≤ 35° and the rate of change of elevation angle decreases slowly, it is predicted to be a single reflection signal; if the elevation angle θ < 15° and the rate of change of elevation angle decreases rapidly, it is predicted to be a single reflection signal. The signal is predicted to be a multiple reflection; the satellite orbital parameters specifically include the satellite elevation angle, satellite azimuth angle, and satellite three-dimensional coordinates. To constrain the signal's scene, all pixel regions capable of reflecting the signal are marked from the video frames. These pixel regions include vertical and horizontal reflective surfaces. Using parameters built into the camera, these pixel regions are mapped onto a geospatial occlusion plane, thus obtaining the reflective plane. The generated ray is projected onto the video image, and the pixel path of the ray signal is obtained by determining the area it passes through in the video frame. Here, the ray refers to the theoretically direct ray from the satellite to the positioning terminal. Along the pixel path of the ray through the video frame, the number of intersections with the reflective planes is determined. Specifically, for the case of no reflection, it needs to be confirmed that the incident signal does not intersect with any reflective plane; for the case of single-surface reflection, it needs to be confirmed that the incident signal intersects with only one reflective plane; for the case of multi-surface reflection, it needs to be confirmed that the incident signal intersects with at least two or more reflective planes. This constrains the scene of the signal's incident signal. For short-time sliding window calculation of signal features to reflect signal changes, a data buffer window is set to cache the feature data in the signal. This data buffer window includes a time delay jitter sub-window and a phase distortion sub-window, corresponding to the time delay jitter and phase distortion features in the signal. The two windows are cached, updated, and calculated synchronously. The window cache adopts a first-in, first-out (FIFO) caching rule. The maximum cache capacity of each sub-window is 5 sets, corresponding to a 50ms duration and a 10ms sampling rate. Feature calculation is only performed when the amount of cached data in the window is greater than or equal to 3 sets, based on the FIFO rule. For sub-window data updates, based on... The first-in, first-out (FIFO) rule is used to update the features of the sub-window in 10ms increments. When a new feature is sent to the sub-window, if the sub-window's cached data has reached its maximum capacity, the oldest set of feature data is automatically removed. If the sub-window's cached data has not reached its maximum capacity, only the sent feature data is retained until the sub-window is full. Feature calculation includes the calculation of statistical features and incremental features. Statistical features mainly calculate the mean and variance of the cached features of the sub-window to reflect the signal's state within 50ms. Incremental features mainly calculate incremental delay jitter and incremental phase distortion to reflect the real-time changes of the signal. Signal separation is performed based on the prediction results, constraint results, and feature results. If the signal meets the prediction criteria of direct illumination, no reflection, and stable features at the elevation angle, it is determined to be a direct illumination signal. If the signal meets the prediction criteria of single reflection, single-surface reflection, and feature distortion at the elevation angle, it is determined to be a single reflection signal. If the signal meets the prediction criteria of multiple reflections, multi-surface reflection, and severe feature distortion at the elevation angle, it is determined to be a multiple reflection signal. Through signal separation, it is possible to predict and follow changes in signal reflection state in dynamic scenes in advance, thereby eliminating signal separation lag and result distortion. Furthermore, by using a short-time sliding window, the computational load of signal separation can be reduced, meeting the real-time requirements of behavioral early warning.
[0025] In one embodiment of the present invention, the step of calibrating the three-dimensional coordinates of the reflection point on the reflection path of a single reflected signal using the satellite's elevation angle and the signal's angle of arrival includes: S21: Based on the satellite's elevation angle, calculate the direction vector of the incident ray from the satellite to the reflection point, and construct the incident ray parameter equation using the satellite's three-dimensional coordinates as the starting point of the incident ray. Based on the signal arrival angle, calculate the direction vector of the reflected ray from the reflection point to the positioning terminal, and construct the reflected ray parameter equation using the positioning terminal's three-dimensional coordinates as the ending point of the reflected ray. S22: Based on the reflection plane of a single reflection signal, extract the spatial plane parameters of the reflection plane, and combine them with the coordinate system of the satellite to determine the plane equation of the reflection plane of the single reflection signal. S23: Since the reflection point is simultaneously on both the incident ray and the reflected ray, let the coordinates of the reflection point of the incident ray and the reflected ray be equal. After parameter elimination, the spatial line equation of the reflection point is obtained. Solve the spatial line equation and the plane equation of the reflection plane simultaneously to obtain the three-dimensional coordinates of the reflection point.
[0026] The step of segmenting the reflection based on the reflection points along the reflection path of the multiple reflected signals, adjusting the reflection points of the segmented reflections using the path length residuals calculated from the reflection segments, and obtaining the three-dimensional coordinates of the reflection points along the reflection path of the multiple reflected signals includes: S24: Based on the propagation sequence and reflection point of the signal, the reflection path of the multiple reflected signals is divided into continuous and non-overlapping reflection segments; S25: For each reflection segment, calculate the corresponding path length; S26: Determine the measured path length of the reflection segment using satellite data, and calculate the difference between the path length of each reflection segment and the actual path length to generate the segment residual; S27: Use the piecewise residual as a correction amount to correct the three-dimensional coordinates of the reflection points in the reflection segments, and obtain the adjusted three-dimensional coordinates of the reflection points.
[0027] The step of using the piecewise residual as a correction factor to correct the three-dimensional coordinates of the reflection points in the reflection segments, and obtaining the adjusted three-dimensional coordinates of the reflection points, includes: S271: Extract the normal vector of the reflection plane based on the reflection plane where the reflection point is located; S272: Determine the adjustment direction of the reflection point along the normal vector based on the sign of the residual of each segment of the reflection segment; S273: Based on the adjustment direction, the coordinate adjustment amount is generated by using the length correction amount of the piecewise residual on the normal vector and combining it with the magnitude of the normal vector; S274: Calculate the three-dimensional coordinates of the reflection point after adjustment based on the coordinate adjustment amount.
[0028] In this embodiment of the invention, as shown in steps S21-S27 above, for the three-dimensional coordinates of the reflection point along the reflection path of a single reflected signal, the spatial direction vector from the satellite to the terminal is calculated using the satellite elevation and azimuth angles. Based on the law of reflection, the unit vector of the incident direction between the satellite and the reflection point is derived. The three-dimensional coordinates of the satellite are obtained based on the satellite orbit parameters. The three-dimensional coordinates of a point on the satellite are used as the starting point of the incident ray to construct the parametric equation of the satellite incident ray. The unit vector of the reflection direction from the reflection point to the terminal is calculated based on the azimuth and elevation angles of the signal reaching the terminal. The initial coordinates of the terminal are used as the endpoint of the reflected ray to construct the parametric equation of the terminal reflected ray. Based on the spatial constraints of the signal, the spatial plane parameters of the single effective reflection plane of the single reflected signal are extracted. The coordinate system of the geometric satellite is used to obtain a general plane equation for the reflection plane, which the reflection point must satisfy. When the reflection point lies simultaneously on the incident ray, the reflected ray, and a single reflection plane, the parametric equations of the satellite's incident ray, the reflected ray, and the plane equation are solved jointly. The coordinates of the reflection points of the incident ray and the reflected ray are made equal, and the parameter coefficients of the incident ray parametric equation and the reflected ray parametric equation are eliminated to generate the spatial straight line equation of the reflection point. This equation is then solved jointly with the plane equation to obtain the unique intersection point coordinates, which are the three-dimensional coordinates of the reflection point. After obtaining the coordinates of the reflection point, it is necessary to verify the validity of the reflection point and the reflection plane. The incident vector of the incident ray and the reflection vector of the reflected ray are calculated. The incident angle is calculated by the angle between the incident vector and the normal vector of the reflection plane; the reflection angle is calculated by the angle between the reflection vector and the normal vector of the reflection plane. When the absolute value of the difference between the incident angle and the reflection angle is less than the corresponding threshold, the reflection point is deemed valid, and its initial coordinates are retained. For the reflection plane, it is necessary to verify whether the reflection point is within the reflection plane region. The coordinate range of the valid region of the reflection plane is determined using the video semantic segmentation results. If the initial coordinates of the reflection point are exactly within this coordinate range, the reflection point is considered valid and can be used as the final 3D coordinates of the reflection point. If the initial coordinates of the reflection point are not within this coordinate range, the reflection point and its coordinates can be projected to the nearest point in the valid region of the reflection plane as the final reflection point. For the reflection segments of a multi-reflection signal path, the entire multi-reflection path is divided according to the signal propagation order. For example, the first segment is the reflection path from the satellite to the first reflection point, the second segment is the reflection path from the first reflection point to the second reflection point, and the third segment is the reflection path from the second reflection point to the positioning terminal. For each reflection segment, the theoretical path length of each segment is calculated using a three-dimensional spatial distance formula. The measured path length of each reflection segment is then calculated using satellite data, specifically by calculating the measured signal propagation time and the speed of light in a vacuum. The difference between this residual and the theoretical path length of each segment is calculated to obtain the segment residual. The segment residual is calculated by subtracting the measured path length from the theoretical path length of the reflection segment. When the residual is positive, the theoretically calculated path is too long, indicating a potential problem. When the reflection point moves inward along the path, and the residual value is negative, the theoretically calculated path is too short, so the reflection point needs to move outward along the path. It is important to understand that "inward" refers to the direction of the propagation path from the satellite to the terminal, and "outward" refers to the direction away from the propagation path from the satellite to the terminal. As shown in steps S271-S274 above, the normal vector of the reflection plane is extracted based on the reflection plane where the reflection point is located. The direction of adjustment of the reflection point along the normal vector is determined by the sign of the piecewise residual. In this embodiment, "inward" means fine-tuning the reflection point inward along the normal vector, and "outward" means fine-tuning the reflection point outward along the normal vector. By projecting the piecewise residual onto the X, Y, and Z axes of the normal vector, the length correction amount is obtained. Combined with the magnitude of the normal vector, the coordinate adjustment amount is calculated, and the three-dimensional coordinates of the adjusted reflection point are calculated based on the coordinate adjustment amount. After the reflection points on the reflection path of the multiple reflected signals are adjusted, the planar constraints and reflection constraints must also be satisfied. That is, the planar constraint is that the adjusted reflection points are still within the original reflection plane, and the reflection constraint is that the incident angle of the signal rays in each reflection segment is approximately equal to the reflection angle, with a deviation of less than 5%. For calculating the direct path length and the reflected path length, the corresponding path lengths are calculated using the three-dimensional spatial distance formula. For the direct path of a direct signal, the three-dimensional straight-line distance between the satellite and the positioning terminal is calculated to obtain the direct path length. For the reflected path of a single reflected signal, the two three-dimensional straight-line distances from the satellite to the reflection point and from the reflection point to the positioning terminal are calculated separately, and the two distances are summed to obtain the single reflected path length. For the reflected path of a multiple reflected signal, the three-dimensional straight-line distance of each segment is calculated sequentially according to the reflection segment, and all segment distances are accumulated to obtain the multiple reflected path length. The equations for direct and reflected signals are constructed and solved to obtain the real-time target coordinates. Specifically, based on the satellite's three-dimensional coordinates, the three-dimensional coordinates of the reflection point, and the lengths of the direct and reflected paths, positioning equations for the direct signal, as well as positioning equations for single and multiple reflections, are constructed respectively. The equations for the direct and reflected signals are then merged, and the weighted least squares method is used to iteratively solve the equations to obtain the optimal solution for the target's three-dimensional coordinates, i.e., the real-time target coordinates.
[0029] In one embodiment of the present invention, the step of obtaining the video positioning trajectory of the target by utilizing deep learning technology, using the pixel coordinates and motion trajectory of the target in a video frame, and in conjunction with the camera installation parameters, includes: S31: Use a deep learning model to detect video frames, locate the pixel box of the target in the video frame image, and extract the center pixel coordinates of the pixel box; S32: According to the time sequence, concatenate the center pixel coordinates of consecutive frames to obtain the motion trajectory point sequence; S33: Obtain the camera's installation angle parameters, and combine them with the camera's intrinsic parameters to construct a mapping model between pixel coordinates and geographic coordinates; S34: Based on the mapping model, convert the center pixel coordinates of the motion trajectory point sequence into geographic coordinates to generate a geographic coordinate point sequence; S35: Based on the time series, perform temporal sliding on the sequence of geographic coordinate points of consecutive frames to generate the video positioning trajectory of the target.
[0030] In this embodiment of the invention, as shown in steps S31-S35 above, in order to perform low-latency, high-frequency target detection on video frames, the learning model uses the YOLOv8 lightweight target detection model. The real-time acquired monitoring video frames (RGB images) are input into the model, and the model extracts features from the video frames. The backbone network extracts the visual features of the target, the neck network fuses multi-scale features, and the head network predicts the target information to output the target category, detection confidence, and pixel box coordinates. According to the preset confidence, the pixel boxes of the effective warning targets are retained. The pixel boxes are located in the video frames, and the center pixel coordinates are calculated using the diagonal coordinate mean method. In consecutive video frames, the points corresponding to the center pixel coordinates are concatenated to obtain the motion trajectory point sequence. The installation angle parameters and intrinsic parameters of the camera are obtained. The installation angle parameters include pitch angle, azimuth angle, and installation height. The intrinsic parameters include focal length, the position of the image optical center in the online pixel coordinate system, and the lens distortion coefficient. Based on the pinhole camera imaging model and ground plane constraints, a mapping model from pixel coordinates to the BeiDou geographic coordinate system is constructed. The center pixel coordinates of each frame in the motion trajectory point sequence are substituted into the mapping model to calculate the corresponding geographic coordinates. The geographic coordinates are bound to the video frame timestamp to ensure temporal alignment. A 5-frame continuous temporal sliding window is used to extract continuous geographic coordinate points according to the temporal sequence. For each new frame of geographic coordinate points, the window slides forward one position, continuously outputting coordinate points to obtain the target's video positioning trajectory.
[0031] In one embodiment of the present invention, the step of verifying the real-time target coordinates using video positioning trajectory to obtain the true location of the target includes: S41: Perform spatiotemporal alignment between the real-time target coordinates and the video positioning trajectory to match the real-time target coordinates and the video positioning trajectory at the same moment; S42: Calculate the spatial difference between the real-time target coordinates and the video positioning trajectory at the same moment, and compare it with a preset deviation threshold for judgment: S43: If the spatial position difference is less than or equal to the deviation threshold, the real-time target coordinates are determined to be valid and the real-time target coordinates are taken as the true position of the target. S44: If the spatial position difference is greater than the deviation threshold, it is determined that the real-time target coordinates are abnormal. The real-time target coordinates are adjusted using the video positioning trajectory to obtain the adjusted true position of the target.
[0032] In this embodiment of the invention, as shown in steps S41-S44 above, for real-time target coordinates and video positioning trajectories, spatiotemporal alignment processing is performed by matching coordinate system one with timestamps to obtain the corresponding real-time target coordinate points and video positioning trajectory points at the same time. The spatial position difference between the two is calculated using the three-dimensional spatial distance formula. This difference is compared with a preset deviation threshold to determine whether the real-time target coordinates are valid. If the difference is not greater than the threshold, it indicates that the real-time target coordinates are reliable and are taken as the true position of the target. If the difference exceeds the threshold, it indicates that the real-time target coordinates have a positioning drift abnormality due to signal reflection or occlusion. By using the video positioning trajectory points at the same time after spatiotemporal alignment as the correction benchmark, the historical coordinates and motion trend information of the target are extracted. A weighted constraint correction algorithm is used to adaptively adjust the real-time target coordinates with drift abnormality to the video positioning trajectory points to obtain the adjusted true position of the target. It is important to understand that the setting of the deviation threshold is achieved by a person skilled in the art collecting multiple sets of sample data and setting a corresponding preset ratio coefficient for each set of sample data. The preset ratio coefficient and the collected sample data are then substituted into the formula, and any two formulas form a system of two first equations. The calculated coefficients are then filtered and averaged to obtain the threshold value. The size of the coefficient is a specific value obtained by quantifying each parameter to facilitate subsequent comparison. The size of the coefficient depends on the amount of sample data and the preset ratio coefficient initially set by a person skilled in the art for each set of sample data, as long as it does not affect the proportional relationship between the parameter and the quantified value.
[0033] In one embodiment of the present invention, the step of comparing the target's actual location with a preset warning area to determine whether to issue a warning includes: S51: Perform geospatial modeling on the preset warning area to generate a set of geographic coordinate boundaries for the warning area; S52: Determine the inclusion relationship between the geographic coordinates of the target's actual location and the set of geographic coordinate boundaries. If the geographic coordinates are included in the set of geographic coordinate boundaries, an alert is triggered.
[0034] In this embodiment of the invention, as shown in steps S51-S52 above, a geospatial model is performed on the preset warning area to generate a set of closed boundaries composed of several geographic coordinate inflection points to form an electronic fence. The spatial inclusion relationship between the geographic coordinates of the target's real location and the geographic coordinate boundary of the warning area is determined to determine whether the target's real location is within the warning area. If the target's real location is determined to be contained within the geographic coordinate boundary of the warning area, a warning operation is immediately triggered. If the target is located at or outside the boundary of the warning area, the target's real location is continuously monitored dynamically to provide warnings at any time.
[0035] Example 2, as Figure 2 As shown, this embodiment provides a deep learning-based BeiDou video behavior early warning system, applied to the deep learning-based BeiDou video behavior early warning method in Embodiment 1 above. The system includes: The signal separation module is used to separate the raw satellite signals received by the positioning terminal into direct signals, single-reflection signals, and multiple-reflection signals; The calibration module is used to calibrate the coordinates of the reflection point of a single reflection signal, as well as to perform reflection segmentation and path residual calculation for multiple reflection signals, and adjust the coordinates of the reflection point along the normal vector of the reflection plane; The positioning module is used to calculate the lengths of the direct path and the reflected path based on the coordinates of the reflection point. By constructing a set of positioning equations for the direct signal and the reflected signal, the real-time target coordinates are calculated. The trajectory generation module is used to identify targets in video frames. Based on the center coordinates of the target pixel box and the camera installation parameters and intrinsic parameters, the generated mapping model is used to convert the pixel trajectory into a video positioning trajectory. The verification module is used to verify the real-time target coordinates against the video positioning trajectory and determine the validity of the target coordinates. The early warning module is used to determine the spatial inclusion relationship between the target's actual location and the early warning area in order to determine whether an early warning should be triggered.
[0036] In this embodiment of the invention, the signal separation module uses the BeiDou satellite signals collected by the positioning terminal as the processing object. By combining the satellite elevation angle and azimuth angle with signal phase and delay characteristics, it separates the mixed direct transmission, single reflection transmission, and multiple reflection transmission signals. This achieves both early prediction and tracking of signal reflection state changes in dynamic scenarios, eliminating signal separation lag and result distortion, and reduces the computational load of signal separation through a short-time sliding window. For single reflection signals, the calibration module calculates the initial coordinates of the reflection point based on the satellite spatial coordinates and the signal angle of arrival. For multiple reflection signals, it performs reflection segmentation processing based on the actual number of reflecting surfaces, calculates the path residual for each segment, and corrects the coordinates of each segment's reflection point using the reflection plane normal vector as the adjustment direction, eliminating path fitting errors and outputting reflection point coordinates that meet geometric constraints and residual threshold requirements. The positioning module calculates the physical lengths of the direct transmission path and each segment's reflection transmission path based on the reflection point coordinates and the satellite's real-time position, thus constructing a system for distinguishing between direct signals, single reflection signals, and multiple reflection signals. The system of positioning observation equations corresponding to the target is solved simultaneously to output real-time target coordinates. This not only eliminates positioning drift errors caused by signal reflection in complex environments and improves target coordinate accuracy, but also makes full use of available satellite signals to avoid the loss of effective reflected signals, which could lead to inaccurate or ineffective positioning. The trajectory generation module uses a deep learning target detection model to identify targets in camera video frames, obtain the center coordinates of the target pixel box, and combine parameters such as camera installation height, installation pitch angle, installation azimuth angle, and internal parameters such as focal length and principal point coordinates to construct a mapping model from pixel coordinates to geographic coordinates. Using this model, the target pixel trajectory is converted into a video positioning trajectory in geographic space. The verification module spatially aligns the real-time target coordinates output by the positioning module with the video positioning trajectory output by the trajectory generation module, calculates the spatial deviation between the two, and determines the validity of the real-time target coordinates by using a preset deviation threshold. The system outputs the true location of the target and provides it to the early warning module for judgment to determine whether to issue an early warning. The system meets the practical needs of scenarios requiring early warning and control.
[0037] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely principles of the invention. Various changes and modifications can be made to the invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the claimed invention. The scope of protection claimed by the appended claims and their equivalents is defined.
Claims
1. A deep learning-based BeiDou video behavior early warning method, characterized in that, Includes the following steps: For satellite signals received by the positioning terminal at the target location, the signal state is predicted by the satellite elevation angle trend, the number of intersections between the signal and the reflection plane in the video frame is used to constrain the scene, and the signal characteristics are calculated by a short-time sliding window to separate the signal. The separated signals include direct signals, single reflection signals and multiple reflection signals. The direct or reflected path corresponding to each signal is determined, and the three-dimensional coordinates of the satellite are obtained. Using the satellite's elevation angle and the signal's angle of arrival, the three-dimensional coordinates of the reflection points on the reflection path of a single reflected signal are calibrated. The reflection is segmented based on the reflection points on the reflection path of multiple reflected signals. The reflection points of the reflection segments are adjusted using the path length residuals calculated from the reflection segments to obtain the three-dimensional coordinates of the reflection points on the reflection path of multiple reflected signals. Combining the direct path length and the reflection path length, a set of equations for the direct signal and a set of equations for the reflected signal are constructed. After solving them, the real-time target coordinates are obtained. By using deep learning technology, the video positioning trajectory of the target can be obtained by combining the pixel coordinates and motion trajectory of the target in the video frame with the camera installation parameters. The real-time target coordinates are verified using video positioning trajectory to obtain the target's true location; The target's actual location is compared with the preset warning area to determine whether to issue a warning.
2. The BeiDou video behavior early warning method based on deep learning according to claim 1, characterized in that, The steps of predicting signal state using satellite elevation angle trends, constraining the scene by the number of intersections between the signal and the reflection plane in the video frame, and calculating signal features using a short-time sliding window for signal separation include: The satellite orbital parameters are obtained, and the satellite's elevation angle and elevation angle change rate are calculated to obtain the elevation angle trend. The elevation angle trend is used to predict the signal status and obtain the prediction results. The prediction results include elevation angle prediction of direct reflection, elevation angle prediction of single reflection, and elevation angle prediction of multiple reflections. Acquire video images of the actual scene, extract video frames from the video images, mark all pixel regions that can reflect signals in the video frames, and map the pixel regions onto a spatial plane to obtain the reflecting plane; Based on the azimuth and elevation angles of the satellite in the satellite orbit parameters, combined with the location of the positioning terminal, a ray signal is generated. The ray signal is then projected onto the video image to obtain the pixel path of the ray signal. Along the path of the ray signal pixels, determine the number of intersections with the reflection plane to obtain the constraint result, which includes no reflection, single-sided reflection, and multi-sided reflection; Set up a data buffer window for the satellite, cache the feature values in the original satellite signal into the data buffer window in time sequence, and update the feature values in the data buffer window according to the sliding step size of the short-time sliding window; For the feature values of the data buffer window, calculate the mean and variance of the feature values of the data buffer window to represent the overall stability of the signal within the current data buffer window; For the data buffer window, the incremental delay jitter and incremental phase distortion of the feature values are calculated according to the latest set of feature values updated by the sliding step size, which are used to represent the changing trend of the signal in the current data buffer window; Based on the mean and variance of the feature values of the data cache window, incremental delay jitter and incremental phase distortion are fused to obtain feature results, which include feature stability, feature distortion and feature severe distortion. Signal separation is performed based on the prediction results, constraint results, and feature results.
3. The BeiDou video behavior early warning method based on deep learning according to claim 2, characterized in that, The step of performing signal separation based on the prediction result, constraint result, and feature result includes: If the signal meets the criteria of direct illumination, no reflection, and stable characteristics as predicted by elevation angle, it is determined to be a direct illumination signal. If the signal meets the criteria of single reflection, single-surface reflection, and characteristic distortion as predicted by elevation angle, it is determined to be a single reflection signal. If a signal meets the criteria of multiple reflections, multi-faceted reflections, and severe characteristic distortion as predicted by elevation angle, it is determined to be a multiple reflection signal.
4. The BeiDou video behavior early warning method based on deep learning according to claim 1, characterized in that, The step of calibrating the three-dimensional coordinates of the reflection point on the reflection path of a single reflected signal using the satellite's elevation angle and the signal's angle of arrival includes: Based on the satellite's elevation angle, calculate the direction vector of the incident ray from the satellite to the reflection point, and construct the incident ray parameter equation using the satellite's three-dimensional coordinates as the starting point of the incident ray. Based on the signal arrival angle, calculate the direction vector of the reflected ray from the reflection point to the positioning terminal, and construct the reflected ray parameter equation using the positioning terminal's three-dimensional coordinates as the ending point of the reflected ray. Based on the reflection plane of a single reflected signal, the spatial plane parameters of the reflection plane are extracted, and combined with the coordinate system of the satellite, the plane equation of the reflection plane of the single reflected signal is determined. Since the reflection point lies on both the incident ray and the reflected ray, we set the coordinates of the reflection point on the incident ray and the reflected ray to be equal. After parameter elimination, we obtain the spatial line equation of the reflection point. We then solve the spatial line equation and the plane equation of the reflection plane simultaneously to obtain the three-dimensional coordinates of the reflection point.
5. The BeiDou video behavior early warning method based on deep learning according to claim 1, characterized in that, The step of segmenting the reflection based on the reflection points along the reflection path of the multiple reflected signals, adjusting the reflection points of the segmented reflections using the path length residuals calculated from the reflection segments, and obtaining the three-dimensional coordinates of the reflection points along the reflection path of the multiple reflected signals includes: Based on the propagation sequence and reflection point of the signal, the reflection path of the multiple reflected signals is divided into continuous and non-overlapping reflection segments. For each reflection segment, calculate the corresponding path length; The measured path length of the reflection segment is determined by satellite, and the difference between the path length of each reflection segment and the path length of each reflection segment is calculated to generate the segment residual. Using the piecewise residual as a correction factor, the three-dimensional coordinates of the reflection points in the reflection segments are corrected to obtain the adjusted three-dimensional coordinates of the reflection points.
6. The BeiDou video behavior early warning method based on deep learning according to claim 5, characterized in that, The step of using the piecewise residual as a correction factor to correct the three-dimensional coordinates of the reflection points in the reflection segments, and obtaining the adjusted three-dimensional coordinates of the reflection points, includes: Extract the normal vector of the reflection plane based on the reflection point. The direction of adjustment of the reflection point along the normal vector is determined by the sign of the residual of each segment of the reflection segment. Based on the adjustment direction, the coordinate adjustment amount is generated by using the length correction amount of the piecewise residual on the normal vector and combining it with the magnitude of the normal vector. The three-dimensional coordinates of the reflection point after adjustment are calculated based on the coordinate adjustment amount.
7. The BeiDou video behavior early warning method based on deep learning according to claim 1, characterized in that, The step of obtaining the video positioning trajectory of the target using deep learning technology, by combining the pixel coordinates and motion trajectory of the target in the video frame with the camera installation parameters, includes: A deep learning model is used to detect video frames, locate the pixel box of the target in the video frame image, and extract the center pixel coordinates of the pixel box. By concatenating the center pixel coordinates of consecutive frames according to the time sequence, a sequence of motion trajectory points is obtained; Obtain the camera's installation angle parameters and, in conjunction with the camera's intrinsic parameters, construct a mapping model between pixel coordinates and geographic coordinates. Based on the mapping model, the center pixel coordinates of the motion trajectory point sequence are converted into geographic coordinates to generate a geographic coordinate point sequence; Based on the time series, the geographic coordinate point sequence of consecutive frames is time-series sliding to generate the video positioning trajectory of the target.
8. The BeiDou video behavior early warning method based on deep learning according to claim 1, characterized in that, The step of verifying the real-time target coordinates using video positioning trajectory to obtain the target's true location includes: Spatiotemporal alignment of real-time target coordinates and video positioning trajectory is performed to match real-time target coordinates and video positioning trajectory at the same moment. Calculate the spatial difference between the real-time target coordinates and the video positioning trajectory at the same moment, and compare it with a preset deviation threshold for judgment: If the spatial position difference is less than or equal to the deviation threshold, the real-time target coordinates are deemed valid and used as the target's true position. If the spatial position difference is greater than the deviation threshold, it is determined that there is an anomaly in the real-time target coordinates. The real-time target coordinates are then adjusted using the video positioning trajectory to obtain the adjusted true position of the target.
9. A deep learning-based BeiDou video behavior early warning method according to claim 1, characterized in that, The step of comparing the target's actual location with a preset warning area to determine whether to issue a warning includes: Perform geospatial modeling on the preset warning area to generate a set of geographic coordinate boundaries for the warning area; The system determines the inclusion relationship between the geographic coordinates of the target's actual location and the set of geographic coordinate boundaries. If the geographic coordinates are included in the set of geographic coordinate boundaries, an alert is triggered.
10. A deep learning-based BeiDou video behavior early warning system, employing the deep learning-based BeiDou video behavior early warning method according to any one of claims 1-9, characterized in that, include: The signal separation module is used to separate the raw satellite signals received by the positioning terminal into direct signals, single-reflection signals, and multiple-reflection signals; The calibration module is used to calibrate the coordinates of the reflection point of a single reflection signal, as well as to perform reflection segmentation and path residual calculation for multiple reflection signals, and adjust the coordinates of the reflection point along the normal vector of the reflection plane; The positioning module is used to calculate the lengths of the direct path and the reflected path based on the coordinates of the reflection point. By constructing a set of positioning equations for the direct signal and the reflected signal, the real-time target coordinates are calculated. The trajectory generation module is used to identify targets in video frames. Based on the center coordinates of the target pixel box and the camera installation parameters and intrinsic parameters, the generated mapping model is used to convert the pixel trajectory into a video positioning trajectory. The verification module is used to verify the real-time target coordinates against the video positioning trajectory and determine the validity of the target coordinates. The early warning module is used to determine the spatial inclusion relationship between the target's actual location and the early warning area in order to determine whether an early warning should be triggered.