An airport surface activity target monitoring system based on video technology
The airport surface activity target surveillance system based on video technology utilizes cameras and video processing servers for target identification and tracking, solving the equipment dependence and radio communication problems of existing systems, and achieving high-frequency, interference-free, and low-cost surveillance effects.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- THE SECOND RES INST OF CIVIL AVIATION ADMINISTRATION OF CHINA
- Filing Date
- 2023-03-10
- Publication Date
- 2026-06-19
AI Technical Summary
Existing airport surface activity target surveillance systems rely on cooperative equipment and radio communication, which suffer from problems such as surveillance interruption when equipment is turned off, low information update frequency, insufficient radio signal coverage, and susceptibility to interference.
An airport surface activity target surveillance system based on video technology is adopted. It uses fixed and rotatable cameras to acquire video data, combines video processing servers for target identification and tracking, calculates the geographic coordinates of the target through deep learning and coordinate transformation, and uses smoothing filtering and coordinate fusion technology to process multi-camera data.
It enables high-frequency surveillance without the need for collaborative equipment and radio communication, covers radio blind spots, is free from interference, and is inexpensive.
Smart Images

Figure CN116229370B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of airport surface activity target surveillance technology, and more specifically to an airport surface activity target surveillance system based on video technology. Background Technology
[0002] Airport surface movement target surveillance refers to the real-time acquisition of the identity and location information of surface movement targets. To achieve this goal, airports generally need to build multi-point positioning systems or rely on ADS-B systems. A common characteristic of these systems is that they are all collaborative technologies. That is, collaborative equipment must be installed on the monitored targets and used according to specifications. This surveillance method has the following drawbacks:
[0003] 1. Monitoring cannot be completed once the monitored target shuts down its collaborative devices;
[0004] 2. These types of surveillance systems all rely on radio communication for operation. Due to the limitations of radio communication bandwidth, the information update frequency is low, on the order of seconds;
[0005] 3. In areas where radio signals cannot reach, it is impossible to monitor targets moving around the airport.
[0006] 4. It is susceptible to radio interference. Summary of the Invention
[0007] In view of the technical deficiencies mentioned in the background art, the purpose of this invention is to provide an airport surface activity target monitoring system based on video technology.
[0008] To achieve the above objectives, embodiments of the present invention provide an airport surface activity target surveillance system based on video technology, comprising:
[0009] Fixed video shooting equipment is used to acquire video data of the activity areas on the airport grounds;
[0010] A video processing server is used to receive the video data through a signal transmission network, identify the video data using a target detection program to obtain the moving targets in the scene, and track the moving targets in the scene using a target tracking program.
[0011] As one specific implementation of this application, the video processing server is used for:
[0012] After identifying the targets of the field activities, the target type of the field activities targets is obtained; the target type includes one or more of aircraft, vehicles, personnel, and non-powered equipment;
[0013] The target type is used as the identity information of the current scene's active target. A target number is assigned to the current scene's active target, and the target tracking program is started. At the same time, the identity information and location information of the current active target are marked in real time in the video data.
[0014] As one specific implementation of this application, the video processing server is further configured to calculate the geographic coordinates of the scene activity target, specifically:
[0015] After identifying the target of the scene activity, a deep learning regression model is used to calculate the image coordinates of all key points of the target of the scene activity;
[0016] Based on the scene activity objectives, obtain a target 3D model and retrieve the ideal 3D coordinates of all key points in the target 3D model from the database;
[0017] Calculate the position information of the scene activity target in the coordinate system of the shooting device based on the image coordinates and the ideal three-dimensional coordinates;
[0018] The parameters of the fixed video shooting equipment are calibrated, and the transformation relationship between the coordinate system of the shooting equipment and the geographical coordinate system of the airport is calculated.
[0019] Based on the transformation relationship, the location information of the scene activity target is transformed from the shooting device coordinate system to the airport geographic coordinate system to obtain the geographic coordinates of the scene activity target.
[0020] As a specific implementation of this application, the video processing server is further configured to adjust the geographic coordinates of the scene activity target using a smoothing filtering method, specifically:
[0021] Extract the geographic coordinates of the target activity in the scene from the previous N times, and use the B-spline algorithm to fit a smooth curve;
[0022] Calculate the distance between the geographic coordinates and the smooth curve for the first N times, and determine the geographic coordinates that are more than a set threshold as unreliable location points;
[0023] Remove unreliable localization points and refit a new smooth curve using the B-spline algorithm;
[0024] Calculate the point on the new smooth curve that is closest to the current location, and use the geographic coordinates of that point as the final geographic coordinates of the scene activity target.
[0025] In a preferred implementation of this application, the system further includes a display device for:
[0026] A map interface is provided, which displays the identity and location information of multiple activity target points;
[0027] It receives user clicks on any activity target point to obtain real-time monitoring and tracking videos of that target point.
[0028] As one specific implementation of this application, when at least two fixed video recording devices are connected to the video processing server, the video processing server is used for:
[0029] For the first video data, the moving targets on the scene are detected, and the first geographic coordinates are calculated based on the image coordinates of the moving targets and the calibration parameters of the first fixed video shooting device.
[0030] For the second video data, the moving targets on the scene are detected, and the second geographic coordinates are calculated based on the image coordinates of the moving targets and the calibration parameters of the first fixed video shooting device;
[0031] The first geographic coordinate and the second geographic coordinate are compared repeatedly. If the distance between the two is less than the threshold, it is considered that the same target appears in the field of view of different fixed video shooting devices and is assigned the same target number.
[0032] If there are repeated shooting scenes or targets, then the first and second geographic coordinates are fused.
[0033] Specifically, the coordinate fusion of the first and second geographic coordinates is performed as follows:
[0034] Add an error factor of 1 to the first geographic coordinates;
[0035] Add an error factor of 2 to the second geographic coordinates;
[0036] Calculate the merged coordinates based on the first geographic coordinates, the second geographic coordinates, error number 1, and error number 2;
[0037] The error count is updated regularly.
[0038] In a preferred implementation of this application, the video processing server further includes accessing data from an external scene activity monitoring system and performing the following processing:
[0039] Information on targets of field activities is obtained from data from external field activity monitoring systems. This information is then matched with the field activity target in this system whose geographical coordinates are closest. If the distance between the two geographical coordinates does not exceed the allowable error range, the match is successful.
[0040] After a successful match, the geographic coordinates of the scene activity target are updated using a coordinate fusion algorithm.
[0041] Compared with existing technologies, this invention proposes a novel airport surface activity target surveillance system based on video surveillance and intelligent analysis technologies. This system is non-cooperative, requiring no cooperative equipment on the monitored devices and no radio communication, offering advantages such as high update frequency, no need for radio base station coverage, and immunity to radio interference. Furthermore, by leveraging the large number of cameras already installed at the airport, this system also boasts low cost. Attached Figure Description
[0042] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the accompanying drawings used in the description of the specific embodiments or the prior art will be briefly introduced below.
[0043] Figure 1 This is a structural diagram of an airport surface activity target monitoring system based on video technology provided in an embodiment of the present invention. Detailed Implementation
[0044] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0045] It should be understood that, when used in this specification and the appended claims, the terms "comprising" and "including" indicate the presence of the described features, integrals, steps, operations, elements and / or components, but do not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components and / or collections thereof.
[0046] Please refer to Figure 1 The airport surface activity target surveillance system based on video technology provided in this embodiment of the invention may include:
[0047] Fixed video shooting equipment is used to acquire video data of the activity areas on the airport grounds;
[0048] A video processing server is used to receive the video data through a signal transmission network, identify the video data using a target detection program to obtain the moving targets in the scene, and track the moving targets in the scene using a target tracking program.
[0049] Fixed video recording equipment can be the numerous cameras already installed at the airport, enabling full coverage surveillance of the airport's surface activity area. In this system, rotatable video recording equipment, such as long-focal-length cameras, can be used to track and film aircraft takeoffs and landings, as well as capture details of ground-based moving targets.
[0050] Furthermore, video processing servers are primarily used for:
[0051] 1. Identify and track targets in the scene.
[0052] Specifically, the video processing server runs a scene moving target detection program on each video signal from the fixed video shooting equipment. After identifying the scene moving target, it obtains the target type, uses the target type as the basic identity information of the current scene moving target, assigns a target number to the current scene moving target, and starts the target tracking program. At the same time, it marks the identity information and location information of the current moving target in the video data in real time.
[0053] The target types include one or more of the following: aircraft, vehicles, personnel, and non-powered equipment.
[0054] 2. Calculate and adjust the geographic coordinates of the scene's activity targets.
[0055] In this embodiment, a coordinate transformation method is mainly used to calculate the geographic coordinates of the target in the scene, and a smoothing filtering method is used to adjust the geographic coordinates. The coordinate transformation method is described as follows:
[0056] By combining knowledge of 3D models, the location information of targets in the scene's activities is calculated from image information. The specific method is as follows:
[0057] (1) Use a deep learning regression calculation model to detect the image coordinates (u, v) of all key points of the moving target. Taking an aircraft as an example, key points can be the apex of the forewing, the apex of the tail, the center point of the engine, etc.
[0058] (2) Based on the target recognition results, retrieve the ideal three-dimensional coordinates (X, Y, Z) of all key points in the three-dimensional model of the target from the database.
[0059] (3) According to the imaging principle, the relationship between the three-dimensional coordinates (X, Y, Z) of the key points of the scene's moving target and its image imaging coordinates (u, v) is as follows:
[0060]
[0061] In this system, the rotation matrix R and the translation matrix T move the 3D model from its initial position to any shooting position; K is the camera's imaging matrix, projecting points in 3D space onto the imaging plane according to the camera's internal parameters. For cameras with the same parameters, K is identical and can be calculated through camera calibration. The program uses multiple sets of (X, Y, Z) and (u, v) correspondences, and solves for R and T according to the minimum average error constraint. The solved T represents the position information of the moving target in the shooting equipment's coordinate system.
[0062] (4) The camera equipment is calibrated, and its installation position and orientation are calculated. Then, the conversion relationship between the camera equipment coordinate system and the airport geographic coordinate system is calculated. According to the conversion relationship, the location information of the scene activity target is converted from the camera equipment coordinate system to the airport geographic coordinate system.
[0063] It should be noted that in this embodiment, calculating the geographic coordinates of the surface activity target is equivalent to locating the surface activity target. Traditional video positioning uses bounding boxes, which can only obtain the image area where the surface activity target (aircraft) is located, resulting in inaccurate positioning. In contrast, the method in this embodiment achieves precise point positioning of the aircraft based on the spatial relationship between the aircraft's key points and its 3D model. The positioning method of this embodiment is more accurate than traditional video positioning. Furthermore, both this embodiment and the cooperative system use the positioning antenna position as the positioning point; therefore, this system can maintain consistency with the point positioning of the cooperative system.
[0064] Furthermore, the smoothing filtering method is described as follows:
[0065] (1) Extract the position information of the target in the scene from the previous N times, and fit a smooth curve using the B-spline algorithm. The parameter expression is P(w).
[0066]
[0067] Where Pi represents the positional information from the first N iterations, B is a recursive function controlling the smoothness, and the subscript k is the recursion order, typically 1 or 2. The recursive function satisfies the constraints.
[0068]
[0069]
[0070] (2) Calculate the distance between the N location information and the smooth curve, and consider points whose distance exceeds the set threshold as unreliable locations.
[0071] (3) After removing unreliable locations, a smooth curve is refitted using the B-spline algorithm.
[0072] (4) Calculate the point on the smooth curve that is closest to the current position, and use that point as the user interface display point for the current position, but still record the original data when recording data.
[0073] 3. When at least two fixed video shooting devices are connected to the video processing server, resolve the target fusion problem.
[0074] In this embodiment, there are multiple fixed video shooting devices, and the video processing server performs target detection and tracking on each video signal from the fixed video shooting devices. This leads to the problem of cross-camera tracking. This embodiment uses a target fusion method and a coordinate fusion method to solve this problem, as follows:
[0075] Target fusion methods mainly include:
[0076] (1) After a new scene activity target is detected, it is first compared with the existing scene activity targets by using geographic coordinates. If the distance between the two is less than a threshold, it is considered that the same target appears in the field of view of different video shooting devices and is assigned the same target number;
[0077] (2) If there are repeated shooting scenes, the coordinates of the two shooting devices are calculated separately and then fused using the coordinate fusion method.
[0078] Coordinate fusion methods mainly include:
[0079] (1) Add an error number 1 to the geographic coordinates 1 calculated by this system. The error number 1 is the product of the error coefficient and the distance, that is, the error is proportional to the distance between the video shooting equipment and the target.
[0080] (2) Add an error factor 2 to the geographic coordinates 2 provided by other systems. The initial value of the error factor 2 is an empirical value, set according to the region. Generally, the error is larger in areas with poor radio signal.
[0081] (3) The merged geographic coordinates are the weighted sum of the geographic coordinates of the two systems inversely proportional to their errors, and can be calculated using the following formula.
[0082]
[0083] (4) Periodically update the error count for each region: record the errors between the merged geographic coordinates and the geographic coordinates of other systems, calculate the average of all errors recorded in this update cycle, and replace the previous error count for that region.
[0084] 4. Access and process data from external field activity monitoring systems, mainly including:
[0085] (1) Obtain the target information of the scene activity from the data of the external scene activity monitoring system, and match the target information of the scene activity with the scene activity target with the nearest geographical coordinates in this system. If the distance between the geographical coordinates of the two does not exceed the allowable error range, the match is successful.
[0086] (2) After a successful match, the geographic coordinates of the scene activity target are updated using the aforementioned coordinate fusion method;
[0087] (3) If the identity information of the matching target is different, after three consecutive successful matches, the identity information of this system will be replaced with the identity information of other systems. For example, if the identity information provided by this system is "aircraft" and the identity information provided by other systems is "CA1234", after three consecutive successful matches, the identity information of this system will be replaced with "CA1234", and the identity information displayed on the map interface will also be "CA1234".
[0088] As a preferred implementation of this application, the system of this embodiment may further include a display device for:
[0089] A map interface is provided, which displays the identity and location information of multiple activity target points;
[0090] It receives user clicks on any activity target point to obtain real-time monitoring and tracking videos of that target point.
[0091] Based on the foregoing description, the airport surface activity target surveillance system based on video technology provided in this embodiment of the invention can be described as follows:
[0092] The system mainly consists of fixed video shooting equipment, rotatable video shooting equipment, signal transmission network, video processing server, and system software. It can operate independently or be connected to an external system.
[0093] 1. In standalone mode, the general functional description is as follows:
[0094] (1) The system runs a scene moving target detection program on each video signal captured by the fixed video shooting equipment. The detected scene moving target types can be one or more of aircraft, vehicles, personnel, and non-powered equipment.
[0095] (2) Upon detecting a target in the scene, the system uses the target type as basic identity information, assigns a target number, and initiates the target tracking program. Simultaneously, the system marks the target's identity and location information in real-time on the video for easy viewing.
[0096] (3) The system calculates the geographic coordinates of the moving targets using a coordinate transformation method based on the image coordinates of the moving targets and the calibration parameters of the shooting camera. A smoothing filter method is then used to adjust the geographic coordinates of the moving targets.
[0097] (4) The system uses a target fusion method to solve the problem of cross-camera tracking.
[0098] (5) The system software displays the identity and location information of the target in the scene in real time on the map interface.
[0099] (6) Clicking on any moving target will bring up the signal of the fixed video being filmed on that target, or you can select a rotating video shooting device to track and film the target.
[0100] 2. When integrating data from other scene activity monitoring systems, the approximate functional description is as follows:
[0101] (1) After obtaining the target information of the scene activities from other systems, match it with the target with the nearest geographical coordinates in this system. If the geographical coordinate distance between the two does not exceed the allowable error range, the match is successful.
[0102] (2) After a successful match, the target geographic coordinates are updated using the coordinate fusion method (described below).
[0103] (3) If the identity information of the matching target is different, after three consecutive successful matches, the identity information of this system will be replaced with the identity information of other systems.
[0104] As described above, this invention proposes a novel airport surface activity target surveillance system based on video surveillance and intelligent analysis technologies. This system is non-cooperative, requiring no cooperative equipment on the monitored devices and no radio communication, offering advantages such as high update frequency, no need for radio base station coverage, and immunity to radio interference. Furthermore, relying on the large number of cameras already installed at the airport, this system also boasts low cost.
[0105] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in the present invention, and these modifications or substitutions should all be covered within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. An airport surface activity target monitoring system based on video technology, characterized in that, The system is non-cooperative, requiring no cooperative equipment to be installed on the monitored device and no radio communication; the system includes: Fixed video shooting equipment is used to acquire video data of the activity areas on the airport grounds; A video processing server is used to receive the video data through a signal transmission network, identify the video data using a target detection program to obtain the moving targets in the scene, and track the moving targets in the scene using a target tracking program. The video processing server is used for: After identifying the targets of the field activities, the target type of the field activities targets is obtained; the target type includes one or more of aircraft, vehicles, personnel, and non-powered equipment; The target type is used as the identity information of the current scene's active target. A target number is assigned to the current scene's active target, and the target tracking program is started. At the same time, the identity information and location information of the current active target are marked in real time in the video data. The video processing server is also used to calculate the geographic coordinates of the scene's activity targets, specifically: After identifying the scene activity targets, a deep learning regression model is used to calculate the image coordinates (u, v) of all key points of the scene activity targets; Based on the scene activity target, obtain the target 3D model, and retrieve the ideal 3D coordinates (X, Y, Z) of all key points in the target 3D model from the database; the correspondence between the image coordinates (u, v) and the ideal 3D coordinates (X, Y, Z) is as follows: Where R is the rotation matrix, T is the translation matrix, and K is the camera's imaging matrix; Based on the correspondence between the image coordinates (u, v) and the ideal three-dimensional coordinates (X, Y, Z), R and T are solved according to the minimum average error constraint condition; the solved T is the position information of the scene activity target in the coordinate system of the shooting device; The parameters of the fixed video shooting equipment are calibrated, and the transformation relationship between the coordinate system of the shooting equipment and the geographical coordinate system of the airport is calculated. Based on the transformation relationship, the location information of the scene activity target is transformed from the shooting device coordinate system to the airport geographic coordinate system to obtain the geographic coordinates of the scene activity target; The video processing server is also used to adjust the geographic coordinates of the scene's activity targets using a smoothing filtering method, specifically: Extract the geographic coordinates of the scene activity target for the previous N times, and use B-spline algorithm to fit a smooth curve, parameter expression ; wherein P i is the previous N position information, B is a recursive function that controls the smoothness, the subscript k is the recursion order, and is 1; the recursive function satisfies the constraint condition Calculate the distance between the geographic coordinates and the smooth curve for the first N times, and determine the geographic coordinates that are more than a set threshold as unreliable location points; Remove unreliable localization points and refit a new smooth curve using the B-spline algorithm; Calculate the point on the new smooth curve that is closest to the current location, and use the geographic coordinates of that point as the final geographic coordinates of the scene activity target.
2. The system as described in claim 1, characterized in that, The system also includes a display device for: A map interface is provided, which displays the identity and location information of multiple activity target points; It receives user clicks on any activity target point to obtain real-time monitoring and tracking videos of that target point.
3. The system as described in claim 1, characterized in that, When at least two fixed video recording devices are connected to the video processing server, the video processing server is used for: For the first video data, the moving targets on the scene are detected, and the first geographic coordinates are calculated based on the image coordinates of the moving targets and the calibration parameters of the first fixed video shooting device. For the second video data, the moving targets on the scene are detected, and the second geographic coordinates are calculated based on the image coordinates of the moving targets and the calibration parameters of the first fixed video shooting device; The first geographic coordinate and the second geographic coordinate are compared repeatedly. If the distance between the two is less than the threshold, it is considered that the same target appears in the field of view of different fixed video shooting devices and is assigned the same target number. If there are repeated shooting scenes or targets, then the first and second geographic coordinates are fused.
4. The system as described in claim 3, characterized in that, The first and second geographic coordinates are fused, specifically as follows: Add an error factor of 1 to the first geographic coordinates; Add an error factor of 2 to the second geographic coordinates; Calculate the merged coordinates based on the first geographic coordinates, the second geographic coordinates, error number 1, and error number 2; The error count is updated regularly.
5. The system as described in claim 1, characterized in that, The video processing server also includes accessing data from an external scene activity monitoring system and performing the following processing: Information on targets of field activities is obtained from data from external field activity monitoring systems. This information is then matched with the field activity target in this system whose geographical coordinates are closest. If the distance between the two geographical coordinates does not exceed the allowable error range, the match is successful. After a successful match, the geographic coordinates of the scene activity target are updated using a coordinate fusion algorithm.
6. The system as described in claim 5, characterized in that, The geographic coordinates of the scene's activity targets are updated using a coordinate fusion algorithm, specifically as follows: The first geographic coordinates calculated by this system will have an additional error of 1. Add an error factor of 2 to the second geographic coordinates calculated by the external scene activity monitoring system; Calculate the merged coordinates based on the first geographic coordinates, the second geographic coordinates, error number 1, and error number 2; The error count is updated regularly.