A multi-platform multi-view image splicing method in an airport large-scale environment
By employing deep learning and multi-view image deformation network reconstruction technology, the problems of ghosting and artifacts in image stitching under large-scale airport environments have been solved, achieving high-quality image stitching and supporting real-time monitoring and security early warning at airports.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIHANG UNIV
- Filing Date
- 2022-07-04
- Publication Date
- 2026-06-16
AI Technical Summary
Traditional image stitching methods suffer from ghosting, artifacts, and poor stitching results in the large-scale environment of airports, especially when processing images with low overlap.
By employing a multi-level constrained deep learning approach, a multi-view image deformation network and a reconstruction network are established to generate an airport edge structure framework map, thereby performing image deformation and reconstruction and ultimately achieving high-quality stitching of airport multi-view and multi-platform images.
It improves the accuracy and quality of image stitching, effectively eliminates artifacts, generates high-quality panoramic airport images, and supports real-time airport monitoring and security early warning.
Smart Images

Figure CN115222595B_ABST
Abstract
Description
Technical Field
[0001] This invention provides a method for multi-platform, multi-view image stitching in a large-scale airport environment. By utilizing image generative adversarial techniques to process airport remote sensing images, an edge structure framework map of the airport is generated. Guided by this edge structure framework map, a multi-view image deformation network and a reconstruction network are established to deform and reconstruct the airport's multi-view, multi-platform images, ultimately obtaining a stitched image based on a large-scale airport environment. This method has significant implications for applications such as security detection and data collection and analysis under airport surveillance, and belongs to the field of aviation surveillance. Background Technology
[0002] With the development of computer vision, image processing has been widely used. However, in many scenarios, there are problems such as ghosting and artifacts in image stitching, which greatly limits the development space of detection technology.
[0003] Furthermore, with the improvement of economic levels, the number of passenger flights in air transport is increasing, and the scheduling and arrangement of airports has gradually become a key focus of airport information monitoring. Due to the geographical advantages of airport towers, their elevated position can effectively cover the entire airport. Installing airport cameras with appropriate resolution can monitor the movement of personnel, vehicles, and flights within the airport in real time, providing strong support for real-time airport scheduling and safety early warning.
[0004] However, for the same target, images captured by several airport cameras may only show a portion of the object. To obtain complete information about the target and avoid potential hazards, it is necessary to stitch together images of the airport towers. However, traditional image stitching methods such as SIFT suffer from problems such as ghosting, obvious stitching marks, poor stitching quality, and severe image distortion when processing large-area, low-overlap images like those of airports. Therefore, proposing a multi-platform, multi-view image stitching method for large-scale airport environments is crucial. Summary of the Invention
[0005] Airport tower cameras capture images of the entire airport from different perspectives during operation. However, due to these varying perspectives, image stitching in many scenarios suffers from issues such as ghosting and artifacts. By imposing multi-faceted constraints on multi-view images and performing pixel-level reconstruction of specific parts, stitching accuracy can be effectively improved. To address these problems, this invention provides a multi-platform, multi-view image stitching method for large-scale airport environments. A deep learning approach, employing multi-level constraints to improve feature extraction and stitching accuracy, is used to train an image reconstruction and stitching model for multi-view images captured by airport tower cameras. This significantly improves stitching accuracy and is of great significance in applications such as monitoring real-time airport dynamics and predicting security risks within the airport. This invention first processes airport remote sensing images using edge feature extraction to generate an edge structure framework map of the airport. Guided by this edge structure framework map, a multi-view image deformation network and a reconstruction network are established to deform and reconstruct multi-view, multi-platform airport images, ultimately obtaining a panoramic image stitched from a large-scale airport environment.
[0006] This invention provides a method for multi-platform, multi-view image stitching in a large-scale airport environment, including:
[0007] Collect multi-view images of the airport. Obtain scene images to be stitched together from multiple perspectives through airport tower monitoring.
[0008] A framework map is obtained from airport GIS images. The remote sensing images of the airport itself are processed to obtain a general framework map of the airport.
[0009] Image coarse alignment. Using the obtained frame diagram as constraints, rotation, translation, and flipping operations are performed on the original airport image to transform the original image and obtain the updated dataset.
[0010] Image stitching and reconstruction. A neural network is built to process airport images from each viewpoint after coarse alignment to eliminate artifacts. Constraints are introduced to improve the similarity between the processed images and the real images. Then, the processed images are decoded and reconstructed to obtain a pixel-level stitched image. Attached Figure Description
[0011] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0012] Figure 1 This is a flowchart of the training detection network model in this invention;
[0013] Figure 2This is a flowchart illustrating the multi-platform, multi-view image stitching method for large-scale airport environments in this invention. Detailed Implementation
[0014] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0015] This invention provides a method for image stitching across a large area of an airport using multiple platforms and perspectives. An image reconstruction and stitching model trained through deep learning is used to capture multi-view images from airport tower cameras, effectively improving detection accuracy. This method is of great significance in applications such as monitoring real-time airport dynamics and predicting potential security risks within the airport.
[0016] First, the dataset needs to be fed into the network for training. After a certain number of iterations, it can be tested and used. The specific process is as follows: Figure 1 As shown, it includes:
[0017] S101: Acquire multi-view images of the airport. Obtain multi-view scene images to be stitched together through airport tower monitoring.
[0018] S102: Using the idea of image generation adversarial, image generation is performed on the satellite remote sensing image of the airport to obtain the general framework map of the airport, which is then saved as the backbone and constraint for subsequent image reconstruction and stitching.
[0019] S103: Using the airport framework structure diagram obtained in S102 as a constraint, feature extraction is performed on all original images to obtain the corresponding homography matrix. The relationship between the matrices is used to perform deformation processing on two highly similar images.
[0020] S104: The airport image dataset processed from S103 is sampled at low resolution, and the images are reconstructed and analyzed through filtering, encoding, and decoding to eliminate artifacts, thereby obtaining high-quality stitching results.
[0021] Specifically, in S101, cameras installed on the airport tower can capture videos of the entire airport from different perspectives, and there are overlapping areas between the images from different perspectives and camera positions. These images can be stored for analysis and processing to obtain an airport image database.
[0022] Specifically, in S102, the Conditional Generative Adversarial Network (CGAN) is used to generate images from the airport satellite remote sensing images. These images are then saved as conditions for subsequent image reconstruction and stitching.
[0023] Specifically, in S103, the airport framework structure diagram obtained in S102 is used as a constraint to extract features from all original images and obtain the corresponding homography matrix.
[0024] Specifically, in the process of using the airport framework diagram to guide feature extraction, the following steps are taken: The final set of feature points. This is a set of feature points in the airport's framework diagram. Given a set of feature points extracted from airport camera images obtained through conventional methods, it is clear that to ensure the accuracy of the final result, it is best to satisfy the following conditions: At the same time, if we assume Representing the original diagram and the framework diagram The pixel spacing between two points relative to a standard point represents the difference in pixel features between them. This is achieved by establishing a function. It can measure the difference between each pair of feature points and serve as feedback to guide the selection of the feature point set. Ultimately, it can obtain more accurate feature points by minimizing this value.
[0025] Then, the relationship between the matrices is used to distort and deform two images with similarity.
[0026] Set the counterclockwise rotation angle to Before rotation, the coordinates of a point A on a training sample are... After rotation, the coordinates of point A are Then there is
[0027]
[0028]
[0029] Finally, the training samples after rotation are flipped to form a new training set.
[0030] Specifically, in S104, the airport image dataset processed in S103 is sampled at low resolution. Convolutional layers are designed for filtering and encoding. Then, an encoding and decoding network is designed to reconstruct and analyze the image. The logic and method of image deformation during image stitching are learned to eliminate artifacts to the greatest extent and obtain high-quality stitching results.
[0031] In network reconstruction, constraints are introduced to guide the reconstruction process, mainly divided into content constraints and gap constraints, which are achieved by introducing parameters. and The difference between content constraints and gap constraints and the ground truth image is measured using... As the overall loss function, the final deformation result is made close to the true value in terms of image features and pixel values by minimizing this value. At this time, the airport GIS image and the generated airport frame image are used together as constraints. Finally, the non-overlapping areas are decoded and restored, realizing the transition from features to pixels, and finally reconstructing a complete airport panoramic mosaic image.
[0032] By following the steps above, the required image stitching model for airport towers under multiple perspectives can be trained. In practical applications, this model can be used to stitch together the collected images of target objects to obtain a panoramic image of the airport.
[0033] The specific implementation process in practical applications is as follows: Figure 2 As shown, specifically:
[0034] S101: Acquire multi-view images of the airport. Obtain multi-view scene images to be stitched together through airport tower monitoring.
[0035] S102: Using the idea of image generation adversarial, image generation is performed on the satellite remote sensing image of the airport to obtain a general framework map of the airport, which is then saved as the backbone and constraint for subsequent image reconstruction and stitching.
[0036] S103: Using the airport framework structure diagram obtained in S102 as a constraint, feature extraction is performed on all original images to obtain the corresponding homography matrix;
[0037] Specifically, in the process of using the airport framework diagram to guide feature extraction, the following steps are taken: The final set of feature points. This is a set of feature points in the airport's framework diagram. Given a set of feature points extracted from airport camera images obtained through conventional methods, it is clear that to ensure the accuracy of the final result, it is best to satisfy the following conditions: At the same time, if we assume Representing the original diagram and the framework diagram The pixel spacing between two points relative to a standard point represents the difference in pixel features between them. This is achieved by establishing a function. It can measure the difference between each pair of feature points and serve as feedback to further guide the selection of the feature point set, ultimately through... This method yields feature points with high accuracy.
[0038] Then, the relationship between the matrices is used to distort and deform two images with similarity.
[0039] Set the counterclockwise rotation angle to Before rotation, the coordinates of a point A on a training sample are... After rotation, the coordinates of point A are Then there is
[0040]
[0041]
[0042] Finally, the training samples after rotation are flipped to form a new training set.
[0043] S104: The airport image dataset processed in S103 is sampled at low resolution. Convolutional layers are designed for filtering and encoding, and then deconvolutional layers are designed to reconstruct and analyze the images. The logic and methods of image deformation during image stitching are learned to eliminate artifacts to the greatest extent possible, so as to obtain high-quality stitching results.
[0044] In network reconstruction, constraints are introduced to guide the reconstruction process, mainly divided into content constraints and gap constraints, which are achieved by introducing parameters. and The difference between content constraints and gap constraints and the ground truth image is measured using... As the overall loss function, the final deformation result is made close to the true value in terms of image features and pixel values by minimizing this value. At this time, the airport GIS image and the generated airport frame image are used together as constraints. Finally, the non-overlapping areas are decoded and restored, realizing the transition from features to pixels, and finally reconstructing a complete airport panoramic mosaic image.
[0045] By following the steps above, the required image stitching model for airport towers under multiple perspectives can be trained. In practical applications, this model can be used to stitch together the collected images of target objects to obtain a panoramic image of the airport.
[0046] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A multi-platform, multi-view image stitching method for large-scale airport environments, characterized by: S101: Acquire multi-view images of the airport; obtain multi-view scene images to be stitched together through airport tower monitoring; S102: Using the idea of image generation adversarial, image generation is performed on the satellite remote sensing image of the airport to obtain the general framework map of the airport, which is then saved as the backbone and constraint for subsequent image reconstruction and stitching. S103: Using the airport framework structure diagram obtained in S102 as a constraint, feature extraction is performed on all original images to obtain the corresponding homography matrix; Specifically, in the process of using the airport framework structure diagram to guide feature extraction, the set satisfies ,in The final set of feature points. This is a set of feature points in the airport's framework diagram. The set consists of feature points extracted from airport camera images obtained through general methods; simultaneously, let... Representing the original diagram and the framework diagram The pixel spacing between two points relative to a standard point To represent the differences in pixel features between them, a function is established. It measures the difference between each pair of feature points, and uses this as feedback to guide the selection of the feature point set, ultimately through... This method yields feature points with high accuracy. Then, the relationship between the matrices is used to distort and deform two images with similarity. Finally, the training samples after rotation transformation are flipped to form a new training set. S104: The airport image dataset processed from S103 is sampled at low resolution. Convolutional layers are designed for filtering and encoding, and then deconvolutional layers are designed to reconstruct and analyze the images. The logic and method of image deformation during image stitching are learned to eliminate artifacts to the greatest extent and obtain high-quality stitching results. By following the steps above, the required image stitching model for airport towers under multiple perspectives can be trained. In practical applications, this model can be used to stitch together the collected images of target objects to obtain a panoramic image of the airport.