An image registration-based complex background infrared image background motion compensation method
By combining the quadtree algorithm and LGB descriptor with position and grayscale information, the problem of inaccurate feature point matching in infrared image registration is solved, the effect of background motion compensation is improved, and a better foundation is laid for infrared weak target detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI AEROSPACE CONTROL TECH INST
- Filing Date
- 2024-08-07
- Publication Date
- 2026-06-23
AI Technical Summary
Existing infrared image registration algorithms struggle to accurately match feature points against complex backgrounds, resulting in poor background motion compensation and impacting the detection of weak infrared targets.
The quadtree algorithm is used to remove redundant feature points, generate LGB descriptors that combine location and grayscale information, map feature point coordinates through a hash function, optimize matching feature point pairs using a random sampling consensus algorithm, and finally calculate background motion parameters for compensation.
It improves the accuracy and discriminative power of feature point matching, reduces the probability of false matching, enhances the effect of background motion compensation, and provides better conditions for subsequent infrared weak target detection.
Smart Images

Figure CN118967762B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for background motion compensation in complex background infrared images based on image registration, belonging to the field of image processing. Background Technology
[0002] Infrared target detection against complex backgrounds has demonstrated crucial application value in key areas such as military defense. However, numerous challenges arise in its application, the most prominent being the accurate differentiation of targets with significantly lower energy in complex backgrounds (such as clouds, terrain, and buildings). In some complex infrared scenes, targets are small, signal strength is weak, and background clutter interference is severe. Classic spatial domain infrared small target detection algorithms often perform poorly in such scenarios, but background motion compensation can compensate for these algorithms' shortcomings when handling complex background images. Background motion compensation is generally achieved by calculating background motion parameters. Difference between the current frame image and the previous frame image after background motion compensation can effectively eliminate the background, providing better conditions for subsequent infrared target detection.
[0003] Among numerous background motion compensation techniques, image registration methods are widely used due to their high accuracy. The core of image registration lies in detecting feature points in the image and generating descriptors for these feature points. The most commonly used image registration algorithm in engineering is the ORB (oriented fast and rotated BRIEF) algorithm, which combines the FAST (Features from Accelerated Segment Test) feature point detection algorithm with the BRIEF (Binary Robust Independent Elementary Features) descriptor. By adding directional information of feature points, it improves rotation invariance. This algorithm is favored due to its low computational cost and high accuracy. However, the ORB algorithm still performs poorly in environments with complex contrast variations and dense feature points, exhibiting problems such as large feature matching errors.
[0004] Besides the BRIEF descriptor, FREAK (Fast Retina Keypoint) and BEBLID (Binary EnhancedBRIEF with Linear Iterative Detection) are also commonly used descriptors, both offering advantages in computational and storage efficiency. The FREAK descriptor mimics the sampling pattern of the human retina to more effectively capture and compare image feature points; the BEBLID descriptor utilizes the AdaBoost algorithm and imbalanced datasets for training to improve its accuracy and efficiency. However, these descriptors primarily form feature vectors based on the grayscale distribution of feature points. For complex backgrounds with repetitive textures in infrared images, the grayscale distributions around feature points are highly similar, resulting in similar descriptors. This similarity can make it difficult to clearly distinguish feature points during matching, potentially leading to mismatches. This, in turn, significantly impacts the calculation of the transformation matrix, ultimately affecting the background motion compensation effect. Summary of the Invention
[0005] The technical problem solved by this invention is to overcome the shortcomings of the prior art and provide a background motion compensation method for complex background infrared images based on image registration, which improves the effect and efficiency of background motion compensation and provides better conditions for subsequent infrared weak target detection.
[0006] The technical solution of this invention is: a method for background motion compensation of complex background infrared images based on image registration, comprising:
[0007] Load two adjacent infrared images, use the FAST feature point detection algorithm to detect feature points in the images, and then use the quadtree algorithm to remove redundant feature points to obtain preprocessed feature points after removing redundant feature points.
[0008] Generate location descriptors and grayscale descriptors, and concatenate the location descriptors and grayscale descriptors with the BEBLID descriptor to generate the LGB descriptor; use the LGB descriptor to describe the features of all preprocessed feature points in the two images;
[0009] Each image is divided into four quadrants. The hash key value of each feature point in the two images is calculated. The coordinates of feature points in different quadrants are mapped to different key values through a hash function, and the coordinates of feature points in the same quadrant are mapped to the same key value through a hash function. The feature points in the two images are matched according to the hash key values to obtain matching feature point pairs. The random sampling consensus algorithm is used to further filter the matching feature point pairs. The final background motion parameters are calculated using the selected matching feature point pairs. Based on the final background motion parameters, an affine transformation is performed on the previous frame of the two images to achieve background motion compensation.
[0010] Preferably, the method of using the quadtree algorithm to remove redundant feature points is as follows:
[0011] S1-1 first creates a mother node, which is the entire image region of a frame. The feature point set of the mother node is the set of all feature points obtained by the FAST feature point detection algorithm on the entire image.
[0012] S1-2 divides the parent node into four child nodes according to the four quadrants, with each child node being a region corresponding to one of the quadrants; based on the coordinate quadrant regions, the feature points in the feature point set of the parent node are assigned to the feature point sets of the corresponding child nodes;
[0013] In S1-3, a variable COUNT is created to count the number of child nodes whose feature point set is not empty. If COUNT does not reach the expected value or the number of feature points in each child node is greater than 1, then for child nodes whose number of feature points is less than or equal to 1, they are marked as not needing further segmentation and the processing is stopped; for child nodes whose number of feature points is greater than 1, they are used as new parent nodes, and S1-2 is returned to continue iterating for each new parent node.
[0014] S1-4 When COUNT reaches the expected value or the number of feature points in each child node is less than or equal to 1, the feature point with the largest response value in each child node is extracted, and the quadtree algorithm ends. The response value of a feature point is defined as the sum of the absolute values of the differences between the gray values of all pixels within a circle with a radius of 3 pixels centered at the feature point and the gray value of the feature point.
[0015] Preferably, the attributes of the parent node include the coordinates of the top left corner, the top right corner, the bottom left corner, the bottom right corner, and the set of feature points it contains.
[0016] Preferably, the attributes of each child node include the coordinates of the top left corner, top right corner, bottom left corner, and bottom right corner of its corresponding quadrant, as well as the set of feature points in the quadrant region it belongs to.
[0017] Preferably, the location descriptor D L (p) is:
[0018]
[0019] Where 'a' represents dividing the entire graph into a grid of size a×a, where the size of 'a' is half the number of bits in the descriptor; 'm' and 'n' represent the row and column numbers of the grid, respectively; ':=' means 'defined as,' that is, the expression on the right is defined as the name on the left.
[0020] Preferably, the grayscale descriptor D g (p) is:
[0021]
[0022] in, : indicates rounding down; G(p) represents the gray value of feature point P, and := means "defined as", that is, the expression on the right is defined as the name on the left.
[0023] Preferably, the hash key value H(P) of each feature point is calculated as follows:
[0024]
[0025] in, y represents rounding down; (x,y) represents the coordinates of feature point P; W and H represent the width and height of the image, respectively.
[0026] Preferably, when the coordinates of feature points in the same quadrant region are mapped to the same key value through a hash function, a unique hash key value is assigned to the feature points in the quadrant boundary region, and the size of the quadrant boundary region is determined based on the difference in the field of view between the two frames.
[0027] Preferably, feature points in two images are matched based on hash key values to obtain matching feature point pairs between the two images, specifically as follows:
[0028] Based on each feature point in the first image, feature points with the same hash key are searched in the second image. The similarity of the matching is evaluated by calculating the Hamming distance between the LGB binary descriptors of the two feature points. Finally, the feature point corresponding to the descriptor with the smallest Hamming distance is selected as the matching pair.
[0029] Preferably, a random sampling consensus algorithm is used to select matching feature point pairs, and the final background motion parameters are calculated using the selected matching feature point pairs, specifically:
[0030] The random sampling consensus algorithm is used for iteration. In each iteration, four matching feature point pairs are randomly selected as subsets to estimate the background motion parameters. The number of matching pairs among all matching feature point pairs that satisfy the background motion parameters obtained in this iteration is evaluated. The subset with the largest number of matching pairs that satisfy the estimated background motion parameters is selected to calculate the final background motion parameters.
[0031] Compared with the prior art, the present invention has the following advantages:
[0032] (1) The background motion compensation algorithm proposed in this invention incorporates position and grayscale information into the descriptor construction method, which can achieve better matching results in environments with complex contrast changes, dense feature points, and repetitive textures. It improves the accuracy and discrimination ability of feature point matching, reduces the probability of mismatch, enhances the effect of background motion compensation, and provides better conditions for subsequent infrared weak target detection.
[0033] (2) The present invention maps the coordinates of feature points in different regions to corresponding key values through a hash function. When matching, only feature points with the same key value are queried, which reduces the search range and time during matching and saves computation. Attached Figure Description
[0034] Figure 1 This is a flowchart of the method of the present invention;
[0035] Figure 2 This is a schematic diagram of the quadtree algorithm for removing redundant feature points according to the present invention;
[0036] Figure 3 This is a schematic diagram illustrating the generation of the location descriptor in this invention;
[0037] Figure 4 This is a schematic diagram of grayscale descriptor generation according to the present invention;
[0038] Figure 5 This is a feature point distribution map before the use of the quadtree algorithm in this invention;
[0039] Figure 6 This is a feature point distribution map after using the quadtree algorithm in this invention;
[0040] Figure 7 This is the infrared image of the previous frame after background motion compensation in this invention;
[0041] Figure 8 This is a difference image between the current frame image and the previous frame image after background motion compensation in this invention. Detailed Implementation
[0042] The purpose of this invention is to propose a background motion compensation method for complex background infrared images based on image registration, which improves the effect and efficiency of background motion compensation and provides better conditions for subsequent infrared weak target detection.
[0043] The objective of this invention is achieved through the following technical solution: First, a quadtree algorithm is used to solve the problem of dense feature points, and an LGB (Location-Gray-BEBLID) descriptor is proposed, incorporating the location and grayscale information of feature points as key distinguishing features into the construction of the binary descriptor. When matching feature points, this invention employs a block-based matching strategy. By constructing a hash function, the coordinates of feature points in different regions of the image are mapped to corresponding key values. During the search, only feature points within the corresponding key values are searched, avoiding point-by-point matching of feature points in one image with all feature points in another image, thus reducing computational complexity.
[0044] The flowchart of this invention is as follows Figure 1 As shown, the details are as follows:
[0045] Step 1: Load two adjacent infrared images, use the FAST feature point detection algorithm to detect feature points, and then use the quadtree algorithm to remove redundant feature points, obtaining preprocessed feature points after removing redundant feature points; see the diagram for the removal process. Figure 2 The specific steps are as follows:
[0046] S1-1 first creates a parent node, which corresponds to the entire image region of a frame. The parent node's attributes include the coordinates of its top-left, top-right, bottom-left, and bottom-right corners, as well as the set of feature points it contains. Assuming the image width is W and the height is H, then the parent node's top-left corner is (0, 0), top-right corner is (W-1, 0), bottom-left corner is (0, H-1), and bottom-right corner is (W-1, H-1); all coordinates mentioned below refer to their positions within the entire image. The set of feature points in the parent node is the set of all feature points obtained from the FAST feature point detection algorithm across the entire image. Then, operation S-2 is performed on the parent node.
[0047] S1-2 divides the parent node into four child nodes according to the four quadrants. Each child node represents a region corresponding to one of the quadrants. The attributes of each child node include the coordinates of the top left, top right, bottom left, and bottom right corners of its corresponding quadrant, as well as the set of feature points in its region. For each feature point in the parent node, it is assigned to the corresponding child node based on its coordinate position (the quadrant it belongs to).
[0048] In S1-3, a variable COUNT is created to count the number of child nodes whose feature point set is not empty. If COUNT does not reach the expected value or the number of feature points in each child node is greater than 1, then for child nodes whose number of feature points is less than or equal to 1, they are marked as not needing further segmentation and the processing is stopped. For child nodes whose number of feature points is greater than 1, they are used as new parent nodes, and S1-2 is returned to continue iterating for each new parent node.
[0049] S1-4 When COUNT reaches the expected value or the number of feature points in each child node is less than or equal to 1, the feature point with the largest response value in each child node is extracted, and the quadtree algorithm ends. The response value of a feature point is defined as the sum of the absolute values of the differences between the gray values of all pixels within a circle with a radius of 3 pixels centered at the feature point and the gray value of the feature point.
[0050] The preprocessed feature points are obtained after removing redundant feature points.
[0051] Step 2: Based on the BEBLID descriptor, fuse the position and grayscale information of the feature points to generate the LGB descriptor. Use the LGB descriptor to describe the features of all the preprocessed feature points obtained in Step 1. The specific steps for generating the LGB descriptor are as follows:
[0052] 1) Generate location descriptors:
[0053]
[0054] Where 'a' represents dividing the entire image into a grid of size a×a (this invention constructs a 16-bit binary descriptor, so 'a' is set to 8); 'm' and 'n' represent the positions of the feature points within the grid: i.e., the grid located in the m-th row and n-th column; ':=' means "defined as", that is, the expression on the right is defined as the name on the left. A schematic diagram is shown below. Figure 3 As shown.
[0055] 2) Generate grayscale descriptors:
[0056]
[0057] in, This indicates rounding down; G(p) represents the gray value of feature point P. See the diagram below. Figure 4 As shown.
[0058] 3) The 16-bit position descriptor, the 16-bit grayscale descriptor, and the 32-bit BEBLID descriptor are concatenated and combined to form a 64-bit LGB binary descriptor.
[0059] Step 3: Divide each image into four regions according to the four quadrants. Calculate the hash key value of each feature point in both images. Map the coordinates of feature points in different regions to different key values using a hash function, and map the coordinates of feature points in the same quadrant region to the same key value using a hash function. Match the feature points in the two images based on the hash key values and LGB descriptors to obtain matching feature point pairs. Use the random sampling consensus algorithm to further filter and optimize the matched feature point pairs. Calculate the final background motion parameters using the pixel position information of the filtered matched feature point pairs in the corresponding images. Perform an affine transformation on the previous frame of the two images based on the final background motion parameters to achieve background motion compensation. The specific steps are as follows:
[0060] 1) Divide each image into four regions according to the four quadrants. Map the coordinates of feature points in different quadrant regions to different key values using a hash function. Map feature points in the same region to the same key value using a hash function. Calculate the hash key value H(P) corresponding to feature point P:
[0061]
[0062] in, The hash key is rounded down; (x, y) represents the coordinates of feature point P; W and H represent the width and height of the entire image, respectively. This formula can calculate the same key value for feature points at different locations within the same region. To avoid feature points near the region boundary that might match in adjacent quadrants, causing matching failures or errors, feature points in the quadrant boundary region are assigned a unique hash key value. The size of the quadrant boundary region is determined based on the difference in the field of view between the two frames.
[0063] 2) Matching feature points in two images: During the matching process, feature points with the same hash key value are searched in the second image based on each feature point in the first image. The similarity of the matching is evaluated by calculating the Hamming distance between the LGB binary descriptors of the two feature points. Finally, the feature point corresponding to the descriptor with the smallest Hamming distance is selected as the matching feature point pair.
[0064] 3) The Random Sample Consensus (RANSAC) algorithm is used to estimate the background motion parameters by iteratively and randomly selecting four matching feature point pairs as subsets. The number of matching pairs among all matching feature point pairs that satisfy the background motion parameters obtained in this iteration is evaluated, and the subset with the largest number of matching pairs is selected to calculate the final background motion parameters.
[0065] 4) Perform an affine transformation on the previous frame of the two images based on the background motion parameters to achieve background motion compensation.
[0066] This invention first employs a quadtree algorithm to remove redundant feature points, addressing the issue of dense feature point distribution. Furthermore, the location information of each feature point within the entire image is used as a key distinguishing feature to construct a 16-bit binary descriptor. To preserve the grayscale information of each feature point and use it as a key distinguishing feature, this invention converts the grayscale value of each feature point into a 16-bit binary descriptor to enhance the uniqueness of the feature point. The location descriptor, grayscale descriptor, and 32-bit BEBLID descriptor are concatenated and combined to form a 64-bit LGB binary descriptor.
[0067] Meanwhile, this invention maps the coordinates of feature points in different regions to corresponding key values using a hash function. During matching, only feature points with the same key value are queried, reducing the search range and time during matching.
[0068] Current descriptors only consider the gray-level distribution around feature points. However, in an image, if there are repetitive or periodic structures, feature points at the same location within these structures are likely to have similar or even identical descriptors. This similarity in descriptors can lead to misclassification when matching feature points, significantly impacting the calculation of the transformation matrix. The background motion compensation algorithm proposed in this invention incorporates position and gray-level information into the descriptor construction process, improving the accuracy and discriminative power of feature point matching, thereby enhancing the effect of background motion compensation and providing better conditions for subsequent infrared weak target detection.
[0069] The specific implementation of the present invention will be described below using two adjacent frames of infrared images with complex backgrounds:
[0070] Step 1: Load two adjacent infrared images, perform feature point detection using the FAST feature point detection algorithm, and then use the quadtree algorithm to remove redundant feature points. The feature point distribution before using the quadtree algorithm is as follows: Figure 5 As shown, the feature point distribution after using the quadtree algorithm is as follows: Figure 6 As shown,
[0071] Step 2: Perform feature description on the processed feature points, and fuse the position and grayscale information of the feature points based on the BEBLID descriptor to generate the LGB descriptor.
[0072] Step 3: Map the coordinates of feature points in different regions to different keys using a hash function. During matching, only feature points with the same key are queried, reducing the search range and time. The Random Sample Consensus (RANSAC) algorithm is then used to further filter and optimize the matching results. Background motion compensation is achieved by calculating background motion parameters using the obtained feature point information. The previous frame infrared image after background motion compensation is shown below. Figure 7 As shown, the difference image between the current frame image and the previous frame image after background motion compensation is as follows: Figure 8 As shown.
[0073] The contents not described in detail in this specification are existing technologies known to those skilled in the art.
Claims
1. A method for background motion compensation in infrared images with complex backgrounds based on image registration, characterized in that... include: Load two adjacent infrared images, perform feature point detection on the images, and then remove redundant feature points to obtain preprocessed feature points; Generate location descriptors and grayscale descriptors, and concatenate the location descriptors and grayscale descriptors with the BEBLID descriptor to generate the LGB descriptor; use the LGB descriptor to describe the features of all preprocessed feature points in the two images; Each image is divided into four quadrant regions. The hash key value of each feature point in the two images is calculated. The coordinates of feature points in different quadrant regions are mapped to different key values through a hash function, and the coordinates of feature points in the same quadrant region are mapped to the same key value through a hash function. The feature points in the two images are matched according to the hash key value and LGB descriptor to obtain the matching feature point pairs in the two images. The random sampling consensus algorithm is used to further filter the matching feature point pairs. The final background motion parameters are calculated using the selected matching feature point pairs. The affine transformation of the previous frame in the two images is performed according to the final background motion parameters to achieve background motion compensation.
2. The method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: The quadtree algorithm is used to remove redundant feature points, specifically: S1-1 first creates a parent node, which represents the entire image region of a frame. The feature point set of the parent node is the set of all detected feature points on the entire image. Here, the FAST feature point detection algorithm is used for feature point detection. S1-2 divides the parent node into four child nodes according to the four quadrants, with each child node being a region corresponding to one of the quadrants; based on the coordinate quadrant regions, the feature points in the feature point set of the parent node are assigned to the feature point sets of the corresponding child nodes; In S1-3, a variable COUNT is created to count the number of child nodes whose feature point set is not empty. If COUNT does not reach the expected value or the number of feature points in each child node is greater than 1, then for child nodes whose number of feature points is less than or equal to 1, they are marked as not needing further segmentation and the processing is stopped; for child nodes whose number of feature points is greater than 1, they are used as new parent nodes, and S1-2 is returned to continue iterating for each new parent node. S1-4 When COUNT reaches the expected value or the number of feature points in each child node is less than or equal to 1, the feature point with the largest response value in each child node is extracted, and the quadtree algorithm ends. The response value of a feature point is defined as the sum of the absolute values of the differences between the gray values of all pixels within a circle with the location of the feature point as the center and the radius of the set number of pixels and the gray value of the feature point.
3. The method for background motion compensation of complex background infrared images based on image registration according to claim 2, characterized in that: The attributes of the parent node include the coordinates of the top left corner, top right corner, bottom left corner, bottom right corner, and the set of feature points it contains.
4. The method for background motion compensation of complex background infrared images based on image registration according to claim 2, characterized in that: Each child node's attributes include the coordinates of its upper left, upper right, lower left, and lower right corners in its corresponding quadrant, as well as the set of feature points in the quadrant region it belongs to.
5. The method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: Location descriptor D L (p) is: Where 'a' represents dividing the entire graph into a grid of size a×a, where the size of 'a' is half the number of bits in the descriptor's binary representation; 'm' and 'n' represent the row and column numbers of the grid, respectively; ':=' means "defined as", that is, the expression on the right is defined as the name on the left.
6. The method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: Grayscale descriptor D g (p) is: in, := indicates rounding down; G(p) represents the gray value of feature point P, and := means "defined as", that is, the expression on the right is defined as the name on the left.
7. The method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: The hash key value H(P) of each feature point is calculated as follows: in, y represents rounding down; (x,y) represents the coordinates of feature point P; W and H represent the width and height of the image, respectively.
8. The method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: When the coordinates of feature points in the same quadrant region are mapped to the same key value through a hash function, a unique hash key value is assigned to the feature points in the quadrant boundary region. The size of the quadrant boundary region is determined based on the difference in the field of view between the two frames.
9. The method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: Feature points in two images are matched based on hash keys and LGB descriptors to obtain matching feature point pairs between the two images, specifically: Based on each feature point in the first image, feature points with the same hash key are searched in the second image. The similarity of the matching is evaluated by calculating the Hamming distance between the LGB descriptors of the two feature points. Finally, the feature point corresponding to the descriptor with the smallest Hamming distance is selected as the matching pair.
10. A method for background motion compensation of complex background infrared images based on image registration according to claim 1, characterized in that: The random sampling consensus algorithm is used to select matching feature point pairs, and the selected matching feature point pairs are used to calculate the final background motion parameters, specifically: The random sampling consensus algorithm is used for iteration. In each iteration, four matching feature point pairs are randomly selected as subsets to estimate the background motion parameters. The number of matching pairs among all matching feature point pairs that satisfy the background motion parameters obtained in this iteration is evaluated. The subset with the largest number of matching pairs that satisfy the estimated background motion parameters is selected to calculate the final background motion parameters.