A loop closure detection method based on block feature uniform weighting and distance sorting in outdoor complex environment
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGDONG OCEAN UNIVERSITY
- Filing Date
- 2026-04-28
- Publication Date
- 2026-06-23
Smart Images

Figure CN122265733A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of visual camera simultaneous localization and mapping and computer vision technology, and particularly relates to a loop closure detection method based on uniform weighting of block features and distance sorting for use in complex outdoor environments. Background Technology
[0002] Visual simultaneous localization and mapping (VSL) and visual localization technologies have become key supporting technologies in fields such as robot autonomous navigation and augmented reality due to their low equipment cost and rich information content. Loop closure detection, as a core component of VSL systems, aims to identify whether the camera has revisited previously mapped locations, thereby effectively eliminating accumulated errors and constructing a globally consistent trajectory and map. However, in complex outdoor environments, visual systems often face challenges such as drastic changes in lighting conditions. In such environments, the image quality acquired by the visual camera sensor is easily affected, making it difficult for traditional feature extraction algorithms that rely on local gray-level gradients to extract a sufficient number of stable and repeatable feature points between consecutive frames, thus causing a decrease in loop closure detection accuracy or even failure.
[0003] While existing loop closure detection methods have improved matching efficiency to some extent by incorporating techniques such as the bag-of-words model, they still face significant challenges in complex outdoor environments. First, overexposure or underexposure caused by strong lighting or shadows severely interferes with the detection and description of local features, resulting in a reduced number of extracted feature points, uneven distribution, and even a large number of false features caused by lighting boundaries, seriously affecting the accuracy of subsequent image similarity calculations. Second, uncertain changes in lighting introduce a large amount of invalid local information into the image, leading to false matches in loop closure detection methods based on global features. Furthermore, when there are many repetitive textures in the scene, local features extracted from different locations are highly similar in appearance, making it difficult for traditional methods to effectively distinguish them, easily causing perceptual confusion and resulting in false positives in loop closure detection. These limitations lead to decreased accuracy and poor robustness of traditional visual simultaneous localization and mapping systems in complex outdoor environments, making it difficult to meet the needs of mobile robots for long-term autonomous operation. Summary of the Invention
[0004] To address the aforementioned technical problems, this invention proposes a loop closure detection method based on uniform weighting of block features and distance sorting for use in complex outdoor environments, thereby resolving the issues present in the prior art.
[0005] Firstly, to achieve the above objectives, this invention provides a loop closure detection method based on uniform weighting of block features and distance sorting for use in complex outdoor environments, comprising the following steps: The image frames acquired by the vision sensor are processed by grayscale conversion and Gaussian filtering to remove noise, resulting in preprocessed image frames. The preprocessed image frame is evenly divided into multiple non-overlapping sub-image blocks; Local feature information is extracted from each sub-image patch, and feature descriptors are constructed. For corresponding sub-image blocks in two frames, calculate the feature distance between their feature descriptors; The feature distance values of all corresponding sub-image blocks in the two frames are summed. Sort the feature distance values of all sub-image patches in ascending order; Based on the sorting results, the batch of sub-image blocks with the smallest feature distance value is selected as the key matching blocks to eliminate mismatched sub-image blocks introduced by changes in illumination. By combining the feature distance values of all key matching blocks, a global similarity score is calculated. When the global similarity score is lower than a preset loop closure detection threshold, it is determined that the two frames of images constitute a loop.
[0006] Optionally, the process of extracting local feature information from each sub-image block and constructing a feature descriptor includes: extracting feature points in each sub-image block using the oFAST corner detection algorithm, and calculating the binary description vector of the feature points using the rBRIEF descriptor to generate the feature descriptor of the sub-image block; and simultaneously endowing the feature descriptor with scale invariance by constructing an image pyramid.
[0007] Optionally, the process of calculating the feature distance value between the feature descriptors of corresponding sub-image patches in two frames includes: (1); in, These represent feature descriptors extracted from corresponding sub-image blocks in the two frames. The dimension of the feature vector. Represents the first eigenvector The One portion, Represents the second eigenvector The One portion, This represents the distance between feature vectors.
[0008] Optionally, the process of calculating the global similarity score by combining the feature distance values of all key matching blocks includes: (2); in, and Indicates the same coordinate position in two frames of images The corresponding sub-image patch, The value represents the total number of sub-image patches into which the image is divided, and D represents the feature distance value between the two feature descriptors. Indicates the first Feature operators for each sub-image patch This represents the sum of the feature similarities of all sub-image patches in the image pair.
[0009] Optionally, the process of sorting the feature distance values of all sub-image patches in ascending order includes: (3); in, Indicates the first Feature distance values of each sub-image patch pair This represents the original set consisting of the feature distance values of all sub-image patches.
[0010] Optionally, the process of selecting the batch of sub-image blocks with the smallest feature distance value as key matching blocks based on the sorting results includes: determining the top N sub-image blocks with the smallest feature distance value, i.e. the highest local structural similarity, as the key matching blocks based on the sorting results.
[0011] Optionally, the process of selecting the batch of sub-image blocks with the smallest feature distance value as key matching blocks based on the sorting results further includes: setting a distance threshold and retaining sub-image blocks with feature distance values lower than the distance threshold as key matching blocks.
[0012] Optionally, the process of geometrically verifying the two frames of images that are determined to constitute a loop includes: using a random sampling consensus algorithm to verify the loop candidates in order to eliminate geometrically inconsistent mismatches.
[0013] In a second aspect, the present invention also provides a computer terminal device, comprising: One or more processors; A memory, coupled to the processor, for storing one or more programs; When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the loop closure detection method based on block feature uniform weighting and distance sorting in the first aspect described above.
[0014] Thirdly, the present invention also provides a computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, it implements the steps of the loop closure detection method based on block feature uniform weighting and distance sorting in the first aspect for use in complex outdoor environments.
[0015] Compared with the prior art, the present invention has the following advantages and technical effects: This invention provides a loop closure detection method based on block feature uniform weighting and distance sorting for complex outdoor environments. By uniformly dividing an image into multiple non-overlapping sub-image blocks and extracting features from each, combined with a uniform weighting strategy, it fully preserves local image structural information, overcoming the shortcomings of traditional global feature extraction which easily loses spatial information. This improves the accuracy of image similarity measurement, provides a reliable matching basis for loop closure detection, and enhances the system's feature recognition capability in complex outdoor environments. This invention calculates the feature distance of sub-image blocks based on cosine similarity, and combines feature distance sorting from smallest to largest with a threshold filtering mechanism to accurately quantify the structural similarity of sub-image blocks. This effectively eliminates false matches caused by sudden changes in illumination, improving the accuracy and robustness of loop closure detection, reducing the probability of false positives, and ensuring the system's positioning accuracy and ground stability. Figure 1 To the point of being responsive. Attached Figure Description
[0016] The accompanying drawings, which form part of this invention, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings: Figure 1 This is a flowchart illustrating a loop closure detection method based on uniform weighting of block features and distance sorting in complex outdoor environments, according to an embodiment of the present invention. Figure 2 This is a schematic diagram of loop closure detection for visual camera pose estimation according to an embodiment of the present invention; Figure 3 This is a schematic diagram of the results before applying the loop closure detection method based on uniform weighting of block features and distance sorting in a visual camera according to an embodiment of the present invention. Figure 4 This is a schematic diagram showing the results of applying a loop closure detection method based on uniform weighting of block features and distance sorting to a visual camera according to an embodiment of the present invention. Detailed Implementation
[0017] It should be noted that, unless otherwise specified, the embodiments and features described in the present invention can be combined with each other. The present invention will now be described in detail with reference to the accompanying drawings and embodiments.
[0018] It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although a logical order is shown in the flowchart, in some cases the steps shown or described may be executed in a different order than that shown here.
[0019] Example 1 like Figure 1 As shown, this embodiment provides a loop closure detection method based on uniform weighting of block features and distance sorting for use in complex outdoor environments, including: The image frames acquired by the vision sensor are processed by grayscale conversion and Gaussian filtering to remove noise, resulting in preprocessed image frames. The preprocessed image frame is evenly divided into multiple non-overlapping sub-image blocks; Local feature information is extracted from each sub-image patch, and feature descriptors are constructed. For corresponding sub-image blocks in two frames, calculate the feature distance between their feature descriptors; The feature distance values of all corresponding sub-image blocks in the two frames are summed. Sort the feature distance values of all sub-image patches in ascending order; Based on the sorting results, the batch of sub-image blocks with the smallest feature distance value is selected as the key matching blocks to eliminate mismatched sub-image blocks introduced by changes in illumination. By combining the feature distance values of all key matching blocks, a global similarity score is calculated. When the global similarity score is lower than a preset loop closure detection threshold, it is determined that the two frames of images constitute a loop.
[0020] Furthermore, the process of extracting local feature information from each sub-image block and constructing a feature descriptor includes: extracting feature points in each sub-image block using the oFAST corner detection algorithm, and calculating the binary description vector of the feature points using the rBRIEF descriptor to generate the feature descriptor of the sub-image block; at the same time, the feature descriptor is given scale invariance by constructing an image pyramid.
[0021] Specifically, the implementation process of this embodiment includes: Step 1: Real-time acquisition of image datasets in outdoor environments. Each frame is converted to grayscale and then denoised using Gaussian filtering to suppress noise interference caused by changes in illumination, resulting in a pre-processed image. and .
[0022] Step 2: Divide each preprocessed image frame evenly into Q×Q non-overlapping sub-image blocks. Set the size of each sub-image block to M×N pixels to ensure that each sub-image block contains sufficient local structural information for subsequent feature extraction and similarity calculation.
[0023] Step 3: From each pair of image data and Feature information is extracted from the corresponding sub-image blocks. In each sub-image patch, a feature descriptor for that sub-image patch is constructed by calculating the local gradient histogram and represented as a feature vector.
[0024] In step three, the feature information specifically refers to ORB features. In each sub-image block, feature points are extracted using the oFAST corner detection algorithm, and the binary description vector of the feature points is calculated using the rBRIEF descriptor to generate the feature descriptor for that sub-image block. Simultaneously, an image pyramid is constructed to impart scale invariance to the ORB features, adapting to image scale changes caused by the movement of the visual camera in outdoor environments.
[0025] Furthermore, for corresponding sub-image blocks in two frames, the feature distance value between their feature descriptors is calculated, including: Step 4: Set the number The feature vectors extracted from each sub-image patch are X = ( ) and Y= ( The distance between two feature vectors is calculated using cosine similarity, specifically as follows: (1); in, These represent feature descriptors extracted from corresponding sub-image blocks in the two frames. The dimension of the feature vector. Represents the first eigenvector The One portion, Represents the second eigenvector The One portion, This represents the distance between feature vectors, and its value ranges from [0,1]. The closer to 0, the more similar the two feature vectors are.
[0026] Furthermore, the process of calculating the global similarity score by combining the feature distance values of all key matching blocks includes: summing the feature distance values of all key matching blocks and using the summation result as the global similarity score.
[0027] Specifically, the implementation process of this embodiment includes: Step 5: Using the feature distance of each sub-image patch obtained in Step 4, uniformly weight the feature similarity of each corresponding sub-image patch in the image data pair to obtain the overall similarity score of the image pair, expressed as: (2); in, and Indicates the same coordinate position in two frames of images The corresponding sub-image patch, The value represents the total number of sub-image patches into which the image is divided, and D represents the feature distance value between the two feature descriptors. Indicates the first Feature operators for each sub-image patch This represents the sum of the feature similarities of all sub-image patches in the image pair.
[0028] Furthermore, sorting the feature distance values of all sub-image patches from smallest to largest includes: Step 6: Sort the feature vector distance values of all sub-image patches calculated in each pair of images in ascending order, as shown below: (3); in, Indicates the first Feature distance values of each sub-image patch pair This represents the original set consisting of the feature distance values of all sub-image patches.
[0029] Furthermore, the process of selecting the batch of sub-image blocks with the smallest feature distance value as key matching blocks based on the sorting results includes: determining the top N sub-image blocks with the smallest feature distance value, i.e. the highest local structural similarity, as the key matching blocks based on the sorting results.
[0030] Furthermore, the process of selecting the batch of sub-image blocks with the smallest feature distance value as key matching blocks based on the sorting results also includes: setting a distance threshold and retaining sub-image blocks with feature distance values lower than the distance threshold as key matching blocks.
[0031] Specifically, the implementation process of this embodiment includes: Step 7: Based on the sorting results from Step 6, select the top N sub-image patches with the smallest feature vector distance (i.e., the highest similarity) as key matching patches. Sub-image patches with small feature vector distances indicate high local structural similarity, and vice versa. By setting a distance threshold T, only sub-image patches with distance values lower than T are retained for the final matching judgment, eliminating mismatched patches caused by changes in illumination.
[0032] like Figures 2-4 As shown, step eight integrates the feature distances of all key matching blocks to calculate the global similarity score between the current frame and historical frames. When the sum of the feature vector distances of all participating sub-image blocks reaches the minimum value, and this minimum value is lower than the preset loop closure detection threshold, it is determined that the current frame and the historical frame constitute a loop. At the same time, the loop closure detection result is output, and the back-end optimization module is triggered to perform pose graph optimization to eliminate accumulated errors.
[0033] Furthermore, the process of geometrically verifying the two frames of images that are determined to constitute a loop includes: using a random sampling consensus algorithm to verify the loop candidates in order to eliminate geometrically inconsistent mismatches.
[0034] Specifically, the implementation process of this embodiment includes: Step 9: Perform geometric verification on the detected loop closures. Use the Random Sample Consensus (RANSAC) algorithm to eliminate possible false matches, add the verified loop closure information to the loop closure set, and update the map's topology. The specific implementation process includes: First, for the feature point pairs that have been matched between the initially detected candidate loop closure frames and the current frame, use RANSAC combined with epipolar geometric constraints for geometric verification. Calculate the fundamental matrix by random sampling and count the number of interior points that satisfy the matrix model, eliminating false matches caused by similar appearances. Then, for the candidate loop closure frames that have passed the epipolar geometric verification, further construct a PnP problem using the 3D spatial points in historical image frames and the 2D projection points of the current frame. Iterate again using RANSAC to solve the precise pose of the current camera relative to the historical frames and count the number of interior points that conform to this pose relationship. Only when the number of interior points after PnP verification exceeds a preset threshold is the loop closure geometric verification considered successful. Finally, the successfully verified loop closure information is added to the loop closure set, and these constraints are used to perform pose graph optimization on the global map, thereby updating the map's topology and eliminating accumulated drift errors.
[0035] In this embodiment, the method is validated on an image dataset containing outdoor scenes. Figure 3 and Figure 4 The comparison of case results shows that traditional methods are... Figure 3 The interference from changes in illumination causes deviations in loop closure detection, resulting in a large cumulative error; however, by using the method of this invention, from... Figure 4 As can be seen, the accuracy of loop closure detection is significantly improved, the optimized pose graph has smaller trajectory errors, and stronger global consistency. The results show that the block feature extraction and uniform weighting strategy proposed in this invention, combined with a distance-based proportional threshold filtering mechanism, can effectively eliminate mismatched sub-blocks caused by complex outdoor environments, improve the accuracy of loop closure detection, thereby reducing the cumulative error of the visual simultaneous localization and mapping system and improving localization accuracy and global trajectory consistency.
[0036] Example 2 In this embodiment, a computer terminal device is provided, including: One or more processors; A memory, coupled to the processor, for storing one or more programs; When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the above-described loop closure detection method based on block feature uniform weighting and distance sorting in complex outdoor environments.
[0037] In this embodiment, a computer-readable storage medium is also provided, on which a computer program is stored. When the computer program is executed by a processor, it implements the steps of the above-described loop closure detection method based on block feature uniform weighting and distance sorting in complex outdoor environments.
[0038] This invention provides a loop closure detection method based on block feature uniform weighting and distance sorting for complex outdoor environments. By uniformly dividing an image into multiple non-overlapping sub-image blocks and extracting features from each, combined with a uniform weighting strategy, it fully preserves local image structural information, overcoming the shortcomings of traditional global feature extraction which easily loses spatial information. This improves the accuracy of image similarity measurement, provides a reliable matching basis for loop closure detection, and enhances the system's feature recognition capability in complex outdoor environments. This invention calculates the feature distance of sub-image blocks based on cosine similarity, and combines feature distance sorting from smallest to largest with a threshold filtering mechanism to accurately quantify the structural similarity of sub-image blocks. This effectively eliminates false matches caused by sudden changes in illumination, improving the accuracy and robustness of loop closure detection, reducing the probability of false positives, and ensuring the system's positioning accuracy and ground stability. Figure 1 To the point of being responsive.
[0039] The above are merely preferred embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A loop closure detection method based on block feature uniform weighting and distance ranking in outdoor complex environment, characterized in that, Includes the following steps: The image frames acquired by the vision sensor are processed by grayscale conversion and Gaussian filtering to remove noise, resulting in preprocessed image frames. The preprocessed image frame is evenly divided into multiple non-overlapping sub-image blocks; Local feature information is extracted from each sub-image patch, and feature descriptors are constructed. For corresponding sub-image blocks in two frames, calculate the feature distance between their feature descriptors; The feature distance values of all corresponding sub-image blocks in the two frames are summed. Sort the feature distance values of all sub-image patches in ascending order; Based on the sorting results, the batch of sub-image blocks with the smallest feature distance value is selected as the key matching blocks to eliminate mismatched sub-image blocks introduced by changes in illumination. By combining the feature distance values of all key matching blocks, a global similarity score is calculated. When the global similarity score is lower than a preset loop closure detection threshold, it is determined that the two frames of images constitute a loop.
2. The method according to claim 1, characterized in that, The process of extracting local feature information from each sub-image block and constructing a feature descriptor includes: extracting feature points in each sub-image block using the oFAST corner detection algorithm, and calculating the binary description vector of the feature points using the rBRIEF descriptor to generate the feature descriptor of the sub-image block; at the same time, the feature descriptor is given scale invariance by constructing an image pyramid.
3. The method according to claim 1, characterized in that, The process of calculating the feature distance between feature descriptors of corresponding sub-image patches in two frames includes: (1); in, These represent feature descriptors extracted from corresponding sub-image blocks in the two frames. The dimension of the feature vector. Represents the first eigenvector The One portion, Represents the second eigenvector The One portion, This represents the distance between feature vectors.
4. The method according to claim 1, characterized in that, The process of calculating the global similarity score by combining the feature distance values of all key matching blocks includes: (2); wherein, and denotes the corresponding sub-image block at the same coordinate position in the two frames of images, denotes the total number of sub-image blocks into which the image is divided, D denotes the feature distance value between two feature descriptors, denotes the feature operator of the th sub-image block, denotes the cumulative value of the feature similarity of all sub-image blocks in the pair of images. 5. The method according to claim 1, characterized in that, The process of sorting the feature distance values of all sub-image patches in ascending order includes: (3); wherein, denotes the feature distance value of the pair of the denotes the original set of all sub-image block feature distance values. 6. The method according to claim 1, characterized in that, The process of selecting the batch of sub-image blocks with the smallest feature distance value as key matching blocks based on the sorting results includes: determining the top N sub-image blocks with the smallest feature distance value, i.e. the highest local structural similarity, as the key matching blocks based on the sorting results.
7. The method according to claim 1, characterized in that, The process of selecting the batch of sub-image blocks with the smallest feature distance value as key matching blocks based on the sorting results also includes: setting a distance threshold and retaining the sub-image blocks with feature distance values lower than the distance threshold as the key matching blocks.
8. The method according to claim 1, characterized in that, The process of geometrically verifying the two frames of images that are determined to constitute a loop includes: using a random sampling consensus algorithm to verify the loop closure candidates in order to eliminate geometrically inconsistent mismatches.
9. A computer terminal device, characterized by include: One or more processors; A memory, coupled to the processor, for storing one or more programs; When the one or more programs are executed by the one or more processors, the one or more processors perform the steps of the method as described in any one of claims 1-8.
10. A computer-readable storage medium having stored thereon a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method as described in any one of claims 1-8.