A relocation method, apparatus, device and readable storage medium
By acquiring global and regional descriptors of the image and combining them with pose encoding information for retrieval and matching, the problem of localization confusion in similar scenarios in relocalization methods is solved, and more accurate localization is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- MIGU COMIC CO LTD
- Filing Date
- 2023-10-18
- Publication Date
- 2026-06-30
AI Technical Summary
Existing relocation methods are prone to localization confusion when dealing with similar scenarios.
By obtaining the global descriptor and region descriptor of the image to be queried, and combining them with the keyframe database for retrieval and matching, the pose encoding information in the global descriptor is used to classify positive and negative examples, and local feature matching is performed through the region descriptor to avoid localization confusion.
During the relocation process, it can effectively distinguish similar scenes that are geographically distant, reduce positioning errors, and improve positioning accuracy.
Smart Images

Figure CN117370597B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a relocation method, apparatus, device, and readable storage medium. Background Technology
[0002] Simultaneous Localization and Mapping (SLAM) relocalization has been widely used in many fields. However, existing relocalization methods may make errors in handling similar scenarios, potentially leading to localization confusion. Summary of the Invention
[0003] This application provides a relocation method, apparatus, device, and readable storage medium to avoid the problem of location confusion.
[0004] In a first aspect, embodiments of this application provide a relocation method, including:
[0005] Obtain the first global descriptor of the image to be queried, wherein the first global descriptor is obtained based on the image to be queried and the pose of the image to be queried;
[0006] Obtain the first region descriptor of the image to be queried;
[0007] The target keyframe is obtained by searching and matching the first global descriptor and the first region descriptor with the keyframe database.
[0008] Pose calculation is performed based on the target keyframe.
[0009] Secondly, embodiments of this application also provide a repositioning device, comprising:
[0010] The first acquisition module is used to acquire a first global descriptor of the image to be queried, wherein the first global descriptor is obtained based on the image to be queried and the pose of the image to be queried;
[0011] The second acquisition module is used to acquire the first region descriptor of the image to be queried;
[0012] The first matching module is used to perform retrieval and matching with the keyframe database based on the first global descriptor and the first region descriptor to obtain the target keyframe.
[0013] The first processing module is used to perform pose calculation based on the target keyframe.
[0014] Thirdly, embodiments of this application also provide an electronic device, including: a memory, a processor, and a program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps in the relocation method described above.
[0015] Fourthly, embodiments of this application also provide a readable storage medium storing a program that, when executed by a processor, implements the steps in the relocation method described above.
[0016] In this embodiment, the first global descriptor and the first region descriptor of the obtained query image are used to perform a retrieval and matching with a keyframe database to obtain a target keyframe, and pose calculation is performed based on the target keyframe. Since the first global descriptor is obtained based on the query image and its pose, and the region descriptor can perform local feature matching on the image, the relocalization process can consider not only the similarity between images but also the distance between their poses, thereby avoiding the problem of localization confusion. Attached Figure Description
[0017] Figure 1 This is a flowchart of the relocation method provided in the embodiments of this application;
[0018] Figure 2 This is one of the structural schematic diagrams of the global descriptor model provided in the embodiments of this application;
[0019] Figure 3 This is the second schematic diagram of the structure of the global descriptor model provided in the embodiments of this application;
[0020] Figure 4 This is a structural diagram of the repositioning device provided in the embodiments of this application. Detailed Implementation
[0021] In the embodiments of this application, the term "and / or" describes the relationship between associated objects, indicating that three relationships can exist. For example, A and / or B can represent three cases: A alone, A and B simultaneously, and B alone. The character " / " generally indicates that the preceding and following associated objects have an "or" relationship.
[0022] In the embodiments of this application, the term "multiple" refers to two or more, and other quantifiers are similar.
[0023] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of the embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.
[0024] See Figure 1 , Figure 1 This is a flowchart of the relocation method provided in the embodiments of this application, such as... Figure 1 As shown, it includes the following steps:
[0025] Step 101: Obtain the first global descriptor of the image to be queried, wherein the first global descriptor is obtained based on the image to be queried and the pose of the image to be queried.
[0026] In this embodiment, the pose of the image to be queried is estimated based on the previous frame and sensor data, and the image to be queried and its pose are input into a global descriptor model to obtain the first global descriptor. If the image to be queried is the first frame, this processing is not required. The sensor data includes, but is not limited to, data from accelerometers, gyroscopes, and other sensors.
[0027] In practical applications, data acquisition tools based on ARKit or ARCore can be used to acquire images and their pose information. These images and pose information serve as input to the model, and after training, a global descriptor model is obtained. Through this global descriptor model, a global descriptor vector, which aggregates image descriptors and pose encoding information, is obtained and used for subsequent global matching.
[0028] In this embodiment of the application, the structure of the global descriptor model can be as follows: Figure 2 As shown, the image is processed by a convolutional neural network and NetVLAD (a local aggregation descriptor vector based on a convolutional neural network) to obtain a vector V containing image information. d (K×D); The pose information of the image is normalized and processed by a multilayer perceptron (MLP) to obtain the pose encoding vector V. T (1×D); Vector V containing image information d (K×D) and pose encoding vector V T (1×D) are fused together to obtain the global vector descriptor V. G ((K+1)×D).
[0029] The reason for introducing image pose information is that using a vector V containing only image information... d In practice, errors can easily occur when determining the positive and negative examples of an image. For instance, in scenes with symmetrical and similar structures, buildings that look similar might be mistakenly identified as positive examples, even though they could be quite far apart. This can lead to incorrect identification of positive and negative examples in similar scenes, affecting model training and causing subsequent relocalization issues. Therefore, in this embodiment, incorporating pose information can distinguish between similar scenes that are geographically distant, thus aiding in the classification of positive and negative examples of images.
[0030] However, preliminary positive and negative example classification based on pose encoding vectors can avoid the above situation. In the embodiments of this application, the classification method for positive and negative examples is as follows:
[0031]
[0032]
[0033] Among them, P q Let N be the set of positive examples of the image q to be queried. q Let d be the set of negative examples of the image q to be queried. E It is a Euclidean distance function. Let q be the pose encoding vector of the image to be queried. Let be the pose encoding vector of image j in the keyframe database. d is the pose encoding vector of image i in the keyframe database. λ is a user-defined scaling parameter, which can take values of [0.1, 0.4] depending on the data distribution; E It is the Euclidean distance function.
[0034] The loss function for the global descriptor model is defined based on positive and negative examples as follows:
[0035]
[0036] Where l is l(x) = max(x,0), Let q be the pose encoding vector of the image to be queried. For the set of negative examples N q The pose encoding vector of image j in the middle. Here is the global descriptor for the image q to be queried. Let P be the set of positive examples. q The global descriptor of image i in the middle. For the set of negative examples N q The global descriptor of image j. In this embodiment, m is defined as an expression related to pose information. The farther away the negative example is, the higher the boundary value will be.
[0037] By incorporating pose encoding information into the positive / negative example judgment and loss function, the model can focus not only on the similarity between images during training but also on the distance between image poses. When images with high similarity appear in both positive and negative samples, the model can place greater emphasis on pose distance for judgment and training, thereby further avoiding the problem of localization confusion.
[0038] Based on the previous frame of the image to be queried and sensor data, the pose (rotation R) of the image to be queried is estimated. c Translation t c When rotating the image to be queried (R) using other sensors, it is possible to measure the rotation of the image. c and displacement t c A preliminary estimate is made. Let F be the current frame image to be queried (relocated). c The previous frame image is F p F p pose (rotation (position) R) p Translation (or orientation) t p ) is T p ,
[0039]
[0040]
[0041] Among them, w k f is the angular velocity at time k. k Let be the acceleration at time k; both of these can be obtained from the readings of the inertial measurement unit; b k and n k The zero bias and noise of the inertial measurement unit can be obtained through pre-calibration; Δt is the time interval between two frames, which can be obtained by subtracting the timestamps of the two frames; v k Let be the velocity at time k, and g be the acceleration due to gravity.
[0042] Step 102: Obtain the first region descriptor of the image to be queried.
[0043] When obtaining the first region descriptor of the image to be queried, the image can be divided into multiple regions, and can be utilized... Figure 3 The region descriptor model extracts a first region descriptor for each region. This region descriptor is used to match local features of the image to improve the indexing results of similar scenes in adjacent locations and reduce subsequent localization errors.
[0044] Among them, Figure 2 Based on this, multiple small local regions are densely sampled using a sliding window method on multiple convolutional layers for local feature matching, as shown in the structure below. Figure 3 As shown, the input image is processed by a convolutional neural network, forming multiple regions in the form of a sliding window. These regions are then processed by VLAD (Vector of Locally Aggregated Descriptors) and PCA (Principal Component Analysis) to obtain region descriptors for each region.
[0045] Where dx and dy are the width and height of the sliding window sampling, which can generally be 1, 3, 5, 8, etc., depending on the required receptive field size; s is the sliding step size, which can generally be 1, but if performance is prioritized over accuracy, s = dx can be used. For each extracted region, a corresponding region descriptor can be obtained.
[0046] For the image to be queried, the number of regions obtained by the sliding window sampling is n. p :
[0047] H and W represent the height and width of the image to be queried, respectively.
[0048] After obtaining the region descriptor, local feature matching can be performed on the image, especially for local features that differ in some aspects, which has a better matching effect.
[0049] Step 103: Based on the first global descriptor and the first region descriptor, perform a search and matching with the keyframe database to obtain the target keyframe.
[0050] In this step, the target keyframe can be determined in the following ways.
[0051] Calculate the Euclidean distance between the first global descriptor and the second global descriptor of each frame in the keyframe database. Select a target keyframe from the keyframe database, wherein the Euclidean distance between the second global descriptor and the first global descriptor of the target keyframe is less than a preset value. Obtain the second region descriptor of the candidate keyframes, and select the target keyframe from the candidate keyframes based on the first region descriptor and the second region descriptor.
[0052] For each image frame in the keyframe database, a corresponding second global descriptor can also be obtained using the aforementioned global descriptor model. Assume the global descriptor of the image to be queried is represented as... The global descriptor database for each frame in the keyframe database is {V G db}
[0053] The global descriptor of the image to be queried is matched against each global descriptor in the global descriptor database. Specifically, the distance between the global descriptor of the image to be queried and each global descriptor in the database is calculated, and the target keyframe is obtained based on the magnitude of the multiple Euclidean distances. The Euclidean distance between the target keyframe and the first global descriptor is less than a preset value, which can be set as needed. This target keyframe can also be considered as the keyframe most similar to the image to be queried. In this embodiment, the similarity between two image frames is represented by the magnitude of the Euclidean distance. The larger the Euclidean distance, the lower the similarity, and vice versa.
[0054] Among them, the target keyframe DB candidate It can be represented as:
[0055] This is the global descriptor for the image to be queried. For any global descriptor in the global descriptor database, d E It is the Euclidean distance function.
[0056] When obtaining the second region descriptor of the candidate keyframe, the candidate keyframe is divided into multiple regions, and can be utilized... Figure 3 The region descriptor model extracts the second region descriptor for each region.
[0057] For a candidate keyframe, the number of regions it is divided into is also n. p .
[0058] When selecting the target keyframe from the candidate keyframes based on the first region descriptor and the second region descriptor, the following process may be included:
[0059] S1. Cluster the second region descriptor to obtain K cluster centers, where K is an integer greater than or equal to 1.
[0060] For example, suppose the set of region descriptors for candidate keyframes is Applying a clustering algorithm (such as K-means) to this set yields K cluster centers, denoted as Kcluster.
[0061] S2. Based on the K cluster centers, calculate the descriptor weights of the second region descriptor.
[0062] For a region descriptor f in the set of region descriptors for candidate keyframes, its corresponding weight w(f) is calculated as follows:
[0063]
[0064] That is, the sum of the distances to the n nearest cluster centers whose cosine distance is to the region descriptor f is taken as the weight value of the region descriptor f. In this way, regions that are more distinctive and appear less frequently in the image can be given higher weights. On the other hand, regions with high similarity and frequent occurrences, such as the background, sky, and grass, will receive very low weights to reduce their impact on subsequent accurate matching.
[0065] S3. Based on the first region descriptor and the second region descriptor, determine a first region from the image to be queried and a second region from the candidate keyframes, wherein the first region and the second region are nearest neighbors to each other.
[0066] The concept of "nearest neighbor" can be understood as two regions being the closest in Euclidean distance to each other. For example, for two regions, region 1 and region 2, region 2 is the region with the smallest Euclidean distance to region 1 among its related regions, and region 1 is the region with the smallest Euclidean distance to region 2 among its related regions. These related regions can be understood as multiple regions within the image containing region 1, or multiple regions within the image containing region 2, etc.
[0067] In this embodiment of the application, for the third region in the candidate keyframe, if the Euclidean distance between the region descriptor of the third region and the region descriptor of region j of the image to be queried is the smallest, and the Euclidean distance between the region descriptor of the third region and the region descriptor of region i in the candidate keyframe is the smallest, then region j is taken as the first region, and region i is taken as the second region; wherein i and j are integers greater than or equal to 0.
[0068] Suppose that the set of region descriptors for the image to be queried is The set of region descriptors for candidate keyframes is Match the two:
[0069]
[0070] Here, iff is the "if and only if" notation, which indicates that the matching result is true if the j-th region in the query image and the i-th small region in the candidate keyframe are each other's nearest neighbors. f1 represents the region descriptor of the third region. This represents the region descriptor for the i-th region in the candidate keyframe. This represents the region descriptor for the i-th region in the candidate keyframe.
[0071] S4. Calculate the score of the candidate keyframe based on the first region, the second region, and the descriptor weight.
[0072] Assumption: First parameter: Second parameter:
[0073] The x and y coordinates of the center of region i. h represents the x and y coordinates of the center of region j. d v d The average values are respectively
[0074] Among them, the score ir It can be represented as:
[0075]
[0076] in, h represents the weight of the region descriptor for candidate keyframe i; d,j h represents the first parameter of region j. d,i The first parameter of region i, v d,j The second parameter of region j, v d,i The second parameter of region i, n p Indicates the number of regions.
[0077] S5. The candidate keyframe with the highest score is taken as the target keyframe, that is, the target keyframe is the final matching result after the finally differentiated regional feature matching.
[0078] Step 104: Perform pose calculation based on the target keyframe.
[0079] In this embodiment, the first global descriptor and first region descriptor of the obtained query image are used to perform a retrieval and matching with a keyframe database to obtain a target keyframe, and pose calculation is performed based on the target keyframe. Since the first global descriptor is obtained based on the query image and its pose, and the region descriptor can match local features of the image, the relocalization process can consider not only the similarity between images but also the distance between their poses, thus avoiding localization confusion. The solution in this embodiment can be used for keyframe retrieval and localization in scenarios with high similarity, reducing the possibility of localization errors.
[0080] See Figure 4 , Figure 4 This is a structural diagram of the repositioning device provided in an embodiment of this application. Figure 4 As shown, the repositioning device includes:
[0081] The first acquisition module 401 is used to acquire a first global descriptor of the image to be queried, wherein the first global descriptor is obtained based on the image to be queried and the pose of the image to be queried; the second acquisition module 402 is used to acquire a first region descriptor of the image to be queried; the first matching module 403 is used to perform a search and matching based on the first global descriptor and the keyframe database to obtain a target keyframe; and the first processing module 404 is used to perform pose calculation based on the target keyframe.
[0082] Optionally, the first acquisition module can also be used for:
[0083] Based on the previous frame of the image to be queried and sensor data, the pose of the image to be queried is estimated;
[0084] The image to be queried and its pose are input into the global descriptor model to obtain the first global descriptor.
[0085] Optionally, the first matching module includes:
[0086] The first calculation submodule is used to calculate the Euclidean distance between the first global descriptor and the second global descriptor of each frame image in the keyframe database.
[0087] The first selection submodule is used to select candidate keyframes from the keyframe database, wherein the Euclidean distance between the second global descriptor and the first global descriptor of the candidate keyframe is less than a preset value.
[0088] The first acquisition submodule is used to acquire the second region descriptor of the candidate keyframe;
[0089] The second selection submodule is used to select the target keyframe from the candidate keyframes based on the first region descriptor and the second region descriptor.
[0090] Optionally, the first matching module is further configured to:
[0091] The image to be queried is divided into multiple regions; the first region descriptor of each region is extracted using a region descriptor model.
[0092] Optionally, the first acquisition submodule is further configured to:
[0093] The candidate keyframes are divided into multiple regions; the second region descriptor is extracted for each region using a region descriptor model.
[0094] Optionally, the second selection submodule includes:
[0095] The first unit is used to cluster the second region descriptor to obtain K cluster centers, where K is an integer greater than or equal to 1;
[0096] The second unit is used to calculate the descriptor weight of the second region descriptor based on the K cluster centers;
[0097] The third unit is used to determine a first region from the image to be queried and a second region from the candidate keyframes based on the first region descriptor and the second region descriptor, wherein the first region and the second region are nearest neighbors to each other;
[0098] The fourth unit is used to calculate the score of the candidate keyframe based on the first region, the second region, and the descriptor weight;
[0099] The fifth unit is used to select the candidate keyframe with the highest score as the target keyframe.
[0100] Optionally, the third unit is further configured to:
[0101] For the third region in the candidate keyframe, if the Euclidean distance between the region descriptor of the third region and the region descriptor of region j in the image to be queried is the smallest, and the Euclidean distance between the region descriptor of the third region and the region descriptor of region i in the candidate keyframe is the smallest, then region j is taken as the first region, and region i is taken as the second region.
[0102] Where i and j are integers greater than or equal to 0.
[0103] Optionally, the fourth unit is further configured to:
[0104] The score of the candidate keyframe is calculated based on the coordinates of the center point of the first region, the coordinates of the center point of the second region, the descriptor weight, and the number of regions.
[0105] The apparatus provided in this application embodiment can execute the above method embodiment, and its implementation principle and technical effect are similar, so it will not be described again here.
[0106] It should be noted that the division of units in the embodiments of this application is illustrative and only represents one logical functional division. In actual implementation, other division methods may be used. Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated units described above can be implemented in hardware or as software functional units.
[0107] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a processor-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0108] This application provides an electronic device, including: a memory, a processor, and a program stored in the memory and executable on the processor; the processor is configured to read the program from the memory to implement the steps in the relocation method described above.
[0109] This application also provides a readable storage medium storing a program. When executed by a processor, this program implements the various processes of the relocation method embodiments described above and achieves the same technical effect. To avoid repetition, it will not be described again here. The readable storage medium can be any available medium or data storage device that the processor can access, including but not limited to magnetic storage (e.g., floppy disks, hard disks, magnetic tapes, magneto-optical disks (MO), etc.), optical storage (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor storage (e.g., ROMs, EPROMs, EEPROMs, non-volatile memory (NAND flash), solid-state drives (SSDs)).
[0110] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
[0111] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.
[0112] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.
Claims
1. A method of relocating, characterized by, include: Obtain the first global descriptor of the image to be queried, wherein the first global descriptor is obtained based on the image to be queried and the pose of the image to be queried; Obtain the first region descriptor of the image to be queried; The target keyframe is obtained by searching and matching the first global descriptor and the first region descriptor with the keyframe database. Pose calculation is performed based on the target keyframes; The step of retrieving and matching the target keyframe with the keyframe database based on the first global descriptor and the first region descriptor includes: Calculate the Euclidean distance between the first global descriptor and the second global descriptor of each frame in the keyframe database; Candidate keyframes are selected from the keyframe database, wherein the Euclidean distance between the second global descriptor and the first global descriptor of the candidate keyframe is less than a preset value; Obtain the second region descriptor of the candidate keyframe; The target keyframe is selected from the candidate keyframes based on the first region descriptor and the second region descriptor; The step of selecting the target keyframe from the candidate keyframes based on the first region descriptor and the second region descriptor includes: Cluster the second region descriptor to obtain K cluster centers, where K is an integer greater than or equal to 1; Based on the K cluster centers, calculate the descriptor weights of the second region descriptor; Based on the first region descriptor and the second region descriptor, a first region is determined from the image to be queried and a second region is determined from the candidate keyframes, wherein the first region and the second region are nearest neighbors to each other; The score of the candidate keyframe is calculated based on the first region, the second region, and the descriptor weight; The candidate keyframe with the highest score is selected as the target keyframe. The step of determining a first region from the image to be queried and a second region from the candidate keyframes based on the first region descriptor and the second region descriptor includes: For the third region in the candidate keyframe, if the Euclidean distance between the region descriptor of the third region and the region descriptor of region j in the image to be queried is the smallest, and the Euclidean distance between the region descriptor of the third region and the region descriptor of region i in the candidate keyframe is the smallest, then region j is taken as the first region, and region i is taken as the second region. Where i and j are integers greater than or equal to 0.
2. The method of claim 1, wherein, The step of obtaining the first global descriptor of the image to be queried includes: Based on the previous frame of the image to be queried and sensor data, the pose of the image to be queried is estimated; The image to be queried and its pose are input into the global descriptor model to obtain the first global descriptor.
3. The method according to claim 1, characterized in that, The step of obtaining the first region descriptor of the image to be queried includes: The image to be queried is divided into multiple regions; Using the region descriptor model, extract the first region descriptor for each region; or Obtaining the second region descriptor of the candidate keyframe includes: The candidate keyframes are divided into multiple regions; Using the region descriptor model, extract the second region descriptor for each region.
4. The method according to claim 1, characterized in that, The step of calculating the score of the candidate keyframe based on the first region, the second region, and the descriptor weight includes: The score of the candidate keyframe is calculated based on the coordinates of the center point of the first region, the coordinates of the center point of the second region, the descriptor weight, and the number of regions.
5. A repositioning device, characterized in that, include: The first acquisition module is used to acquire a first global descriptor of the image to be queried, wherein the first global descriptor is obtained based on the image to be queried and the pose of the image to be queried; The second acquisition module is used to acquire the first region descriptor of the image to be queried; The first matching module is used to perform retrieval and matching with the keyframe database based on the first global descriptor and the first region descriptor to obtain the target keyframe. The first processing module is used to perform pose calculation based on the target keyframe; The first matching module includes: The first calculation submodule is used to calculate the Euclidean distance between the first global descriptor and the second global descriptor of each frame image in the keyframe database. The first selection submodule is used to select candidate keyframes from the keyframe database, wherein the Euclidean distance between the second global descriptor and the first global descriptor of the candidate keyframe is less than a preset value. The first acquisition submodule is used to acquire the second region descriptor of the candidate keyframe; The second selection submodule is used to select the target keyframe from the candidate keyframes based on the first region descriptor and the second region descriptor. The second selection submodule includes: The first unit is used to cluster the second region descriptor to obtain K cluster centers, where K is an integer greater than or equal to 1; The second unit is used to calculate the descriptor weight of the second region descriptor based on the K cluster centers; The third unit is used to determine a first region from the image to be queried and a second region from the candidate keyframes based on the first region descriptor and the second region descriptor, wherein the first region and the second region are nearest neighbors to each other; The fourth unit is used to calculate the score of the candidate keyframe based on the first region, the second region, and the descriptor weight; The fifth unit is used to select the candidate keyframe with the highest score as the target keyframe; The third unit is further used for: For the third region in the candidate keyframe, if the Euclidean distance between the region descriptor of the third region and the region descriptor of region j in the image to be queried is the smallest, and the Euclidean distance between the region descriptor of the third region and the region descriptor of region i in the candidate keyframe is the smallest, then region j is taken as the first region, and region i is taken as the second region. Where i and j are integers greater than or equal to 0.
6. An electronic device, comprising: A memory, a processor, and a program stored in the memory and executable on the processor; characterized in that the processor is configured to read the program from the memory to implement the steps of the relocation method as described in any one of claims 1 to 4.
7. A readable storage medium for storing a program, characterized in that, When the program is executed by the processor, it implements the steps of the relocation method as described in any one of claims 1 to 4.