Visual positioning method and device, computer readable medium and electronic equipment
A visual positioning and equipment technology, applied in the field of visual positioning, can solve the problems of no acquisition, positioning failure, weak texture, etc., to achieve the effect of improving accuracy and avoiding positioning failure
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0047] Taking the first frame in the video data as the initial key frame, and determining the collection pose in the coordinate system of the first device when the first device collects the initial key frame (the first frame).
[0048] Real-time detection of the pose change of the first device relative to the first frame acquisition pose. When the pose change is equal to the preset change, the video frame collected at this moment is the first target key frame and added to the key frame set.
[0049] Then replace the first frame with the first target key frame as the initial key frame, obtain the collection pose of the first device in the coordinate system of the first device when collecting the initial key frame (the first target key frame), and continue to execute The above process of real-time detecting, collecting and adding the key frame set ends when the number of video key frames in the key frame set meets the requirements.
[0050] It should be noted that when detecting...
Embodiment 2
[0066] When establishing the first global feature, DoG features of video key frames at multiple scales can be extracted first, and corresponding SIFT descriptors can be calculated. Among them, the number of scales can be calculated according to the resolution of video key frames through the following formula (3):
[0067]
[0068] Among them, N octave is the number of scales; R img,x The pixel width of the image matrix corresponding to the video key frame in the horizontal direction, R img,y The pixel width in the vertical direction of the image matrix corresponding to the video keyframe.
[0069] After obtaining the SIFT descriptor, the SIFT bag-of-words model features corresponding to the key frames of the video can be calculated through the pre-trained dictionary tree. Specifically, first put each SIFT descriptor in the video key frame into the dictionary tree to find the closest word, then calculate the norm distance between the descriptor and the word, and perform n...
Embodiment 3
[0090] For each first key frame, the first key frame and each candidate key frame are respectively subjected to the following comparison process to obtain a second comparison result between the first key frame and the candidate key frame. The process of comparing a first key frame 1 with a corresponding candidate key frame A is described in detail below:
[0091] The first step is to determine the first matching pair
[0092] The third preset condition may be set as the smallest distance c being less than 0.8 times the second smallest distance d.
[0093] For a first local feature 1 of the first key frame 1, the second distance between the first local feature 1 and all the second local features in the candidate key frame A is calculated to obtain a second distance set. When the second distance in the second distance set satisfies the above-mentioned third preset condition, the first matching pair is generated according to the second local feature A corresponding to the smalle...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


