Pose estimation method and device of structure line segment-based vio system and medium

By utilizing the rotation matrix transformation between the global coordinate system and the building coordinate system, as well as updating the state vector and covariance matrix of the structural line segments in the VIO system, pose estimation is optimized, solving the problem of accuracy degradation in V-SLAM and VIO systems in complex scenes, and improving the positioning accuracy and robustness of the system.

CN116704022BActive Publication Date: 2026-06-12INST OF AUTOMATION CHINESE ACAD OF SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INST OF AUTOMATION CHINESE ACAD OF SCI
Filing Date
2023-04-18
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing V-SLAM and VIO systems suffer from decreased positioning accuracy or reduced robustness in complex scenarios, especially in man-made environments.

Method used

By transforming the initial state vector of the VIO system to the building coordinate system using rotation matrix transformation based on the global coordinate system and the building coordinate system, the state vector and covariance matrix are updated using the global orientation of the structural line segments, and the pose estimation is optimized by combining the method of minimizing reprojection error.

🎯Benefits of technology

It improves the pose estimation accuracy of the VIO system in complex scenarios, enhances the system's robustness and adaptability to complex environments, and performs better in artificial environments.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116704022B_ABST
    Figure CN116704022B_ABST
Patent Text Reader

Abstract

The application provides a pose estimation method, device and medium of a structure line segment-based VIO system, the method comprising: transforming an initial state vector of the VIO system to a building coordinate system based on a rotation matrix between a global coordinate system and the building coordinate system; determining a structure line segment and a global direction of the structure line segment based on image data; updating a predicted state vector and a covariance matrix based on the structure line segment and the global direction; and determining a pose of a camera based on the updated state vector and covariance matrix. The pose estimation method of the structure line segment-based VIO system provided by the application uses the global direction of the structure line segment to perform pose estimation, and since the structure line segment on the image is a direct observation of the global direction of the environment, the structure line segment can be used to eliminate the cumulative error existing in the rotation estimation of the VIO system, thereby improving the pose estimation accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer vision technology, and in particular to a pose estimation method, apparatus and medium for a VIO system based on structural line segments. Background Technology

[0002] In recent years, Visual Simultaneous Localization and Mapping (V-SLAM) and Visual-Inertial Odometry (VIO) systems have been widely used in robotics, autonomous driving, virtual reality, and augmented reality. Currently, V-SLAM and VIO systems can achieve high positioning accuracy even in scenarios with rich scene textures and simple motion patterns.

[0003] However, in certain complex scenarios (such as artificial environments), these systems may experience a decrease in accuracy or a deterioration in robustness. Summary of the Invention

[0004] To address the problems existing in the prior art, embodiments of the present invention provide a pose estimation method, apparatus, and medium for a VIO system based on structural line segments.

[0005] This invention provides a pose estimation method for a VIO system based on structural line segments, comprising:

[0006] The initial state vector of the VIO system is transformed to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system; the rotation matrix is ​​determined based on the image data acquired by the VIO system.

[0007] Based on the image data, structural line segments and their global directions are determined; the structural line segments are line segments whose directions are the same as the coordinate axes of the architectural coordinate system.

[0008] Based on the structural line segments and the global direction, the predicted state vector and covariance matrix are updated; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation.

[0009] The camera pose is determined based on the updated state vector and covariance matrix.

[0010] In some embodiments, before transforming the initial state vector of the VIO system to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system, the method further includes:

[0011] Among the ablation points corresponding to the image data, determine the first ablation point, the second ablation point, and the third ablation point, which correspond to the coordinate axis directions of the building coordinate system, respectively;

[0012] Based on the first disappearance point, the second disappearance point, and the third disappearance point, determine the first and second included angles between the building coordinate system and the global coordinate system in the X-axis and Y-axis directions;

[0013] The rotation matrix is ​​determined based on the first included angle and the second included angle.

[0014] In some embodiments, determining the structural line segment and its global direction based on the image data includes:

[0015] Based on the direction vectors of the coordinate axes of the global coordinate system, determine the fourth, fifth, and sixth disappearance points corresponding to the direction vectors in the image coordinate system;

[0016] Feature extraction is performed on the image data to determine the target line segment in the image data;

[0017] Based on the midpoint of the target line segment, the fourth disappearance point, the fifth disappearance point, and the sixth disappearance point, the first line segment, the second line segment, and the third line segment are determined;

[0018] The structural line segment and the global direction are determined based on the third included angle between the target line segment and the first line segment, the second line segment, and the third line segment.

[0019] In some embodiments, updating the predicted state vector and covariance matrix based on the structural line segment and the global direction includes:

[0020] Based on the global direction, determine the first projection plane corresponding to the structural line segment;

[0021] The second projection plane is determined based on the first projection plane and the camera optical center;

[0022] Based on the first Plück coordinates of the structural line segment in the global coordinate system and the second projection plane, the initial projection coordinates of the structural line segment are determined;

[0023] The initial projection coordinates are optimized based on the method of minimizing reprojection error to determine the target projection coordinates;

[0024] Based on the target projection coordinates, the predicted state vector and covariance matrix are updated.

[0025] In some embodiments, updating the predicted state vector and covariance matrix based on the target projected coordinates includes:

[0026] Based on the target projection coordinates and the global direction, determine the target intersection point between the structural line segment and the first projection plane;

[0027] Based on the target intersection point and the global direction, determine the second Plück coordinates of the structural line segment in the global coordinate system;

[0028] Based on the second Plück coordinates, determine the third Plück coordinates of the structural line segment in the camera coordinate system;

[0029] Based on the third Plück coordinates, the structural line segment is projected from the camera coordinate system onto the image plane to determine the reprojection error of the structural line segment;

[0030] Based on the reprojection error, the predicted state vector and covariance matrix are updated.

[0031] In some embodiments, updating the predicted state vector and covariance matrix based on the reprojection error includes:

[0032] Determine the Jacobian matrix of the reprojection error with respect to the camera pose and the target projection coordinates;

[0033] Based on the reprojection error and the Jacobian matrix, the predicted state vector and covariance matrix are updated.

[0034] The present invention also provides a pose estimation device for a VIO system based on structural line segments, comprising:

[0035] The transformation module is used to transform the initial state vector of the VIO system to the building coordinate system based on a rotation matrix between the global coordinate system and the building coordinate system; the rotation matrix is ​​determined based on the image data acquired by the VIO system.

[0036] The first determining module is used to determine the structural line segment and the global direction of the structural line segment based on the image data; the structural line segment is a line segment with the same direction as the coordinate axis of the architectural coordinate system;

[0037] An update module is used to update the predicted state vector and covariance matrix based on the structural line segment and the global direction; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation.

[0038] The second determination module is used to determine the camera pose based on the updated state vector and covariance matrix.

[0039] The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the program to implement the pose estimation method of the VIO system based on structural line segments as described above.

[0040] The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the pose estimation method of the VIO system based on structural line segments as described above.

[0041] The present invention also provides a computer program product, including a computer program that, when executed by a processor, implements the pose estimation method of the VIO system based on structural line segments as described above.

[0042] The pose estimation method, apparatus and medium of VIO system based on structural line segments provided by the present invention perform pose estimation by utilizing the global direction of structural line segments. Since the structural line segments on the image are a direct observation of the global direction of the environment, the accumulated error in the rotation estimation of the VIO system can be eliminated by using structural line segments, thereby improving the pose estimation accuracy. Attached Figure Description

[0043] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0044] Figure 1 This is one of the flowcharts illustrating the pose estimation method for a VIO system based on structural line segments provided in this embodiment of the invention.

[0045] Figure 2 This is a schematic diagram of the coordinate alignment method provided in an embodiment of the present invention;

[0046] Figure 3 This is a schematic diagram of a typical Manhattan world model provided by existing technology;

[0047] Figure 4 This is the second flowchart illustrating the pose estimation method for a VIO system based on structural line segments provided in this embodiment of the invention.

[0048] Figure 5 This is a schematic diagram of the pose estimation device of the VIO system based on structural line segments provided in an embodiment of the present invention;

[0049] Figure 6This is a schematic diagram of the structure of the electronic device provided in an embodiment of the present invention. Detailed Implementation

[0050] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0051] The terms "first," "second," etc., used in the specification and claims of this invention are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such terms can be used interchangeably where appropriate so that embodiments of the invention can be implemented in orders other than those illustrated or described herein, and the objects distinguished by "first" and "second" are generally of the same class, not limited in number; for example, the first object can be one or more. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship.

[0052] Figure 1 This is one of the flowcharts illustrating the pose estimation method for a VIO system based on structural line segments provided in this embodiment of the invention, such as... Figure 1 As shown, the pose estimation method for a VIO system based on structural line segments provided in this embodiment of the invention includes:

[0053] Step 101: Based on the rotation matrix between the global coordinate system and the building coordinate system, transform the initial state vector of the VIO system to the building coordinate system; the rotation matrix is ​​determined based on the image data acquired by the VIO system.

[0054] Step 102: Based on the image data, determine the structural line segments and their global orientation; the structural line segments are line segments whose orientation is the same as the coordinate axes of the architectural coordinate system.

[0055] Step 103: Update the predicted state vector and covariance matrix based on the structural line segment and the global direction; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation.

[0056] Step 104: Determine the camera pose based on the updated state vector and covariance matrix.

[0057] It should be noted that the subject executing the pose estimation method for the VIO system based on structural line segments provided by this invention can be an electronic device, a component within an electronic device, an integrated circuit, or a chip. The electronic device can be a mobile electronic device or a non-mobile electronic device. For example, mobile electronic devices can be mobile phones, tablets, laptops, PDAs, in-vehicle electronic devices, wearable devices, ultra-mobile personal computers (UMPCs), netbooks, or personal digital assistants (PDAs), etc., while non-mobile electronic devices can be servers, network-attached storage (NAS), personal computers (PCs), televisions (TVs), ATMs, or self-service machines, etc. This invention does not impose specific limitations.

[0058] In step 101, the initial state vector of the VIO system is transformed to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system.

[0059] VIO is an algorithm that integrates camera and inertial measurement unit (MU) data to achieve simultaneous localization and mapping (SLAM).

[0060] The VIO system is initialized by using sensor data to estimate the initial values ​​of the system state variables (carrier velocity, gravity direction, accelerometer and gyroscope deviations), i.e., the initial state vector.

[0061] The global coordinate system G is the coordinate system corresponding to the VIO system. In buildings that conform to the assumptions of the Manhattan world model, the three principal directions of the building provide a natural coordinate system, namely the building coordinate system B.

[0062] The three coordinate axes of the global coordinate system G of the VIO system are represented as X', Y', and Z', and the three coordinate axes of the building coordinate system B are represented as X, Y, and Z.

[0063] After VIO initialization, the Z' axis of the global coordinate system G is aligned with the direction of gravity, but the directions of the X' and Y' axes are arbitrarily distributed. That is, the Z' axis of the global coordinate system G is consistent with the Z-axis of the building coordinate system B, but the two coordinate systems differ by a rotation angle θ around the Z-axis.

[0064] Using 11 consecutive frames of images after initialization, the rotation matrix between the two coordinate systems can be calculated, and the obtained rotation matrix can be used to transform all parameters in the initial state vector of the VIO system from the global coordinate system G to the building coordinate system B.

[0065] In some embodiments, before transforming the initial state vector of the VIO system to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system, the method further includes:

[0066] Among the ablation points corresponding to the image data, determine the first ablation point, the second ablation point, and the third ablation point, which correspond to the coordinate axis directions of the building coordinate system, respectively;

[0067] Based on the first disappearance point, the second disappearance point, and the third disappearance point, determine the first and second included angles between the building coordinate system and the global coordinate system in the X-axis and Y-axis directions;

[0068] The rotation matrix is ​​determined based on the first included angle and the second included angle.

[0069] For the input image, the first step is to detect the corresponding cancellation points. The cancellation point detection algorithm is also based on the Manhattan world model assumption, therefore, three corresponding cancellation points v can be calculated for each image. i Its homogeneous coordinates are represented as {v i ,i=1,2,3}.

[0070] At this point, the vanishing point v is obtained. i The correspondence between the three disappearing points on the image and the main direction of the building is unknown. We need to find the correspondence between the three disappearing points on the image and the main direction of the building.

[0071] Define e x e y and e z Let X, Y, and Z represent the unit vectors in the global coordinate system G corresponding to the X', Y', and Z' axes, respectively, as follows:

[0072]

[0073] The rotation matrix from the current global coordinate system G to the camera coordinate system is represented as: Since the Z-axis directions of the global coordinate system G and the architectural coordinate system B are consistent, the first disappearance point v corresponding to the Z-axis direction of the architectural coordinate system B can be determined using the following formula. z :

[0074]

[0075] Where K represents the pre-calibrated camera intrinsic parameter matrix, v u This represents the corresponding disappearance point in the image. e represents the rotation matrix from the global coordinate system G to the camera coordinate system. z This represents the unit vector along the Z' axis in the global coordinate system G.

[0076] Formula (2) can be used to determine the vanishing point in the Z' axis direction. The method is as follows: First, rotate the Z' axis direction in the global coordinate system G to the current camera coordinate system. Then, transform the three vanishing points extracted from the image from the pixel coordinate system to the camera coordinate system. Finally, calculate the angle between the Z' axis direction in the camera coordinate system and the directions of the three vanishing points. The vanishing point corresponding to the smallest angle is the vanishing point in the Z axis direction of the corresponding architectural coordinate system B.

[0077] After determining the disappearance point corresponding to the Z-axis direction of architectural coordinate system B, the remaining two disappearance points correspond to the X-axis and Y-axis directions of architectural coordinate system B, respectively. The second disappearance point v corresponding to the X-axis direction of architectural coordinate system B is then determined from these two remaining disappearance points using the following formula. x :

[0078]

[0079] After determining the disappearance points corresponding to the X and Z axes, the remaining disappearance point is the third disappearance point v corresponding to the Y-axis direction of the architectural coordinate system B. y .

[0080] After determining the first, second, and third disappearance points, the direction vectors of the three coordinate axes of the building coordinate system B can be obtained. Based on these direction vectors, the first angle between the X-axis of the building coordinate system B and the X'-axis of the current global coordinate system G in the camera coordinate system can be determined. And the second angle between the Y-axis of the building coordinate system B and the Y'-axis of the global coordinate system in the camera coordinate system.

[0081] First included angle The calculation formula is as follows:

[0082]

[0083] in, Let K represent the rotation matrix from the global coordinate system G to the camera coordinate system, and let K represent the pre-calibrated camera intrinsic parameter matrix. x v represents the unit vector along the X' axis in the global coordinate system G. x This indicates the second disappearance point.

[0084] Second angle The calculation formula is as follows:

[0085]

[0086] in, Let K represent the rotation matrix from the global coordinate system G to the camera coordinate system, and let K represent the pre-calibrated camera intrinsic parameter matrix. y v represents the unit vector along the Y' axis in the global coordinate system G. y This indicates the third disappearance point.

[0087] In theory, It should equal However, the Z-axis directions of the two coordinate systems are not perfectly aligned, resulting in Therefore, the average of the first and second included angles is taken as the initial value of the rotation angle.

[0088] To reduce the impact of random errors on coordinate alignment accuracy, the initial value θ of the rotation angle can be calculated using 11 consecutive frames of images. init Take the average of all valid results, θ avg , as the final rotation angle.

[0089] Therefore, based on the final θ avg Then, construct the rotation matrix between the VIO system's global coordinate system G and the building coordinate system B.

[0090] The expression for the rotation matrix is ​​as follows:

[0091]

[0092] use Transform all parameters of the current system state vector from the global coordinate system G to the building coordinate system B to complete the coordinate alignment process.

[0093] Figure 2 This is a schematic diagram of the coordinate alignment method provided in an embodiment of the present invention, as shown below. Figure 2 As shown, after the coordinate alignment step, the global coordinate system G coincides with the architectural coordinate system B, thus making it easier to utilize structural line segments. Furthermore, this coordinate alignment method enables the alignment of the global coordinate system G of the VIO system with the three principal directions of the Manhattan world model corresponding to the environment, greatly simplifying the parameterization of structural line segments, the calculation of reprojection errors, and the Jacobian matrix.

[0094] In step 102, based on the image data, the structural line segments and their global orientation are determined.

[0095] By extracting line segment features from the image data acquired by the VIO system, the target line segment in the image data can be extracted.

[0096] From the extracted target line segments, their direction can be used to determine whether a target line segment is a structural line segment. A structural line segment is a line segment whose direction is the same as the coordinate axis of the building coordinate system B. The direction of the coordinate axis of the building coordinate system is the main direction in the environment, including three types of structural line segments: structural line segments in the X-axis direction, structural line segments in the Y-axis direction, and structural line segments in the Z-axis direction.

[0097] The main difference between structured and unstructured line segments is that structured line segments encode global orientation information of the environment, and structured line segments on an image are a direct observation of the global orientation of the environment.

[0098] Figure 3 This is a schematic diagram of a typical Manhattan world model provided by existing technology, such as... Figure 3 As shown, the scene contains three main directions (X-axis, Y-axis, and Z-axis), each with multiple line segments. Line segments in the environment whose direction aligns with the main directions are defined as structural line segments, while those whose direction does not align with any of the three main directions are defined as unstructured line segments. Based on their direction, structural line segments can be divided into three categories: structural line segments along the X-axis, structural line segments along the Y-axis, and structural line segments along the Z-axis.

[0099] After determining the direction of the structural line segment, the first projection plane corresponding to the structural line segment can be determined. The two-dimensional coordinates of the intersection point of the structural line segment and the first projection plane are the 2D projection coordinates of the structural line segment, and these coordinates are used to represent the structural line segment in the VIO system.

[0100] like Figure 3 As shown, line segments ② and ⑧ are structural line segments in the X-axis direction, and their corresponding projection plane is YOZ; line segments ①, ③, ④ and ⑥ are structural line segments in the Z-axis direction, and their corresponding projection plane is XOY; line segments ⑤ and ⑦ are structural line segments in the Y-axis direction, and their corresponding projection plane is XOZ.

[0101] In some embodiments, determining the structural line segment and its global direction based on the image data includes:

[0102] Based on the direction vectors of the coordinate axes of the global coordinate system, determine the fourth, fifth, and sixth disappearance points corresponding to the direction vectors in the image coordinate system;

[0103] Feature extraction is performed on the image data to determine the target line segment in the image data;

[0104] Based on the midpoint of the target line segment, the fourth disappearance point, the fifth disappearance point, and the sixth disappearance point, the first line segment, the second line segment, and the third line segment are determined;

[0105] The structural line segment and the global direction are determined based on the third included angle between the target line segment and the first line segment, the second line segment, and the third line segment.

[0106] After the coordinate alignment step, the global coordinate system G of the VIO system coincides with the architectural coordinate system B. Projecting the direction vectors of the three coordinate axes of the global coordinate system G onto the image coordinate system yields the disappearance point v′ corresponding to that direction vector in the image coordinate system C. i , v′ i The projection formula is as follows:

[0107]

[0108] Where K represents the pre-calibrated camera intrinsic parameter matrix, This represents the rotation matrix from the global coordinate system G to the current IMU. The rotation matrix e represents the rotation of the IMU and the camera. i This represents the direction vector of the global coordinate system G.

[0109] That is, the homogeneous coordinates of the disappearance point in the image coordinate system C can be obtained by formula (7).

[0110] For any target line segment l extracted from the image j Target line segment l j The midpoint is m j According to the midpoint m j With the fourth disappearance point v′ x By connecting the lines, we can obtain the first line segment l. xj Based on the midpoint m j With the fifth disappearance point v′ y By connecting the lines, we can obtain the second line segment l. yj Based on the midpoint m j With the sixth disappearance point v′ z By connecting the lines, we can obtain the third line segment l. zj .

[0111] Thus, the target line segment l can be obtained. j With the first line segment l xj The included angle δ xj Target line segment l and second line segment l yj The included angle δ yj and the target line segment l and the third line segment l zj The included angle δ zj .

[0112] If δ xj δ yj and δ zj If both are greater than 3 degrees, then the target line segment l is determined. jIt is not a structural line segment. Otherwise, determine the target line segment l. j For structural line segments.

[0113] When determining whether the target line segment is a structural line segment, in δ xj δ yj and δ zj In this process, the direction of the coordinate axis corresponding to the smallest angle value is selected as the global direction of the structural line segment.

[0114] Furthermore, to improve the accuracy of line segment orientation classification, the orientation consistency of structural line segments can be detected across multiple image frames. Assuming a structural line L in three-dimensional space can be observed in m consecutive image frames, errors in the line segment detection, tracking, and classification steps may lead to the line segment being classified in different orientations on different images.

[0115] Suppose that the structural line segment L is classified as having the most images corresponding to the X-axis direction, totaling k frames. Only when the ratio of k / m is greater than a certain threshold is the line segment determined to be a structural line segment in the X-axis direction. Otherwise, the line segment is classified as a non-structural line segment.

[0116] Optionally, embodiments of the present invention employ different thresholds in the three main directions, with the thresholds for the X and Y axes being 0.75 and the threshold for the Z axis being 0.90. The reason for setting these thresholds is that, for images captured by handheld cameras or drones in artificial environments, extracting and tracking horizontal line segments is more difficult than extracting and tracking vertical line segments. If the same threshold is set, the number of structural line segments that the system can utilize in the horizontal direction is extremely limited.

[0117] In step 103, the predicted state vector and covariance matrix are updated based on the structural line segment and the global direction.

[0118] In some embodiments, updating the predicted state vector and covariance matrix based on the structural line segment and the global direction includes:

[0119] Based on the global direction, determine the first projection plane corresponding to the structural line segment;

[0120] The second projection plane is determined based on the first projection plane and the camera optical center;

[0121] Based on the first Plück coordinates of the structural line segment in the global coordinate system and the second projection plane, the initial projection coordinates of the structural line segment are determined;

[0122] The initial projection coordinates are optimized based on the method of minimizing reprojection error to determine the target projection coordinates;

[0123] Based on the target projection coordinates, the predicted state vector and covariance matrix are updated.

[0124] To utilize the features of structural line segments in a VIO system, it is necessary to first calculate the position of the structural line segments in three-dimensional space. Currently, there is no specific triangulation method for structural line segments, resulting in the ineffective utilization of the global orientation information of the structural line segments during the triangulation process. Therefore, this invention proposes a triangulation method specifically for structural line segments.

[0125] First, based on the global direction of the structural line segment, determine the first projection plane corresponding to the structural line segment. The 2D projection coordinates of the structural line segment are defined as the intersection of the structural line segment and its corresponding first projection plane. The first projection plane is perpendicular to the structural line segment and passes through the origin of the coordinate system. For example, if the global direction of the structural line segment is the X-axis, then the first projection plane is YOZ.

[0126] Then, treating the structural line segment as a regular line segment, and using the traditional line segment triangulation algorithm—specifically, the multi-view constrained line segment triangulation method—we can obtain the first Plück matrix L of the structural line segment L in the global coordinate system G. * (L * (The matrix is ​​4×4). In this process, the direction information of the structural line segment L is not used, the first Plück coordinate has angular and positional errors, and the obtained line segment direction and position are both incorrect.

[0127] The greater the distance between the structure line segment and the projection plane, the larger the difference between the initial projected coordinates and the true values. Therefore, a plane parallel to the first projection plane and passing through the optical center of the current camera is used as the alternative projection plane, i.e., the second projection plane. The distance between the alternative projection plane and the structure line segment is smaller, and the error of the initial projected coordinates calculated using the alternative projection plane is also smaller.

[0128] For example, structural line segment L can be captured by camera C. n It was observed that, assuming the optical center position of the camera is... Then the alternative projection plane π passing through the camera's optical center P The expression is as follows:

[0129]

[0130] Structural line segment L and alternative projection plane π P The intersection point is:

[0131] Q = (L * )πP=[X P Y P Z P W P ] T (9)

[0132] Among them, L * Represents the first Plück coordinate, π P This represents the second projection plane.

[0133] The initial projected coordinates q of the structural line segment L init for:

[0134]

[0135] After obtaining the initial projected coordinates of structural line segment L through the above steps, since the global direction of structural line segment L is known, the Plück coordinates of structural line segment L can be directly constructed from the initial projected coordinates. The constructed Plück coordinates only have positional errors and no directional errors, meaning that the global direction information of the structural line segment is utilized in the triangulation process.

[0136] Then, by optimizing the position of the structural line segments using the method of minimizing reprojection error, more accurate target projection coordinates can be obtained. The optimization objective function used in this embodiment of the invention is as follows:

[0137]

[0138] Among them, e l The reprojection error of structural line segment L is represented by the Levenberg-Marquardt (LM) algorithm. By optimizing the objective function, more accurate target projection coordinates can be obtained. These target projection coordinates can be used to construct a map of structural line segments in the current environment.

[0139] The pose estimation method for a VIO system based on structural line segments provided in this invention optimizes the position of the structural line segments by utilizing the global orientation information of the structural line segments after determining the initial projected coordinates of the structural line segments, thereby improving the accuracy of the position of the structural line segments and further improving the pose estimation accuracy.

[0140] In some embodiments, updating the predicted state vector and covariance matrix based on the target projected coordinates includes:

[0141] Based on the target projection coordinates and the global direction, determine the target intersection point between the structural line segment and the first projection plane;

[0142] Based on the target intersection point and the global direction, determine the second Plück coordinates of the structural line segment in the global coordinate system;

[0143] Based on the second Plück coordinates, determine the third Plück coordinates of the structural line segment in the camera coordinate system;

[0144] Based on the third Plück coordinates, the structural line segment is projected from the camera coordinate system onto the image plane to determine the reprojection error of the structural line segment;

[0145] Based on the reprojection error, the predicted state vector and covariance matrix are updated.

[0146] In some embodiments, updating the predicted state vector and covariance matrix based on the reprojection error includes:

[0147] Determine the Jacobian matrix of the reprojection error with respect to the camera pose and the target projection coordinates;

[0148] Based on the reprojection error and the Jacobian matrix, the predicted state vector and covariance matrix are updated.

[0149] For structural line segment L, the target projection coordinate is q, and the 3D intersection point of structural line segment L and the first projection plane is Q.

[0150] After determining the global direction of the structural line segment L and the target projection coordinates q, the 3D intersection point of the structural line segment L and the first projection plane, i.e., the target intersection point Q, can be calculated using the following formula:

[0151] Q G =P T q (12)

[0152] Where P represents the transformation matrix from 2D projected coordinates to the 3D intersection point.

[0153] The expression for P is as follows:

[0154]

[0155] Let Q be a point on the structural line segment L. Since the global direction of the structural line segment L is known, the Plück coordinates L of the structural line segment L in the global coordinate system G can be obtained using the following formula. G That is, the second Plück coordinate, L G The expression is as follows:

[0156]

[0157] Where n represents the normal vector of the plane formed by the structural line segment L and the origin O, {d=e i Let i = x, y, z} represent the global direction of the structural line segment L. The three-dimensional intersection point Q of the structural line segment and its projection plane represents... G =[q1 q2 q3] T An antisymmetric matrix.

[0158] The expression is as follows:

[0159]

[0160] That is, after the above steps, the Plück coordinates L of the structural line segment L in the global coordinate system G are obtained. G .

[0161] To calculate the reprojection error of the structural line segment, it is necessary to obtain the projection of the structural line segment L onto the image.

[0162] First, the Plück coordinates L of the structure line segment L in the current camera coordinate system are calculated using the camera pose. C That is, the third Plück coordinate, L C The expression is as follows:

[0163]

[0164] in, and L represents the rotation and translation of the current camera in the global coordinate system G, respectively. G This represents the second Plück coordinate.

[0165] Next, the structural line segment L is projected from the camera coordinate system onto the image plane. The expression for the projected line segment l is as follows:

[0166]

[0167] Among them, f x with f y c represents the camera's focal length. x With c y This indicates the principal point position of the camera, and its specific value can be determined through pre-calibration.

[0168] Reprojection error e of structural line segment L l To track the two endpoints x of the line segment s and x e The distance to the projection line segment l, and the reprojection error e. l The calculation formula is as follows:

[0169]

[0170] Where, x s =[u s v s 1] T and x e =[u e v e 1] T These are the homogeneous coordinates of the endpoints of the tracking line segment in the pixel coordinate system.

[0171] Based on the reprojection error, we can obtain the Jacobian matrix of the reprojection error with respect to the camera pose, and the Jacobian matrix of the reprojection error with respect to the target projection coordinates.

[0172] Therefore, the predicted state vector and covariance matrix can be updated based on the reprojection error and the Jacobian matrix.

[0173] Optionally, the prediction process is as follows: First, acquire the IMU data between the current image time and the previous image time. Then, use the IMU dynamics equations to integrate the system state vector and covariance from the previous time to the current time. Second, since new images are added to the system, state augmentation is required. This involves adding the current camera pose to the initial state vector and expanding the covariance matrix to obtain the predicted state vector and covariance matrix.

[0174] The covariance matrix is ​​a matrix associated with the state vector, which describes the degree of correlation between the uncertainty of the Kalman filter state estimation and the estimation error.

[0175] In step 104, the camera pose is determined based on the updated state vector and covariance matrix.

[0176] The camera pose can be accurately determined based on the updated state vector and covariance matrix.

[0177] The pose estimation method for VIO systems based on structural line segments provided in this invention uses the global orientation of structural line segments for pose estimation. Since structural line segments on the image are direct observations of the global orientation of the environment, the accumulated error in rotation estimation of the VIO system can be eliminated by using structural line segments, thereby improving the pose estimation accuracy.

[0178] Figure 4 This is the second flowchart illustrating the pose estimation method for a VIO system based on structural line segments provided in this embodiment of the invention. Figure 4 As shown, it mainly includes four steps: VIO initialization and coordinate alignment, image processing, system prediction, and system update.

[0179] S1, VIO initialization and coordinate alignment, includes two steps: VIO system initialization and coordinate alignment.

[0180] S1.1, VIO system initialization mainly completes the estimation of the initial values ​​of gravity direction, velocity, position and scale.

[0181] S1.2 After the VIO system initialization is completed, the coordinate alignment step uses multi-frame image information to calculate the rotation matrix between the current VIO global coordinate system G and the building coordinate system B. Then, the state variables of the VIO system are transformed from the global coordinate system G to the building coordinate system B. After the transformation is completed, the three coordinate axes of the global coordinate system G of the VIO system are aligned with the three principal directions of the Manhattan world model.

[0182] S2. Image processing includes five steps: point feature extraction, point feature tracking, line segment feature extraction, line segment feature tracking, and line segment direction classification.

[0183] Point feature extraction and tracking used the corner detection (Feature from Accelerated SegmentTest, FAST) algorithm and the KLT optical flow tracking algorithm (Kanade-Lucas-Tomasi Tracking Method), respectively; line feature extraction and matching used the line segment detector (LSD) algorithm and the line band descriptor (LBD) algorithm, respectively.

[0184] Specifically, in order to utilize structural information in the environment, the results of line segment tracking need to be classified to determine whether a line segment belongs to a structural segment. If it is a structural segment, its corresponding global direction needs to be further determined.

[0185] S3. The system prediction step uses IMU information between two frames of images to predict the latest system state vector and covariance. This process increases the uncertainty of the system.

[0186] The prediction process of the VIO system is as follows: First, acquire the IMU data between the current image time and the previous image time, and then use the IMU dynamics equation to integrate the system state vector and covariance from the previous time to the current time. Second, since a new image has been added to the system, state expansion is required, that is, adding the current camera pose to the system state vector and expanding the covariance matrix.

[0187] S4. The update steps of the proposed structure segment-assisted VIO system utilize both point features and observation information of structure segments, which reduces the uncertainty of the system.

[0188] Specifically, for structural line segments, the proposed triangulation method is first used to calculate their projected coordinates. Then, the reprojection error and Jacobian matrix of the structural line segment features are calculated using the projected coordinates. Finally, the system state vector and covariance are updated.

[0189] The update process of the VIO system is as follows: After obtaining the correlation between point features and structural line segment features in continuous images through image processing steps, in order to utilize the observation information of point features and structural line segment features in the VIO system, it is first necessary to triangulate the point features and line segment features respectively to obtain their coordinates in three-dimensional space. Secondly, using the observation models of point features and structural line segment features, the reprojection error and Jacobian matrix of point features and structural line segments are calculated respectively, thereby updating the system state vector and covariance.

[0190] After the above steps, the global direction information of the structural line segments is fully utilized by the VIO system.

[0191] This invention provides a pose estimation method for a VIO system based on structural line segments. The system first performs VIO initialization, then coordinate alignment, and finally, after image processing, prediction, and updating, obtains the pose estimation result. Since structural line segments encode global orientation information of the environment, they can be used to eliminate accumulated errors in rotation estimation of the VIO system, thereby improving the pose estimation accuracy. Furthermore, by using the observation information of structural line segments, the robustness of the odometry system and its adaptability to complex environments (weak texture, motion blur, and lighting variations) can be improved. Because man-made architectural environments generally exhibit relatively obvious structural regularities, this invention has a wide range of applications.

[0192] The pose estimation device for a VIO system based on structural line segments provided by the present invention will be described below. The pose estimation device for a VIO system based on structural line segments described below can be referred to in correspondence with the pose estimation method for a VIO system based on structural line segments described above.

[0193] Figure 5 This is a schematic diagram of the pose estimation device of the VIO system based on structural line segments provided in an embodiment of the present invention, as shown below. Figure 5 As shown, the pose estimation device for a VIO system based on structural line segments provided in this embodiment of the invention includes:

[0194] The transformation module 510 is used to transform the initial state vector of the VIO system to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system; the rotation matrix is ​​determined based on the image data acquired by the VIO system.

[0195] The first determining module 520 is used to determine the structural line segment and the global direction of the structural line segment based on the image data; the structural line segment is a line segment with the same direction as the coordinate axis of the architectural coordinate system;

[0196] The update module 530 is used to update the predicted state vector and covariance matrix based on the structural line segment and the global direction; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation.

[0197] The second determination module 540 is used to determine the camera pose based on the updated state vector and covariance matrix.

[0198] It should be noted that the pose estimation device for the VIO system based on structural line segments provided in this embodiment of the invention can implement all the method steps implemented in the above-mentioned pose estimation method embodiment for the VIO system based on structural line segments, and can achieve the same technical effect. Here, the parts that are the same as those in the method embodiment and the beneficial effects will not be described in detail.

[0199] Optionally, it also includes: a third determining module, used for:

[0200] Among the ablation points corresponding to the image data, determine the first ablation point, the second ablation point, and the third ablation point, which correspond to the coordinate axis directions of the building coordinate system, respectively;

[0201] Based on the first disappearance point, the second disappearance point, and the third disappearance point, determine the first and second included angles between the building coordinate system and the global coordinate system in the X-axis and Y-axis directions;

[0202] The rotation matrix is ​​determined based on the first included angle and the second included angle.

[0203] Optionally, the first determining module 520 is specifically used for:

[0204] Based on the direction vectors of the coordinate axes of the global coordinate system, determine the fourth, fifth, and sixth disappearance points corresponding to the direction vectors in the image coordinate system;

[0205] Feature extraction is performed on the image data to determine the target line segment in the image data;

[0206] Based on the midpoint of the target line segment, the fourth disappearance point, the fifth disappearance point, and the sixth disappearance point, the first line segment, the second line segment, and the third line segment are determined;

[0207] The structural line segment and the global direction are determined based on the third included angle between the target line segment and the first line segment, the second line segment, and the third line segment.

[0208] Optionally, the update module 530 is specifically used for:

[0209] Based on the global direction, determine the first projection plane corresponding to the structural line segment;

[0210] The second projection plane is determined based on the first projection plane and the camera optical center;

[0211] Based on the first Plück coordinates of the structural line segment in the global coordinate system and the second projection plane, the initial projection coordinates of the structural line segment are determined;

[0212] The initial projection coordinates are optimized based on the method of minimizing reprojection error to determine the target projection coordinates;

[0213] Based on the target projection coordinates, the predicted state vector and covariance matrix are updated.

[0214] Optionally, the update module 530 is specifically used for:

[0215] Based on the target projection coordinates and the global direction, determine the target intersection point between the structural line segment and the first projection plane;

[0216] Based on the target intersection point and the global direction, determine the second Plück coordinates of the structural line segment in the global coordinate system;

[0217] Based on the second Plück coordinates, determine the third Plück coordinates of the structural line segment in the camera coordinate system;

[0218] Based on the third Plück coordinates, the structural line segment is projected from the camera coordinate system onto the image plane to determine the reprojection error of the structural line segment;

[0219] Based on the reprojection error, the predicted state vector and covariance matrix are updated.

[0220] Optionally, the update module 530 is specifically used for:

[0221] Determine the Jacobian matrix of the reprojection error with respect to the camera pose and the target projection coordinates;

[0222] Based on the reprojection error and the Jacobian matrix, the predicted state vector and covariance matrix are updated.

[0223] Figure 6 An example is a schematic diagram of the physical structure of an electronic device, such as... Figure 6As shown, the electronic device may include a processor 610, a communications interface 620, a memory 630, and a communication bus 640. The processor 610, communications interface 620, and memory 630 communicate with each other via the communication bus 640. The processor 610 can call logical instructions in the memory 630 to execute a pose estimation method for a VIO system based on structural line segments. This method includes: transforming the initial state vector of the VIO system to the building coordinate system based on a rotation matrix between the global coordinate system and the building coordinate system; the rotation matrix is ​​determined based on image data acquired by the VIO system; determining structural line segments and their global orientation based on the image data; the structural line segments are line segments with the same orientation as the coordinate axes of the building coordinate system; updating the predicted state vector and covariance matrix based on the structural line segments and the global orientation; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation; and determining the camera pose based on the updated state vector and covariance matrix.

[0224] Furthermore, the logical instructions in the aforementioned memory 630 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0225] On the other hand, the present invention also provides a computer program product, which includes a computer program that can be stored on a non-transitory computer-readable storage medium. When the computer program is executed by a processor, the computer can execute the pose estimation method for a VIO system based on structural line segments provided by the above methods. The method includes: transforming the initial state vector of the VIO system to the building coordinate system based on a rotation matrix between the global coordinate system and the building coordinate system; the rotation matrix is ​​determined based on image data acquired by the VIO system; determining structural line segments and their global orientation based on the image data; the structural line segments are line segments with the same orientation as the coordinate axes of the building coordinate system; updating the predicted state vector and covariance matrix based on the structural line segments and the global orientation; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation; and determining the camera pose based on the updated state vector and covariance matrix.

[0226] In another aspect, the present invention also provides a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the pose estimation method for a VIO system based on structural line segments provided by the methods described above. This method includes: transforming the initial state vector of the VIO system to the building coordinate system based on a rotation matrix between a global coordinate system and a building coordinate system; the rotation matrix being determined based on image data acquired by the VIO system; determining structural line segments and their global orientation based on the image data; the structural line segments being line segments with the same orientation as the coordinate axes of the building coordinate system; updating the predicted state vector and covariance matrix based on the structural line segments and the global orientation; the predicted state variables and covariance matrix being determined based on the initial state vector after coordinate transformation; and determining the camera pose based on the updated state vector and covariance matrix.

[0227] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.

[0228] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.

[0229] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A pose estimation method for a VIO system based on structural line segments, characterized in that, include: The initial state vector of the VIO system is transformed to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system. The rotation matrix is ​​determined based on image data acquired by the VIO system; Based on the image data, structural line segments and their global directions are determined; the structural line segments are line segments whose directions are the same as the coordinate axes of the architectural coordinate system. Based on the structural line segments and the global direction, the predicted state vector and covariance matrix are updated; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation. The camera pose is determined based on the updated state vector and covariance matrix.

2. The pose estimation method for a VIO system based on structural line segments according to claim 1, characterized in that, Before transforming the initial state vector of the VIO system to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system, the method further includes: Among the ablation points corresponding to the image data, determine the first ablation point, the second ablation point, and the third ablation point, which correspond to the coordinate axis directions of the building coordinate system, respectively; Based on the first disappearance point, the second disappearance point, and the third disappearance point, determine the first and second included angles between the building coordinate system and the global coordinate system in the X-axis and Y-axis directions; The rotation matrix is ​​determined based on the first included angle and the second included angle.

3. The pose estimation method for a VIO system based on structural line segments according to claim 1, characterized in that, Determining the structural line segment and its global direction based on the image data includes: Based on the direction vectors of the coordinate axes of the global coordinate system, determine the fourth, fifth, and sixth disappearance points corresponding to the direction vectors in the image coordinate system; Feature extraction is performed on the image data to determine the target line segment in the image data; Based on the midpoint of the target line segment, the fourth disappearance point, the fifth disappearance point, and the sixth disappearance point, the first line segment, the second line segment, and the third line segment are determined; The structural line segment and the global direction are determined based on the third included angle between the target line segment and the first line segment, the second line segment, and the third line segment.

4. The pose estimation method for a VIO system based on structural line segments according to claim 1, characterized in that, The step of updating the predicted state vector and covariance matrix based on the structural line segment and the global direction includes: Based on the global direction, determine the first projection plane corresponding to the structural line segment; The second projection plane is determined based on the first projection plane and the camera optical center; Based on the first Plück coordinates of the structural line segment in the global coordinate system and the second projection plane, the initial projection coordinates of the structural line segment are determined; The initial projection coordinates are optimized based on the method of minimizing reprojection error to determine the target projection coordinates; Based on the target projection coordinates, the predicted state vector and covariance matrix are updated.

5. The pose estimation method for a VIO system based on structural line segments according to claim 4, characterized in that, The step of updating the predicted state vector and covariance matrix based on the target projection coordinates includes: Based on the target projection coordinates and the global direction, determine the target intersection point between the structural line segment and the first projection plane; Based on the target intersection point and the global direction, determine the second Plück coordinates of the structural line segment in the global coordinate system; Based on the second Plück coordinates, determine the third Plück coordinates of the structural line segment in the camera coordinate system; Based on the third Plück coordinates, the structural line segment is projected from the camera coordinate system onto the image plane to determine the reprojection error of the structural line segment; Based on the reprojection error, the predicted state vector and covariance matrix are updated.

6. The pose estimation method for a VIO system based on structural line segments according to claim 5, characterized in that, The step of updating the predicted state vector and covariance matrix based on the reprojection error includes: Determine the Jacobian matrix of the reprojection error with respect to the camera pose and the target projection coordinates; Based on the reprojection error and the Jacobian matrix, the predicted state vector and covariance matrix are updated.

7. A pose estimation device for a VIO system based on structural line segments, characterized in that, include: The transformation module is used to transform the initial state vector of the VIO system to the building coordinate system based on the rotation matrix between the global coordinate system and the building coordinate system. The rotation matrix is ​​determined based on image data acquired by the VIO system; The first determining module is used to determine the structural line segment and the global direction of the structural line segment based on the image data; the structural line segment is a line segment with the same direction as the coordinate axis of the architectural coordinate system; An update module is used to update the predicted state vector and covariance matrix based on the structural line segment and the global direction; the predicted state variables and covariance matrix are determined based on the initial state vector after coordinate transformation. The second determination module is used to determine the camera pose based on the updated state vector and covariance matrix.

8. An electronic device comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, When the processor executes the program, it implements the pose estimation method for the VIO system based on structural line segments as described in any one of claims 1 to 6.

9. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the pose estimation method for the VIO system based on structural line segments as described in any one of claims 1 to 6.

10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the pose estimation method for the VIO system based on structural line segments as described in any one of claims 1 to 6.