Planar estimation method, apparatus, device and storage medium
By estimating the triangular plane and normal vector histogram of the target image on mobile devices, and combining SLAM and pyramid optical flow tracking algorithms, the problem of insufficient computing resources for plane recognition on mobile devices is solved, and efficient and accurate plane detection is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGZHOU SHIYUAN ELECTRONICS CO LTD
- Filing Date
- 2024-12-24
- Publication Date
- 2026-06-26
AI Technical Summary
Existing computer vision systems struggle to efficiently identify and estimate planes in 3D space on mobile devices, especially horizontal and vertical planes, due to limited computing resources.
By estimating the triangular plane of the target image in the spatial coordinate system, calculating the normal vector and generating the normal vector histogram, the same plane region in the image is determined using the normal vector histogram, and feature point tracking is performed by combining SLAM information and pyramid optical flow tracking algorithm to filter out noise location points, thereby achieving plane detection.
Efficient plane detection was achieved on devices with limited computing resources, reducing computational load, saving hardware costs and computing resources, and improving the accuracy and speed of plane recognition.
Smart Images

Figure CN122289366A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing, and more particularly to plane estimation methods, apparatus, devices, and storage media. Background Technology
[0002] With the development of virtual reality (VR), augmented reality (AR), and mixed reality (MR) technologies, spatial computing needs to include not only head pose estimation but also rich environmental information. In existing computer vision systems, the estimation of object surfaces in 3D space, especially the identification and estimation of horizontal and vertical planes, typically relies on complex algorithms and substantial computational resources. These systems often suffer from limitations in processing speed and accuracy, particularly on mobile devices where hardware computing resources restrict performance. How to extract the planar positions of rough surfaces such as floors, walls, and desktops in space using the limited computing resources of mobile terminals has become a challenge and a difficult problem for the industry. Summary of the Invention
[0003] This application provides a plane estimation method, apparatus, device, and storage medium for performing plane estimation.
[0004] Firstly, a plane estimation method is provided, including:
[0005] Acquire a target image, wherein the target image is an image containing a plane;
[0006] Estimate the triangular planes of the target image in the spatial coordinate system to obtain multiple spatial planes. The spatial planes are obtained based on the triangular image regions in the target image, where the triangular image regions are the image regions formed by connecting three feature points in the target image.
[0007] Calculate the normal vector of each of the multiple spatial planes to obtain multiple normal vectors;
[0008] Based on the multiple normal vectors, a normal vector histogram is generated. The normal vector histogram includes the number of normal vectors corresponding to different angle intervals. The angle in the angle interval is the angle between the normal vector and the preset reference object.
[0009] The first triangular image region and the second triangular image region in the target image are determined as the same plane region in the target image. The first triangular image region and the second triangular image region are two adjacent triangular image regions in the target image. Furthermore, the normal vector of the spatial plane corresponding to the first triangular image region and the normal vector of the spatial plane corresponding to the second triangular image region correspond to the same angle interval in the normal vector histogram.
[0010] In this technical solution, after acquiring the target image, multiple spatial planes are obtained by estimating the triangular planes of the target image in the spatial coordinate system. Then, the normal vector of each spatial plane is calculated to obtain multiple normal vectors. Based on the multiple normal vectors, a normal vector histogram is generated. The normal vector histogram includes normal vector data corresponding to different angle intervals. Finally, two adjacent triangular image regions in the same angle interval of the corresponding normal vector histogram in the target image are determined as the same plane region in the target image, thus realizing the detection of planes in the image. Since the same plane region in the image is determined based on the normal vector histogram, only normal vector calculation and statistics are required. Compared with plane estimation through deep learning, the amount of computation can be reduced and computing resources can be saved, thus enabling plane detection on devices with limited computing resources (such as mobile devices).
[0011] In conjunction with the first aspect, estimating the triangular planes of the target image in a spatial coordinate system to obtain multiple spatial planes includes: determining target feature points in the target image, wherein the target feature points exist in a series of consecutive observation images corresponding to the target image, and the series of observation images includes the target image; determining the spatial position corresponding to the target feature points based on the observation pose information of the series of observation images, wherein the observation pose information reflects the camera pose corresponding to the observation images; extracting triangular patches from the target image based on the target feature points in the target image to obtain multiple triangular image regions in the target image, wherein the vertices of each triangular image region are three target feature points in the target image, and the multiple triangular image regions do not overlap with each other; and mapping the multiple triangular image regions to a spatial coordinate system based on the spatial positions corresponding to the target feature points to obtain the multiple spatial planes.
[0012] The spatial location of feature points in an image is determined by observing the pose information of multiple consecutive frames of images, and then the triangular image region in the image is mapped to obtain a spatial plane. This method is based on simultaneous localization and mapping (SLAM) information to estimate the spatial location of feature points in the image. It combines global consistency of spatial information to ensure the accuracy of spatial location estimation. It also eliminates the need for an additional depth sensor to determine the depth information of feature points and saves hardware costs.
[0013] In conjunction with the first aspect, in one possible implementation, determining the spatial position corresponding to the target feature point based on the observation pose information of the multi-frame observation images includes: combining any two frames of observation images to obtain multiple observation image combinations; triangulating the target feature point based on the observation pose information of the observation images in the target observation image combination to obtain triangulated position points corresponding to the target observation image combination, wherein the target observation image combination is any one of the multiple observation image combinations, and the triangulated position points are used to reflect the spatial position of the target feature point; filtering out noise position points from the multiple triangulated position points corresponding to the target feature point, wherein the multiple triangulated position points are the triangulated position points corresponding to the multiple observation image combinations, and the noise position points are triangulated position points that do not belong to the fitting plane corresponding to the multiple triangulated position points; and determining the spatial position corresponding to the target feature point based on the remaining triangulated position points excluding the noise position points from the multiple triangulated position points.
[0014] In the process of spatial location estimation of feature points in an image based on SLAM information, the accuracy of spatial location estimation can be improved by filtering out noisy location points obtained through triangulation.
[0015] In conjunction with the first aspect, in one possible implementation, filtering out noisy location points from the plurality of triangulated location points corresponding to the target feature point includes: selecting any three triangulated location points from the plurality of triangulated location points as three interior points in a location plane, constructing a location plane, and determining an outer point set, wherein the outer point set includes at least one outer point, the outer point being a triangulated location point other than the arbitrary three triangulated location points from the plurality of triangulated location points; starting from the first outer point in the outer point set, obtaining a target outer point from the outer point set; calculating the distance between the target outer point and the location plane; if the distance is less than a preset threshold, determining the target outer point as an interior point in the location plane; if the number of interior points in the location plane is less than a preset threshold, then... If the number of inliers is greater than the maximum number of inliers, the maximum number of inliers is updated to the number of inliers in the position plane. Based on all inliers in the position plane, the position plane is updated. The next outer point of the target outer point is obtained from the outer point set as the target outer point, and the step of calculating the distance between the target outer point and the position plane is executed. If the distance is greater than or equal to the preset threshold, the next outer point of the target outer point is obtained from the outer point set as the target outer point, and the step of calculating the distance between the target outer point and the position plane is executed. Among the multiple triangulated position points, triangulated position points that do not belong to the target position plane are filtered out. The target position plane is the position plane with the largest number of inliers obtained in the final update.
[0016] Filtering out noisy location points based on random sample consensus (RANSAC) can improve the accuracy of filtering, thereby improving the accuracy of spatial location estimation.
[0017] In conjunction with the first aspect, in one possible implementation, determining the target feature points in the target image includes: performing feature point tracking on the multi-frame observation images based on the pyramid optical flow tracking algorithm to obtain the tracked feature points in each frame of the multi-frame observation images; and determining the tracked feature points in the target image as the target feature points in the target image.
[0018] The pyramid optical flow tracing algorithm is used to track and identify feature points in images without having to calculate feature points frame by frame, which can improve the speed of feature point recognition.
[0019] In conjunction with the first aspect, in one possible implementation, generating a normal vector histogram based on the plurality of normal vectors includes: calculating the angle between the target normal vector and the reference plane to obtain the target angle, wherein the target normal vector is any one of the plurality of normal vectors; incrementing the number of normal vectors corresponding to the angle interval to which the target angle belongs by one; and generating the normal vector histogram based on each angle interval and the number of normal vectors corresponding to each angle interval.
[0020] By calculating the angle between the normal vector and the reference plane, and counting the number of normal vectors in different angle intervals, a normal vector histogram is generated. This method is simple and can save computation.
[0021] In conjunction with the first aspect, in one possible implementation, the method is applied to a target device, which includes a plane estimation thread and a listening thread. The plane estimation thread is used to execute the above method to obtain plane estimation data; the listening thread is used to save the latest plane estimation data and send the latest plane estimation data to the application that needs the plane estimation data.
[0022] By setting up a separate listening thread to store the latest plane estimation data and send it to applications that need the plane estimation data, data processing and data application can be decoupled, ensuring that data processing is not disturbed.
[0023] Secondly, a plane estimation device is provided, comprising:
[0024] The image acquisition module is used to acquire a target image, wherein the target image is an image containing a plane;
[0025] The spatial plane estimation module is used to estimate the triangular planes of the target image in the spatial coordinate system to obtain multiple spatial planes. The spatial planes are obtained based on the triangular image regions in the target image, and the triangular image regions are the image regions formed by connecting three feature points in the target image.
[0026] The normal vector calculation module is used to calculate the normal vector of each of the multiple spatial planes to obtain multiple normal vectors;
[0027] The histogram generation module is used to generate a normal vector histogram based on the plurality of normal vectors. The normal vector histogram includes the number of normal vectors corresponding to different angle intervals, and the angle in the angle interval is the angle between the normal vector and a preset reference object.
[0028] The planar region determination module is used to determine the first triangular image region and the second triangular image region in the target image as the same planar region in the target image. The first triangular image region and the second triangular image region are two adjacent triangular image regions in the target image. Furthermore, the normal vector of the spatial plane corresponding to the first triangular image region and the normal vector of the spatial plane corresponding to the second triangular image region correspond to the same angle interval in the normal vector histogram.
[0029] Thirdly, a computer device is provided, including a memory and a processor, the memory being connected to the processor, the processor being configured to execute one or more computer programs stored in the memory, wherein, when executing the one or more computer programs, the processor causes the computer device to implement the plane estimation method of the first aspect described above.
[0030] Fourthly, a computer-readable storage medium is provided, which stores a computer program, the computer program including program instructions, which, when executed by a processor, cause the processor to perform the plane estimation method of the first aspect.
[0031] This application can achieve the following technical effects: it realizes the detection of planes in images; since the same plane region in the image is determined based on the normal vector histogram, only the normal vector calculation and statistics need to be performed. Compared with the plane estimation through deep learning, it can reduce the amount of computation and save computing resources, thus enabling plane detection on devices with limited computing resources (such as mobile devices). Attached Figure Description
[0032] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0033] Figure 1 This is a schematic diagram of the thread of the target device provided in the embodiments of this application;
[0034] Figure 2 A schematic flowchart of a plane estimation method provided in an embodiment of this application;
[0035] Figure 3 This is a flowchart illustrating the noise removal location points provided in an embodiment of this application.
[0036] Figure 4 A schematic diagram of the normal vector histogram provided in the embodiments of this application;
[0037] Figure 5 This is a schematic diagram of the structure of a plane estimation device provided in an embodiment of this application;
[0038] Figure 6 This is a schematic diagram of the structure of a computer device provided in an embodiment of this application. Detailed Implementation
[0039] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of this application. All other embodiments obtained by those skilled in the art based on the embodiments in this application without inventive effort are within the scope of protection of this application.
[0040] It should be noted that, unless there is a conflict, the various features in the embodiments of this application can be combined with each other, all of which are within the protection scope of this application. Furthermore, although functional modules are divided in the device schematic diagram and a logical order is shown in the flowchart, in some cases, the steps shown or described can be executed in a different order than the module division in the device or the order in the flowchart. Moreover, the terms "first," "second," and "third" used in this application do not limit the data or execution order, but only distinguish identical or similar items with substantially the same function and effect.
[0041] The technical solution of this application is applicable to scenarios that require plane estimation. These scenarios can be, for example, AR scenarios, where plane estimation is used to detect and track planes in the real world so that virtual content can be overlaid on the planes; or autonomous driving scenarios, where plane estimation is used to identify road planes to help vehicles understand the geometric features of the road for path planning and obstacle avoidance; or medical image analysis scenarios, where plane estimation is used to identify and analyze planar structures in medical images, such as bones and organs; and the examples are not limited to those described here.
[0042] In some feasible technical solutions, deep learning is used to estimate planes in images. Taking PlaneNet as an example, its core is a deep learning model based on a convolutional neural network (CNN). It predicts the depth value of each pixel in an image using this deep learning model and then performs plane detection and segmentation based on these depth values. This approach requires complex computations and significant computing resources, making it unsuitable for mobile devices with limited computing resources, such as some VR devices.
[0043] In view of this, this application proposes a plane estimation scheme. By estimating the triangular planes of the target image in the spatial coordinate system, multiple spatial planes are obtained. Then, the normal vector of each spatial plane is calculated to obtain multiple normal vectors. Based on the multiple normal vectors, a normal vector histogram is generated. The normal vector histogram includes the number of normal vectors corresponding to different angle intervals. Finally, two adjacent triangular image regions in the same angle interval in the corresponding normal vector histogram of the target image are determined as the same plane region in the target image. This is equivalent to determining the same plane region in the image based on the normal vector histogram. Only normal vector calculation and statistics are required. Compared with plane estimation through deep learning, it can reduce the amount of computation and save computing resources, thus enabling plane detection on devices with limited computing resources (such as mobile devices).
[0044] The technical solution of this application can be applied to the target device, such as a mobile device. Figure 1 As shown, the target device includes an image sensor and an inertial measurement unit (IMU). The image sensor and IMU are synchronized based on the same clock source. The image sensor is used to acquire images, and the IMU is used to measure inertial data. The image sensor can be, for example, a fisheye camera, and the IMU includes a gyroscope and an accelerometer. The target device also includes a plane estimation thread and a listening thread. The plane estimation thread receives the sensor data acquired by the image sensor and IMU, combines it with the image attitude information output by the SLAM module in the target device to perform plane estimation, and obtains plane estimation data. The listening thread coordinates the output of the plane estimation thread and application requests. The listening thread can save the latest plane estimation data output by the plane estimation thread and send the latest plane estimation data to applications that need to use it. Specifically, the listening thread can read the latest plane estimation data output by the plane estimation thread at a fixed frame rate, using read-write locks to control the write and read operations of the application. At any given time, the listening thread allows only one thread to perform a write operation and allows multiple threads to perform read operations. The listening thread can start when the plane estimation thread starts and shut down when the plane estimation thread shuts down. By setting up a separate listening thread to store the latest plane estimation data and send it to applications that need the plane estimation data, data processing and data application can be decoupled, ensuring that data processing is not disturbed.
[0045] The technical solution of this application is described in detail below. See also: Figure 2 , Figure 2 This is a flowchart illustrating a plane estimation method provided in an embodiment of this application, as shown below. Figure 2 As shown, the method includes the following steps:
[0046] S101, acquire the target image.
[0047] Here, the target image is an image containing a plane, and the target image is the image from which the plane needs to be estimated. The target image can be obtained by capturing an image using a camera in the target device.
[0048] S102, estimate the triangular plane of the target image in the spatial coordinate system to obtain multiple spatial planes.
[0049] Here, each of the multiple spatial planes is derived from a triangular image region in the target image. A triangular image region in the target image is an image region formed by connecting three feature points in the target image. Feature points in the target image refer to points in the target image that possess distinct characteristics, effectively reflect the essential features of the target image, and can identify objects in the target image. Feature points in the target image include, but are not limited to, corner points and edge points. In the target image, the triangular region formed by connecting three feature points is the triangular image region; mapping the three vertices of the triangular image region to spatial position points in a spatial coordinate system, the triangular plane formed by connecting the three spatial position points corresponding to the three vertices is the spatial plane.
[0050] In one feasible implementation, the triangular planes of the target image in the spatial coordinate system can be estimated through the following steps A1-A4 to obtain multiple spatial planes:
[0051] A1. Identify the target feature points in the target image.
[0052] Here, the target feature points exist in a series of consecutive observation images corresponding to the target image. An observation image is an image used to observe the positional changes of the feature points. The series of consecutive observation images corresponding to the target image includes the target image itself, which can be a series of consecutive frames captured by a camera. For example, if the series of consecutive observation images corresponding to the target image consists of N consecutive frames captured by a camera (N≥2), then the target image can be the last frame in the N consecutive frames, or it can be the first frame in the N consecutive frames. The target feature points refer to the feature points that exist within the N frames.
[0053] There are multiple target feature points in the target image.
[0054] In one possible implementation, the target feature points in the target image can be determined through the following steps A11-A12:
[0055] A11. Based on the pyramid optical flow tracking algorithm, feature point tracking is performed on multiple frames of observation images to obtain the tracking feature points in each frame of the multiple observation images.
[0056] Here, the pyramid optical flow tracking algorithm is an algorithm for target tracking. The pyramid optical flow tracking algorithm decomposes the observed image into multiple pyramid images with different resolutions. On each pyramid image layer, the motion vector between adjacent pixels is calculated using the optical flow method, thereby obtaining the motion vector of the feature point between two adjacent observation images.
[0057] Feature point tracking based on the pyramid optical flow tracing algorithm can include the following steps (1)-(6):
[0058] (1) Use the initial image as the observation image.
[0059] Here, the initial image can be the first frame captured by the camera, or a pre-set image from which feature point tracking is to begin.
[0060] (2) Extract feature points from the observed image to obtain the first feature point in the observed image.
[0061] Among them, corner detection algorithms such as Fast and Harris can be used to extract feature points from the observed image to obtain the first feature point in the observed image.
[0062] When the observed image is the initial image, feature points can be extracted from the entire observed image to obtain the first feature point in the observed image. When the observed image is not the initial image, the observed image contains a second feature point. The second feature point is the feature point obtained by tracking the feature points in the previous frame of the observed image. Feature points can be extracted from the blank feature regions in the observed image to obtain the first feature point. The blank feature regions in the observed image refer to the image regions that do not contain the second feature point.
[0063] In this process, after extracting feature points from the observed image to obtain the first feature points, non-maximum suppression can be applied to the first feature points in each local region of the observed image. That is, for each local region of the observed image, only the first feature point with the highest score is retained. In this way, the number of feature points in the observed image can be reduced, making the feature points in the observed image more trackable, and the amount of computation required for tracking can also be reduced.
[0064] (3) Based on the first feature point in the observed image, determine the feature points to be matched in the observed image.
[0065] When the observed image is the initial image, the first feature point in the observed image can be identified as the feature point to be matched in the observed image; when the observed image is not the initial image, the first feature point and the second feature point in the observed image can be identified as the feature points to be matched in the observed image.
[0066] (4) Based on the pyramid optical flow tracking algorithm, determine the optical flow information between the observed image and the next frame of the observed image.
[0067] Here, optical flow information is used to reflect the amount of motion of the feature points to be matched in the observed image in two directions of the image coordinate system. The optical flow information between the observed image and the next frame includes the optical flow vectors of each feature point to be matched in the observed image. The optical flow vector of the point to be matched in the observed image can be represented as (Δu... i Δv i ), (Δu i Δv i ) represents the optical flow vector of the i-th feature point to be matched in the observed image.
[0068] In the process of determining the optical flow information between the observed image and the next frame image based on the pyramid optical flow tracking algorithm, initial optical flow information can be determined based on gyroscope data, and this initial optical flow information can be used to solve for the optical flow information between the observed image and the next frame image. This can accelerate the determination of the optical flow vector.
[0069] In addition, by combining the rotation information of the gyroscope data, the pixels near the feature point can be rotated before matching, thereby finding more effective matching points.
[0070] (5) Based on the optical flow information between the observed image and the next frame of the observed image, determine the position of each feature point to be matched in the observed image in the next frame of the observed image, and obtain the second feature point in the next frame of the observed image.
[0071] Wherein, the position of the i-th feature point to be matched in the observed image in the next frame of the observed image can be represented as (u2) i v2 i ), u2 i =u1 i +Δu i v2 i =v1 i +Δv i , (u1 i v1 i ) represents the position coordinates of the i-th feature point to be matched in the observed image.
[0072] For a target feature point to be matched in the observed image, the target feature point to be matched is any feature point to be matched in the observed image, if (u2 i v2 iIf the location is within the image's position coordinate range, then a second feature point matching the target feature point is determined in the next frame of the observed image. i v2 i ) represents the position coordinates of the second feature point that matches the target feature point in the next frame of the observed image; if (u2 i v2 i If the coordinates of a point exceed the range of the image's position coordinates, it is determined that there is no second feature point in the next frame of the observed image that matches the target feature point. By performing the same discrimination process for each feature point to be matched in the observed image, all second feature points in the next frame of the observed image can be determined.
[0073] If the target feature point to be matched is the second feature point in the observed image, or if there is a second feature point in the next frame of the observed image that matches the target feature point, then the target feature point to be matched is determined as the tracking feature point in the observed image.
[0074] (6) Take the next frame of the observed image as the observed image and execute step (2).
[0075] A12. Identify the tracking feature points in the target image as target feature points in the target image.
[0076] Among them, the tracking feature points determined by performing the above steps (2)-(5) when using the target image as the observation image are the target feature points in the target image.
[0077] In steps A11-A12 above, feature point tracking and recognition of the image are performed based on the pyramid optical flow tracking algorithm. This eliminates the need to calculate the feature points of the image frame by frame, thus improving the speed of feature point recognition.
[0078] A2. Determine the spatial location of the target feature point based on the observation pose information of multiple frames of observation images.
[0079] Here, the observation pose information of the observed image reflects the camera pose corresponding to the observed image, which refers to the camera's pose when capturing the observed image. The observation pose information of multiple frames of observed images is calculated by the SLAM module in the target device, and the observation pose information of the observed image can be represented as follows: Ti represents the transformation relationship between the world coordinate system and the camera coordinate system when the camera captures the i-th observation image.
[0080] Specifically, the spatial location of the target feature point can be determined using triangulation based on the observation pose information of multiple frames of observation images.
[0081] If the number of observed images is two, in one example, the spatial location of the target feature point can satisfy the following relationship (hereinafter referred to as Relation 1):
[0082]
[0083] (v1, u1) represents the position coordinates of the target feature point in the first observation image, which is one of two observation images. r11, r21, and r31 represent the observation pose information of the first observation image. (v2, u2) represents the position coordinates of the target feature point in the second observation image, which is the other of two observation images. r12, r22, and r32 represent the observation pose information of the second observation image. (x, y, z, 1) represents the homogeneous coordinates of the target feature point in the world coordinate system.
[0084] If the number of observed images exceeds two frames, in one example, the spatial location of the target feature point can satisfy the following relationship (hereinafter referred to as Relation 2):
[0085]
[0086] (v i u i ) and (v j u j () represents the position coordinates of the target feature point in the i-th and j-th observation images in a multi-frame observation image, r1 i r2 i and r3 i r1 represents the observation pose information of the i-th observation image. j r2 j and r3 j Let (x, y, z, 1) be the observation pose information of the j-th observation image, and (x, y, z, 1) be the homogeneous coordinates of the target feature point in the world coordinate system.
[0087] Since the pose information of the observed image and the position coordinates of the target feature point in the observed image are known quantities, the homogeneous coordinates of the target feature point in the world coordinate system can be solved according to the above relation 1 or relation 2, thereby obtaining the spatial position coordinates of the target feature point.
[0088] It should be understood that Relations 1 and 2 mentioned above are only one implementation of using the triangulation method to solve for the spatial position of feature points. There are other implementations of using the triangulation method to solve for the spatial position of feature points, and this application does not limit them.
[0089] When the number of observed images exceeds two frames, the spatial location of the target feature point can also be determined through the following steps A21-A24:
[0090] A21. Combine any two frames of observation images from multiple observation images to obtain multiple observation image combinations.
[0091] The number of observation image combinations is N represents the number of observation images across multiple frames.
[0092] A22. Based on the observation pose information of the observation images in the target observation image combination, triangulate the target feature points to obtain the triangulated position points corresponding to the target observation image combination.
[0093] Here, the target observation image combination is any combination of multiple observation image groups, and the triangulated position points are used to reflect the spatial position of the target feature points.
[0094] In one feasible implementation, the triangulated position points corresponding to the target observation image combination can be determined by referring to relation 1 above. These triangulated position points can be represented as (x, y, z, 1) in relation 1 above. For Each observation image combination in the observation image combination refers to the above relation 1 to triangulate the target feature point, so as to obtain the triangulated position point corresponding to each observation image combination, thereby obtaining multiple triangulated position points corresponding to the target feature point.
[0095] A23. Among the multiple triangulated location points corresponding to the target feature point, filter out the noise location points.
[0096] Here, the multiple triangulated position points corresponding to the target feature points are the triangulated position points corresponding to the combination of multiple observation images, and the noise position points are the triangulated position points in the fitting plane that do not belong to the multiple triangulated position points.
[0097] In one feasible implementation, it can be done by, for example Figure 3 The process steps shown involve filtering out noisy locations from multiple triangulated locations corresponding to the target feature point, including the following steps a1 to a16:
[0098] a1. Set the number of plane iterations to one.
[0099] a2. Among the multiple triangulated position points corresponding to the target feature point, select any three triangulated position points as three interior points.
[0100] a3. Construct the position plane based on the three interior points.
[0101] A position plane can be constructed by connecting three interior points, where the three interior points are interior points in the position plane.
[0102] a4. Take the first exterior point in the exterior point set as the target exterior point.
[0103] Here, the set of outer points includes at least one outer point, which is the triangulated position point of multiple triangulated position points corresponding to the target feature point, excluding the three interior points used to construct the position plane.
[0104] a5. Calculate the distance between the target's external point and the position plane.
[0105] a6. Determine whether the distance between the target's external point and the location plane is less than a preset threshold.
[0106] If the distance between the target outer point and the position plane is less than the preset threshold, proceed to step a11; if the distance between the target outer point and the position plane is greater than or equal to the preset threshold, it indicates that the target outer point is a triangulated position point that does not belong to the position plane, and proceed to step a7.
[0107] a7. Determine whether the target external point is the last external point in the set of external points.
[0108] If the target outlier is not the last outlier in the outlier set, proceed to step a8; if the target outlier is the last outlier in the outlier set, proceed to step a9.
[0109] a8. Take the next outer point of the target outer point in the outer point set as the target outer point and execute step a5.
[0110] a9. Determine whether the number of plane iterations has reached the preset number.
[0111] If the number of plane iterations reaches the preset number, proceed to step a15; if the number of iterations does not reach the preset number, proceed to step a10.
[0112] a10. Increment the iteration count by one and execute step a2.
[0113] a11. Determine the target's external point as an internal point in the position plane.
[0114] a12. Determine whether the number of interior points in the position plane is greater than the maximum number of interior points.
[0115] The initial value for the maximum number of interior points can be set to 3.
[0116] If the number of interior points in the position plane is greater than the maximum number of interior points, proceed to step a13; if the number of interior points in the position plane is less than or equal to the maximum number of interior points, proceed to step a14.
[0117] a13. Update the maximum number of interior points to the number of interior points in the location plane.
[0118] a14. Update the position plane based on the interior points in the position plane, and execute step a7.
[0119] a15. The plane with the most interior points obtained from the final update is determined as the target plane.
[0120] Here, the target position plane is the position plane corresponding to the maximum number of interior points determined during multiple plane iterations, that is, the position plane updated to obtain the maximum number of interior points.
[0121] a16. Among the multiple triangulated position points corresponding to the target feature point, filter out the triangulated position points that do not belong to the target position plane.
[0122] In steps a1-a16 above, filtering out noisy location points based on RANSAC can improve the accuracy of filtering, thereby improving the accuracy of spatial location estimation.
[0123] A24. Determine the spatial location of the target feature point based on the remaining triangulated location points (excluding noise location points) among the multiple triangulated location points corresponding to the target feature point.
[0124] After filtering out noise locations and obtaining the remaining triangulated locations (hereinafter referred to as remaining locations) among multiple triangulated locations corresponding to the target feature point, the mean of all remaining locations can be used to determine the spatial location corresponding to the target feature point. The spatial location corresponding to the target feature point is as follows:
[0125]
[0126] (x d y d , z d (x) represents the spatial coordinates of the target feature point. k y k , z k ) represents the position coordinates of the kth remaining position point, and n represents the number of remaining position points.
[0127] Alternatively, after filtering out noise location points, it can be determined whether the first triangulated location point has been filtered out. The first triangulated location point is a triangulated location point determined by the triangulation method based on the observation pose information of the target image. If the first triangulated location point has not been filtered out, then any first triangulated location point is determined as the spatial location corresponding to the target feature point. If the first triangulated location point has been filtered out, then the second triangulated location point can be determined as the spatial location corresponding to the target feature point. The second triangulated location point is a triangulated location point determined by the triangulation method based on the observation pose information of the observation image adjacent to the target image.
[0128] The spatial location of the target feature point can be determined based on the triangulated location points after filtering, but this application does not limit the method described above.
[0129] In steps A21-A24 above, during the process of estimating the spatial location of feature points in the image based on SLAM information, filtering out some noisy location points obtained by triangulation can improve the accuracy of spatial location estimation.
[0130] For each target feature point in the target image, the spatial position corresponding to each target feature point is determined according to step A2 above. This will give us the spatial positions of all target feature points in the target image, forming the three-dimensional point cloud data in the target image.
[0131] A3. Based on the target feature points in the target image, perform triangular patch extraction on the target image to obtain multiple triangular image regions in the target image.
[0132] Here, the vertices of the triangular image regions in the target image are the three target feature points in the target image, and the multiple triangular image regions in the target image do not overlap with each other.
[0133] Specifically, based on the target feature points in the target image, the Delunay triangulation algorithm can be used to extract triangular facets from the target image, resulting in multiple triangular image regions in the target image.
[0134] A4. Based on the spatial location of the target feature points, map multiple triangular image regions in the target image to a spatial coordinate system to obtain multiple spatial planes.
[0135] Specifically, for the target triangular image region in the target image, the three vertices can be determined in the spatial coordinate system based on their spatial positions and connected to each other, thereby mapping the target triangular image region to the spatial coordinate system. The target triangular image region is any one of the multiple triangular image regions in the target image.
[0136] For example, if the spatial coordinates of the three vertices of the target triangular image region are position coordinate 1 = (x1, y1, z1), position coordinate 2 = (x2, y2, z2), and position coordinate 3 = (x3, y3, z3), then position coordinate 1, position coordinate 2, and position coordinate 3 can be connected in the spatial coordinate system to obtain the spatial plane corresponding to the target triangular image region.
[0137] For each triangular image region in the target image, multiple spatial planes can be obtained by mapping in the same way.
[0138] In steps A1-A4 above, the spatial position of feature points in the image is determined based on the observation pose information of consecutive multi-frame observation images, thereby completing the mapping of triangular image regions in the image to obtain a spatial plane. This is based on SLAM information to estimate the spatial position of feature points in the image, which combines global consistency in spatial information to ensure the accuracy of spatial position estimation. It also eliminates the need for additional depth sensors to determine the depth information of feature points and saves hardware costs.
[0139] S103, calculate the normal vector of each of the multiple spatial planes to obtain multiple normal vectors.
[0140] In one feasible implementation, the normal vector of the space plane can be determined based on the three vertices of the space plane.
[0141] Taking the three vertices of the spatial plane as (x1, y1, z1), (x2, y2, z2), and (x3, y3, z3) as an example, the normal vector of the spatial plane can be solved according to the following system of equations:
[0142]
[0143] (dx, dy, dz) represents the normal vector of the space plane, and (x1, y1, z1), (x2, y2, z2), and (x3, y3, z3) are known quantities. By solving for the unknown quantities (dx, dy, dz), the normal vector of the space plane can be calculated.
[0144] By calculating the normal vector for each of the multiple spatial planes in the same way, multiple normal vectors can be obtained.
[0145] S104 generates a normal vector histogram based on multiple normal vectors.
[0146] Here, the normal vector histogram includes the number of normal vectors corresponding to different angle intervals, where the angle within an interval is the angle between the normal vector and a preset reference object. The horizontal axis of the normal vector histogram represents the angle interval, and the vertical axis represents the number of normal vectors within that interval. The preset reference object is a vector or plane used as a reference in the spatial coordinate system. The horizontal axis of the normal vector can be obtained by dividing the angles according to preset intervals. For example, 36 angle intervals can be obtained by using 5° as the preset interval. These 36 angle intervals are: the first angle interval = [0°, 5°], the second angle interval = (5°, 10°], the third angle interval = (10°, 15°], ..., the thirty-sixth angle interval = (175°, 180°). For example, the normal vector histogram can be... Figure 4 As shown.
[0147] In one feasible implementation, the normal vector histogram can be generated through the following steps B1-B3:
[0148] B1. Calculate the angle between the target normal vector and the reference plane to obtain the target angle.
[0149] Here, the target normal vector is any normal vector among multiple normal vectors. The reference plane can be a horizontal plane in the spatial coordinate system, or a plane perpendicular to the horizontal plane in the spatial coordinate system; or it can be any other plane, which is not limited in this application.
[0150] Specifically, the projection vector of the target normal vector onto the reference plane can be calculated to obtain the target projection vector; the normal vector of the reference plane can be calculated to obtain the reference normal vector; and then the angle between the target projection vector and the reference normal vector can be calculated to obtain the target angle.
[0151] B2. Increment the number of normal vectors corresponding to the included angle interval to which the target included angle belongs by one.
[0152] Here, the included angle interval to which the target angle belongs refers to the included angle interval that includes the target angle.
[0153] For example, if the target angle is 16°, and the angle intervals are preset at 5° intervals, the 36 angle intervals are: the first angle interval = [0°, 5°], the second angle interval = (5°, 10°], the third angle interval = (10°, 15°], ..., the thirty-sixth angle interval = (175°, 180°], then the angle interval to which the target angle belongs is the fourth angle interval = (15°, 20°]. The number of normal vectors of the fourth angle interval is increased by one.
[0154] For each of the multiple normal vectors, the number of normal vectors corresponding to the included angle interval is counted according to the above method B1-B2, and then the number of normal vectors corresponding to each included angle interval can be obtained.
[0155] B3. Generate a normal vector histogram based on each included angle interval and the number of normal vectors corresponding to each included angle interval.
[0156] In this process, each included angle interval can be used as the horizontal axis, and the number of normal vectors corresponding to each included angle interval can be used as the vertical axis to draw a histogram, thereby generating a normal vector histogram.
[0157] In steps B1-B3 above, the angle between the normal vector and the reference plane is calculated, and the number of normal vectors in different angle intervals is counted based on the angle to generate a normal vector histogram. This method is simple and can save computation.
[0158] Optionally, multiple angles can be calculated between multiple normal vectors and a reference vector (e.g., a vector parallel to the X, Y, or Z axis in a spatial coordinate system), resulting in multiple angles; the angle interval to which each angle belongs can be determined, and the number of normal vectors in each angle interval can be incremented by one to obtain the number of normal vectors corresponding to each angle interval; then, a normal vector histogram can be generated based on each angle interval and the number of normal vectors corresponding to each angle interval. This application does not impose any restrictions on the method of generating the normal vector histogram.
[0159] S105, the first triangular image region and the second triangular image region in the target image are determined as the same plane region in the target image.
[0160] Here, the first triangular image region and the second triangular image region are two adjacent triangular image regions in the target image. Furthermore, the normal vector of the spatial plane corresponding to the first triangular image region (hereinafter referred to as the first normal vector) and the normal vector of the spatial plane corresponding to the second triangular image region (hereinafter referred to as the second normal vector) belong to the same angle interval in their respective normal vector histograms. The same angle interval in the normal vector histograms of the first and second normal vectors means that the angle calculated based on the first normal vector and the angle calculated based on the second normal vector belong to the same angle interval in the normal vector histogram. It can be understood that the first and second triangular image regions are adjacent because, among the three feature points connecting to form the first triangular image region, two feature points are also feature points connecting to form the second triangular image region; that is, two vertices of the first triangular image region are two vertices of the second triangular image region.
[0161] Specifically, for any two adjacent triangular regions in the target image, it can be determined whether their corresponding normal vectors fall within the same angle interval in the normal vector histogram. If they do, the two adjacent triangular regions are considered to be on the same plane in the target image; otherwise, they are not. This process allows for the extraction of planes from the target image.
[0162] In the above Figure 2 In the corresponding technical solution, after acquiring the target image, multiple spatial planes are obtained by estimating the triangular planes of the target image in the spatial coordinate system. Then, the normal vector of each spatial plane is calculated to obtain multiple normal vectors. Based on the multiple normal vectors, a normal vector histogram is generated. The normal vector histogram includes normal vector data corresponding to different angle intervals. Finally, two adjacent triangular image regions in the same angle interval of the corresponding normal vector histogram in the target image are determined as the same plane region in the target image, thus realizing the detection of planes in the image. Since the same plane region in the image is determined based on the normal vector histogram, only normal vector calculation and statistics are required. Compared with plane estimation through deep learning, the amount of computation can be reduced and computing resources can be saved, thus enabling plane detection on devices with limited computing resources (such as mobile devices).
[0163] The method of this application has been described above; the apparatus of this application will be described below.
[0164] See Figure 5 , Figure 5 This is a schematic diagram of the structure of a planar estimation device provided in an embodiment of this application, as shown below. Figure 5 As shown, the plane estimation device 20 includes:
[0165] Image acquisition module 201 is used to acquire a target image, wherein the target image is an image containing a plane;
[0166] The spatial plane estimation module 202 is used to estimate the triangular plane of the target image in the spatial coordinate system to obtain multiple spatial planes. The spatial planes are obtained based on the triangular image regions in the target image. The triangular image regions are image regions formed by connecting three feature points in the target image.
[0167] The normal vector calculation module 203 is used to calculate the normal vector of each of the multiple spatial planes to obtain multiple normal vectors;
[0168] The histogram generation module 204 is used to generate a normal vector histogram based on the plurality of normal vectors. The normal vector histogram includes the number of normal vectors corresponding to different angle intervals. The angle in the angle interval is the angle between the normal vector and a preset reference object.
[0169] The planar region determination module 205 is used to determine the first triangular image region and the second triangular image region in the target image as the same planar region in the target image. The first triangular image region and the second triangular image region are two adjacent triangular image regions in the target image. Furthermore, the normal vector of the spatial plane corresponding to the first triangular image region and the normal vector of the spatial plane corresponding to the second triangular image region correspond to the same angle interval in the normal vector histogram.
[0170] In one possible design, the aforementioned spatial plane estimation module 202 is specifically used for: determining target feature points in the target image, wherein the target feature points exist in a series of consecutive observation images corresponding to the target image, and the series of observation images includes the target image; determining the spatial position corresponding to the target feature points based on the observation pose information of the series of observation images, wherein the observation pose information reflects the camera pose corresponding to the observation image; extracting triangular patches from the target image based on the target feature points in the target image to obtain multiple triangular image regions in the target image, wherein the vertices of each triangular image region are three target feature points in the target image, and the multiple triangular image regions do not overlap with each other; and mapping the multiple triangular image regions to a spatial coordinate system based on the spatial position corresponding to the target feature points to obtain the multiple spatial planes.
[0171] In one possible design, the aforementioned spatial plane estimation module 202 is specifically used for: combining any two frames of observation images from the multiple observation images to obtain multiple observation image combinations; triangulating the target feature points according to the observation pose information of the observation images in the target observation image combination to obtain the triangulated position points corresponding to the target observation image combination, wherein the target observation image combination is any one of the multiple observation image combinations, and the triangulated position points are used to reflect the spatial position of the target feature points; filtering out noise position points from the multiple triangulated position points corresponding to the target feature points, wherein the multiple triangulated position points are the triangulated position points corresponding to the multiple observation image combinations, and the noise position points are triangulated position points that do not belong to the fitting plane corresponding to the multiple triangulated position points; and determining the spatial position corresponding to the target feature point based on the remaining triangulated position points excluding the noise position points from the multiple triangulated position points.
[0172] In one possible design, the aforementioned spatial plane estimation module 202 is specifically used for: selecting any three triangulated position points from the plurality of triangulated position points as three interior points in the position plane, constructing the position plane, and determining an outer point set, wherein the outer point set includes at least one outer point, which is a triangulated position point other than the arbitrary three triangulated position points from the plurality of triangulated position points; starting from the first outer point in the outer point set, obtaining a target outer point from the outer point set; calculating the distance between the target outer point and the position plane; if the distance is less than a preset threshold, determining the target outer point as an interior point in the position plane; if the number of interior points in the position plane is greater than the maximum number of interior points, determining the target outer point as an interior point in the position plane. The maximum number of interior points is updated to the number of interior points in the position plane. Based on all interior points in the position plane, the position plane is updated. The next outer point of the target outer point is obtained from the outer point set as the target outer point, and the step of calculating the distance between the target outer point and the position plane is executed. If the distance is greater than or equal to the preset threshold, the next outer point of the target outer point is obtained from the outer point set as the target outer point, and the step of calculating the distance between the target outer point and the position plane is executed. Among the multiple triangulated position points, triangulated position points that do not belong to the target position plane are filtered out. The target position plane is the position plane with the largest number of interior points obtained in the final update.
[0173] In one possible design, the aforementioned spatial plane estimation module 202 is specifically used to: perform feature point tracking on the multi-frame observation images based on the pyramid optical flow tracking algorithm to obtain the tracking feature points in each frame of the multi-frame observation images; and determine the tracking feature points in the target image as the target feature points in the target image.
[0174] In one possible design, the histogram generation module 204 is specifically used to: calculate the angle between the target normal vector and the reference plane to obtain the target angle, wherein the target normal vector is any one of the plurality of normal vectors; increment the number of normal vectors corresponding to the angle interval to which the target angle belongs by one; and generate the normal vector histogram based on each angle interval and the number of normal vectors corresponding to each angle interval.
[0175] In one possible design, the above-described plane estimation device is applied to a target device, which includes a plane estimation thread and a listening thread. The plane estimation thread is used to execute the method of the above method embodiment to obtain plane estimation data; the listening thread is used to save the latest plane estimation data and send the latest plane estimation data to the application that needs the plane estimation data.
[0176] It should be noted that, Figure 5For any content not mentioned in the corresponding embodiments, please refer to the description of the foregoing method embodiments, which will not be repeated here.
[0177] The aforementioned device, after acquiring the target image, estimates the triangular planes of the target image in the spatial coordinate system to obtain multiple spatial planes. Then, it calculates the normal vector of each spatial plane to obtain multiple normal vectors. Based on these multiple normal vectors, it generates a normal vector histogram, which includes normal vector data corresponding to different angle intervals. Finally, it identifies two adjacent triangular image regions in the same angle interval of the corresponding normal vector histogram in the target image as the same plane region in the target image, thus realizing the detection of planes in the image. Since the same plane region in the image is determined based on the normal vector histogram, only normal vector calculation and statistics are required. Compared with plane estimation through deep learning, this reduces the amount of computation and saves computing resources, enabling plane detection on devices with limited computing resources (such as mobile devices).
[0178] See Figure 6 , Figure 6 This is a schematic diagram of the structure of a computer device 30 provided in an embodiment of this application. The computer device 30 includes a processor 301 and a memory 302. The memory 302 is connected to the processor 301, for example, via a bus.
[0179] Processor 301 is configured to support the computer device 30 in performing the corresponding functions in the methods described in the above method embodiments. Processor 301 may be a central processing unit (CPU), a network processor (NP), a hardware chip, or any combination thereof. The aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof. The aforementioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
[0180] Memory 302 is used to store program code, etc. Memory 302 may include volatile memory (VM), such as random access memory (RAM); memory 302 may also include non-volatile memory (NVM), such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid-state drive (SSD); memory 302 may also include combinations of the above types of memory.
[0181] The memory 302 is used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the program instructions / modules corresponding to the warning prompting method in the embodiments of this application. The processor executes various functional applications and data processing of the plane estimation method by running the non-volatile software programs, instructions, and modules stored in the memory, thereby realizing the function of the plane estimation method provided in the above method embodiments.
[0182] Memory 302 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and application programs required for at least one function. The data storage area may store data created based on the use of the plane estimation device. In some embodiments, the memory may include memory remotely located relative to the processor, which may be connected to the plane estimation device via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
[0183] The one or more modules are stored in the memory. When executed by the one or more processors, they perform the plane estimation method in any of the above method embodiments. For example, they perform the method steps described in the above method embodiments to realize the functions of the modules described in the above device embodiments.
[0184] This application also provides a computer-readable storage medium storing a computer program, the computer program including program instructions, which, when executed by a computer, cause the computer to perform the method described in the foregoing embodiments.
[0185] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), or random access memory (RAM), etc.
[0186] The above-disclosed embodiments are merely preferred embodiments of this application and should not be construed as limiting the scope of this application. Therefore, any equivalent variations made in accordance with the claims of this application shall still fall within the scope of this application.
Claims
1. A plane estimation method, characterized in that, include: Acquire a target image, wherein the target image is an image containing a plane; Estimate the triangular planes of the target image in the spatial coordinate system to obtain multiple spatial planes. The spatial planes are obtained based on the triangular image regions in the target image, where the triangular image regions are the image regions formed by connecting three feature points in the target image. Calculate the normal vector of each of the multiple spatial planes to obtain multiple normal vectors; Based on the multiple normal vectors, a normal vector histogram is generated. The normal vector histogram includes the number of normal vectors corresponding to different angle intervals. The angle in the angle interval is the angle between the normal vector and the preset reference object. The first triangular image region and the second triangular image region in the target image are determined as the same plane region in the target image. The first triangular image region and the second triangular image region are two adjacent triangular image regions in the target image. Furthermore, the normal vector of the spatial plane corresponding to the first triangular image region and the normal vector of the spatial plane corresponding to the second triangular image region correspond to the same angle interval in the normal vector histogram.
2. The method according to claim 1, characterized in that, The estimation of the triangular plane of the target image in the spatial coordinate system yields multiple spatial planes, including: Determine target feature points in the target image, wherein the target feature points exist in a series of consecutive observation images corresponding to the target image, and the series of observation images includes the target image; Based on the observation pose information of the multi-frame observation images, the spatial position corresponding to the target feature point is determined, and the observation pose information is used to reflect the camera pose corresponding to the observation image; Based on the target feature points in the target image, triangular patches are extracted from the target image to obtain multiple triangular image regions in the target image. The vertices of each triangular image region are three target feature points in the target image, and the multiple triangular image regions do not overlap with each other. Based on the spatial location corresponding to the target feature point, the multiple triangular image regions are mapped to a spatial coordinate system to obtain the multiple spatial planes.
3. The method according to claim 2, characterized in that, Determining the spatial location corresponding to the target feature point based on the observation pose information of the multi-frame observation images includes: Combine any two frames of the multi-frame observation images to obtain multiple observation image combinations; Based on the observation pose information of the observation images in the target observation image combination, the target feature points are triangulated to obtain the triangulated position points corresponding to the target observation image combination. The target observation image combination is any one of the multiple observation image combinations. The triangulated position points are used to reflect the spatial position of the target feature points. Among the multiple triangulated position points corresponding to the target feature point, noise position points are filtered out. The multiple triangulated position points are the triangulated position points corresponding to the combination of the multiple observed images, and the noise position points are the triangulated position points that do not belong to the fitting plane corresponding to the multiple triangulated position points. The spatial location corresponding to the target feature point is determined based on the remaining triangulated location points excluding the noise location points from the plurality of triangulated location points.
4. The method according to claim 3, characterized in that, The step of filtering out noise location points from the multiple triangulated location points corresponding to the target feature point includes: Select any three triangulated position points from the plurality of triangulated position points as three interior points in the position plane to construct the position plane and determine the set of exterior points. The set of exterior points includes at least one exterior point, which is a triangulated position point other than the arbitrary three triangulated position points from the plurality of triangulated position points. Starting from the first outer point in the set of outer points, obtain the target outer point from the set of outer points; Calculate the distance between the target outer point and the position plane; If the distance is less than a preset threshold, the target outer point is determined as an inner point in the position plane; If the number of interior points in the location plane is greater than the maximum number of interior points, update the maximum number of interior points to the number of interior points in the location plane, update the location plane according to all interior points in the location plane, obtain the next outer point of the target outer point from the outer point set as the target outer point, and perform the step of calculating the distance between the target outer point and the location plane. If the distance is greater than or equal to the preset threshold, the next outer point of the target outer point is obtained from the outer point set as the target outer point, and the step of calculating the distance between the target outer point and the position plane is executed; Among the multiple triangulated position points, triangulated position points that do not belong to the target position plane are filtered out. The target position plane is the position plane with the largest number of interior points obtained in the final update.
5. The method according to claim 2, characterized in that, Determining the target feature points in the target image includes: Based on the pyramid optical flow tracking algorithm, feature point tracking is performed on the multi-frame observation images to obtain the tracking feature points in each frame of the multi-frame observation images; The tracking feature points in the target image are identified as the target feature points in the target image.
6. The method according to claim 1, characterized in that, The step of generating a normal vector histogram based on the plurality of normal vectors includes: Calculate the angle between the target normal vector and the reference plane to obtain the target angle, wherein the target normal vector is any one of the plurality of normal vectors; Increment the number of normal vectors corresponding to the included angle interval to which the target included angle belongs by one; The normal vector histogram is generated based on each included angle interval and the number of normal vectors corresponding to each included angle interval.
7. The method according to any one of claims 1-6, characterized in that, The method is applied to a target device, which includes a plane estimation thread and a listening thread. The plane estimation thread is used to execute the method described in any one of claims 1-6 to obtain plane estimation data. The listening thread is used to save the latest plane estimation data and send the latest plane estimation data to the application that needs the plane estimation data.
8. A plane estimation device, characterized in that, include: The image acquisition module is used to acquire a target image, wherein the target image is an image containing a plane; The spatial plane estimation module is used to estimate the triangular planes of the target image in the spatial coordinate system to obtain multiple spatial planes. The spatial planes are obtained based on the triangular image regions in the target image, and the triangular image regions are the image regions formed by connecting three feature points in the target image. The normal vector calculation module is used to calculate the normal vector of each of the multiple spatial planes to obtain multiple normal vectors; The histogram generation module is used to generate a normal vector histogram based on the plurality of normal vectors. The normal vector histogram includes the number of normal vectors corresponding to different angle intervals, and the angle in the angle interval is the angle between the normal vector and a preset reference object. The planar region determination module is used to determine the first triangular image region and the second triangular image region in the target image as the same planar region in the target image. The first triangular image region and the second triangular image region are two adjacent triangular image regions in the target image. Furthermore, the normal vector of the spatial plane corresponding to the first triangular image region and the normal vector of the spatial plane corresponding to the second triangular image region correspond to the same angle interval in the normal vector histogram.
9. A computer device, characterized in that, The device includes a memory and a processor, the memory being connected to the processor, the processor being configured to execute one or more computer programs stored in the memory, the processor causing the computer device to perform the method as described in any one of claims 1-7 when executing the one or more computer programs.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, the computer program including program instructions that, when executed by a processor, cause the processor to perform the method as described in any one of claims 1-7.