A method and apparatus for determining a target point
By performing stereo matching of feature points and camera calibration in a binocular vision positioning system, and using the common pole line constraint relationship and optical axis intersection to determine the world coordinates of the target point, the problem of target object positioning error in the binocular vision positioning system is solved, and higher precision target object positioning is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGZHOU AIMUYI TECH CO LTD
- Filing Date
- 2022-09-13
- Publication Date
- 2026-06-19
AI Technical Summary
In the process of using a binocular vision positioning system to determine the target object, there are positioning or tracking errors of non-real target objects, resulting in low recognition accuracy.
By acquiring feature points from the left and right cameras and performing stereo matching, the target point is determined using the common pole line constraint relationship. This includes calibrating the cameras to obtain camera parameters and determining the world coordinates of the target point through the intersection of the optical axes.
It improves the accuracy of target point recognition, enabling more accurate positioning of target objects, especially in the medical field where it assists in locating patient lesions and tracking the position of surgical instruments in real time.
Smart Images

Figure CN115546262B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer vision, and more specifically, to a method and apparatus for determining target points in the field of computer vision. Background Technology
[0002] With the high demand for location information technology, positioning technology has developed significantly. This includes binocular vision positioning systems based on the parallax principle.
[0003] This binocular vision positioning system includes two cameras, each with its optical center located on the same baseline. The two cameras capture images of a target object from different angles within the same field of view, obtaining the target object's projection points on their respective camera images. The system then performs stereo matching of these projection points to determine the target object's spatial position. Based on the positions of these projection points and the principle of similar triangle measurement, the system obtains the target object's spatial coordinates for localization or tracking.
[0004] In the process of determining target objects using a binocular vision positioning system, stereo matching of projection points with common polarity constraints acquired by two cameras can yield multiple target objects. Among these multiple target objects, there may be non-real target objects, which introduces errors in the positioning or tracking of the target objects. Summary of the Invention
[0005] This application provides a method and apparatus for determining target points. This method can improve the recognition accuracy of real target points and more accurately locate target objects.
[0006] Firstly, a method for determining target points is provided, applied to a binocular vision positioning device, which includes a left camera and a right camera for acquiring images. The method includes: acquiring N left feature points in the left image of the target scene captured by the left camera, and M right feature points in the right image of the target scene captured by the right camera, wherein the N left feature points and the M right feature points have a common pole line constraint relationship, M≥N, and M and N are integers greater than 0; performing stereo matching processing on any one of the N left feature points and M-N+1 right feature points from the M right feature points to determine the target point.
[0007] In the above technical solution, the method can identify the target point when there is a common polarity constraint relationship between the N left feature points in the target image captured by the left camera and the M right feature points in the target image captured by the right camera. That is, when M and N are equal, the real spatial point, i.e., the target point, is obtained directly; when M and N are not equal, the recognition accuracy of the target point can be improved, and the target point can be located more accurately.
[0008] In conjunction with the first aspect, in some possible implementations, the target point is determined by performing stereo matching processing between any one of the N left feature points and each of the M (M-N+1) right feature points. This includes: performing stereo matching processing between the i-th left feature point of the N left feature points and the i-th right feature point, the (i+1)-th right feature point to the (i+MN)-th right feature points of the M right feature points to determine the target point. The i-th left feature point is any one of the N left feature points, where N≥i≥1.
[0009] The above technical solution describes the specific process of obtaining the target point based on M-N+1 right feature points out of N left feature points and M right feature points. The target point is determined by performing stereo matching between the i-th left feature point among the N left feature points and the i-th, (i+1)-(i+MN)-th right feature points among the M right feature points. This improves the accuracy of target point recognition.
[0010] Combining the first aspect and the above implementation methods, in some possible implementation methods, any one of the N left feature points is stereo matched with the M-N+1 right feature points of the M right feature points to determine the target point. This includes: determining the i-th left optical axis based on the line connecting the optical center of the left camera to the i-th left feature point among the N left feature points; determining the i-th right optical axis and the i+1 right optical axis to the i+MN right optical axes based on the lines connecting the optical center of the right camera to the i-th right feature point, the i+1 right optical axis to the i+MN right feature points among the M right feature points; and determining the intersection points of the i-th left optical axis with the i-th right optical axis and the i+1 right optical axis to the i+MN right optical axes as the target point.
[0011] In combination with the first aspect and the above implementation methods, in some possible implementation methods, the method further includes: calibrating the left camera and the right camera with the optical center of the left camera or the optical center of the right camera as the origin of the world coordinate system, and determining the camera parameters of the left camera and the right camera.
[0012] It should be understood that camera parameters include the extrinsic and intrinsic parameters of the left camera, and the extrinsic and intrinsic parameters of the right camera. The camera's intrinsic parameters are determined by the camera itself and are only related to the camera itself. Specifically, they include the camera's focal length, lens distortion parameters, and pixel size. The focal lengths of the left and right cameras are generally equal. The lens distortion parameters represent the magnitude of radial distortion. Pixel size refers to the actual length and width represented by one pixel. The camera's extrinsic parameters refer to the camera's pose in the world coordinate system, determined by the relative pose relationship between the camera and the world coordinate system. Specifically, they include rotation and translation vectors. The rotation vector describes the direction of the world coordinate system's coordinate axes corresponding to the camera's coordinate axes; the translation vector describes the position of the origin of the world coordinate system in the camera's coordinate system.
[0013] In the above technical solution, calibrating the left and right cameras is to reconstruct a binocular vision positioning geometric model based on the left image captured by the left camera and the right image captured by the right camera. This model allows us to obtain the spatial geometric relationship between the left and right cameras, as well as their intrinsic parameters. The spatial geometric relationship between the left and right cameras refers to their extrinsic parameters. Therefore, the obtained camera parameters can contribute to the subsequent determination of the world coordinates of the target point.
[0014] In combination with the first aspect and the above implementation, in some possible implementations, the method further includes: determining the world coordinates of any one of the target points in the left image, the pixel coordinates of any one of the N left feature points in the left image, the pixel coordinates of the M-N+1 right feature points in the right image, and the camera parameters of the left camera and the right camera.
[0015] In the above technical solution, based on the pixel coordinates of any left feature point in the left image, the pixel coordinates of M-N+1 right feature points out of M right feature points in the right image, the camera parameters of the left camera, and the camera parameters of the right camera, the world coordinates of any one of multiple target points can be obtained. In other words, through this process, the precise location of the target point can be determined, which has significant application value in the medical field. For example, in the medical field, it can assist in locating the position of diseased tissue in patients and can track the position of surgical instruments in real time.
[0016] Combining the first aspect and the above implementation methods, in some possible implementation methods, both the left camera and the right camera are infrared cameras.
[0017] In summary, this application proposes a method for determining target points. This method can identify target points when there is a common polarity constraint relationship between N left feature points in the target image captured by the left camera and M right feature points in the target image captured by the right camera. When M and N are equal, the actual spatial point, i.e., the target point, is obtained directly; when M and N are not equal, the accuracy of target point recognition can be improved, and the target point can be located more accurately.
[0018] Furthermore, the process of obtaining the target point from M-N+1 right feature points out of N left feature points and M right feature points is described in detail.
[0019] It should be understood that before acquiring images using the left and right cameras included in a binocular vision positioning device, the left and right cameras need to be calibrated. This is to obtain camera parameters, namely the spatial geometrical positional relationship between the left and right cameras, as well as their intrinsic parameters. These camera parameters can then contribute to the subsequent determination of the world coordinates of the target point.
[0020] Finally, based on the pixel coordinates of any left feature point in the left image, the pixel coordinates of M-N+1 right feature points out of M right feature points in the right image, the camera parameters of the left camera, and the camera parameters of the right camera, the world coordinates of any one of the multiple target points can be obtained. In other words, through the above process, the precise location of the target point can be determined, which has significant application value in the medical field. For example, in the medical field, it can assist in locating the position of diseased tissue in patients and can track the position of surgical instruments in real time.
[0021] Secondly, an apparatus for determining a target point is provided. The apparatus includes: an acquisition module, used to acquire N left feature points in the left image of the target scene captured by a left camera, and M right feature points in the right image of the target scene captured by a right camera, wherein the N left feature points and the M right feature points have a common pole line constraint relationship, M≥N, and M and N are integers greater than 0; and a determination module, used to perform stereo matching processing between any one of the N left feature points and M-N+1 right feature points from the M right feature points to determine the target point.
[0022] In conjunction with the second aspect, in some possible implementations, the determining module is specifically used to perform stereo matching processing between the i-th left feature point among the N left feature points and the i-th right feature point, the (i+1)-th right feature point to the (i+MN)-th right feature point among the M right feature points to determine the target point, where the i-th left feature point is any one of the N left feature points, and N≥i≥1.
[0023] In conjunction with the second aspect and the above implementation methods, in some possible implementation methods, the determining module is also specifically used to determine the i-th left optical axis based on the line connecting the optical center of the left camera to the i-th left feature point among the N left feature points; to determine the i-th right optical axis and the i+1 right optical axis to the i+MN right optical axes based on the line connecting the optical center of the right camera to the i-th right feature point, the i+1 right feature point to the i+MN right feature points among the M right feature points; and to determine the intersection points of the i-th left optical axis with the i-th right optical axis and the i+1 right optical axis to the i+MN right optical axes as multiple target points.
[0024] In conjunction with the second aspect and the above implementation methods, in some possible implementation methods, the determining module is further used to calibrate the left camera and the right camera with the optical center of the left camera or the optical center of the right camera as the origin of the world coordinate system, and determine the camera parameters of the left camera and the right camera.
[0025] In combination with the second aspect and the above implementation methods, in some possible implementation methods, the determining module is further used to determine the world coordinates of any one of the target points in the target points, based on the pixel coordinates of any one of the N left feature points in the left image, the pixel coordinates of the M-N+1 right feature points in the M right feature points in the right image, and the camera parameters of the left camera and the right camera.
[0026] Combining the second aspect and the above implementation methods, in some possible implementations, both the left camera and the right camera are infrared cameras.
[0027] Thirdly, a device for determining a target point is provided, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein when the processor executes the computer program, the device for determining the target point performs the method described in the first aspect or any possible implementation thereof.
[0028] Fourthly, a computer-readable storage medium is provided that stores instructions which, when executed on a computer or processor, cause the computer or processor to perform the methods described in the first aspect or any possible implementation thereof.
[0029] Fifthly, a computer program product containing instructions is provided, which, when run on a computer or processor, causes the computer or processor to perform the method described in the first aspect or any possible implementation thereof. Attached Figure Description
[0030] Figure 1This is a schematic diagram of the structure of a binocular positioning device provided in an embodiment of this application;
[0031] Figure 2 This is a schematic diagram of a stereo matching process provided in an embodiment of this application;
[0032] Figure 3 This is a schematic diagram provided in an embodiment of the present application to describe the epipolar constraint relationship between multiple projection points;
[0033] Figure 4 This is a schematic flowchart illustrating a method for determining a target point provided in an embodiment of this application;
[0034] Figure 5 This is a schematic diagram illustrating the determination of a target point according to an embodiment of this application;
[0035] Figure 6 This is a schematic diagram illustrating how to determine the world coordinates of a target point according to an embodiment of this application;
[0036] Figure 7 This is a schematic diagram of the structure of a device for determining a target point provided in an embodiment of this application;
[0037] Figure 8 This is a schematic diagram of another device for determining target points provided in an embodiment of this application. Detailed Implementation
[0038] The technical solutions of this application will now be described clearly and in detail with reference to the accompanying drawings. In the description of the embodiments of this application, unless otherwise stated, "multiple" refers to two or more.
[0039] Hereinafter, the terms "first" and "second" are used for descriptive purposes only and should not be construed as implying or suggesting relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
[0040] Figure 1 This is a schematic diagram of a binocular vision positioning device provided in an embodiment of this application. The device includes a left camera and a right camera that capture images of a target object from different shooting angles within the same field of view, obtaining the projection points of the target object onto the respective images of the two cameras. The binocular vision positioning device performs stereo matching on the projection points of the target object captured by the two cameras to obtain the spatial position of the target object; and obtains the spatial coordinates of the target object based on the positions of the projection points of the target object captured by the two cameras and the principle of similar triangle measurement.
[0041] For example, such as Figure 1 As shown, C l C is the optical center of the left camera. r Let P be the optical center of the right camera. The projection point of the target object P captured by the left camera on the left image is P1, and the projection point of the target object P captured by the right camera on the right image is P2. The world coordinates of the target object P are obtained based on P1, P2, and the principle of similar triangles.
[0042] It should be understood that the implementation principle of binocular vision positioning devices is the same as the principle of human visual perception. The projection points corresponding to the same object seen by the left and right eyes differ. Therefore, it is necessary to match the projection points seen by the left and right eyes respectively, and determine the position of the target object based on the matching results. Thus, stereo matching processing is introduced into binocular vision positioning devices. Stereo matching processing is the process of matching one or more projection points captured by the left camera with one or more projection points captured by the right camera to determine the target object. Taking the example where the target object captured by both the left and right cameras has two projection points in each image, the process of determining the target object will be described in detail.
[0043] Figure 2 This is a schematic diagram of a stereo matching process provided in an embodiment of this application.
[0044] For example, when the number of projection points for both the left and right images is 2, such as Figure 2 As shown, assume P1 and P2 are two target objects in space, and the projection point of the target object captured by the left camera in the left image is P. l1 and P l2 The projection point of the target object captured by the right camera in the right image is P. r1 and P r2 P l1 and P l2 , with P r1 and P r2 An epipolar constraint relationship exists. P l1 With P r1 Matching yields P1, which is C. l With P l1 Forming the first optical axis, P r1 With C r A second optical axis is formed, and the intersection of the first and second optical axes determines P1; similarly, P l1 With P r2 Matching yields M and P l2 With P r1 Matching yields N and P l2 With P r2 P2 is obtained by matching.
[0045] It should also be understood that epipolar constraints describe the constraints formed by the projection point and the optical center of the camera under the projection model when a target object is projected onto images from two different viewpoints. Epipolar constraints can be described as follows: the first plane formed by the projection point of the target object on the left image, the optical center of the left camera, and the optical center of the right camera intersects with the right image to form an intersection line. Therefore, the projection point on the right image captured by the right camera will always lie on the intersection line.
[0046] Figure 3 This is a schematic diagram provided in an embodiment of the present application to describe the epipolar constraint relationship between multiple projection points.
[0047] For example, such as Figure 3 As shown, the projection point on the left image is P. l1 P l1 C l and C r A first plane is formed, and the line of intersection between the first plane and the right image is l. The projection point on the right image captured by the right camera must lie on the line of intersection l, i.e., P. r1 P r2 and P r3 Located on l.
[0048] There are common pole line constraints between multiple projection points on the left image and multiple projection points on the right image. During the process of determining the target point using stereo matching, it is possible to obtain non-true target points. For example, Figure 2 In this diagram, P1 and P2 are two target objects in space. Stereo matching yields four target points, meaning that M and N, obtained through matching, are not the actual target points. Therefore, it is concluded that processing multiple projection points captured by the left and right cameras using stereo matching may result in non-real target objects.
[0049] Figure 4 This is a schematic flowchart illustrating a method for determining a target point provided in an embodiment of this application.
[0050] It should be understood that the method for determining a target point provided in the embodiments of this application can be applied to... Figure 1 The binocular vision positioning device shown includes a left camera and a right camera for acquiring images. Alternatively, this method for determining target points can also be applied to a positioning system that includes this binocular vision positioning device.
[0051] For example, such as Figure 4 As shown, the method 400 includes:
[0052] 401. A binocular vision positioning device acquires N left feature points in the left image of the target scene captured by the left camera, and M right feature points in the right image of the target scene captured by the right camera. The N left feature points and the M right feature points are in a common pole line constraint relationship, M≥N, and M and N are integers greater than 0.
[0053] It should be understood that the left feature point in the above scheme refers to... Figure 1 The projection points in the left image shown, and the right feature points refer to... Figure 1 The projection points in the right image shown.
[0054] It should also be understood that the binocular vision positioning device may also include a bracket connecting the left and right cameras. Both the left and right cameras in this binocular vision positioning device are cameras with adjustable shooting angles, and this application does not limit the types of the left and right cameras. The left and right cameras can be of the same type. For example, both the left and right cameras can be infrared cameras, low-light cameras, or wide dynamic range cameras, etc.; the left and right cameras can also be of different types. For example, when the left camera is an infrared camera, the right camera can be a low-light camera or a wide dynamic range camera, etc.
[0055] In one possible implementation, before step 401, the binocular vision positioning device calibrates the left and right cameras using the optical center of the left camera or the optical center of the right camera as the origin of the world coordinate system, and determines the camera parameters of the left and right cameras.
[0056] It should be understood that camera parameters include the extrinsic and intrinsic parameters of the left camera, and the extrinsic and intrinsic parameters of the right camera. The camera's intrinsic parameters are determined by the camera itself and are only related to the camera itself. Specifically, they include the camera's focal length, lens distortion parameters, and pixel size. The focal lengths of the left and right cameras are generally equal. The lens distortion parameters represent the magnitude of radial distortion. Pixel size refers to the actual length and width represented by one pixel. The camera's extrinsic parameters refer to the camera's pose in the world coordinate system, determined by the relative pose relationship between the camera and the world coordinate system. Specifically, they include rotation and translation vectors. The rotation vector describes the direction of the world coordinate system's coordinate axes corresponding to the camera's coordinate axes; the translation vector describes the position of the origin of the world coordinate system in the camera's coordinate system.
[0057] In the above technical solution, calibrating the left and right cameras is to reconstruct a binocular vision positioning geometric model based on the left image captured by the left camera and the right image captured by the right camera. This model allows us to obtain the spatial geometric relationship between the left and right cameras, as well as their intrinsic parameters. The spatial geometric relationship between the left and right cameras refers to their extrinsic parameters. Therefore, the obtained camera parameters can contribute to the subsequent determination of the world coordinates of the target point.
[0058] 402. The binocular vision positioning device performs stereo matching processing on any one of the N left feature points and M-N+1 right feature points from the M right feature points to determine the target point.
[0059] It should be understood that the target point in the above scheme refers to Figure 1 The target object in the space shown. The left image contains N left feature points. After performing stereo matching between the N left feature points and M right feature points using method 400, N*(M-N+1) target points can be obtained. Compared to the existing N*M target points, the method 400 of this application can obtain fewer target points, thus increasing the accuracy of target point recognition to some extent.
[0060] It should also be understood that the number of target points in step 402 is one or more. When both N and M are 1, a single real spatial point, i.e., a target point, is obtained directly; when N and M are not equal, the number of target points determined is multiple.
[0061] It should also be understood that in step 401, M ≥ N, where M is the number of right feature points on the right image of the target image captured by the right camera, and N is the number of left feature points on the left image of the target image captured by the left camera. When N > M, that is, when the number of left feature points is greater than or equal to the number of right feature points, the solution in step 402 can be replaced by "the binocular vision positioning device performs stereo matching processing on any one of the M left feature points and each of the N-M+1 left feature points to determine the target point." Furthermore, the distinction between the left and right cameras mentioned above is only based on orientation; the positions of the left and right cameras can be interchanged.
[0062] In one possible implementation, step 402 includes: the binocular vision positioning device performs stereo matching processing on the i-th left feature point among N left feature points and the i-th right feature point, the (i+1)-th right feature point to the (i+MN)-th right feature point among M right feature points to determine the target point, where the i-th left feature point is any one of the N left feature points, and N≥i≥1.
[0063] Figure 5This is a schematic diagram of determining a target point provided in an embodiment of this application.
[0064] For example, when the number of left feature points obtained from the left image is 2 and the number of right feature points obtained from the right image is 3, such as Figure 5 As shown, the feature point P on the left image obtained by the left camera is... l1 and P l2 The feature point P on the right image obtained by the right camera r1 P r2 and P r3 For example, the binocular vision positioning device performs stereo matching between the first left feature point out of two left feature points and the first and second right feature points out of three right feature points, respectively, to obtain the target point P. 11 and P 12 And perform stereo matching between the second left feature point of the two left feature points and the second and third right feature points of the three right feature points to obtain the target point P. 22 and P 23 Therefore, the target point is P. 11 P 12 P 22 and P 23 .
[0065] In another possible implementation, step 402 includes: the binocular vision positioning device determining the i-th left optical axis based on the line connecting the optical center of the left camera to the i-th left feature point among N left feature points; the binocular vision positioning device determining the i-th right optical axis and the (i+1)-to-(i+MN ...
[0066] It should be understood that both of the above possible ways of determining the target point can be regarded as specific implementation methods of step 402, while the other possible implementation method can be regarded as a specific implementation process of one possible implementation method.
[0067] Figure 5 This is a schematic diagram of determining a target point provided in an embodiment of this application.
[0068] For example, continue with Figure 5 Taking the situation shown as an example, when the number of left feature points obtained from the left image is 2 and the number of right feature points obtained from the right image is 3, as follows: Figure 5 As shown, the feature point P on the left image obtained by the left camera is... l1 and Pl2 The feature point P on the right image obtained by the right camera r1 P r2 and P r3 For example, the binocular vision positioning device determines the first left optical axis by connecting the optical center of the left camera to the first of the two left feature points; it determines the first and second right optical axes by connecting the optical center of the right camera to the first and second of the three right feature points; and it determines the intersection points of the first left optical axis with the first and second right optical axes, i.e., P. 11 and P 12 The target point is determined. The binocular vision positioning device uses a line connecting the optical center of the left camera to the second of the two left feature points to determine the second left optical axis; the binocular vision positioning device uses a line connecting the optical center of the right camera to the second and third of the three right feature points to determine the second and third right optical axes; the binocular vision positioning device uses the intersection points of the second left optical axis with the second and third right optical axes, i.e., P... 22 and P 23 This is identified as the target point. Therefore, the target point is P. 11 P 12 P 22 and P 23 .
[0069] In one possible implementation, after step 402, the binocular vision positioning system uses the pixel coordinates of any one of the N left feature points on the left image, the pixel coordinates of the M-N+1 right feature points on the right image, and the camera parameters of the left and right cameras to determine the world coordinates of any one of the target points.
[0070] In the above technical solution, based on the pixel coordinates of any left feature point in the left image, the pixel coordinates of M-N+1 right feature points out of M right feature points in the right image, the camera parameters of the left camera, and the camera parameters of the right camera, the world coordinates of any one of multiple target points can be obtained. In other words, through this process, the precise location of the target point can be determined, which has significant application value in the medical field. For example, in the medical field, it can assist in locating the position of a patient's diseased tissue and can track the position of surgical instruments in real time.
[0071] It should be understood that camera parameters include the extrinsic and intrinsic parameters of the left camera, and the extrinsic and intrinsic parameters of the right camera. The camera's intrinsic parameters are determined by the camera itself and are only related to the camera itself. Specifically, they include the camera's focal length, lens distortion parameters, and pixel size. The focal lengths of the left and right cameras are generally equal. The lens distortion parameters represent the magnitude of radial distortion. Pixel size refers to the actual length and width represented by one pixel. The camera's extrinsic parameters refer to the camera's pose in the world coordinate system, determined by the relative pose relationship between the camera and the world coordinate system. Specifically, they include rotation and translation vectors. The rotation vector describes the direction of the world coordinate system's coordinate axes corresponding to the camera's coordinate axes; the translation vector describes the position of the origin of the world coordinate system in the camera's coordinate system.
[0072] Figure 6 This is a schematic diagram illustrating how to determine the world coordinates of a target point according to an embodiment of this application.
[0073] For example, with P 11 As a target point, such as Figure 6 As shown, this describes how to determine the target point P. 11 The process of determining world coordinates. The left feature point P on the left image acquired by the left camera. l1 The pixel coordinates are (u1, v1), and the right feature point P on the right image obtained by the right camera is... r1 The pixel coordinates are (u2, v2), and the focal lengths of both the left and right cameras are f. The target point P is determined using the following formula. 11 world coordinates (x w y w , z w ).
[0074]
[0075]
[0076] Where z1 and z2 are the distances from the left image to the origin of the left camera coordinate system and the right image to the origin of the right camera coordinate system, respectively. l A r These are the camera intrinsics for the left and right cameras, respectively. l R r t l t r These are the rotation vectors of the left camera, the rotation vector of the right camera, the translation vector of the left camera, and the translation vector of the right camera, respectively.
[0077] Figure 7 This is a schematic diagram of a device for determining a target point provided in an embodiment of this application.
[0078] For example, such as Figure 7 As shown, the device 700 includes:
[0079] The acquisition module 701 is used to acquire N left feature points in the left image of the target image captured by the left camera, and M right feature points in the right image of the target image captured by the right camera. The N left feature points and the M right feature points have a common pole line constraint relationship, M≥N, and M and N are integers greater than 0.
[0080] The determination module 702 is used to perform stereo matching processing between any one of the N left feature points and the M-N+1 right feature points among the M right feature points to determine the target point.
[0081] Optionally, the determining module 702 is specifically used to perform stereo matching processing between the i-th left feature point among the N left feature points and the i-th right feature point, the (i+1)-th right feature point to the (i+MN)-th right feature point among the M right feature points to determine the target point, where the i-th left feature point is any one of the N left feature points, and N≥i≥1.
[0082] Optionally, the determining module 702 is further specifically used to determine the i-th left optical axis based on the line connecting the optical center of the left camera to the i-th left feature point among the N left feature points; to determine the i-th right optical axis and the i+1 right optical axis to the i+MN right optical axis based on the line connecting the optical center of the right camera to the i-th right feature point, the i+1 right feature point to the i+MN right feature points among the M right feature points; and to determine the intersection points of the i-th left optical axis with the i-th right optical axis and the i+1 right optical axis to the i+MN right optical axis as target points.
[0083] Optionally, the determining module 702 is further configured to calibrate the left camera and the right camera using the optical center of the left camera or the optical center of the right camera as the origin of the world coordinate system, and determine the camera parameters of the left camera and the right camera.
[0084] Optionally, the determining module 702 is further configured to determine the world coordinates of any one of the target points in the left image, the pixel coordinates of any one of the N left feature points in the left image, the pixel coordinates of the M-N+1 right feature points in the right image, and the camera parameters of the left camera and the right camera.
[0085] Optionally, both the left and right cameras can be infrared cameras.
[0086] Figure 8 This is a schematic diagram of the structure of a device for determining a target point provided in an embodiment of this application.
[0087] For example, such as Figure 8 As shown, the device 800 for determining a target point includes: a memory 801, a processor 802, and a computer program 803 stored in the memory 801 and running on the processor 802, wherein when the processor 802 executes the computer program 803, the device for determining a target point can perform any of the methods for determining a target point described above.
[0088] This embodiment can divide the device for determining the target point into functional modules based on the above method example. For example, each function can be assigned to a separate module, or two or more functions can be integrated into one processing module. The integrated module can be implemented in hardware. It should be noted that the module division in this embodiment is illustrative and only represents one logical functional division. In actual implementation, there may be other division methods.
[0089] When each functional module is divided according to its corresponding function, the device for determining the target point may include an acquisition module and a determination module, etc. It should be noted that all relevant content of each step involved in the above method embodiments can be referenced from the functional descriptions of the corresponding functional modules, and will not be repeated here.
[0090] The device for determining target points provided in this embodiment is used to execute the above-described method for determining target points, and thus can achieve the same effect as the above-described implementation method.
[0091] When using integrated units, the device for determining the target point may include a processing module and a storage module. The processing module can be used to control and manage the actions of the device for determining the target point. The storage module can be used to support the execution of program code and data by the device for determining the target point.
[0092] The processing module may be a processor or a controller, which can implement or execute various exemplary logic blocks, modules, and circuits as disclosed in this application. The processor may also be a combination of computing functions, such as a combination of one or more microprocessors, a combination of digital signal processing (DSP) and microprocessors, etc., and the storage module may be a memory.
[0093] The device for determining a target point provided in the embodiments of this application may specifically be a chip, component, or module. The device for determining a target point may include a connected processor and a memory. The memory is used to store instructions. When the device for determining a target point is running, the processor may call and execute the instructions to cause the chip to execute any of the aforementioned methods for determining a target point.
[0094] This embodiment provides a computer-readable storage medium storing instructions that, when executed on a computer or processor, cause the computer or processor to perform any of the methods for determining a target point described above.
[0095] This embodiment also provides a computer program product containing instructions that, when run on a computer or processor, causes the computer or processor to perform the aforementioned related steps to implement any of the methods for determining a target point described above.
[0096] In this embodiment, the device for determining the target point, the computer-readable storage medium, the computer program product containing instructions, or the chip are all used to execute the corresponding methods provided above. Therefore, the beneficial effects that can be achieved can be referred to the beneficial effects in the corresponding methods provided above, and will not be repeated here.
[0097] Through the above description of the embodiments, those skilled in the art will understand that, for the sake of convenience and brevity, only the division of the above functional modules is used as an example. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.
[0098] In the embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.
[0099] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method for determining a target point, characterized in that, The method is applied to a binocular vision positioning device, the binocular vision positioning device including a left camera and a right camera for acquiring images, and the method includes: Obtain N left feature points in the left image of the target scene captured by the left camera, and M right feature points in the right image of the target scene captured by the right camera. The N left feature points and the M right feature points have a common pole line constraint relationship, M≥N, and M and N are integers greater than 0. The i-th left optical axis is determined by the line connecting the optical center of the left camera to the i-th left feature point among the N left feature points. The i-th left feature point is the i-th feature point from left to right among the N left feature points, where N≥i≥1. Based on the lines connecting the optical center of the right camera to the i-th right feature point, the (i+1)-th right feature point to the (i+MN)-th right feature point among the M right feature points, the i-th right optical axis and the (i+1)-th right optical axis to the (i+MN)-th right optical axis are determined. The i-th right feature point to the (i+MN)-th right feature point are the i-th feature point to the (i+MN)-th feature point from left to right among the M right feature points. The intersection points of the i-th left optical axis with the i-th right optical axis and the intersection points of the (i+1)-th right optical axis with the (i+MN)-th right optical axis are determined as the target points.
2. The method according to claim 1, characterized in that, The method further includes: Using the optical center of the left camera or the optical center of the right camera as the origin of the world coordinate system, the left camera and the right camera are calibrated to determine their camera parameters.
3. The method according to claim 1, characterized in that, The method further includes: For the i-th target point, the world coordinates of the i-th target point are determined based on the pixel coordinates of the i-th left feature point among the N left feature points on the left image, the pixel coordinates of the i-th right feature point among the M right feature points on the right image, and the camera parameters of the left camera and the right camera.
4. The method according to claim 1, characterized in that, Both the left and right cameras are infrared cameras.
5. A device for determining a target point, characterized in that, The device includes: The acquisition module is used to acquire N left feature points in the left image of the target scene captured by the left camera, and M right feature points in the right image of the target scene captured by the right camera. The N left feature points and the M right feature points have a common pole line constraint relationship, M≥N, and M and N are integers greater than 0. The determination module is used to determine the i-th left optical axis based on the line connecting the optical center of the left camera to the i-th left feature point among the N left feature points, where the i-th left feature point is the i-th feature point from left to right among the N left feature points, N≥i≥1; and to determine the i-th right optical axis and the (i+1)-to-(i+MN ...
6. A device for determining a target point, characterized in that, The device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the computer program, it causes the device for determining the target point to perform the method for determining the target point as described in any one of claims 1 to 4.
7. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed on a computer or processor, cause the computer or processor to perform the method for determining a target point as described in any one of claims 1 to 4.
8. A computer program product containing instructions, characterized in that, When the computer program product is run on a computer or processor, it causes the computer or processor to perform the method for determining a target point as described in any one of claims 1 to 4.
Citation Information
Patent Citations
Binocular vision stereo matching method and device based on homonymous mark points
CN111028284A