Method and system for determining a predetermined point in an input image

By performing multiple shifts on the input image and combining machine learning and GPU computing, the problem of inaccurate calibration of digital imaging devices in automotive applications was solved, achieving efficient and accurate pre-point detection and improving the quality of camera calibration.

CN115222811BActive Publication Date: 2026-06-30APTIV TECHNOLOGIES AG

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
APTIV TECHNOLOGIES AG
Filing Date
2022-04-13
Publication Date
2026-06-30

Smart Images

  • Figure CN115222811B_ABST
    Figure CN115222811B_ABST
Patent Text Reader

Abstract

This disclosure relates to methods and systems for determining predetermined points in an input image, and more particularly to a computer-implemented method for determining predetermined points in an input image, the method comprising the following steps performed by computer hardware components: applying multiple shifts to the input image to obtain multiple shifted images; detecting a predetermined point in each shifted image; re-shifting the detected predetermined point; and determining the predetermined point based on the re-shifted detected predetermined point.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to methods and systems for determining predetermined points in an input image. Background Technology

[0002] Digital imaging devices (such as digital cameras) are used for a variety of tasks in automotive applications. However, proper calibration of digital imaging devices may be necessary.

[0003] Therefore, there is a need to provide an effective and reliable method for camera calibration. Summary of the Invention

[0004] This disclosure provides a computer implementation method, a computer system, and a non-transitory computer-readable medium.

[0005] In one aspect, this disclosure relates to a computer-implemented method for determining a predetermined point in an input image, the method comprising the following steps performed (in other words, implemented) by computer hardware components: applying multiple shifts to the input image to obtain a plurality of shifted images; detecting the predetermined point in each shifted image; re-shifting the detected predetermined point; and determining the predetermined point based on the re-shifted detected predetermined point.

[0006] In other words, detection can be performed on the shifted image, and the detection results can be shifted back (or re-shifted) to the original coordinates, so that a predetermined point can be determined based on the re-shifted detection results.

[0007] One of the shifts can be zero, so that a shifted image can be the original image.

[0008] According to the implementation method, detecting the predetermined point includes: determining the coordinates (of the predetermined point in the shifted image).

[0009] According to the implementation method, the re-shifting includes shifting the coordinates to the coordinate system of the input image.

[0010] According to the implementation method, detecting a predetermined point includes: determining the probability (that the predetermined point appears at the corresponding predetermined coordinates).

[0011] According to the implementation method, determining the predetermined point based on the detected predetermined point after re-shifting includes: accumulating the probability of the detected predetermined point after re-shifting.

[0012] According to the implementation method, the input image is divided into multiple units. Each unit can be a rectangular sub-part of the input image. All units can have the same size, or they can have different sizes. The units can overlap or not overlap.

[0013] According to an embodiment, the shifted image is shifted by at least one of at least approximately 1% of the cell size, or at least approximately 2% of the cell size, or at least approximately 5% of the cell size, or at least approximately 10% of the cell size.

[0014] For example, the cell size can be 8×8 (in other words: 8 pixels wide and 8 pixels high). The shift can be an input parameter to this method. The shift can be an integer (positive or negative), and the absolute value of the shift can be less than the cell size. This method can be executed in parallel, allowing several shifts to be set simultaneously; for example, the shifts can be selected from a list of the following: [-6, -4, -2, 0, 2, 4, 6], and the results from all shifts can be combined and used simultaneously.

[0015] Shifting can be performed in two components, such as vertical and horizontal. For example, the length of a vertical shift can be the same as the length of a horizontal shift (thus providing a diagonal shift).

[0016] According to the implementation, the image includes (in other words: displays; in other words: contains) a checkerboard pattern.

[0017] According to the implementation method, the predetermined point includes the saddle point of the chessboard grid.

[0018] According to the implementation method, a saddle point is the intersection of the black and white boundaries of a chessboard grid.

[0019] According to the implementation method, machine learning methods are used to determine predetermined points in each shifted image.

[0020] According to the implementation method, the machine learning method includes artificial neural networks.

[0021] In another aspect, this disclosure relates to a computer system comprising a plurality of computer hardware components configured to perform some or all of the steps of the computer implementation method described herein.

[0022] A computer system may include multiple computer hardware components (e.g., a processor, such as a processing unit or processing network; at least one memory, such as a memory cell or memory network; and at least one non-transitory data storage device). It should be understood that additional computer hardware components may be provided and used to perform the steps of the computer-implemented method within the computer system. The non-transitory data storage and / or memory cell may include computer programs that instruct the computer, for example, to use the processing unit and at least one memory cell to perform some or all of the steps or aspects of the computer-implemented method described herein.

[0023] According to one implementation, the computer system also includes a camera configured to acquire images.

[0024] In another aspect, this disclosure relates to a non-transitory computer-readable medium comprising instructions for performing several or all of the steps or aspects of the computer-implemented methods described herein. The computer-readable medium may be configured as: an optical medium, such as an optical disc (CD) or digital versatile disc (DVD); a magnetic medium, such as a hard disk drive (HDD); a solid-state drive (SSD); a read-only memory (ROM), such as flash memory; and so on. Furthermore, the computer-readable medium may be configured as a data storage device accessible via a data connection such as an internet connection. The computer-readable medium may, for example, be an online database or cloud storage.

[0025] This disclosure also relates to a computer program for instructing a computer to perform some or all of the steps or aspects of the computer implementation methods described herein.

[0026] Using the methods and apparatus described in this paper, a technique can be provided to identify saddle points in camera calibration patterns using machine learning. Attached Figure Description

[0027] This document describes exemplary embodiments and functions of the present disclosure in conjunction with the following schematically illustrated figures:

[0028] Figure 1A This is a diagram illustrating an example of the input data;

[0029] Figure 1B This is an illustration of another example of input data;

[0030] Figure 2 This is a diagram illustrating an example of a saddle point;

[0031] Figure 3A This is a diagram illustrating the shifting of input data according to various implementation methods;

[0032] Figure 3B These are illustrations of alignment according to various embodiments; and

[0033] Figure 4 This is a flowchart illustrating a method for determining predetermined points in an input image according to various embodiments.

[0034] List of reference numerals

[0035] An illustration of an example of 100 input data.

[0036] A diagram illustrating another example of 150 input data.

[0037] Illustration of 200 saddle point examples

[0038] 300 Illustrations of input data shifting according to various implementation methods

[0039] Unit 302

[0040] Unit 304

[0041] Unit 306

[0042] Unit 308

[0043] 310 shift

[0044] 312+ shift units

[0045] 350 Illustrations of alignment according to various embodiments

[0046] 352 points detected

[0047] 354 The point after alignment

[0048] 400 shows a flowchart illustrating a method for determining predetermined points in an input image according to various embodiments.

[0049] 402 Steps to apply multiple shifts to the input image to obtain multiple shifted images

[0050] 404 Steps for detecting predetermined points in each shifted image

[0051] 406 Steps for re-shifting the detected predetermined point

[0052] 408 Steps for determining the predetermined point based on the detected predetermined point of re-shifting Detailed Implementation

[0053] According to various implementation methods, non-maximum suppression methods for saddle point detectors can be provided.

[0054] Saddle points on an image can be one of the most distinctive features extracted from the image.

[0055] Saddle points can be widely used for the internal calibration of cameras, the external calibration of cameras, and / or matching feature points.

[0056] Figure 1A Illustration 100 shows an example of input data (i.e., an input image). You can see how much distortion occurs in the image. The image shows a checkerboard pattern.

[0057] Figure 1B Illustration 150 shows another example of input data. It can be seen that the checkerboard pattern can be set in more than one plane.

[0058] Figure 2 Illustration 200 shows an example of a saddle point. Figure 2At the top, pixel values ​​are shown above the pixel coordinates. At the bottom, the image is shown, with the saddle point located at the intersection between the boundaries of the dark and bright areas.

[0059] The accuracy of saddle point location can have a significant impact on the quality of camera calibration. Furthermore, it may be desirable to detect all saddle points, as failure to do so can lead to serious problems in later stages of the calibration process (related to issues with matching saddle points and their predicted locations).

[0060] In methods that rely on fixed-size regions of interest to detect and locate saddle points, problems can arise when the saddle point is located on the boundary between two or more regions of interest. This location often leads to a situation where a particular saddle point is neither associated with a region of interest nor detected. According to various implementations, recall of saddle points located on the boundary between regions of interest can be increased.

[0061] According to various implementations, a non-maximum suppression method can be provided that can increase the recall of neural networks to saddle points. The non-maximum suppression (NMS) method can ensure that no more than one pixel (or one location, which may have sub-pixel resolution) is identified as the maximum value, even if multiple pixels in the image may take the maximum value.

[0062] Standard NMS methods may require "all and all" matching, which can lead to O(n) time complexity. 2 The complexity is O(n) times that of the image, where n is the width (or height). In contrast, by utilizing a raster structure, a complexity of O(n) times can be achieved according to various implementations.

[0063] Depending on the implementation, vectorization techniques (SIMD (Single Instruction, Multiple Data) instructions) can be used to achieve even higher speeds. Furthermore, computation on mesh structures with independent cells can be easily ported to the GPU (Graphics Processing Unit). This may mean that when the main method (which can be referred to as the kernel method) can run on the GPU, NMS can also be executed on the GPU without sending data back to the CPU (Central Processing Unit). Exchanging data between the CPU and GPU can be time-consuming. Depending on the implementation, the amount of data transferred between the CPU and GPU can be minimized.

[0064] When a saddle point occurs very close to the edge or corner of a cell, it often goes undetected. According to various implementations, this problem can be overcome by passing the image through a neural network several times. Each time the image passes through the neural network, the frames can be positioned differently, causing the top-left corner of the photograph (or image) to shift from its original position.

[0065] This method can move saddle points near cell boundaries away from the boundaries. However, this method may require the use of a non-maximum suppression (NMS) method because most saddle points may be detected repeatedly. However, the NMS method is computationally expensive.

[0066] According to various implementations, this problem can be overcome by using a mesh structure to balance the speed of method execution. Outputs from different runs with different offsets can be aligned such that they return values ​​in a reference system. Thus, for each cell, the final position of the saddle point in the cell can be computed independently and in parallel using a weighted average, where the weights are determined by the probability of the point's existence.

[0067] Figure 3A A diagram 300 illustrating input data shifting according to various embodiments is shown. The input image can be divided into multiple units, such as four units 302, 304, 306, and 308. For example, the first unit 302 can be the upper left corner unit. Saddle points can be detected for multiple shifted images (or shifted units). For example, for the first unit 302, a shift 310 can be applied to generate multiple shifted units 312 corresponding to the first unit 302.

[0068] Figure 3B Illustration 350 shows alignment according to various embodiments. The alignment process will be illustrated by examining how the saddle point moves during alignment.

[0069] Position the (detected) saddle point 352 within the lower right cell (within the subregions designated 1, 2, 3, and 4). Make the shift equal to... Figure 3B The offset shown.

[0070] The saddle point is located in one of the four sub-regions, and the aligned point 354 can be in the same region or in a different region:

[0071] If the point is within region 1, it may move to cell 1' after alignment. The position in cell 1' after alignment can be calculated based on a given offset (but in the opposite direction). The same applies to sub-regions 2 and 3. For sub-region 4, the detected point may still be in the same cell (the cell to the lower right), and only the position may be affected by the offset.

[0072] In highly distorted regions of an image, saddle points may be stretched, making their size larger than the cell size. In such cases, saddle point detectors may struggle to produce correct output when the center of the saddle point is far from the center of the cell. While conventionally used methods may miss saddle points, methods implemented according to various methods can correctly detect all saddle points.

[0073] The methods implemented according to various methods can be rapid and can increase recall from 98% to 99.9% (reducing the number of missed saddle points by more than 20 times).

[0074] Ground truth (GT) data can be created semi-automatically. Results from the "standard detector" can be manually corrected. False positives can be removed, and missing detections can be added. A potential problem is the difficulty in accurately determining the location of saddle points. When creating ground truth data, saddle point locations can only be estimated within ±1px accuracy. A prediction smaller than two pixels from the GT can be considered a true positive (TP). In this case, recall can be defined as: #TP / (#all GTs on the image). It should be understood that the # symbol represents "number".

[0075] As mentioned above, it can be difficult to assess the “position error” of the prediction. The location of the saddle point can be used for the camera’s inherent calibration. Therefore, two sets of detected saddle points on the same image can be compared by checking how well the camera calibration is done (in other words, for which set of saddle points the camera calibration (model optimization) residuals are smaller).

[0076] Figure 4 A flowchart 400 illustrates a method for determining a predetermined point in an input image according to various embodiments. At 402, multiple shifts can be applied to the input image to obtain multiple shifted images. At 404, a predetermined point can be detected in each shifted image. At 406, the detected predetermined point can be re-shifted. At 408, the predetermined point can be determined based on the re-shifted detected predetermined point.

[0077] According to various implementation methods, detecting a predetermined point may include or may include determining coordinates.

[0078] According to various implementations, re-shifting may include or may shift the coordinates to the coordinate system of the input image.

[0079] According to various implementation methods, the detection of predetermined points may include or may be a certain probability.

[0080] According to various implementations, determining a predetermined point based on the detected predetermined point after re-shifting may include or may accumulate the probability of the detected predetermined point after re-shifting.

[0081] According to various implementation methods, the input image can be divided into multiple units.

[0082] According to various embodiments, the shifted image may be shifted by at least one of at least approximately 1% of the cell size, or at least one of at least approximately 2% of the cell size, or at least one of at least approximately 5% of the cell size, or at least one of at least approximately 10% of the cell size.

[0083] According to various implementations, the image may include or may show a checkerboard pattern.

[0084] According to various implementation methods, the predetermined point may include or may be a saddle point of a chessboard.

[0085] According to various implementation methods, a saddle point can be the intersection of the black and white boundaries of a chessboard.

[0086] According to various implementation methods, machine learning methods can be used to determine predetermined points in each shifted image.

[0087] Depending on the implementation method, machine learning methods may include or may be artificial neural networks.

[0088] Each of steps 402, 404, 406, 408, and the other steps described above, can be performed by computer hardware components.

Claims

1. A computer-based method for determining predetermined points in an input image. The computer implementation method includes the following steps performed by computer hardware components: - Apply (402) multiple shifts to the input image to obtain multiple shifted images; - Detect a predetermined point (404) in each of the shifted images; - The predetermined point detected by the re-shift (406); as well as - Determine the predetermined point (408) based on the detected predetermined point after re-shifting; The step of detecting the predetermined point includes determining the coordinates; The re-shifting step includes shifting the coordinates to the coordinate system of the input image; The step of detecting the predetermined point includes determining the probability that the predetermined point appears at the corresponding predetermined coordinates; The step of determining the predetermined point based on the detected predetermined point after re-shifting includes: accumulating the probability of the detected predetermined point after re-shifting; The image includes a chessboard pattern; Wherein, the predetermined point includes the saddle point of the chessboard square; and The predetermined point in each of the shifted images is determined using a machine learning method.

2. The computer implementation method according to claim 1, in, The input image is divided into multiple units.

3. The computer implementation method according to claim 2, in, The shifted image is shifted by at least 1% of the height or width of the cell.

4. The computer implementation method according to claim 2, in, The shifted image is shifted by at least 2% of the height or width of the cell.

5. The computer implementation method according to claim 2, in, The shifted image is shifted by at least 5% of the height or width of the cell.

6. The computer implementation method according to claim 2, in, The shifted image is shifted by at least 10% of the height or width of the cell.

7. The computer implementation method according to claim 1, in, A saddle point is the intersection of the black and white boundaries of the chessboard square.

8. The computer implementation method according to claim 1, in, The machine learning methods include artificial neural networks.

9. A computer system comprising a plurality of computer hardware components configured to perform the steps of the computer-implemented method according to any one of claims 1 to 8.

10. The computer system according to claim 9, further comprising: A camera configured to acquire images.

11. A non-transitory computer-readable medium comprising instructions for performing a computer-implemented method according to any one of claims 1 to 8.