Structured-light three-dimensional reconstruction method and apparatus, electronic device, and storage medium

By combining integer coding patterns and line-shifting stripe patterns, the shortcomings of existing structured light 3D reconstruction technology in terms of accuracy and efficiency are solved, achieving high-precision 3D reconstruction, especially accurate reconstruction in complex scenes.

WO2026138359A1PCT designated stage Publication Date: 2026-07-02HANGZHOU HIKROBOT TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HANGZHOU HIKROBOT TECH CO LTD
Filing Date
2025-11-28
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing structured light 3D reconstruction technology has shortcomings in terms of accuracy and efficiency, especially in achieving high-precision 3D reconstruction when dealing with complex scenes.

Method used

By acquiring images of multi-frame integer-coded patterns and line-shifted fringe patterns of the object under test, and using the integer-coded patterns to assist the line-shifted fringe patterns in phase unwrapping, the accuracy of phase unwrapping is improved, ultimately achieving high-precision 3D reconstruction.

Benefits of technology

Improving the accuracy of phase understanding enables high-precision 3D reconstruction, especially in complex scenes where the 3D shape of objects can be reconstructed more accurately.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025138385_02072026_PF_FP_ABST
    Figure CN2025138385_02072026_PF_FP_ABST
Patent Text Reader

Abstract

The present application provides a structured-light three-dimensional reconstruction method and apparatus, an electronic device, and a storage medium. In an example of the present application, the structured-light three-dimensional reconstruction method comprises: acquiring images of an object to be measured; performing phase extraction processing on first-type images among the images of the object, so as to obtain a first phase extraction result; performing phase extraction processing on second-type images among the images of the object, so as to obtain a second phase extraction result; on the basis of the second phase extraction result, performing phase unwrapping processing on the first phase extraction result, so as to obtain an absolute phase map; and performing three-dimensional reconstruction processing on the basis of the absolute phase map.
Need to check novelty before this filing date? Find Prior Art

Description

Structured light 3D reconstruction methods, devices, electronic equipment and storage media Technical Field

[0001] This application relates to the field of machine vision technology, and in particular to a structured light 3D reconstruction method, apparatus, electronic device and storage medium. Background Technology

[0002] Structured light 3D reconstruction is a common 3D imaging technique that uses a structured light source (such as a laser or projector) to generate a specific light spot to capture the shape and depth information of an object's surface.

[0003] Structured light 3D reconstruction has broad application prospects in fields such as industrial manufacturing, virtual reality, and cultural heritage protection. Summary of the Invention

[0004] In view of this, this application provides a structured light three-dimensional reconstruction method, apparatus, electronic device, and storage medium.

[0005] According to a first aspect of the embodiments of this application, a structured light 3D reconstruction method is provided, comprising: acquiring an image of a test object; wherein the test object is projected with a specified pattern, the specified pattern including a multi-frame integer-coded pattern and a multi-frame line-shifted fringe pattern projected sequentially, the number of the integer-coded patterns being consistent with the period of the line-shifted fringe pattern in the spatial domain; performing dephase processing on a first type image in the image of the test object to obtain a first dephase result; wherein the first type image is an image of the test object acquired when the multi-frame line-shifted fringe pattern is projected sequentially; performing dephase processing on a second type image in the image of the test object to obtain a second dephase result; wherein the second type image is an image of the test object acquired when the multi-frame integer-coded pattern is projected sequentially; performing unwrapping processing on the first dephase result based on the second dephase result to obtain an absolute phase map; and performing 3D reconstruction processing based on the absolute phase map.

[0006] According to a second aspect of the embodiments of this application, a structured light three-dimensional reconstruction apparatus is provided, comprising: an acquisition unit for acquiring an image of a test object; wherein the test object is projected with a specified pattern, the specified pattern including a multi-frame integer-coded pattern and a multi-frame line-shifted fringe pattern projected in sequence, the number of the integer-coded patterns being consistent with the period of the line-shifted fringe pattern in the spatial domain; a dephase unit for performing dephase processing on a first type of image in the image of the test object to obtain a first dephase result; wherein the first type of image is an image of the test object acquired when the multi-frame line-shifted fringe pattern is projected in sequence; the dephase unit is further configured to perform dephase processing on a second type of image in the image of the test object to obtain a second dephase result; wherein the second type of image is an image of the test object acquired when the multi-frame integer-coded pattern is projected in sequence; an unwrapping unit for performing unwrapping processing on the first dephase result based on the second dephase result to obtain an absolute phase map; and a reconstruction unit for performing three-dimensional reconstruction processing based on the absolute phase map.

[0007] According to a third aspect of the present application, an electronic device is provided, including a processor and a memory, the memory storing machine-executable instructions executable by the processor, the processor being configured to execute the machine-executable instructions to implement the method provided in the first aspect.

[0008] According to a fourth aspect of the embodiments of this application, a machine-readable storage medium is provided, wherein machine-executable instructions are stored therein, and when the machine-executable instructions are executed by a processor, the method provided in the first aspect is implemented.

[0009] The structured light 3D reconstruction method of this application acquires an image of a test object projected with a specified pattern, obtaining multiple frames of first-type images and multiple frames of second-type images. Phase dephase processing is performed on the first-type images to obtain a first phase dephase result, and phase dephase processing is performed on the second-type images to obtain a second phase dephase result. Based on the second phase dephase result, the first phase dephase result is unwrapped to obtain an absolute phase map. Then, 3D reconstruction is performed based on the absolute phase map. By combining integer-coded patterns and line-shifted fringe patterns, the phase unwrapping accuracy is improved by using integer-coded patterns to assist the line-shifted fringe patterns, thereby achieving high-precision 3D reconstruction. Attached Figure Description

[0010] Figure 1 is a flowchart illustrating a structured light three-dimensional reconstruction method provided in an embodiment of this application.

[0011] Figures 2A to 2D are schematic diagrams of the line-shifted stripe patterns provided in the embodiments of this application.

[0012] Figures 3A and 3B are schematic diagrams of an integral calculation process provided in an embodiment of this application.

[0013] Figure 4 is a schematic diagram of the temporal grayscale and temporal template integration result of a single pixel provided in an embodiment of this application.

[0014] Figure 5 is a schematic diagram of an integral result period expansion provided by an embodiment of this application.

[0015] Figure 6 is a flowchart illustrating a structured light three-dimensional reconstruction method provided in an embodiment of this application.

[0016] Figure 7 is a schematic diagram of a structured light three-dimensional reconstruction device provided in an embodiment of this application.

[0017] Figure 8 is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0018] To enable those skilled in the art to better understand the technical solutions provided in the embodiments of this application, some terms involved in the embodiments of this application will be briefly explained below.

[0019] 1. Integer-coded patterns: Images are encoded using binary numbers, with each pixel in the image corresponding to an integer code value. This application collectively refers to projection patterns such as Gray code and XOR code used to assist in phase unwrapping of line shifting fringes as integer-coded patterns.

[0020] 2. Gray code: An integer encoding pattern, also called reflected binary code, is a special binary encoding method. In Gray code, two adjacent values ​​differ by only one binary bit.

[0021] 3. XOR code: An integer encoding pattern, also known as XOR code or parity check code, is a special binary encoding method. Similar to Gray code, two adjacent values ​​differ by only one bit.

[0022] 4. Phase Decoding: The process of calculating the code value of the acquired coded image according to the structured light pattern encoding method, also known as decoding.

[0023] 5. Temporal template integration: Calculate the integration result of the gray value of the acquired sequence of line-shifted stripe images in the temporal domain with a given template (which can be called the temporal template).

[0024] 6. Line shift stripe patterns are a set of specially designed images modulated by binary stripes (in other words, pixel values ​​are only 0 and 1, either black or white). The stripe width and period are specially designed, see Figures 2A-2D for example.

[0025] To make the above-mentioned objectives, features and advantages of the embodiments of this application more apparent and understandable, the technical solutions of the embodiments of this application will be further described in detail below with reference to the accompanying drawings.

[0026] It should be noted that the sequence number of each step in the embodiments of this application does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0027] Please refer to Figure 1, which is a flowchart of a structured light three-dimensional reconstruction method provided in an embodiment of this application. As shown in Figure 1, the structured light three-dimensional reconstruction method may include steps S100 to S140.

[0028] Step S100: Acquire an image of the object under test; wherein the object under test is projected with a specified pattern, the specified pattern including a multi-frame integer-coded pattern and a multi-frame line-shifted stripe pattern projected sequentially, the number of integer-coded patterns being consistent with the period of the line-shifted stripe pattern in the spatial domain. The consistency between the number of integer-coded patterns and the period of the line-shifted stripe pattern in the spatial domain can be understood as: the number of integer-coded patterns being consistent with the number of pixels included in the horizontal direction within one period of the line-shifted stripe pattern in the spatial domain.

[0029] In this embodiment of the application, in order to achieve three-dimensional reconstruction of the object under test, a multi-frame integer coded pattern and a multi-frame line-shifted stripe pattern can be sequentially projected onto the object under test using a structured light source.

[0030] For example, for any line-shifted stripe pattern, the pixel values ​​in the same column are the same, the pixel values ​​in the same row are not exactly the same, and it exhibits periodicity (and may include multiple periods).

[0031] For example, a schematic diagram of a line-shifted stripe pattern can be shown in Figure 2A (Figure 2A takes three cycles in the horizontal direction as an example). That is, the line-shifted stripe pattern can include multiple cycles in the spatial domain (i.e., in the horizontal direction of a frame of line-shifted stripe image), and there are multiple pixels in one cycle. For example, the cycle shown in Figure 2A along the horizontal direction includes 8 pixels.

[0032] As shown in Figure 2A, the period of the line-shifting stripe pattern in the spatial domain is 8, meaning that the line-shifting stripe pattern can be shifted 8 times in one period. The period of 8 in the spatial domain means that the pattern repeats a complete waveform pattern once every 8 pixels in physical space (such as a projection plane or image plane).

[0033] For example, a multi-frame line-shifted stripe pattern projected in sequence may include a set of line-shifted stripe patterns obtained by cyclic shifting (which may be referred to as a sequence of line-shifted stripe patterns).

[0034] For example, the sequence line shift stripe pattern shifts by one pixel in the time domain each time.

[0035] For example, the above-mentioned multi-frame line-shifted stripe pattern may include 8 frames of line-shifted stripe patterns, including the line-shifted stripe pattern shown in FIG2A, and a new line-shifted stripe pattern obtained by cyclically shifting the line-shifted stripe pattern shown in FIG2A, wherein the schematic diagrams of 2 frames can be shown in FIG2B and FIG2C.

[0036] For example, an image of the object being tested projected with the specified pattern can be obtained.

[0037] For example, the acquired images of the object under test include: an image of the object under test acquired when a line shift stripe pattern is projected (which may be referred to as a first type image), and an image of the object under test acquired when an integer encoded pattern is projected (which may be referred to as a second type image).

[0038] For example, images of the object under test obtained by projecting different line shift stripe patterns correspond to different first-type images; images of the object under test obtained by projecting different integer encoding patterns correspond to different second-type images.

[0039] For example, an image of the object being measured can be acquired using a monocular camera or a binocular camera.

[0040] For example, the first type of image and the second type of image have the same resolution.

[0041] For example, the first type of image and the second type of image are acquired using the same camera, and the object under test and the structured light system must be relatively stationary during the acquisition of the first type of image and the second type of image.

[0042] Step S110: Perform phase dephase processing on the first type of image in the image of the object under test to obtain the first phase dephase result.

[0043] Step S120: Perform phase dephase processing on the second type of image in the image of the object under test to obtain the second phase dephase result.

[0044] In this embodiment of the application, the obtained first type of image can be subjected to phase dephase processing to obtain the corresponding phase dephase result (referred to as the first phase dephase result in this document).

[0045] For the acquired second type of image, phase demodulation processing can be performed to obtain the corresponding phase demodulation result (referred to as the second phase demodulation result in this paper).

[0046] It should be noted that, in order to improve the accuracy of subsequent processing, the acquired image of the object under test can be preprocessed before subsequent processing such as phase resolution, such as filtering and distortion correction.

[0047] For example, Gaussian filtering can be used to preprocess the acquired image to reduce image noise interference, and distortion correction can be performed on the image based on the camera calibration results.

[0048] For example, for a binocular system, epipolar correction can also be performed to obtain a distortion-free image with aligned rows of images from the left and right cameras.

[0049] Step S130: Based on the second solution phase result, unwrap the first solution phase result to obtain the absolute phase diagram.

[0050] In this embodiment, the line-shifted stripe image is used to achieve subpixel-level encoding.

[0051] Because line-shifted fringe images are periodic, the phase resolution results will also exhibit periodicity. For example, assuming the line-shifted fringe image has 128 periods in the horizontal direction, the phase resolution results will also have 128 periods. This necessitates phase wrapping to better represent and understand sub-pixel-level phase information; that is, using appropriate mathematical methods to constrain the phase values ​​within a specific range to obtain more accurate sub-pixel-level encoding results.

[0052] As can be seen, the solution phase obtained by solving the phase of the first type of image is a wrapped phase, which needs to be unwrapped to obtain the absolute phase.

[0053] Integer-coded images are used to determine the order of line-shifted fringes (i.e., the integer part of the absolute phase) to assist in phase unwrapping of the line-shifted fringe pattern.

[0054] Accordingly, after obtaining the first and second phase solutions in the manner described above, the first phase solution can be unwrapped based on the second phase solution to obtain the absolute phase diagram.

[0055] For example, the absolute phase map includes the absolute phase at each pixel location.

[0056] Step S140: Perform three-dimensional reconstruction based on the obtained absolute phase map.

[0057] In this embodiment of the application, after obtaining the absolute phase map in the manner described above, three-dimensional reconstruction processing can be performed based on the obtained absolute phase map.

[0058] For example, for a binocular system (i.e., the image of the object being measured is acquired by a binocular camera in step S100), point cloud and depth map can be calculated by binocular stereo matching and triangulation. The specific implementation can be described in detail below.

[0059] As can be seen, in the method flow shown in Figure 1, by acquiring an image of the object to be measured projected with a specified pattern, multiple frames of first-type images and multiple frames of second-type images are obtained. The first-type images are subjected to phase de-phase processing to obtain a first phase de-phase result, and the second-type images are subjected to phase de-phase processing to obtain a second phase de-phase result. Based on the second phase de-phase result, the first phase de-phase result is unwrapped to obtain an absolute phase map. Then, based on the absolute phase map, three-dimensional reconstruction processing is performed to achieve high-precision three-dimensional reconstruction.

[0060] In some embodiments, the above-described dephase processing of the first type of image in the image of the object under test to obtain the first dephase result may include: for any pixel position in the first type of image, performing an integration operation based on the grayscale value of different first type images at that pixel position and a temporal template to obtain the integral response value corresponding to each temporal template; wherein, the temporal template includes an initial template and a new template obtained by cyclically shifting the initial template, and the number of temporal templates is consistent with the period of the line-shifted fringe pattern in the spatial domain; the initial template is determined based on the theoretical encoding value of the specified pixel position in the temporal domain; and the first dephase result corresponding to the pixel position is determined based on the position of the maximum value of the integral response value in the temporal domain. The consistency between the number of temporal templates and the period of the line-shifted fringe pattern in the spatial domain can be understood as: the number of temporal templates is consistent with the number of pixels included in the horizontal direction in one period of the line-shifted fringe pattern in the spatial domain.

[0061] For example, in order to improve the decoding accuracy of line-shifted stripe patterns, the theoretical encoded value of a single pixel position in the time domain can be used as a template. By performing time-domain template integration on the sequence of line-shifted stripe images, the phase solution of the line-shifted stripe images can be realized based on the sub-pixel position corresponding to the maximum value in the integrated response value, thereby improving the phase solution accuracy and thus improving the accuracy of structured light 3D reconstruction.

[0062] For example, the time-domain template may include an initial template and a new template obtained by cyclically shifting the initial template.

[0063] The initial template is determined based on the theoretical encoding value of the specified pixel position in the time domain.

[0064] Taking the line-shifted stripe pattern shown in Figures 2A to 2C as an example, assuming that the initial template corresponds to the pixel position at the upper left corner of the image, the theoretical encoding value of this pixel position in the time domain is 1, 0, 0, 0, 0, 1, 1, 1 in sequence, and the initial template can be [1, 0, 0, 0, 0, 1, 1, 1].

[0065] The time-domain template may also include a new template obtained by cyclically shifting the initial template, which may include [1,1,0,0,0,0,1,1], [1,1,1,0,0,0,0,1], [1,1,1,1,0,0,0,0], [0,1,1,1,1,0,0,0], [0,0,1,1,1,1,0,0], [0,0,0,1,1,1,1,0], and [0,0,0,0,1,1,1,1].

[0066] It should be noted that, in this embodiment of the application, considering that the integration result is similar to a Gaussian distribution, increasing the amplitude of the integral response value can more accurately determine the maximum value of the integral response value and its corresponding position; when the time-domain template is represented by 1 and 0, it is equivalent to only accumulating the gray value corresponding to 1 during the integration process. Therefore, in order to increase the amplitude of the response value, 0 in the time-domain template can be set to -1. That is, the initial template can be [1, -1, -1, -1, -1, 1, 1, 1].

[0067] For example, for any pixel location in a first type of image, an integration operation is performed based on the grayscale value of that pixel location in different first type images and the temporal template to obtain the integral response value corresponding to that pixel location under each temporal template.

[0068] For example, taking the aforementioned temporal template as an example, a set of line-shifted stripe patterns may include 8 frames (3 of which can be as shown in Figures 2A to 2C, and the remaining 5 frames can be obtained by shifting). When the 8 frames of line-shifted stripe patterns are projected onto the object under test, an image of the object under test (i.e., the aforementioned first type of image) is acquired. For any pixel position, the grayscale value of the 8 first type images at that pixel position is acquired, and based on the grayscale value of the 8 first type images at that pixel position, integration is performed with each temporal template to obtain 8 integral response values. Taking the line shift coding pattern in Figure 2D as an example, the value of white pixels in the figure is 1, and the value of black pixels is 0. Each row in Figure 2D represents an image, and the vertical pixel values ​​of each image are the same (it can be understood that this shows one row of each image). Each image only moves in the horizontal direction (that is, the second image is based on the first image, with each column moving one pixel to the right, which is the meaning of "line shift"). The period of the line shift stripes in the schematic diagram of Figure 2D in the spatial domain is 4, and each period has 8 pixels (4 black and 4 white), for a total of 32 pixels. There are 8 images in the time domain, that is, the time domain period is 8.

[0069] Taking the initial template [1, -1, -1, -1, -1, 1, 1, 1] as an example, assuming that for a certain pixel position, the gray values ​​of these 8 frames of the first type of image at that pixel position are 28, 50, 24, 0, 0, 0, 0, 10 respectively, we can integrate [28, 50, 24, 0, 0, 0, 0, 10] with each temporal template to obtain the integral response value corresponding to each temporal template at that pixel position. The specific implementation can be seen in Figures 3A and 3B.

[0070] For example, for the initial template, the integral response value = 28*1 + 50*(-1) + 24*(-1) + 0*(-1) + 0*(-1) + 0*(-1) + 0*(-1) + 10*1 = -36. For the second time-domain template (i.e., the initial template is shifted right by 1 bit), the integral response value is 28*1+50*1+24*(-1)+0*(-1)+0*(-1)+0*(-1)+0*1+10*1=64; for the third time-domain template (i.e., the initial template is shifted right by 2 bits), the integral response value is 28*1+50*1+24*1+0*(-1)+0*(-1)+0*(-1)+0*(-1)+10*1=112; and so on, until the eighth time-domain template (i.e., the initial template is shifted right by 7 bits), the integral response value is 28*(-1)+50*(-1)+24*(-1)+0*(-1)+0*1+0*1+0*1+10*1=-92.

[0071] For example, once the integral response value corresponding to each time-domain template is determined, the first solution phase result corresponding to the pixel position can be determined based on the position of the maximum value of the integral response value in the time domain.

[0072] For example, assuming that the non-initial templates in the temporal template are obtained by cyclically shifting the initial template to the right, multiple new temporal templates (non-initial templates) can be obtained by cyclically shifting the initial template to the right. The integral response value corresponding to each temporal template is calculated. The position with the largest integral response value represents the position where, when the initial template is moved to the right, the integral response value is closest to the actual pixel intensity in the temporal domain (i.e., the actual acquired pixel grayscale). In other words, the position with the largest integral response value can uniquely identify a pixel within a single period.

[0073] For example, for an image obtained by projecting the pattern shown in Figure 2A onto the object under test, each of the eight pixels within one period along the horizontal direction can be calculated to have a unique decimal value. For instance, if the time-domain period is 8, the integral response values ​​calculated for a certain pixel on the eight images in the time domain are 50 80 100 60 20 0 0 0. The maximum integer value of the integral response is 100, which corresponds to position 3 in the time-domain period. To improve the accuracy of phase resolution, a more precise decimal position needs to be calculated. The decimal maximum value (let's say 105.5) and its corresponding position (let's say 2.8) are calculated near the maximum value of the integral response using the gray-scale centroid method. Finally, the phase resolution result for this pixel is the decimal position 2.8.

[0074] In one example, determining the first solution phase result corresponding to a pixel position based on the location of the maximum value of the integral response value in the time domain can include: determining the search range based on the location of the maximum value of the integral response value in the time domain and a preset window size; within the search range, determining the precise location of the maximum value corresponding to the pixel position using the gray-scale centroid method; and normalizing the precise location of the maximum value corresponding to the pixel position to obtain the first solution phase result corresponding to the pixel position.

[0075] For example, once the location of the maximum value of the integral response in the time domain is determined, the gray-scale centroid method can be used to determine the precise location of the maximum value in the time domain corresponding to that pixel location, and thus determine the phase result of the first solution corresponding to that pixel location. In other words, the precise location of the maximum value can be understood as similar to a sub-pixel location.

[0076] Accordingly, the search range can be determined based on the location of the maximum value of the integral response in the time domain and the preset window size. For example, taking the initial template as [1, -1, -1, -1, -1, 1, 1, 1], for a certain pixel position, the gray values ​​of the eight different first-type images at that pixel position are 28, 50, 24, 0, 0, 0, 0, 10 respectively. [28, 50, 24, 0, 0, 0, 0, 10] can be integrated with each time domain template to obtain the integral response value corresponding to each time domain template at that pixel position, which is [-36, 64, 112, 92, 36, -64, -112, -92]. The maximum value of the integral response in the time domain is the 3rd value, and the preset window size is 5, so the search range is determined to be [-36, 64, 112, 92, 36].

[0077] The actual temporal grayscale value of a single pixel and the integral result of the temporal template can be seen in Figure 4. As shown in Figure 4, the lower curve represents the temporal grayscale value of a single pixel, and the upper curve represents the integral response value of that single pixel under various temporal templates. The maximum value of the integral response corresponds to the edge where the pixel changes from bright to dark in the temporal domain of the line shift stripe.

[0078] Considering that the maximum value of the integral response may appear at the two boundaries of a period, such as the integral response values ​​of the following periods: [112,92,36,-64,-112,-92,-36,64] or [92,36,-64,-112,-92,-36,64,112], period expansion must be performed first when using the gray-scale centroid method. That is, the latter half of the integral result data is expanded to the beginning, and the first half of the integral result data is expanded to the end, so that the data on both sides of the period form a loop, which facilitates the calculation of the gray-scale centroid method.

[0079] For example, a schematic diagram of the periodic extension of the integral result can be shown in Figure 5.

[0080] The formula for calculating the precise location of the maximum value corresponding to a pixel position using the grayscale centroid method can be as follows:

[0081] Where h is the width of the half-window, max id The position with the largest integral response value is i, which corresponds to the template movement position, and pos is the integral result (the integral result after period expansion).

[0082] For φ t Normalization can be performed to obtain the corresponding first solution phase result, where:

[0083] Where N is the number of line shift steps, i.e., the number of projected images (consistent with the number of values ​​in the time-domain template). For example, for an 8-step line shift, i.e., N=8, the time-domain template includes 8 values.

[0084] It should be noted that the gray-scale centroid method is only an example of one method for calculating the precise location of the maximum value corresponding to the pixel position in this application embodiment, and is not a limitation on the range of the precise location of the maximum value. In this application embodiment, other methods can also be used to calculate the precise location of the maximum value after integrating the line shifting stripes, such as the quadratic curve fitting method, the specific implementation of which will not be elaborated here.

[0085] In some embodiments, the above-described dephase processing of the second type of image in the image of the object under test to obtain a second dephase result may include: for any pixel position in the second type of image, performing binarization processing on the gray values ​​of different second type images at that pixel position according to the binarization threshold of that pixel position to obtain the binarization results of different second type images at that pixel position; and determining the second dephase result of that pixel position based on the binarization results of different second type images at that pixel position.

[0086] For example, for any pixel location in a second-type image, the grayscale values ​​of multiple second-type images at that pixel location are binarized according to the binarization threshold of that pixel location, to obtain the binarization results of multiple second-type images at that pixel location.

[0087] For example, assuming that the sequentially projected integer-coded patterns include 8 frames, after projecting these 8 different integer-coded patterns onto the object under test, 8 frames of second-type images corresponding to these 8 different integer-coded patterns can be acquired. For any pixel position, the grayscale value of the 8 frames of second-type images at that pixel position can be binarized according to the binarization threshold of that pixel position.

[0088] For example, for any gray value at a pixel location in the 8 frames of the second type of image, if the gray value is greater than the binarization threshold at that pixel location, the gray value at that pixel location can be set to 1; otherwise, it can be set to 0.

[0089] For example, the second solution phase result at the pixel location can be determined based on the binarization results of multiple second-type images at that pixel location.

[0090] For example, assuming that the binarization results of 8 frames of the second type of image at this pixel position are 00001111 in sequence, the mapping rule of the binarization result can be determined according to the encoding method of the integer coding pattern, and the binarization result can be mapped based on the mapping rule. For example, the mapping is to the result of sequential arrangement and increment. Then, the second solution phase result can be determined based on the mapped result.

[0091] In one example, for any pixel location in multiple second-type images, the binarization threshold for that pixel location can be determined as follows: based on the grayscale values ​​of each first-type image at that pixel location, determine the maximum and minimum grayscale values; based on the maximum and minimum grayscale values, determine the binarization threshold for that pixel location.

[0092] To enable those skilled in the art to better understand the technical solutions provided in the embodiments of this application, the technical solutions provided in the embodiments of this application are described below in conjunction with specific application scenarios.

[0093] In this embodiment, as shown in Figure 6, structured light 3D reconstruction may include steps such as coded image projection and acquisition, image preprocessing, image correction, image phase deconstruction, and point cloud and depth map calculation.

[0094] The implementation of each step will be explained below.

[0095] I. Encoded Image Projection and Acquisition

[0096] Using a structured light source such as a projector (or laser galvanometer), multiple frames of integer-coded patterns and multiple frames of line-shifted stripe patterns are sequentially projected onto the surface of the object being measured. Images of the object are then simultaneously acquired by a camera, such as a binocular camera, for the purpose of 3D reconstruction of the object.

[0097] For example, the integer encoding pattern may include projection patterns such as Gray code or XOR code used to assist the line shift stripe pattern in wrapping phase unwrapping.

[0098] II. Image Preprocessing

[0099] For example, Gaussian filtering can be performed on the acquired image to reduce image noise interference.

[0100] III. Image Correction

[0101] For example, distortion correction can be performed on the image based on the camera's calibration results.

[0102] For example, for a binocular system, epipolar correction is also required to obtain a distortion-free image with aligned rows of images from the left and right cameras.

[0103] IV. Image Phase Decomposition

[0104] For example, phase dephase can be performed on the integer-coded image (i.e., the second type of image mentioned above) and the line-shifted stripe image (i.e., the first type of image mentioned above), respectively.

[0105] For example, line-shifted stripe images are used to achieve sub-pixel level encoding. The phase dephase result of the first type of image is the wrapped phase. It is necessary to unwrap the phase dephase result (wrapped phase) of the first type of image to obtain the absolute phase. The phase dephase accuracy directly affects the accuracy of 3D reconstruction.

[0106] Integer-coded images are used to determine the order of line-shifted fringes, which are then used to unwrap the phase of the line-shifted fringes. The unwrapping results do not affect the reconstruction accuracy.

[0107] For example, for phase resolution of a line-shifted fringe image, the theoretical encoded value in the time domain of the line-shifted fringe image can be used as an initial template. This template is then integrated with the grayscale value of the actual line-shifted fringe image captured by the camera. The initial template is shifted one bit to the right each time to calculate the integral response value of the time-domain template at each position. The position with the largest integral response value represents the position where the initial template is closest to the actual pixel intensity in the time domain.

[0108] For example, the initial template can be set to be consistent with the theoretical encoding value. The non-initial template is obtained by cyclically shifting the initial template to the right. Based on the new time-domain template (non-initial template), the integral response value corresponding to the new time-domain template can be calculated. The position with the largest response value can be used to uniquely identify the pixel in a single period.

[0109] Assuming the theoretical encoding value of the line-shifted stripe image in the time domain is [1, 0, 0, 0, 0, 1, 1, 1], in order to increase the amplitude of the response value, the template can be set to [1, -1, -1, -1, -1, 1, 1, 1].

[0110] The temporal template integral calculation process for a single pixel can be shown in Figures 3A and 3B. Then, the position with the largest integral response value is found by traversing the temporal domain. Taking the position with the largest integral response value as the center, a window size (e.g., 5 positions) is selected. The precise position of the maximum value corresponding to the pixel position is calculated according to the gray-scale centroid method. After normalization, it is the decoding result of the pixel.

[0111] As shown in Figure 4, the maximum value of the integral response corresponds to the edge where the pixel changes from bright to dark in the time domain of the line shift stripe. The maximum value of the integral response may appear at the two boundaries of the period. Therefore, when using the gray-scale centroid method for calculation, period expansion must be performed first to make the data on both sides of the period form a loop, so that the gray-scale centroid method can be calculated normally.

[0112] In Figure 4, the horizontal axis represents the position of the template movement in the time domain, and the vertical axis represents the integral response value.

[0113] For example, during the phase demodulation of a line-shifting stripe image, the maximum and minimum gray values ​​of each pixel position in the time domain can be calculated to generate the maximum and minimum gray values ​​of the measured object image, which can be used as thresholds for binarization of the integer-coded image.

[0114] Compared with the traditional approach of projecting a completely black image and a completely white image to determine the binarization threshold during the phase dephase process of an integer-coded image, the method provided in this application embodiment can reduce the number of projected images. In addition, the brightness of the actual projected integer-coded image and the brightness of the completely black / white image may not be completely consistent in the traditional approach, which may lead to binarization errors in some complex scenarios. The method provided in this application embodiment can improve the accuracy of binarization while reducing the number of projected black and white patterns.

[0115] For example, for any pixel location, the average of the maximum and minimum gray values ​​at that pixel location can be used as the binarization threshold for the integer coded pattern.

[0116] After completing the phase dephase processing of the integer-coded image (performing phase dephase processing on the second type of image) and the phase dephase processing of the line-shifted stripe image (performing phase dephase processing on the first type of image to obtain the wrapped phase), the wrapped phase is unwrapped according to the following formula to obtain the absolute phase: δ=k+φ

[0117] Where k is the dephase result (or integer decoding result) of the integer encoded image, φ is the dephase result (or fractional decoding result) of the line-shifted stripe image, and δ is the absolute phase value.

[0118] V. Point Cloud and Depth Map Calculation

[0119] For example, given an absolute phase map, three-dimensional reconstruction can be performed in a variety of ways.

[0120] For example, for a monocular system, the point cloud depth map can be calculated based on a pre-calibrated phase-height or phase-parallax curve.

[0121] For binocular systems, point clouds and depth maps can be calculated through binocular stereo matching and triangulation.

[0122] Taking a binocular system as an example. First, binocular stereo matching can be performed. Since binocular epipolar correction has already been completed, the images from the left and right cameras are row-aligned. Therefore, it is only necessary to complete the column-direction matching of the images acquired by the binocular system.

[0123] First, the phase values ​​of each row in the phase maps of the left and right cameras are sorted. Based on the sorting results, the left and right integer pixel coordinates corresponding to each pixel under the other camera are determined. Then, sub-pixel coordinates are calculated using linear interpolation based on the absolute phase values, thereby obtaining the binocular parallax value for each pixel. Finally, the 3D coordinates of the point cloud are calculated based on the camera imaging model and the triangulation principle to complete the reconstruction of the point cloud and depth map.

[0124] The imaging model of a camera can be represented as:

[0125] Where f is the focal length, (c x c y (x, y, z) are the main point coordinates, which can be obtained through camera calibration; (u, v) are the pixel coordinates; (x, y, z) are the three-dimensional coordinates of the point cloud.

[0126] According to the triangulation principle, we can obtain:

[0127] Where b is the base distance, d is the parallax value, and f is the focal length.

[0128] The formulas for calculating X and Y are as follows:

[0129] The method provided in this application has been described above. The apparatus provided in this application is described below:

[0130] Please refer to Figure 7, which is a structural schematic diagram of a structured light 3D reconstruction device provided in an embodiment of this application. As shown in Figure 7, the structured light 3D reconstruction device may include: an acquisition unit 710, used to acquire an image of a test object; wherein the test object is projected with a specified pattern, the specified pattern including a multi-frame integer-coded pattern and a multi-frame line-shifted fringe pattern projected sequentially, the number of the integer-coded patterns being consistent with the period of the line-shifted fringe pattern in the spatial domain; and a phase-deconstruction unit 720, used to perform phase-deconstruction processing on a first type of image in the image of the test object to obtain a first phase-deconstruction result; wherein the first... The type image is an image of the object under test obtained by sequentially projecting the multi-frame line-shifted stripe pattern; the dephase unit 720 is further configured to perform dephase processing on the second type image in the image of the object under test to obtain a second dephase result; wherein, the second type image is an image of the object under test obtained by sequentially projecting the multi-frame integer-coded pattern; the unwrapping unit 730 is configured to perform unwrapping processing on the first dephase result based on the second dephase result to obtain an absolute phase map; the reconstruction unit 740 is configured to perform three-dimensional reconstruction processing based on the absolute phase map.

[0131] In some embodiments, the dephase unit 720 performs dephase processing on a first type of image in the image of the object under test to obtain a first dephase result, including: for any pixel position in the first type of image, performing an integration operation based on the grayscale value of different first type images at that pixel position and a temporal template to obtain an integral response value corresponding to each temporal template; wherein, the temporal template includes an initial template and a new template obtained by cyclically shifting the initial template, and the number of temporal templates is consistent with the period of the line-shifted stripe pattern in the spatial domain; the initial template is determined based on the theoretical encoding value of the specified pixel position in the temporal domain; and the first dephase result corresponding to the pixel position is determined based on the position of the maximum value of the integral response value in the temporal domain.

[0132] In some embodiments, the dephase unit 720 determines the first dephase result corresponding to the pixel position based on the location of the maximum value of the integral response value in the time domain, including: determining a search range based on the location of the maximum value of the integral response value in the time domain and a preset window size; determining the precise location of the maximum value within the search range using the gray-scale centroid method; and normalizing the precise location of the maximum value to obtain the first dephase result corresponding to the pixel position.

[0133] In some embodiments, the dephase unit 720 performs dephase processing on the second type of image in the image of the object under test to obtain a second dephase result, including: for any pixel position in the second type of image, performing binarization processing on the gray values ​​of different second type images at that pixel position according to the binarization threshold of that pixel position to obtain the binarization results of different second type images at that pixel position; and determining the second dephase result of that pixel position based on the binarization results of different second type images at that pixel position.

[0134] In some embodiments, for any pixel location in the second type of image, the binarization threshold of that pixel location is determined by: determining the maximum gray value and the minimum gray value based on the gray value of each first type of image at that pixel location; and determining the binarization threshold of that pixel location based on the maximum gray value and the minimum gray value.

[0135] This application provides an electronic device including a processor and a memory, wherein the memory stores machine-executable instructions that can be executed by the processor, and the processor executes the machine-executable instructions to implement the vehicle camera calibration method described above.

[0136] Please refer to Figure 8, which is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of this application. The electronic device may include a processor 801 and a memory 802 storing machine-executable instructions. The processor 801 and the memory 802 can communicate via a system bus 803. Furthermore, by reading and executing the machine-executable instructions in the memory 802 corresponding to the vehicle-mounted camera calibration logic, the processor 801 can execute the vehicle-mounted camera calibration method described above.

[0137] The memory 802 mentioned in this document can be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, etc. For example, machine-readable storage media can be: RAM (Random Access Memory), volatile memory, non-volatile memory, flash memory, storage drives (such as hard disk drives), solid-state drives, any type of storage disk (such as optical discs, DVDs, etc.), or similar storage media, or combinations thereof.

[0138] In some embodiments, a machine-readable storage medium is also provided, such as memory 802 in FIG8, which stores machine-executable instructions that, when executed by a processor, implement the vehicle-mounted camera calibration method described above. For example, the storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.

[0139] The above are some embodiments of this application and are not intended to limit this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application.

Claims

1. A structured light 3D reconstruction method, comprising: Acquire an image of the object under test; wherein the object under test is projected with a specified pattern, the specified pattern including a multi-frame integer-coded pattern and a multi-frame line-shifted stripe pattern projected in sequence, the number of the integer-coded patterns being consistent with the period of the line-shifted stripe pattern in the spatial domain; Phase-dephase processing is performed on the first type of image in the image of the object under test to obtain a first phase-dephase result; wherein, the first type of image is the image of the object under test obtained by sequentially projecting the multi-frame line-shifted stripe pattern; Phase-dephase processing is performed on the second type image in the image of the object under test to obtain a second phase-dephase result; wherein, the second type image is the image of the object under test obtained by sequentially projecting the multi-frame integer coded pattern; Based on the second solution phase result, the first solution phase result is unwrapped to obtain an absolute phase map; Based on the absolute phase map, a three-dimensional reconstruction process is performed.

2. The method according to claim 1, wherein, The step of performing phase dephase processing on the first type of image in the image of the object under test to obtain the first phase dephase result includes: For any pixel location in the first type of image, an integration operation is performed based on the grayscale value of that pixel location in different first type images and the temporal template to obtain the integral response value corresponding to each temporal template; wherein, the temporal template includes an initial template and a new template obtained by cyclically shifting the initial template, and the number of the temporal templates is consistent with the period of the line-shifted stripe pattern in the spatial domain; the initial template is determined based on the theoretical encoding value of the specified pixel location in the temporal domain. Based on the location of the maximum value of the integral response in the time domain, the first solution phase result corresponding to the pixel position is determined.

3. The method according to claim 2, wherein, Based on the location of the maximum value of the integral response in the time domain, the first solution phase result corresponding to the pixel location is determined, including: The search range is determined based on the location of the maximum value of the integral response value in the time domain and the preset window size; Within the search range, the precise location of the maximum value is determined using the gray-scale centroid method; The precise location of the maximum value is normalized to obtain the first solution phase result corresponding to that pixel location.

4. The method according to claim 1, wherein, The step of performing phase dephase processing on the second type of image in the image of the object under test to obtain a second phase dephase result includes: For any pixel location in the second type of image, the grayscale value of different second type images at that pixel location is binarized according to the binarization threshold of that pixel location, so as to obtain the binarization result of different second type images at that pixel location; Based on the binarization results of different second-type images at this pixel location, the second solution phase result at this pixel location is determined.

5. The method according to claim 4, wherein, For any pixel location in the second type of image, the binarization threshold for that pixel location is determined in the following way: Based on the grayscale value of each first-type image at that pixel location, determine the maximum and minimum grayscale values; Based on the maximum and minimum gray values, the binarization threshold for the pixel location is determined.

6. A structured light three-dimensional reconstruction device, comprising: An acquisition unit is used to acquire an image of a test object; wherein the test object is projected with a specified pattern, the specified pattern including a multi-frame integer-coded pattern and a multi-frame line-shifted stripe pattern projected in sequence, the number of the integer-coded patterns being consistent with the period of the line-shifted stripe pattern in the spatial domain. A dephase unit is used to perform dephase processing on a first type of image in the image of the object under test to obtain a first dephase result; wherein, the first type of image is the image of the object under test obtained by sequentially projecting the multi-frame line-shifted stripe pattern; The dephase unit is further configured to perform dephase processing on the second type image in the image of the object under test to obtain a second dephase result; wherein, the second type image is the image of the object under test obtained by sequentially projecting the multi-frame integer coded pattern; The unwrapping unit is used to unwrap the first unwrapping result based on the second unwrapping result to obtain an absolute phase map; The reconstruction unit is used to perform three-dimensional reconstruction processing based on the absolute phase map.

7. The apparatus according to claim 6, wherein, The dephase unit performs dephase processing on a first type of image in the image of the object under test to obtain a first dephase result, including: For any pixel location in the first type of image, an integration operation is performed based on the grayscale value of that pixel location in different first type images and the temporal template to obtain the integral response value corresponding to each temporal template; wherein, the temporal template includes an initial template and a new template obtained by cyclically shifting the initial template, and the number of the temporal templates is consistent with the period of the line-shifted stripe pattern in the spatial domain; the initial template is determined based on the theoretical encoding value of the specified pixel location in the temporal domain. Based on the location of the maximum value of the integral response in the time domain, determine the phase result of the first solution corresponding to the pixel location; The phase-solution unit determines the first phase-solution result corresponding to the pixel position based on the location of the maximum value of the integral response value in the time domain, including: The search range is determined based on the location of the maximum value of the integral response in the time domain and the preset window size; Within the search range, the precise location of the maximum value is determined using the gray-scale centroid method; The precise location of the maximum value is normalized to obtain the first solution phase result corresponding to that pixel location.

8. The apparatus according to claim 6, wherein, The dephase unit performs dephase processing on the second type of image in the image of the object under test to obtain a second dephase result, including: For any pixel location in the second type of image, the grayscale value of different second type images at that pixel location is binarized according to the binarization threshold of that pixel location, so as to obtain the binarization result of different second type images at that pixel location; Based on the binarization results of different second-type images at this pixel location, the second solution phase result at this pixel location is determined; For any pixel location in the second type of image, the binarization threshold for that pixel location is determined in the following way: Based on the grayscale value of each first-type image at that pixel location, determine the maximum and minimum grayscale values; Based on the maximum and minimum gray values, the binarization threshold for the pixel location is determined.

9. An electronic device, wherein, The method includes a processor and a memory, the memory storing machine-executable instructions that can be executed by the processor, the processor executing the machine-executable instructions to implement the method as described in any one of claims 1-5.

10. A machine-readable storage medium, wherein, The machine-readable storage medium stores machine-executable instructions, which, when executed by a processor, implement the method as described in any one of claims 1-5.