Striped image enhancement method, system, device and medium based on complementary modulation information
By leveraging the complementary modulation properties of high-exposure and low-exposure striped images through FGNet, the problem of HDR region reconstruction failure was solved, achieving efficient 3D reconstruction results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SICHUAN UNIV
- Filing Date
- 2026-03-24
- Publication Date
- 2026-06-12
Smart Images

Figure CN122199283A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of three-dimensional surface measurement technology, and in particular relates to a method, system, device and medium for stripe image enhancement based on complementary modulation information. Background Technology
[0002] Fringe projection profilometry (FPP) is a high-precision optical 3D measurement technique. Due to its advantages such as simple hardware structure, high measurement speed, and non-contact measurement, it has been widely used in fields such as industrial defect detection, 3D reconstruction of biomedical tissues, and cultural heritage morphology measurement. A typical structured light FPP measurement system usually consists of a DLP projector and a CMOS camera. The projector projects a sinusoidal fringe pattern onto the surface of the object being measured, and the camera acquires the fringe image modulated by the object's surface. FPP first demodulates the phase of the acquired fringe image, and then maps the phase information to height information using a phase-height mapping model, thereby obtaining the surface height of the object being measured and ultimately generating a reconstructed point cloud.
[0003] High-quality 3D reconstruction relies on proper exposure of the fringe image. However, in actual measurements, high dynamic range (HDR) issues are common due to the uneven distribution of surface reflectivity. Excessive reflectivity can cause brightness to exceed the camera's response range, resulting in truncation of fringe information; conversely, insufficient reflectivity can lead to insufficient brightness, preventing the camera from effectively capturing fringe information. Both situations can cause reconstruction failure in the HDR region.
[0004] To address HDR issues and obtain higher-quality reconstructed point clouds, researchers have proposed several methods. Polarization-based measurement methods utilize polarization information to suppress specular reflection and saturation, but require additional optics and precise calibration. Adaptive exposure methods mitigate overexposure and underexposure by dynamically adjusting imaging parameters, but are sensitive to parameter settings and still have limitations in complex HDR scenes.
[0005] Multi-exposure fusion methods utilize the complementarity of grayscale intensity, modulation, or phase information in fringe images acquired under different exposure conditions to restore and enhance HDR regions. While multi-exposure fusion methods typically achieve superior reconstruction quality compared to traditional methods, the fringe acquisition process in real-world measurement scenarios is often cumbersome, leading to low measurement efficiency. In recent years, deep learning technology has developed rapidly and has been widely applied in fringe projection profilometry (FPP). However, experimental results show that this approach struggles to achieve effective restoration when processing fringe images with extreme exposures, relying solely on neighborhood pixel information.
[0006] In summary, while traditional multi-exposure methods can improve reconstruction results to some extent, their data acquisition process is complex and time-consuming. In recent years, deep learning methods, due to their strong feature representation capabilities, have been widely applied to the restoration of HDR stripe images and have shown advantages in measurement efficiency. However, most existing methods typically use only a single overexposed stripe image as input, making them difficult to effectively restore large HDR regions with severe saturation or low-key modulation, thus limiting the accuracy of 3D reconstruction. Therefore, to address these issues, this invention proposes a stripe image enhancement scheme based on complementary modulation information. Summary of the Invention
[0007] The purpose of this invention is to provide a stripe image enhancement method, system, device, and medium based on complementary modulation information to solve the problems existing in the prior art.
[0008] In a first aspect, to achieve the above objectives, the present invention provides a stripe image enhancement method based on complementary modulation information, comprising: Acquire a double exposure stripe image to be processed, the double exposure stripe image to be processed including a high exposure stripe image and a corresponding low exposure stripe image; The double-exposure stripe image to be processed is input into a fusion-guided repair network for image enhancement, and an enhanced stripe image is output. The fusion-guided repair network includes a modulation complement fusion module and a guided repair module connected in sequence. The modulation complement fusion module includes a dual-branch structure and a terminal convolutional layer connected in sequence. The dual-branch structure includes a symmetrically arranged low-exposure modulation-aware sub-branch and a high-exposure modulation-aware sub-branch. The guided repair module is built based on the UNet network.
[0009] This invention proposes a Fusion-Guided Restoration Network (FGNet) designed to effectively utilize the complementary modulation characteristics between high-exposure and low-exposure fringe image pairs. By exploiting this complementarity, FGNet reduces the number of images acquired through multiple exposures while mitigating common reconstruction errors in single-exposure input methods. Specifically, the framework first extracts and fuses complementary information from image pairs using a Modulation-Complementary Fusion Module (MCFM), and then uses a Guided Restoration Module (GRM) to enhance and refine the HDR regions in the fused result, ultimately generating a high-quality enhanced fringe image. Experimental results show that FGNet can effectively utilize complementary modulation information to improve the restoration effect of HDR regions and achieve high-precision 3D point cloud reconstruction. Compared with traditional multi-exposure fusion methods, the proposed method requires only two sets of input fringe images to obtain high-quality output and achieves higher reconstruction accuracy than existing comparative methods while covering a wide range of exposure conditions.
[0010] Optionally, the training process of the fusion-guided repair network specifically includes: Acquire training data, which includes double-exposure stripe training images and corresponding real labels; An initial fusion-guided inpainting network is constructed. The training data is input into the fusion-guided inpainting network for image enhancement. The network is trained with the goal of minimizing the loss between the initial training result after image enhancement and the real label corresponding to the double exposure stripe training image. The trained fusion-guided inpainting network is then obtained.
[0011] Optionally, the processing of the fusion-guided repair network specifically includes: The high-exposure stripe image is input into the low-exposure modulation perception sub-branch to learn modulation perception weights. The low-exposure stripe image is input into the high-exposure modulation perception sub-branch to learn modulation perception weights. By performing modulation complement fusion on the outputs of the low-exposure modulation perception sub-branch and the high-exposure modulation perception sub-branch through the terminal convolutional layer, a modulation complement feature map is obtained. The guided repair module performs targeted enhancement and repair on the low-profile region in the complementary feature map of modulation, and outputs an enhanced stripe image.
[0012] Optionally, the process of obtaining the modulation complement feature map specifically includes:
[0013] In the formula, To adjust the complementary characteristics of the system, Low-exposure striped image Weighting coefficients for the medium-high reflectivity region High-exposure striped image The weighting coefficients for the low to medium reflectivity region, ⊙ represents the element-wise Hadamard product, and 𝜖 is used to ensure numerical stability.
[0014] Optionally, the process of acquiring the enhanced stripe image specifically includes:
[0015] In the formula, Indicates the backbone UNet number The output of the layer sampling layer is the enhanced feature map obtained under the guidance of the extracted guiding information; This indicates the first [unclear] of the backbone UNet. Layer sampling operation, This indicates a feature concatenation operation. Main UNet No. The output of the downsampling layer, This represents the cross-attention mechanism. To guide the feature extractor in the first Guided features extracted from layers.
[0016] Secondly, to achieve the above objectives, the present invention provides a stripe image enhancement system based on complementary modulation information, comprising: The data acquisition module is used to acquire the double exposure stripe image to be processed, which includes a high exposure stripe image and a corresponding low exposure stripe image. A fusion-guided inpainting module is used to input the double-exposure stripe image to be processed into a fusion-guided inpainting network for image enhancement and output an enhanced stripe image. The fusion-guided inpainting network includes a modulation-complementary fusion module and a guided inpainting module connected in sequence. The modulation-complementary fusion module includes a dual-branch structure and a terminal convolutional layer connected in sequence. The dual-branch structure includes a symmetrically arranged low-exposure modulation-aware sub-branch and a high-exposure modulation-aware sub-branch. The guided inpainting module is built based on the UNet network.
[0017] Thirdly, to achieve the above objectives, the present invention provides an electronic device, including a memory and a processor, wherein the memory is used to store a computer program, and the processor runs the computer program to cause the electronic device to perform a stripe image enhancement method based on complementary modulation information according to the first aspect.
[0018] Fourthly, to achieve the above objectives, the present invention provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements a stripe image enhancement method based on complementary modulation information as described in the first aspect.
[0019] The technical effects of this invention are as follows: This invention fully utilizes the complementary modulation characteristics between double-exposure fringe images. First, a fusion module extracts and integrates complementary information from high- and low-exposure images to obtain a fusion result with complementary modulation. Then, a guided repair module is introduced to perform targeted enhancement and fine repair on the low-modulus regions in the fusion result, thereby generating a high-quality enhanced fringe image for subsequent phase calculation and 3D reconstruction. Attached Figure Description
[0020] The accompanying drawings, which form part of this application, are used to provide a further understanding of this application. The illustrative embodiments and descriptions of this application are used to explain this application and do not constitute an undue limitation of this application. In the drawings: Figure 1 This is a schematic diagram of the overall network structure of FGNet in an embodiment of the present invention; Figure 2 This refers to the measurement system and data samples in this embodiment of the invention; wherein, Figure 2 (a) is a real structured light FPP measurement system. Figure 2 (b) is a Blender-based analog FPP measurement system. Figure 2 (c) is a dataset example; Figure 3 This is a schematic diagram illustrating the analysis of the ceramic block reconstruction results in an embodiment of the present invention; wherein, Figure 3 (a) is a ceramic block; Figure 3 (b) is a low-exposure stripe image of the ceramic block; Figure 3 (c) is a high-exposure stripe image of the ceramic block; Figure 3 (d) is the stripe image restored by FGNet; Figure 3 (e) represents the grayscale value extracted from row 128 of the low-exposure, high-exposure, and FGNet-enhanced stripe images of the ceramic block; Figure 4 The comparison results of phase cross-section curves obtained by different methods in the embodiments of the present invention are shown; wherein, Figure 4 (a) is the method proposed in this embodiment; Figure 4 (b) is a low-exposure striped image; Figure 4 (c) is a high-exposure striped image; Figure 4 (d) is a multi-exposure fusion method; Figure 5 Here are the absolute phase error maps obtained by different methods in the embodiments of the present invention: where, Figure 5 (a) is the method proposed in this embodiment; Figure 5 (b) is a low-exposure striped image; Figure 5 (c) is a high-exposure striped image; Figure 6 This is a schematic diagram illustrating the comparison results of the reconstructed point cloud of the ceramic block in an embodiment of the present invention; wherein, Figure 6 (a) is a low-exposure striped image; Figure 6 (b) is a high-exposure striped image; Figure 6 (c) is the method proposed in this embodiment; Figure 6 (d) is a multi-exposure fusion method.
[0021] Figure 7 This is a visualization result of reconstruction error and modulation in an embodiment of the present invention; wherein, Figure 7 (a)- Figure 7 (c) Reconstruction errors of the method proposed in this embodiment, the low-exposure image, and the high-exposure image, respectively; Figure 7 (d)–(f) are the corresponding modulation diagrams of the method proposed in this embodiment, the low-exposure image, and the high-exposure image, respectively; Figure 8 The measurement object and its processing result are described in the embodiments of the present invention; wherein, Figure 8 (a) is an input low-exposure image; Figure 8 (b) Input a high-exposure image; Figure 8 (c) is the enhanced image after processing by the method proposed in this embodiment; Figure 8 (d) is the grayscale curve of the selected row; Figure 8 (e) is the absolute phase map obtained by the method proposed in this embodiment; Figure 9 These are point clouds of metal cover plates obtained by different methods in the embodiments of the present invention; Figure 10 These are point cloud error maps of metal cover plates obtained by different methods in the embodiments of the present invention; Figure 11 These are point clouds of carbon fiber plates obtained by different methods in the embodiments of the present invention; Figure 12 These are point cloud error maps of carbon fiber plates obtained by different methods in the embodiments of the present invention; Figure 13 These are point clouds of glitter surfaces obtained by different methods in the embodiments of the present invention; Figure 14 (a) is a sequence of stripe images under different exposure intensities. Figure 14 (b) is a magnified view of the point cloud reconstructed by the multi-exposure fusion method and the HDR region point cloud, from left to right; Figure 15 These are the reconstruction results of different image pairs under the proposed method in the embodiments of the present invention; Figure 15 (a) is the high-exposure striped image used as input; Figure 15 (b) is a low-exposure stripe image used as input; Figure 15 (c) is the striped image enhanced by the proposed method; Figure 15 (d) shows the 3D reconstructed point cloud results for each exposure image pair; Figure 15 (e) Figure 15 (f) A magnified view of the high dynamic range (HDR) region in the reconstructed point cloud; Figure 15 (g) is the 3D reconstruction error distribution map corresponding to each exposure image pair; Figure 16 This is a comparison image of the point cloud reconstruction results of the ablation experiment in this embodiment of the invention; Figure 16 (a) Reconstructed point cloud using the multi-exposure fusion method; Figure 16 (b) is the reconstructed point cloud of the method proposed in this embodiment; Figure 17 This is the reconstruction error distribution of the ablation experiment in the embodiments of the present invention; Figure 17 (b), (c), (d), (e), and (f) are reconstruction error diagrams corresponding to the proposed method, the fused image, the fused image enhanced only by UNet, the low-exposure image, and the high-exposure image, respectively.
[0022] Figure 18 This is a flowchart illustrating the implementation of an embodiment of the present invention. Detailed Implementation
[0023] Various exemplary embodiments of the present invention will now be described in detail. This detailed description should not be considered as a limitation of the present invention, but rather as a more detailed description of certain aspects, features, and embodiments of the present invention.
[0024] It should be understood that the terminology used in this invention is merely for describing particular embodiments and is not intended to limit the invention. Furthermore, with respect to numerical ranges in this invention, it should be understood that each intermediate value between the upper and lower limits of the range is also specifically disclosed. Every smaller range between any stated value or intermediate value within a stated range, and any other stated value or intermediate value within said range, is also included in this invention. The upper and lower limits of these smaller ranges may be independently included or excluded from the range.
[0025] Various modifications and variations can be made to the specific embodiments described in this specification without departing from the scope or spirit of the invention, as will be apparent to those skilled in the art. Other embodiments derived from this specification will also be obvious to those skilled in the art. This application specification and embodiments are merely exemplary.
[0026] The terms “include,” “including,” “have,” “contain,” etc., used in this article are all open-ended terms, meaning that they include but are not limited to.
[0027] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. This application will now be described in detail with reference to the accompanying drawings and embodiments. Example
[0028] like Figure 1 - Figure 18 As shown, this embodiment provides a stripe image enhancement method based on complementary modulation information, including: acquiring a double-exposure stripe image to be processed, the double-exposure stripe image to be processed including a high-exposure stripe image and a corresponding low-exposure stripe image; inputting the double-exposure stripe image to be processed into a fusion-guided repair network for image enhancement, and outputting an enhanced stripe image; wherein, the fusion-guided repair network includes a modulation-complementary fusion module and a guided repair module connected in sequence, the modulation-complementary fusion module includes a dual-branch structure and a terminal convolutional layer connected in sequence, the dual-branch structure includes a symmetrically arranged low-exposure modulation-aware sub-branch and a high-exposure modulation-aware sub-branch; the guided repair module is built based on the UNet network.
[0029] To address the problems of existing technologies, this embodiment proposes a deep learning framework that explicitly learns pixel-level modulation-aware reliability maps from high-exposure and low-exposure stripe images. This framework combines a modulation complementarity fusion module with a guided inpainting module, fully utilizing their complementary characteristics. Experimental results show that the proposed method can effectively utilize modulation complementarity to recover HDR stripe images and achieves a significant improvement in reconstruction accuracy compared to contrasting methods. For example, in experiments with metal cover plates, the proposed method achieves a reconstruction MAE as low as 0.0113 mm, representing approximately five times the accuracy improvement compared to other methods.
[0030] The specific implementation process of this embodiment is as follows: Fringe projection profilometry: A projector projects a phase-shifted sinusoidal fringe pattern onto the surface of the object being measured, while a camera acquires a phase fringe image modulated by the object's surface. The acquired fringe image is then phase-demodulated, and the object's surface height is calculated using a phase-height mapping model, ultimately generating a reconstructed point cloud. The projected fringe image can be represented as:
[0031] The acquired deformed stripe image is as follows:
[0032] in, and These represent the coordinates of the projector and the camera, respectively. and Indicates the first striped image The strength of ) and For background intensity, The width is one period. This embodiment uses a four-step phase shift method ( =4), the fringe frequency is 64 ( =64). and Indicates the intensity of the adjustment mechanism. The formula for calculating the wrapped phase is as follows:
[0033] The wrapped phase is calculated Then, the absolute phase is solved, and the surface is reconstructed using a phase-height mapping model.
[0034] Network Structure: Figure 1 illustrates the overall network structure of the proposed FGNet. This network employs a two-stage design, first global and then local, for reconstructing high dynamic range (HDR) stripe images. The first stage is the Modulation-Complementary Fusion Module (MCFM), which fuses information from low-exposure stripe images (𝐼𝑢𝑒) and high-exposure stripe images (𝐼𝑜𝑒) through a specially designed mechanism to generate a high-fidelity initial feature map. Specifically, this stage uses a symmetrical bi-branch structure to directly learn modulation-aware weight maps 𝑤1 and 𝑤2. These weight maps serve as spatial reliability indicators, characterizing the effective modulation of stripes in the low-exposure and high-exposure input stripe images.
[0035] By identifying and enhancing high-key regions—specifically, high-reflectivity regions in the low-exposure striated image 𝐼𝑢𝑒 and low-reflectivity regions preserved in the high-exposure striated image 𝐼𝑜𝑒—the network can collaboratively compensate for the loss of striated information due to local saturation and low signal-to-noise ratio. In Figure 1, the corresponding weight distribution is visualized as heatmaps 𝑤1 and 𝑤2.
[0036] It is important to emphasize that the weight coefficients 𝑤1 and 𝑤2 are not preset based on exposure parameters, but are adaptively learned from fringe modulation-related features in a completely data-driven manner. In fringe projection contour measurement, the fringe modulation itself reflects the combined effects of exposure variations, surface reflectivity, and noise interference. Therefore, by learning a pixel-level modulation confidence map, the proposed modulation complementarity fusion module (MCFM) enables the network to implicitly adapt to different exposure conditions and object surface characteristics, selectively emphasizing the input fringe image with higher modulation fidelity at each spatial location.
[0037] For input stripe images 𝐼𝑢𝑒 and 𝐼𝑜𝑒𝑜𝑒 with dimensions of 4 × 256 × 256, the modulation complement fusion can be expressed as:
[0038] Here, ⊙ represents the element-wise Hadamard product, and 𝜖 is used to ensure numerical stability. The resulting 𝐼𝑓𝑢𝑠𝑖𝑜𝑛 encapsulates the comprehensive feature representation optimized by the modulation.
[0039] Structurally, MCFM utilizes convolutional layers for basic feature extraction, followed by a multi-scale pooling strategy (combining average pooling and max pooling) to obtain multi-scale modulation-aware features. These features are further refined through upsampling and convolutional blocks, and residual connections are introduced into the network to mitigate feature degradation. Finally, terminal convolutional layers generate pixel-level modulation weights to drive the complementary fusion process of the striped image.
[0040] The complementary modulation feature map (CMM) provides a comprehensive characterization that preserves core fringe information from both low- and high-exposure fringe images. However, although CMM effectively integrates data from different sources, it may still contain residual modulation distortion or exposure unevenness biases introduced by the sensor's nonlinear response. To correct these artifacts, this embodiment introduces a Guided Restoration Module (GRM).
[0041] GRM employs a UNet backbone network to perform global modulation normalization and stripe detail optimization on striated images, thereby enhancing the overall structural fidelity of the striped image. Simultaneously, a guided feature extractor is introduced to extract spatial prior information from the high-exposure image Ioe. Although local regions may be saturated, the high-exposure image retains richer global illumination and structural information. Unlike the initial fusion, these guided features are integrated into the UNet backbone network through an inter-layer cross-attention mechanism. This method adaptively fuses high-resolution contextual information from the source image into the decoded features and injects it layer by layer into the upsampling stage to generate stripe restoration guidance.
[0042] This layered injection mechanism effectively corrects the complementary modulation features to the optimal exposure baseline while restoring the stripe contrast in low-modulation regions. The detailed network structure is shown in Figure 1, and the mathematical formula guiding the restoration process can be expressed as:
[0043] in, Indicates the backbone UNet number The output of the layer sampling layer is the enhanced feature map obtained under the guidance of the extracted guiding information. This indicates the first... Layer sampling operation, This indicates a feature concatenation operation. Main UNet (the first) The output of the downsampling layer, This represents the cross-attention mechanism. To guide the feature extractor in the first Guided features extracted from layers. It should be noted that the designed guided feature extractor consists only of downsampling paths.
[0044] The proposed method employs multi-head depthwise convolutional transpose attention, the operation of which can be represented as follows:
[0045] Here, 𝑥 and 𝑦 represent the two input feature maps of the attention module. (·) represents a 1×1 convolution, and LN(·) represents layer normalization. (·) represents a 3×3 depthwise separable convolution. (·) indicates a feature dimension rearrangement operation. , and Let represent the query, key, and value feature maps, respectively, with dimensions 𝐶×𝐻×𝑊. 𝑄ˆ, 𝐾ˆ, and 𝑉ˆ represent the rearranged query, key, and value, respectively, with dimensions 𝐻𝑊×𝐶, 𝐶×𝐻𝑊, and 𝐻𝑊×𝐶. The output of the attention module is denoted as . The dimensions are 𝐶×𝐻×𝑊. It is a learnable scaling parameter used to enhance training stability.
[0046] Loss Function: To ensure that the fused and enhanced striped image retains both sinusoidal characteristics and structural consistency, the loss function design includes MAE loss, SSIM loss, and cosine similarity loss. Specifically, SSIM loss constrains the structural consistency between the reconstructed result and the reference image while maintaining local contrast; cosine similarity loss constrains the consistency of the overall intensity distribution, reducing the impact of intensity scale variations caused by different exposure levels. The combined use of these two losses provides complementary supervision for local structural fidelity and global intensity distribution. Furthermore, to ensure that the grayscale distribution of the final enhanced image is within a reasonable exposure range, a background intensity constraint loss is introduced. The definitions of each loss term are as follows:
[0047] in, and They represent and The mean; and They represent and The variance; express and The covariance between them.
[0048]
[0049] in, This represents the channel index. Since this algorithm is based on four-step phase-shift fringes, therefore... =4. ∈R4×256×256 represents the final enhanced output of the proposed network. ∈R4×256×256 represents the corresponding ground truth.
[0050] The total loss is:
[0051] Dataset: The dataset constructed in this embodiment consists of two parts: one part is simulated data generated using the digital twin technology of the Blender platform, and the other part is real data collected in the actual measurement environment, with corresponding labels generated using a multi-exposure fusion method. The simulated dataset contains 1,215 samples, and the real dataset contains 1,000 samples, covering typical and extreme exposure conditions. The Blender virtual image acquisition system and the real measurement system are shown in Figures 2(a) and (b), respectively. Both systems consist of orthogonally arranged projectors and cameras. In the real measurement system, a DLP LightCrafter 4500 projector with a resolution of 912 × 1140 and a CMOS camera (MER2-502-79U3M) with a resolution of 2448 × 2048 are used. The 3D model used for the simulated data comes from the Thingi10K dataset, while the real data is collected from metal parts with high dynamic range surfaces. Figure 2(c) shows examples of simulated and real data and their corresponding labels (the first row is the synthetic dataset, and the second row is the data collected in the real measurement scenario. From left to right, they are low-exposure images, high-exposure images, and their corresponding real labels). Each data sample consists of a pair of high-exposure and low-exposure striped images and their corresponding labels. Each image is 4 × 256 × 256 pixels (four-step phase-shift grayscale stripe pattern), and the stripe frequency is set to 64.
[0052] To verify the effectiveness of the proposed method, this experiment was conducted on an NVIDIA GeForce RTX 4090 GPU using the Py-Torch framework. The network was trained for 200 epochs using the Adam optimizer with an initial learning rate of 2e-4. The specific experimental settings are as follows: The reconstruction results of the ceramic block were analyzed to evaluate the ability of the proposed method to recover the sinusoidal characteristics of the fringe image.
[0053] Comparative experiments were conducted with recent representative deep learning methods to evaluate the performance of the proposed method in HDR region restoration.
[0054] The generalization ability of the proposed method in handling different exposure differences in input is evaluated.
[0055] Ablation experiments were conducted to verify the effectiveness of each component in the proposed network structure.
[0056] Analysis of Reconstruction Results Using Ceramic Blocks: To evaluate the ability of the proposed method to recover the sinusoidal characteristics of fringe images, a 3D reconstruction experiment was conducted using ceramic blocks as the test object. The surface height of the ceramic block was 3 mm, and the measurement error tolerance was ±1 mm. The fringe images and their corresponding phase results were analyzed in the experiment. Due to the high reflectivity of the ceramic surface, overexposure is prone to occur during the measurement process, making it a typical sample for verifying the effectiveness of the proposed method.
[0057] Figure 3(d) shows the fringe image enhanced and restored using the proposed method. To evaluate the restoration of the fringe sinusoidal characteristics, the grayscale values of row 128 (shown by the red line in Figure 3(d)) were extracted from the low-exposure, high-exposure, and enhanced images for comparative analysis, and their grayscale curves are shown in Figure 3(e). It can be observed that in the high-exposure image, the fringe peaks are severely truncated due to overexposure; while in the low-exposure image, the fringe valleys are truncated to zero because the intensity is too low to be effectively captured by the camera. After enhancement using the proposed method, the grayscale values of the fringe image are restored to the normal exposure range, and the sinusoidal characteristics of the fringes are preserved.
[0058] Figure 4 shows the absolute phase curves along the red line (the areas within the red and yellow boxes show magnified views of the phase curves). Figures 4(b) and (c) show the absolute phase curves of the underexposed and overexposed images, respectively, where phase resolution errors result in a sawtooth phase distribution. Figure 4(a) shows the phase curve after processing by the proposed method. After FGNet enhancement, the absolute phase curve recovers to a smooth increasing trend and is highly consistent with the phase curve obtained by the multi-exposure fusion method (Figure 4(d)), indicating that the proposed method can effectively improve the phase accuracy in the HDR region. Figure 5 shows the absolute phase error map. It can be observed that both the underexposed image (Figure 5(b)) and the overexposed image (Figure 5(c)) exhibit step-like phase errors, indicating phase resolution failures in the HDR region. This phenomenon stems from severe intensity saturation or signal loss in these regions, leading to low fringe modulation and unreliable packaging phase estimation. The accumulated phase resolution errors manifest as discontinuous step-like structures and uneven error distributions. In contrast, the phase error map enhanced by the proposed method (Fig. 5(a)) shows a significant reduction in phase error.
[0059] In this embodiment, a 3D reconstruction of the ceramic block was performed, and the resulting point cloud is shown in Figure 6. The reconstruction results from both underexposed and overexposed stripe images exhibited step-like and jagged artifacts. In contrast, the proposed method (Figure 6(c)) can completely reconstruct the HDR surface, and its results are comparable to those of the multi-exposure fusion method (Figure 6(d)).
[0060] Figure 7(ac) shows the reconstruction error map, with the proposed method achieving a MAE of 0.0034 mm and an RMSE of 0.0072 mm. In contrast, the point cloud error distribution in low-exposure and high-exposure images is uneven, mainly due to local intensity saturation or insufficient illumination in HDR regions, leading to decreased fringe quality, unreliable phase estimation, and subsequent reconstruction errors. The modulation maps of low-exposure and high-exposure images (Fig. 7(df)) exhibit significant complementarity: under low exposure, the modulation of high-reflectivity regions is higher, while under high exposure, the modulation of low-reflectivity regions dominates. In contrast, the proposed method can effectively fuse these complementary high-modulus regions and obtain a globally enhanced modulation map.
[0061] The experimental results above show that the proposed method can effectively recover HDR stripe images, reconstruct the sinusoidal characteristics of the stripes, and achieve complete three-dimensional reconstruction based on the enhanced stripe images.
[0062] Comparative Experiments: To evaluate the performance of the proposed method, comparative experiments were conducted, selecting recent deep learning-based HDR surface reconstruction algorithms as controls. The reconstruction results obtained through multi-exposure fusion were used as the reference standard (Label), and evaluation metrics included MAE, RMSE, and standard deviation (SD). Specific comparison methods included DC-UNet, HDR-Net, UNN, and Y-FFC. Except for the UNN method, all other methods were trained and tested using the same input configuration.
[0063] In this embodiment, sequins, metal cover plates, and carbon fiber plates were selected as reconstruction objects to compare the performance of different methods. As shown in Figure 8, the surface reflectivity of these three objects varies greatly: when the dark areas are properly exposed, the bright areas are severely overexposed; while when the bright areas are properly exposed, the dark areas are underexposed.
[0064] Figure 9 shows the reconstruction results of the metal cover plate. In underexposed areas, all comparison methods showed good recovery results; however, in overexposed areas, the reconstructed point cloud exhibited varying degrees of fluctuation or missing points. Specifically, Y-FFC could detect high-reflectance areas, but the separated diffuse reflection components had large errors, resulting in missing points in these areas; DC-UNet had a certain suppression effect on overexposed areas, but point cloud fluctuations were still obvious; the UNN method fitted overexposed areas through mathematical relationships, but the performance of a single constraint was limited in the case of large-area overexposure, resulting in random fluctuations and missing points in the point cloud; HDR-Net generated more complete and smoother point clouds in these areas, but point cloud fluctuations and streaking artifacts still existed. In contrast, the proposed method fused high and low exposure stripe information in complementary areas and used a guided recovery module to enhance and refine stripe features, thereby obtaining HDR region reconstruction results that were closer to the labeled point cloud.
[0065] Figure 10 shows the point cloud error map of the metal cover plate. The error distribution indicates that the reconstruction error of the contrasting methods is mainly concentrated in the overexposed areas. In contrast, the proposed method exhibits the best recovery performance in these overexposed areas.
[0066] Table 1 summarizes the quantitative indicators. The MAE of the proposed method is only 0.0113 mm, which is about five times more accurate than the best-performing Y-FFC and DC-UNet methods. The reconstruction comparison results of the carbon fiber plate are shown in Figure 11.
[0067] Table 1 Comparison of quantitative results for metal cover plates
[0068] It can be observed that UNN exhibits large-area point cloud loss in underexposed areas, while DC-UNet, Y-FFC, and HDR-Net also show point cloud loss in these areas. Although these methods generate relatively complete point clouds in overexposed areas, they fail to accurately reconstruct the subtle surface defects of the carbon fiber plate (the indentations in the overexposed areas of the label). In contrast, the proposed method achieves better reconstruction results in terms of overall quality and detail preservation, demonstrating its ability to effectively recover HDR areas while maximizing the preservation of detail information in the source image.
[0069] The reconstruction error map of the carbon fiber plate is shown in Figure 12. The error distribution shows that existing methods exhibit significant errors in both overexposed and underexposed areas, indicating limited ability to simultaneously recover these two types of areas. The proposed method, however, can effectively reconstruct both underexposed and overexposed areas while preserving surface texture details. A summary of the quantitative reconstruction indices for the carbon fiber plate is shown in Table 2. The proposed method achieves the lowest values across all indices: MAE 0.0227 mm, RMSE 0.0374 mm, and standard deviation 0.0297 mm. Compared to the comparative methods, the reconstruction accuracy is approximately twice as high.
[0070] Table 2 Comparison of quantitative results for carbon fiber plates
[0071] The reconstructed point cloud of the glossy surface is shown in Figure 13. Although the contrast method can reconstruct the HDR region, significant processing artifacts still exist on complex surfaces, typically manifesting as streaks. In particular, the reconstruction results of DC-UNet and HDR-Net exhibit obvious striping effects on the surface, severely affecting the reconstruction accuracy, indicating that existing methods are insufficient in terms of fidelity and detail preservation. In contrast, the proposed method utilizes a guided recovery module to perform stripe contrast correction and exposure compensation on the fusion result, while simultaneously refining the stripe image, thereby better preserving the stripe characteristics and texture details in the enhanced image, generating a more complete and smoother point cloud, and the results are close to those of the multi-exposure fusion method. The quantitative indicators are summarized in Table 3, with MAE reduced to 0.0045 mm and RMSE reduced to 0.0178 mm, further demonstrating the advantages of the proposed method in the reconstruction of complex HDR surfaces.
[0072] Table 3 Comparison of quantitative reconstruction results of the sequin surface
[0073] Image Pair Exposure Difference Analysis: This embodiment designed an exposure difference analysis experiment to evaluate the generalization ability of the proposed method for low-exposure-high-exposure image pairs under different exposure conditions. In the experiment, the camera exposure time was fixed at 50 ms, and the projector brightness was set to grayscale values of ΔH = 50, 80, 110, 140, 170, 200, and 230 (increment ΔH = 30), and striped images were acquired sequentially. In this exposure sequence, the low-exposure image was well imaged in high-reflectivity areas, but underexposed in low-reflectivity areas; the high-exposure image was normally imaged in low-reflectivity areas, but overexposed in high-reflectivity areas.
[0074] Furthermore, in all image pairs, the striped image with K = 50 is fixed as the low-exposure input, and the remaining images are sequentially used as high-exposure inputs, thereby evaluating the generalization performance of the proposed method under different exposure difference conditions.
[0075] Reconstruction results of different image pairs, such as Figure 15 As shown. For image pairs with different exposure differences, the proposed method can effectively guide the fringe pattern into the normal exposure range, while recovering fringe information in both low-reflectivity and high-reflectivity regions, such as... Figure 15 As shown in (c). Figure 15 Images (d), (e), and (f) show the corresponding reconstructed point clouds and magnified views of the high-reflectivity regions, respectively. For all image pairs, the proposed method can achieve complete reconstruction of both normal-exposure and HDR regions, and no significant differences were observed between the reconstructed point clouds obtained under different exposure conditions. Figure 15 (g) shows relative to Figure 14 (b) shows the reconstruction error distribution, indicating that the proposed method maintains high reconstruction accuracy under different image pairs, thus verifying its good generalization ability under different exposure conditions. Table 4 presents the quantitative reconstruction results for image pairs with different exposures. It can be seen that no significant abrupt changes occurred in any of the quantitative indicators under different exposure combinations. Specifically, the maximum change in MAE was only 0.0048 mm, and the maximum changes in RMSE and standard deviation were 0.0112 mm and 0.0101 mm, respectively. Notably, when the exposure difference (Δφ) exceeds 90, all indicators remain basically stable. The above results demonstrate that the proposed method can maintain consistent and reliable reconstruction accuracy under different exposure difference conditions, verifying its good generalization ability for image pairs with different exposure differences.
[0076] Table 4 Reconstruction performance metrics for different image pairs
[0077] Ablation Experiment: To verify whether the proposed method can effectively utilize the complementary information between high and low exposure image pairs, and to evaluate the contribution of the guided restoration mechanism, the following ablation experiment scheme was designed in this embodiment: Remove the guided restoration module: only perform stripe restoration and 3D reconstruction based on the fusion results of high and low exposure images, without introducing a guided restoration mechanism to further correct and enhance stripe contrast, modulation and local details.
[0078] Remove Guided Feature Extractor: In this setting, guided information is not used for feature extraction. Instead, standard UNet is used to enhance and restore the fusion results, thereby evaluating the impact of guided features on HDR stripe restoration and reconstruction performance.
[0079] A metal cover plate was selected as the reconstruction test object. The comparative experiments included: reconstruction results based on low-exposure images, reconstruction results based on high-exposure images, reconstruction results based on fused images, reconstruction results after enhancing the fused results using only UNet, and reconstruction results obtained using the method proposed in this embodiment.
[0080] Figure 16(e) shows the reconstruction result based solely on the low-exposure image. It can be seen that high-reflectivity regions can be reconstructed well, while low-reflectivity regions, due to underexposure and low fringe modulation, show significant reconstruction failure. Figure 16(f) shows the reconstruction result based solely on the high-exposure image. In this case, the low-reflectivity regions are normally exposed, and the point cloud reconstruction is relatively complete. However, the high-reflectivity regions are severely overexposed, leading to fringe saturation and obvious reconstruction gaps. Figure 16(c) shows the reconstruction result of the fused image. It can be observed that the point cloud in the high-reflectivity regions is basically complete, and the gaps in the low-reflectivity regions are significantly reduced compared to the low-exposure result, indicating that the fusion module can integrate the complementary modulation information from the high and low-exposure image pairs to a certain extent. However, residual point cloud gaps still exist in the low-modulation regions, indicating that relying solely on the fusion stage is insufficient to completely recover the severely degraded fringe information. Figure 16(d) shows the reconstruction result after enhancing the fusion result using standard UNet. Compared to using only the fused image, the overall quality of the point cloud is further improved in both high and low reflectivity regions. However, significant outliers and missing points are still observed in the low reflectivity region, indicating that UNet's recovery capability remains limited in the absence of explicit guiding information. Figure 16(b) shows the reconstruction results of the proposed method in this embodiment. By extracting guiding features from the high-exposure image, the guided inpainting module can effectively guide the fused image to the normal exposure range and further refine the stripe details in the HDR region, especially significantly improving the point cloud quality in the low reflectivity region. The above results show that the proposed method can not only fully exploit the complementary information between high and low exposure image pairs to achieve effective recovery of the HDR region, but also perform global enhancement and local fine-tuning of the fusion result. Compared with the fusion result enhancement method based solely on UNet, the method in this embodiment achieves a more accurate and complete point cloud reconstruction effect in the HDR region.
[0081] Figure 17(e) shows the reconstruction error distribution of the low-exposure image. It can be seen that the error is mainly concentrated in the low reflectivity region, due to underexposure leading to a decrease in fringe modulation, thus causing reconstruction failure. High reflectivity regions, under normal exposure conditions, can be correctly reconstructed with smaller errors. Figure 17(f) shows the reconstruction error distribution of the high-exposure image. Here, the error is mainly distributed in the high reflectivity region, where overexposure causes severe saturation of fringe information. Low reflectivity regions, due to normal exposure, have relatively small reconstruction errors. Figure 17(c) shows the reconstruction error of the fused image. The results indicate that the fusion module can extract and integrate complementary information from the high and low exposure image pairs to a certain extent, but significant reconstruction errors still exist in the low reflectivity region. This suggests that relying solely on the fusion stage is insufficient to fully recover the severely degraded fringe information, and further guided enhancement and refinement are still needed. Figure 17(d) shows the reconstruction error after enhancing the fused image using standard UNet. Compared to Figure 17(c), the error in the low reflectivity region is reduced, indicating that UNet has a certain recovery capability. However, significant residual errors remain, indicating that UNet struggles to accurately recover severely degraded regions in the absence of explicit guidance information.
[0082] Figure 17(b) shows the reconstruction error map of the proposed method. Its error distribution is highly consistent with the error characteristics of the normally exposed areas in low-exposure and high-exposure images. The overall reconstruction error is significantly reduced, and the reconstruction result is closer to the reference label. The above results further demonstrate that the proposed method can not only effectively fuse complementary information in high- and low-exposure image pairs, but also effectively enhance and finely restore HDR regions with the help of a guided inpainting mechanism.
[0083] Table 5 Quantitative Results of Ablation Experiments
[0084] As shown in Table 5, the quantitative results reveal that both low-exposure and high-exposure images exhibit significant reconstruction errors due to the presence of high dynamic range (HDR) regions. By first fusing complementary information from the high- and low-exposure image pairs, the RMSE is reduced to 0.0669 mm. Further enhancement of the fusion result using only UNet further slightly reduces the RMSE to 0.0665 mm. In contrast, the proposed method achieves superior reconstruction performance, with an RMSE reduced to 0.0591 mm and the lowest MAE (0.0113 mm), indicating a significant improvement in reconstruction accuracy. These numerical results demonstrate that each module in the proposed network plays a positive role in the high-precision 3D reconstruction of HDR surfaces.
[0085] In summary, this embodiment proposes a high dynamic range (HDR) surface 3D reconstruction method based on high and low exposure fringe image pairs. This method fully utilizes the complementary modulation characteristics between the two exposure fringe images. First, a fusion module extracts and integrates complementary information from the high and low exposure images to obtain a fusion result with complementary modulation. Then, a guided inpainting module is introduced to specifically enhance and finely repair the low-modulus regions in the fusion result, thereby generating a high-quality enhanced fringe image for subsequent phase calculation and 3D reconstruction.
[0086] Experimental results demonstrate that the proposed method can effectively recover fringe information within the HDR region while preserving the sinusoidal characteristics of the fringes, achieving complete and high-precision point cloud reconstruction of surfaces with varying reflectivity. Comparative experiments further validate that this method outperforms existing deep learning methods in both reconstruction accuracy and detail preservation within the HDR region. Ablation experiments show that complementary modulation information fusion and guided inpainting mechanisms play crucial roles in improving reconstruction performance. Compared to traditional multi-exposure fusion methods, the proposed method requires only two sets of input fringe images to obtain high-quality reconstruction results, providing a high-precision and efficient solution for HDR surface 3D measurement.
[0087] The above description is merely a preferred embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A stripe image enhancement method based on complementary modulation information, characterized in that, include: Acquire a double exposure stripe image to be processed, the double exposure stripe image to be processed including a high exposure stripe image and a corresponding low exposure stripe image; The double-exposure stripe image to be processed is input into a fusion-guided repair network for image enhancement, and an enhanced stripe image is output. The fusion-guided repair network includes a modulation complement fusion module and a guided repair module connected in sequence. The modulation complement fusion module includes a dual-branch structure and a terminal convolutional layer connected in sequence. The dual-branch structure includes a symmetrically arranged low-exposure modulation-aware sub-branch and a high-exposure modulation-aware sub-branch. The guided repair module is built based on the UNet network.
2. The stripe image enhancement method based on complementary modulation information according to claim 1, characterized in that, The training process of the fusion-guided repair network specifically includes: Acquire training data, which includes double-exposure stripe training images and corresponding real labels; An initial fusion-guided inpainting network is constructed. The training data is input into the fusion-guided inpainting network for image enhancement. The network is trained with the goal of minimizing the loss between the initial training result after image enhancement and the real label corresponding to the double exposure stripe training image. The trained fusion-guided inpainting network is then obtained.
3. The stripe image enhancement method based on complementary modulation information according to claim 1, characterized in that, The processing procedure of the fusion-guided repair network specifically includes: The high-exposure stripe image is input into the low-exposure modulation perception sub-branch to learn modulation perception weights. The low-exposure stripe image is input into the high-exposure modulation perception sub-branch to learn modulation perception weights. By performing modulation complement fusion on the outputs of the low-exposure modulation perception sub-branch and the high-exposure modulation perception sub-branch through the terminal convolutional layer, a modulation complement feature map is obtained. The guided repair module performs targeted enhancement and repair on the low-profile region in the complementary feature map of modulation, and outputs an enhanced stripe image.
4. The stripe image enhancement method based on complementary modulation information according to claim 3, characterized in that, The process of obtaining the complementary feature map of modulation intensity specifically includes: In the formula, To adjust the complementary characteristics of the system, Low-exposure striped image Weighting coefficients for the medium-high reflectivity region High-exposure striped image The weighting coefficients for the low to medium reflectivity region, ⊙ represents the element-wise Hadamard product, and 𝜖 is used to ensure numerical stability.
5. The stripe image enhancement method based on complementary modulation information according to claim 3, characterized in that, The process of acquiring the enhanced stripe image specifically includes: In the formula, Indicates the backbone UNet number The output of the layer sampling layer is the enhanced feature map obtained under the guidance of the extracted guiding information; This indicates the first [unclear] of the backbone UNet. Layer sampling operation, This indicates a feature concatenation operation. Main UNet No. The output of the downsampling layer, This represents the cross-attention mechanism. To guide the feature extractor in the first Guided features extracted from layers.
6. A stripe image enhancement system based on complementary modulation information, characterized in that, include: The data acquisition module is used to acquire the double exposure stripe image to be processed, which includes a high exposure stripe image and a corresponding low exposure stripe image. A fusion-guided inpainting module is used to input the double-exposure stripe image to be processed into a fusion-guided inpainting network for image enhancement and output an enhanced stripe image. The fusion-guided inpainting network includes a modulation-complementary fusion module and a guided inpainting module connected in sequence. The modulation-complementary fusion module includes a dual-branch structure and a terminal convolutional layer connected in sequence. The dual-branch structure includes a symmetrically arranged low-exposure modulation-aware sub-branch and a high-exposure modulation-aware sub-branch. The guided inpainting module is built based on the UNet network.
7. An electronic device, characterized in that, The device includes a memory and a processor, the memory being used to store a computer program, and the processor running the computer program to cause the electronic device to perform a stripe image enhancement method based on complementary modulation information according to any one of claims 1-5.
8. A computer-readable storage medium, characterized in that, It stores a computer program that, when executed by a processor, implements a stripe image enhancement method based on complementary modulation information as described in any one of claims 1-5.