A method, apparatus, device and storage medium for under-display image processing

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By incorporating visible light and infrared cameras within the under-display camera and combining image fusion and diffraction restoration technologies, the problem of poor image quality in under-display cameras has been solved, achieving higher-quality under-display shooting results.

CN116264630BActive Publication Date: 2026-06-30GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD
Filing Date: 2021-12-14
Publication Date: 2026-06-30

Application Information

Patent Timeline

14 Dec 2021

Application

30 Jun 2026

Publication

CN116264630B

IPC: H04N23/50; H04N23/95; H04N23/741; H04N5/265

AI Tagging

Technology Topics

Imaging processing Computer graphics (images)

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Anomaly detection method, device and medium for a printed circuit board assembly
CN122265128AImage analysis Imaging processing Anomaly detection
system
JP2026105331AImaging processing Medicine
An image processing method, apparatus and electronic device
CN116912124Bunderstand spatial structureGood image denoising effectImage enhancement Character and pattern recognitionImage denoisingImaging processing
A multi-modal image semantic segmentation method based on a convolutional neural network
CN122313049AImaging processing Image resolution
Method for parallel image processing and routing
US20260172584A1Closed circuit television systems Digital video signal modification Imaging processing Image manipulation

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

The poor image quality of under-display cameras, with uneven light transmittance leading to issues such as image blurring, overexposure of light sources, loss of detail, and color cast.

Method used

Two cameras with different wavelengths are set at the bottom of the screen. The visible light and infrared light cameras are used to collect image information respectively, and the image quality is improved through image fusion, diffraction repair and noise processing.

Benefits of technology

It effectively compensates for image loss caused by differences in light transmittance of under-display cameras, and improves the shooting quality and detail performance of under-display cameras.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116264630B_ABST

Patent Text Reader

Abstract

This application discloses an under-display image processing method, apparatus, device, and storage medium. The method includes: acquiring a first image captured by a first under-display camera of a shooting device at different exposure times, and a second image captured by a second under-display camera; and obtaining a target image based on the first image and the second image. In this way, taking advantage of the large differences in the transmittance of screens to different wavelengths of light signals, an under-display camera absorbing two different wavelengths of light signals is set under the screen. The image information captured by the second under-display camera compensates for the image information lost by the first under-display camera, effectively improving the shooting quality of the under-display camera.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to image processing technology, and more particularly to an under-screen image processing method, apparatus, device, and storage medium. Background Technology

[0002] An under-display camera hides a regular camera beneath the screen, allowing it to capture images through the screen above. However, the light reaching the camera is significantly reduced due to the screen's anode blockage, complicating traditional light propagation characteristics. A typical punch-hole front camera exhibits balanced light transmittance across different RGB channels, with very similar transmittance across different wavelengths, averaging over 90%. Under-display cameras, however, exhibit substantial transmittance variations depending on screen design, ranging from 10% to 50% across different wavelengths. Therefore, under-display cameras inherently suffer significant light loss.

[0003] Under-display camera imaging can be viewed as light passing through a slit array to form an image. Light diffraction occurs during this process, and different slit array configurations (different sizes, widths, and combinations) result in varying shapes, degrees of diffusion, and peak energy of the diffracted light spots. This leads to different degrees of image loss in the under-display camera captured. Common manifestations include severe image blurring, widespread overexposure of light sources, loss of image detail, and general color cast. Therefore, optimizing the image quality of under-display cameras is a pressing issue that needs to be addressed in the development of under-display photography technology. Summary of the Invention

[0004] To address the aforementioned technical problems, embodiments of this application aim to provide an under-display image processing method, apparatus, device, and storage medium.

[0005] The technical solution of this application is implemented as follows:

[0006] Firstly, a method for image processing on a single screen is provided, including:

[0007] Acquire a first image captured by the first under-screen camera of the shooting device at different exposure times, and a second image captured by the second under-screen camera;

[0008] Based on the first image and the second image, the target image is obtained;

[0009] The first under-display camera and the second under-display camera are located below the screen of the shooting device. The first under-display camera is used to sense light signals in a first wavelength range, and the second under-display camera is used to sense light signals in a second wavelength range. The first wavelength range is smaller than the second wavelength range.

[0010] Secondly, an under-display image processing apparatus is provided, comprising:

[0011] The module is configured to: acquire at least two first images captured by the first under-screen camera of the shooting device under different exposure parameters, and a second image captured by the second under-screen camera;

[0012] The image processing module is configured to obtain a target image based on the at least two frames of the first image and the second image;

[0013] The first under-display camera and the second under-display camera are located below the screen of the shooting device. The first under-display camera is used to sense light signals in a first wavelength range, and the second under-display camera is used to sense light signals in a second wavelength range. The first wavelength range is smaller than the second wavelength range.

[0014] Thirdly, a terminal device is provided, comprising: a processor and a memory configured to store a computer program capable of running on the processor.

[0015] Wherein, when the processor is configured to run the computer program, it executes the steps of the aforementioned method.

[0016] Fourthly, a computer-readable storage medium is provided having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the aforementioned method.

[0017] This application provides an under-display image processing method, apparatus, device, and storage medium. The method includes: acquiring a first image captured by a first under-display camera of a shooting device at different exposure times, and a second image captured by a second under-display camera; and obtaining a target image based on the first image and the second image. In this way, taking advantage of the large differences in the transmittance of screens to different wavelengths of light signals, an under-display camera absorbing two different wavelengths of light signals is set under the screen. The image information captured by the second under-display camera compensates for the image information lost by the first under-display camera, effectively improving the shooting quality of the under-display camera. Attached Figure Description

[0018] Figure 1 This is a schematic diagram of the anode distribution on the screen;

[0019] Figure 2 This is a schematic diagram of light diffraction.

[0020] Figure 3 This is a schematic diagram illustrating the effect of the slit width on the diffraction fringes.

[0021] Figure 4 This is a schematic diagram of an image captured by the on-screen camera.

[0022] Figure 5 This is a schematic diagram of an image captured by the under-display camera.

[0023] Figure 6 This is a schematic diagram of the first process of the under-display image processing method in the embodiments of this application;

[0024] Figure 7 This is a schematic diagram of the composition of the imaging device in the embodiments of this application;

[0025] Figure 8 This is a schematic diagram of the second process of the under-display image processing method in the embodiments of this application;

[0026] Figure 9 This is a schematic diagram of the HDR fusion process in an embodiment of this application;

[0027] Figure 10 This is a schematic diagram of the Laplace pyramid fusion process in the implementation of this application;

[0028] Figure 11 This is a flowchart illustrating the network training method in an embodiment of this application;

[0029] Figure 12 This is a schematic diagram of the imaging principle in an embodiment of this application;

[0030] Figure 13 This is a schematic diagram illustrating the screen degradation principle in an embodiment of this application;

[0031] Figure 14 This is a schematic diagram of the third process of the under-display image processing method in the embodiments of this application.

[0032] Figure 15 This is a schematic diagram of the composition structure of the under-display image processing device in the embodiments of this application;

[0033] Figure 16 This is a schematic diagram of the composition structure of the terminal device in the embodiments of this application. Detailed Implementation

[0034] In order to gain a more detailed understanding of the features and technical content of the embodiments of this application, the implementation of the embodiments of this application will be described in detail below with reference to the accompanying drawings. The accompanying drawings are for reference and illustration only and are not intended to limit the embodiments of this application.

[0035] An under-display camera (UPC) hides a regular camera beneath the screen, allowing it to capture images through the screen above. However, the light reaching the camera is severely limited due to the screen's anode blockage, complicating traditional light propagation characteristics. A typical punch-hole camera (often called an "on-screen camera") exhibits relatively uniform light transmittance across different RGB channels, with very similar transmittance across different wavelengths, averaging over 90%. However, transmittance varies significantly depending on screen design, ranging from 10% to 50% across different wavelengths. Therefore, under-display cameras inherently suffer considerable light loss. For example... Figure 1 This is a schematic diagram of the anode distribution on the screen, such as... Figure 1 As shown, the lower left corner is the light-transmitting area, and the other areas are the normal display areas. It can be seen that in order to increase the transmittance of the under-display camera, the anode density in the light-transmitting area is less than that in the normal display area.

[0036] Figure 2 This is a schematic diagram of light diffraction, such as... Figure 2 As shown, when the light source S passes through the circular aperture H, it forms annular diffraction fringes on the imaging plane P. When the light source S passes through the single slit G, it forms parallel diffraction fringes on the imaging plane P.

[0037] Figure 3 This is a schematic diagram illustrating the effect of the slit width on the diffraction fringes, as shown below. Figure 3 As shown, when the slit is very wide, the slit width is much larger than the wavelength of light, and the diffraction phenomenon is extremely inconspicuous. The smaller the slit width, the more obvious the diffraction phenomenon.

[0038] Under-display camera imaging can be viewed as light passing through a slit array to form an image. Light diffraction occurs when light passes through these slits, and different slit array shapes (different sizes, widths, and combinations) result in varying shapes, degrees of diffusion, and peak energy of the diffracted light spots. This leads to different degrees of loss in the image captured by the under-display camera. Common manifestations include: severe image blurring, widespread overexposure of light sources, loss of image detail, and general color cast.

[0039] For example, Figure 4 This is a schematic diagram of an image captured by the on-screen camera. Figure 5 This is a schematic diagram of an image captured by an under-display camera. Photos and videos taken by an under-display camera through the screen exhibit significant light diffraction compared to those captured by a normal punch-hole camera. To address this issue, this application provides an under-display image processing method. Figure 6 This is a schematic diagram of the first process of the under-display image processing method in an embodiment of this application, as shown below. Figure 6 As shown, the method may specifically include:

[0040] Step 601: Acquire the first image captured by the first under-screen camera of the shooting device at different exposure times, and the second image captured by the second under-screen camera;

[0041] The first under-display camera and the second under-display camera are located below the screen of the shooting device. The first under-display camera is used to sense light signals in a first wavelength range, and the second under-display camera is used to sense light signals in a second wavelength range. The first wavelength range is smaller than the second wavelength range.

[0042] In practical applications, when the screen has different transmittance for different wavelengths of light signals, when the range of the first wavelength is smaller than that of the second wavelength, the screen has higher transmittance for the second wavelength of light signals than for the first wavelength of light signals.

[0043] To address the significant differences in screen transmittance for different wavelengths of light signals, an under-display camera is installed below the screen to absorb two different wavelengths of light signals. The image information captured by the second under-display camera compensates for the image information lost by the first under-display camera, effectively improving the shooting quality of the under-display camera.

[0044] For example, Figure 7 This is a schematic diagram of the composition of the shooting device in an embodiment of this application. The screen includes a light-transmitting area and a normal display area. A first under-screen camera and a second under-screen camera are disposed below the light-transmitting area. The first under-screen camera captures two or more first images at different exposure times, and the second under-screen camera can capture a second image according to a fixed exposure time.

[0045] For example, in some embodiments, the light signal in the first wavelength range is a visible light signal, and the light signal in the second wavelength range is an infrared light signal; the first under-display camera is a visible light camera, and the second under-display camera is an infrared camera.

[0046] Natural light is composed of light waves of different wavelengths. The visible light spectrum for the human eye is approximately 390-780 nm. Electromagnetic waves shorter than 390 nm and longer than 780 nm are imperceptible to the human eye. Electromagnetic waves with wavelengths shorter than 390 nm are located outside the violet range of the visible light spectrum and are called ultraviolet light. Electromagnetic waves longer than 780 nm are located outside the red range of the visible light spectrum and are called infrared light. The infrared light spectrum ranges from 780 nm to 1 mm.

[0047] Because visible light and infrared light have different wavelengths, and different screens exhibit significant differences in transmittance across different wavelength bands, visible light under-display cameras can only capture a very limited amount of light. In contrast, infrared under-display cameras can capture much more infrared light, obtaining more image information and effectively compensating for the insufficient transmittance of visible light cameras. By using one visible light camera and one infrared camera in an under-display front camera setup, the high transmittance of the infrared band is utilized to capture image details. Algorithms are then used to fuse these details into an RGB image, resulting in a superior under-display camera effect.

[0048] Here, the first image is image data output by the visible light image sensor, and the second image is image data output by the infrared light image sensor.

[0049] Step 602: Obtain the target image based on the first image and the second image.

[0050] Here, the acquired first and second images undergo a series of image processing steps, including image fusion, diffraction restoration, noise reduction, color correction, and gamma correction, to obtain the final target image displayed to the user. For example, an image in JPG format.

[0051] It should be noted that the under-display image processing method provided in this application embodiment can be applied to the processing of images or videos captured by an under-display camera.

[0052] Based on the above embodiments, the under-display image processing method will be further illustrated with examples. Figure 8 This is a schematic diagram of the second process of the under-display image processing method in an embodiment of this application, as shown below. Figure 8 As shown, the method may specifically include:

[0053] Step 801: Acquire the first image captured by the first under-screen camera of the shooting device at different exposure times, and the second image captured by the second under-screen camera;

[0054] For example, two first images captured under long and short exposure times, or three first images captured under long, medium and short exposure times.

[0055] Step 802: Perform a first fusion process on at least two frames of the first image acquired at different exposure times to obtain a first fused image;

[0056] For example, the first fusion process is High-Dynamic Range (HDR) fusion, which synthesizes a final HDR image based on LDR (Low-Dynamic Range) images with different exposure times and utilizing the LDR image with the best detail corresponding to each exposure time. It can better reflect the visual effects of the real environment. Compared to ordinary images, HDR images can provide more dynamic range and image detail.

[0057] Figure 9 This is a schematic diagram of the HDR fusion process in an embodiment of this application. Figure 9 As shown, HDR fusion includes:

[0058] Acquire three images captured by the sensor, including a long exposure image EV0, a medium exposure image EV-2, and a short exposure image EV-4.

[0059] Brightness alignment: Brightness matching is performed on the three frames to be fused. The long exposure image EV0 is used as the reference frame to determine the brightness multiple between the long and medium exposure images and the long and short exposure images. Brightness matching is to multiply the medium exposure image and the short exposure image by the multiple to map the brightness value to the brightness level of the long exposure image.

[0060] Feature point detection: Image feature point extraction involves dividing the image into blocks, using gradient domain minima, setting an interest point threshold, and extracting the image's feature vectors. These feature vectors effectively represent the main information of the image.

[0061] Image registration: Using the long exposure image EV0 as the reference frame, feature points between different images are matched. The matched feature point sequence can be used to calculate the transformation matrix Homography1 between long and medium exposure images, and the transformation matrix Homography2 between long and short exposure images.

[0062] Image transformation: The medium-exposure image EV-2 is transformed (Warp) according to the transformation matrix Homography1 to obtain the EV-2Warp frame. The short-exposure image EV-4 is transformed according to the transformation matrix Homography2 to obtain the EV-4Warp frame.

[0063] HDR fusion: Fusion Maps fuse three frames of images to obtain an HDR image.

[0064] Step 803: Perform diffraction repair on the first fused image based on the first repair network to obtain the repaired first fused image;

[0065] Here, the first fused image can be understood as the under-display image obtained by fusing the under-display images captured by the first under-display camera. The repaired first fused image can be understood as the fused on-screen image, which is equivalent to the on-screen image obtained by fusing the on-screen images captured by the on-screen camera.

[0066] Step 804: Perform diffraction repair on the second image based on the second repair network to obtain the repaired second image;

[0067] Here, the second image can be understood as an under-display image captured by the second under-display camera. The repaired second image can be understood as an on-screen image, equivalent to an on-screen image captured by the on-screen camera.

[0068] In practical applications, the first and second repair networks can be neural networks. The method further includes training the first and second repair networks. Specifically, it includes: acquiring a training dataset; wherein the training dataset includes first on-screen image data and first under-screen image data corresponding to the target under-display camera, the first on-screen image data and the first under-display image data corresponding to each other; training the target repair network based on the training dataset to obtain the trained target repair network;

[0069] Wherein, when the target under-display camera is the first under-display camera, the target repair network is the first repair network; when the target under-display camera is the second under-display camera, the target repair network is the second repair network.

[0070] Specifically, the target restoration network is used to process the under-screen image data and output on-screen image data; the loss value of the on-screen image data output by the target restoration network relative to the on-screen image data in the training dataset is calculated to adjust the network parameters.

[0071] Step 805: Perform a second fusion process on the repaired first fused image and the repaired second image to obtain a second fused image;

[0072] The first fused image after diffraction restoration has a high dynamic range fusion result, while the second image has a detail image with higher transmittance. Fusion of these two restored images can yield a higher quality under-screen image.

[0073] For example, the second fusion process can be Laplacian pyramid fusion, also known as multi-resolution fusion algorithm. A Laplacian pyramid is built from the image, where each layer of the pyramid contains different frequency bands of the image, and these different frequency bands are fused separately.

[0074] Figure 10 This is a schematic diagram of the Laplace pyramid fusion process in the implementation of this application, as shown below. Figure 10 As shown,

[0075] First, the two images to be fused are decomposed using the Laplacian pyramid method. A multi-scale pyramid image sequence is obtained through multi-scale transformation, with the number of decomposition levels being:

[0076] floor(log2(min(H,W)))

[0077] Where H and W are the image height and width, and floor represents rounding down.

[0078] Secondly, the image fusion method is determined. Determining the fusion method specifically includes the following:

[0079] Region energy-based fusion methods outperform pixel-based fusion methods (e.g., maximum value fusion). Fusion methods that maximize region energy produce the best image quality, with the highest values for information entropy, standard deviation, average gradient, and spatial frequency. The fused image contains rich information and is clear.

[0080] The energy E of an image patch (window) is calculated using pixel value statistics, and the fusion method is determined based on the obtained energy.

[0081]

[0082]

[0083] m and n represent the window size, E represents the energy level within a selected window, the superscript l indicates the pyramid level, L represents the region pixel value representing brightness information, and the subscripts I and V represent the first fused image and the second fused image after restoration (e.g., infrared image and visible light image), respectively. After calculating the energy values of the two images, the similarity of the window information is determined based on the energy values.

[0084] Similarity calculation method:

[0085]

[0086] Among them, the similarity M of the l-th layer pyramid l The values of (x, y) are between [-1, 1], if M l If (x, y) ≥ t, then the weighted average fusion is used in this layer; otherwise, the maximum value fusion is used. t is set as an externally adjustable parameter.

[0087] The method of taking the weighted average of pixels has the smallest standard deviation, relatively uniform color tone, and good performance in transition areas. The fusion method that takes the largest pixel value can retain more information when there is uniform local brightness, but the contrast is lower.

[0088] Therefore, during fusion, the similarity is judged by the parameter t. If the similarity < t, it indicates that the regional similarity is low, and ghosting is likely to occur during weighted fusion. For highly similar regions with similarity > t, weighted averaging is performed to make the transition more balanced and delicate. The following are two calculation methods for fusion:

[0089] Weighted average fusion:

[0090] L{F} l (x,y) = G{W} l (x,y) · L{I} l (x,y) + (1 - G{W} l (x,y) · L{V} l (x,y))

[0091] Maximum value fusion:

[0092]

[0093]

[0094] The superscript l represents the l-th layer of the pyramid, L represents the regional pixel value size characterizing the luminance information, the subscripts I and V respectively represent the first fused image after restoration and the second image after restoration, and G is the weight.

[0095] Step 806: Obtain the target image based on the second fused image.

[0096] Exemplarily, the target image is obtained by performing image processing on the second fused image based on an image signal processor; wherein, the image processing includes: color correction, Gamma correction, and noise reduction processing.

[0097] Adopting the above technical solution, aiming at the characteristic that the transmittance of the screen for different band light signals varies greatly, an under-screen camera that absorbs two different band light signals is set under the screen. The image information collected by the second under-screen camera is used to supplement the image information lost by the first under-screen camera. The first image and the second image are respectively subjected to diffraction restoration, and the two restored images obtained are subjected to pyramid fusion. After fusion, the detailed performance under the screen is further improved, enhancing the image quality of the under-screen image.

[0098] Based on the above embodiments, Figure 11 is a schematic flowchart of the network training method in the embodiments of the present application. As Figure 11 shown, the training method includes:

[0099] Step 1101: When there is no screen in the light receiving direction of the target under-screen camera, obtain the first on-screen real image data collected by the target under-screen camera;

[0100] Step 1102: Process the real image data on the first screen based on the screen degradation model corresponding to the target under-screen camera to obtain simulated image data under the first screen;

[0101] Here, the screen degradation model is established based on the screen diffraction characteristics to simulate the screen degradation process. This application constructs screen degradation models by calibrating the diffraction characteristics of the first and second under-display cameras, respectively.

[0102] like Figure 12 As shown, in an ideal lens imaging scenario, an object point on the focal plane will be projected as a single image point. However, in reality, an ideal lens does not exist; lenses always have some imperfections that cause an object point to be projected as multiple points. The image formed by an ideal point after passing through a camera is described by the point spread function (PSF).

[0103] A small, ideal dot passing through a circular aperture will appear in a special shape in the PSF (Physical Image File), a pattern called the Airy Pattern. Figure 12 As shown in the imaging plane on the right, this PSF formed due to diffraction is called the diffraction-limited PSF.

[0104] like Figure 13 As shown, suppose there is an ideal lens, unaffected by diffraction, whose image is x. The actual lens has a PSF of c and an image formed by the actual lens of b. The relationship between these three is a typical convolution relationship:

[0105] x*c=b

[0106] Convolution in the spatial domain is equivalent to multiplication in the frequency domain:

[0107] F(x)·F(c)=F(b)

[0108] Since convolution in the spatial domain is equivalent to multiplication in the frequency domain, we only need to perform division in the frequency domain to recover x. We call this process deconvolution, which is equivalent to performing division in the frequency domain.

[0109]

[0110] The result of the deconvolution can be obtained by performing an inverse Fourier transform on the result of the division.

[0111]

[0112] Using the aforementioned calibration principle, the two under-display cameras are processed separately. On-screen and under-display image data are collected and paired. The screen's diffraction characteristics are obtained through frequency domain division. Based on these diffraction characteristics, a screen degradation model can be effectively established to simulate screen loss, thereby simulating the image capture process and generating a training dataset for training the restoration network. The restoration network is then used to perform diffraction restoration on the images, making the images captured by the under-display camera approximate those captured by the on-screen camera.

[0113] In practical applications, if we consider the noise factor n, the convolution relationship between x, c, and b is:

[0114] b = c * x + n

[0115] In other words, the actual degradation model includes not only the diffraction characteristics of the screen but also the noise characteristics, thus the restoration network has the functions of image diffraction restoration and noise reduction.

[0116] For example, in some embodiments, the method further includes: when the target under-display camera receives light in a direction without a screen, acquiring real image data on a third screen captured by the target under-display camera; when the target under-display camera receives light in a direction with a screen, acquiring real image data under the third screen captured by the target under-display camera; wherein the real image data on the third screen and the real image data under the third screen; and obtaining the screen degradation model based on the real image data on the third screen and the real image data under the third screen.

[0117] Step 1103: Use the real image data on the first screen and the simulated image data off the first screen to form a first data pair to obtain the generated dataset;

[0118] Step 1104: When the target under-display camera receives light without a screen, acquire the real image data on the second screen captured by the target under-display camera;

[0119] Step 1105: When the target under-display camera receives light from the screen, acquire the second under-display real image data captured by the target under-display camera;

[0120] Step 1106: Use the real image data on the second screen and the real image data below the second screen to form a second data pair to obtain the collected dataset;

[0121] The first under-screen camera receives light from the direction of the light source and collects on-screen image data when there is no screen. Based on the screen degradation model corresponding to the first under-screen camera, the corresponding under-screen image data is obtained. The under-screen and on-screen image data are combined to form a generated dataset. Then, the first under-screen camera is used to collect under-screen image data and on-screen image data in actual use to form a collected dataset.

[0122] The second under-screen camera receives light from the direction of the light source and collects on-screen image data when there is no screen. Based on the screen degradation model corresponding to the second under-screen camera, the under-screen image data is obtained. The under-screen and on-screen image data are combined to form a generated dataset. Then, the second under-screen camera is used to collect under-screen image data and on-screen image data in actual use to form a collected dataset.

[0123] Using these two datasets for training can prevent the network from performing too ideally when trained using only generated datasets, making it unsuitable for real-world shooting environments, and thus improve the network's repair capabilities.

[0124] For example, in some embodiments, the method further includes: performing data augmentation on the generated dataset and the collected dataset to obtain augmented generated dataset and training dataset.

[0125] Data augmentation, which leverages existing data, can optimize the training dataset and thus improve training efficiency. Data augmentation methods can include at least one of the following: geometric transformations, color transformations, etc. Geometric transformations include: horizontal flipping, vertical flipping, deformation scaling, flipping operations, rotation operations, etc. Color transformations include: noise, blurring, color transformation, erasing, filling, etc.

[0126] Step 1107: Use the generated dataset and the collected dataset to form a training dataset;

[0127] Step 1108: Train the target repair network based on the training dataset to obtain the trained target repair network.

[0128] Wherein, when the target under-display camera is the first under-display camera, the target repair network is the first repair network; when the target under-display camera is the second under-display camera, the target repair network is the second repair network.

[0129] During training, the repair network is used to process the under-screen image data and output on-screen image data. The loss value of the on-screen image data output by the repair network relative to the on-screen image data in the training dataset is calculated to adjust the network parameters of the repair network. The trained repair network is then used to perform diffraction repair on the image, resulting in an on-screen image equivalent to that captured by an on-screen camera.

[0130] The above-mentioned under-display image processing method can be applied to terminal devices equipped with under-display cameras, such as mobile phones, tablets, laptops, handheld computers, personal digital assistants (PDAs), portable media players (PMPs), wearable devices, cameras, etc.

[0131] Based on the above embodiments, an example is given, in which the first under-display camera is a visible light camera and the second under-display camera is an infrared camera. Figure 14 This is a schematic diagram of the third process of the under-display image processing method in an embodiment of this application, as shown below. Figure 14 As shown,

[0132] When the shutter is pressed, the shooting device receives the shooting command and controls the visible light camera to capture long exposure image EV0, medium exposure image EV-2 and short exposure image EV-4, and controls the infrared camera to capture a single frame infrared image.

[0133] The long exposure image EV0, the medium exposure image EV-2, and the short exposure image EV-4 are HDR fused together to obtain an HDR fused frame;

[0134] The HDR restoration network (i.e., the first restoration network) performs diffraction restoration on the HDR fused frame to obtain the HDR restored frame, and the infrared restoration network (i.e., the second restoration network) performs diffraction restoration on the infrared frame to obtain the infrared restored frame.

[0135] The HDR restored frame and the infrared frame are fused using a Laplacian pyramid to obtain the fused restored frame in the RAW domain;

[0136] The RAW domain fused and repaired frames are sent to the ISP, where they undergo pre- and post-processing to output a JPG image for saving. Diffraction repair in the RAW domain yields an image equivalent to that captured by the on-screen camera. Subsequent image processing by the image signal processor avoids introducing interference from the under-screen camera, meaning the image signal processor does not need a special processing flow for it, thus ensuring the integrity of the image signal processor's processing flow.

[0137] Visible light under-display cameras capture multiple frames of images with varying exposure times for HDR fusion. Infrared cameras have a significant advantage in transmittance in the infrared band, but they cannot compensate for the dynamic range loss caused by under-display cameras. Therefore, after the sensor completes the exposure, the images are first HDR fused in the Raw domain using the visible light under-display camera, and then the image information captured by the infrared camera is used to compensate for the image information lost by the visible light camera, effectively improving the shooting quality of the under-display camera.

[0138] To implement the method of the embodiments of this application, based on the same inventive concept, the embodiments of this application also provide an under-display image processing device, such as... Figure 15 As shown, the device 150 includes:

[0139] The acquisition module 1501 is configured to acquire at least two first images captured by the first under-screen camera of the shooting device under different exposure parameters, and a second image captured by the second under-screen camera;

[0140] Image processing module 1502 is configured to obtain a target image based on the at least two frames of the first image and the second image;

[0141] The first under-display camera and the second under-display camera are located below the screen of the shooting device. The first under-display camera is used to sense light signals in a first wavelength range, and the second under-display camera is used to sense light signals in a second wavelength range. The first wavelength range is smaller than the second wavelength range.

[0142] like Figure 7 As shown, the shooting device includes a screen, a first under-screen camera, and a second under-screen camera. The screen includes a light-transmitting area and a normal display area. The first and second under-screen cameras are positioned below the light-transmitting area. The first under-screen camera captures two or more first images at different exposure times, and the second under-screen camera can capture a second image based on a fixed exposure time.

[0143] For example, in some embodiments, the light signal in the first wavelength range is a visible light signal, and the light signal in the second wavelength range is an infrared light signal; the first under-display camera is a visible light camera, and the second under-display camera is an infrared camera.

[0144] In this way, taking advantage of the large differences in the transmittance of the screen to different wavelengths of light signals, an under-display camera that absorbs two different wavelengths of light signals is set under the screen. The image information captured by the second under-display camera is used to compensate for the image information lost by the first under-display camera, effectively improving the shooting quality of the under-display camera.

[0145] For example, in some embodiments, the image processing module 1502 is configured to:

[0146] At least two frames of the first image acquired at different exposure times are subjected to a first fusion process to obtain a first fused image;

[0147] The first fused image is diffractively repaired based on the first repair network to obtain the repaired first fused image;

[0148] The second image is then subjected to diffraction repair based on the second repair network to obtain the repaired second image.

[0149] A second fusion process is performed on the repaired first fused image and the repaired second image to obtain a second fused image;

[0150] The target image is obtained based on the second fused image.

[0151] For example, in some embodiments, the image processing module 1502 is further configured to: acquire a training dataset; wherein the training dataset includes first on-screen image data and first under-screen image data corresponding to the target under-screen camera, the first on-screen image data and the first under-screen image data being corresponding; and train a target restoration network based on the training dataset to obtain a trained target restoration network;

[0152] Wherein, when the target under-display camera is the first under-display camera, the target repair network is the first repair network; when the target under-display camera is the second under-display camera, the target repair network is the second repair network.

[0153] For example, in some embodiments, the training dataset includes a generated dataset and a collected dataset;

[0154] The image processing module 1502 is further configured to: acquire first on-screen real image data collected by the target under-screen camera when there is no screen in the direction of light received by the target under-screen camera; process the first on-screen real image data based on the screen degradation model corresponding to the target under-screen camera to obtain first under-screen simulated image data; form a first data pair using the first on-screen real image data and the first under-screen simulated image data to obtain the generated dataset; acquire second on-screen real image data collected by the target under-screen camera when there is no screen in the direction of light received by the target under-screen camera; acquire second under-screen real image data collected by the target under-screen camera when there is a screen in the direction of light received by the target under-screen camera; form a second data pair using the second on-screen real image data and the second under-screen real image data to obtain the acquired dataset.

[0155] For example, in some embodiments, the image processing module 1502 is further configured to perform data augmentation on the generated dataset and the acquired dataset to obtain an augmented generated dataset and a training dataset.

[0156] For example, in some embodiments, the image processing module 1502 is further configured to: when the target under-display camera receives light in a direction where there is no screen, acquire the real image data on the third screen captured by the target under-display camera;

[0157] When the target under-display camera receives light from the screen, it acquires the third under-display real image data collected by the target under-display camera; wherein, the third under-display real image data and the third under-display real image data are included.

[0158] The screen degradation model is obtained based on the real image data on the third screen and the real image data below the third screen.

[0159] It should be noted that the embodiments of this application do not have real image data under the first screen. The real image data on the second screen and the real image data under the second screen are used to represent a data pair, and the real image data on the third screen and the real image data under the third screen are used to represent another data pair, for ease of understanding.

[0160] For example, in some embodiments, the first fusion process is a high dynamic range fusion, and the second fusion process is a Laplace pyramid fusion.

[0161] For example, in some embodiments, the image processing module 1502 is further configured to: perform image processing on the second fused image based on the image signal processor to obtain the target image;

[0162] The image processing includes color correction, gamma correction, and noise reduction.

[0163] It should be noted that the aforementioned under-display image processing device can be an image processing chip or a terminal device used in a terminal device.

[0164] Based on the hardware implementation of each unit in the above-described under-display image processing device, this application embodiment also provides a terminal device, such as... Figure 16 As shown, the terminal device 160 includes: a processor 1601 and a memory 1602 configured to store computer programs capable of running on the processor;

[0165] When the processor 1601 is configured to run a computer program, it executes the method steps described in the foregoing embodiments.

[0166] Of course, in practical applications, such as Figure 16 As shown, the various components in this terminal device are coupled together via bus system 1603. It can be understood that bus system 1603 is used to enable communication between these components. In addition to a data bus, bus system 1603 also includes a power bus, a control bus, and a status signal bus. However, for clarity, all buses are labeled as bus system 1603 in the figure.

[0167] In practical applications, the aforementioned processor can be at least one of the following: Application-Specific Integrated Circuit (ASIC), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), controller, microcontroller, and microprocessor. It is understood that, for different devices, the electronic devices used to implement the functions of the aforementioned processor can also be other types, and the embodiments of this application do not specifically limit this.

[0168] The aforementioned memory can be volatile memory, such as random-access memory (RAM); or non-volatile memory, such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid-state drive (SSD); or a combination of the above types of memory, and provides instructions and data to the processor.

[0169] In practical applications, the aforementioned device can be a terminal device or a chip applied to a terminal device. In this application, the device can implement the functions of multiple units through software, hardware, or a combination of both, enabling the device to execute the under-display image processing method provided in any of the above embodiments. Furthermore, the technical effects of each technical solution of this device can be referenced to the technical effects of the corresponding technical solutions in the under-display image processing method, and will not be elaborated upon further in this application.

[0170] In an exemplary embodiment, this application also provides a computer-readable storage medium, such as a memory including a computer program, which can be executed by an under-display image processing processor to perform the steps of the aforementioned method.

[0171] This application also provides a computer program product, including computer program instructions.

[0172] Optionally, the computer program product can be applied to the terminal device in the embodiments of this application, and the computer program instructions cause the computer to execute the corresponding processes implemented by the terminal device in the various methods of the embodiments of this application. For the sake of brevity, they will not be described in detail here.

[0173] This application also provides a computer program.

[0174] Optionally, the computer program can be applied to the terminal device in the embodiments of this application. When the computer program is run on the computer, it causes the computer to execute the corresponding processes implemented by the terminal device in the various methods of the embodiments of this application. For the sake of brevity, it will not be described in detail here.

[0175] It should be understood that the terminology used in this application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The singular forms “a,” “the,” and “the” used in this application and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any or all possible combinations of one or more of the associated listed items. The expressions “having,” “may have,” “comprising,” and “including,” or “may include” and “may contain” used herein may be used to indicate the presence of a corresponding feature (e.g., an element such as a number, function, operation, or component), but do not exclude the presence of additional features.

[0176] It should be understood that although the terms first, second, third, etc., may be used in this application to describe various information, this information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another, and are not necessarily used to describe a specific order or sequence. For example, without departing from the scope of this invention, first information may also be referred to as second information, and similarly, second information may also be referred to as first information.

[0177] The technical solutions described in the embodiments of this application can be combined arbitrarily without conflict.

[0178] In the several embodiments provided in this application, it should be understood that the disclosed methods, apparatus, and devices can be implemented in other ways. The embodiments described above are merely illustrative. For example, the division of units is only a logical functional division, and in actual implementation, there may be other division methods, such as: multiple units or components can be combined, or integrated into another system, or some features can be ignored or not executed. In addition, the coupling, direct coupling, or communication connection between the various components shown or discussed can be through some interfaces, and the indirect coupling or communication connection of devices or units can be electrical, mechanical, or other forms.

[0179] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the units may be selected to achieve the purpose of this embodiment according to actual needs.

[0180] In addition, each functional unit in the various embodiments of this application can be integrated into one processing unit, or each unit can be a separate unit, or two or more units can be integrated into one unit; the integrated unit can be implemented in hardware or in the form of hardware plus software functional units.

[0181] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application.

Claims

1. An under-display image processing method, characterized in that, The method includes: Acquire at least two first images captured by the first under-screen camera of the shooting device at different exposure times, and a second image captured by the second under-screen camera; High dynamic range fusion is performed on at least two first images acquired at different exposure times to obtain a first fused image; The first fused image is diffractively repaired based on the first repair network to obtain the repaired first fused image; The second image is then subjected to diffraction repair based on the second repair network to obtain the repaired second image. A second fusion process is performed on the repaired first fused image and the repaired second image to obtain a second fused image; The target image is obtained based on the second fused image; The first under-display camera and the second under-display camera are located below the screen of the shooting device. The first under-display camera is used to sense light signals in a first wavelength range, and the second under-display camera is used to sense light signals in a second wavelength range. The first wavelength range is smaller than the second wavelength range.

2. The method according to claim 1, characterized in that, The method further includes: Obtain a training dataset; wherein the training dataset includes first on-screen image data and first under-screen image data corresponding to the target under-screen camera, and the first on-screen image data and the first under-screen image data correspond to each other; The target repair network is trained based on the training dataset to obtain the trained target repair network; Wherein, when the target under-display camera is the first under-display camera, the target repair network is the first repair network; When the target under-display camera is the second under-display camera, the target repair network is the second repair network.

3. The method according to claim 2, characterized in that, The training dataset includes a generated dataset and a collected dataset; The acquisition of the training dataset includes: When the target under-display camera receives light without a screen, it acquires the real image data on the first screen captured by the target under-display camera. Based on the screen degradation model corresponding to the target under-screen camera, the real image data on the first screen is processed to obtain simulated image data under the first screen. The generated dataset is obtained by combining real image data on the first screen and simulated image data off the first screen to form a first data pair. When the target under-display camera receives light without a screen, it acquires real image data on the second screen captured by the target under-display camera. When the target under-display camera receives light from the screen, it acquires the second real under-display image data captured by the target under-display camera. The second data pair is formed by using the real image data on the second screen and the real image data below the second screen to obtain the collected dataset.

4. The method according to claim 3, characterized in that, The method further includes: Data augmentation is performed on the generated dataset and the collected dataset to obtain the augmented generated dataset and the training dataset.

5. The method according to claim 3, characterized in that, The method further includes: When the target under-display camera receives light without a screen, it acquires real image data on the third screen captured by the target under-display camera. When the target under-display camera receives light from the screen, it acquires the third under-display real image data collected by the target under-display camera; wherein, the third under-display real image data and the third under-display real image data are included. The screen degradation model is obtained based on the real image data on the third screen and the real image data below the third screen.

6. The method according to claim 1, characterized in that, The second fusion process is the Laplace pyramid fusion.

7. The method according to claim 1, characterized in that, The process of obtaining the target image based on the second fused image includes: The second fused image is processed using an image signal processor to obtain the target image; The image processing includes color correction, gamma correction, and noise reduction.

8. The method according to any one of claims 1-7, characterized in that, The optical signal in the first band is a visible light signal, and the optical signal in the second band is an infrared light signal; The first under-display camera is a visible light camera, and the second under-display camera is an infrared camera.

9. An under-display image processing device, characterized in that, The device includes: The acquisition module is configured to acquire at least two first images captured by the first under-screen camera of the shooting device under different exposure parameters, and a second image captured by the second under-screen camera; The image processing module is configured to: perform high dynamic range fusion on at least two first images acquired at different exposure times to obtain a first fused image; perform diffraction repair on the first fused image based on a first repair network to obtain a repaired first fused image; perform diffraction repair on the second image based on a second repair network to obtain a repaired second image; perform a second fusion process on the repaired first fused image and the repaired second image to obtain a second fused image; and obtain a target image based on the second fused image. The first under-display camera and the second under-display camera are located below the screen of the shooting device. The first under-display camera is used to sense light signals in a first wavelength range, and the second under-display camera is used to sense light signals in a second wavelength range. The first wavelength range is smaller than the second wavelength range.

10. The apparatus according to claim 9, characterized in that, The optical signal in the first band is a visible light signal, and the optical signal in the second band is an infrared light signal; The first under-display camera is a visible light camera, and the second under-display camera is an infrared camera.

11. A terminal device, characterized in that, The terminal device includes: a processor and a memory configured to store computer programs capable of running on the processor. Wherein, when the processor is configured to run the computer program, it performs the steps of the method according to any one of claims 1 to 8.

12. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 8.