Image processing methods and related devices

By performing mean and difference calculations on multiple frames of images captured under a stroboscopic light source, a stripe mapping image is generated, and the pixel weights are adjusted. This solves the problem of stripes in images captured under a stroboscopic light source and achieves efficient stripe removal.

CN120751271BActive Publication Date: 2026-06-30HONOR DEVICE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HONOR DEVICE CO LTD
Filing Date
2024-08-30
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Images captured under stroboscopic light sources often exhibit banding phenomena (such as bright and dark bands), which are difficult to remove effectively with existing techniques.

Method used

A fused image is generated by averaging multiple frames of images. The difference between the fused image and the image to be processed is calculated to obtain a stripe mapping image. Stripes are removed by weighted fusion of pixel values ​​in the stripe mapping image. The specific method includes extracting mask images of dark and bright stripes and adjusting the weight of pixels during the fusion process.

Benefits of technology

It effectively removes banding in images, improving image quality, especially in images of still objects taken in stroboscopic light source environments, reducing banding interference.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120751271B_ABST
    Figure CN120751271B_ABST
Patent Text Reader

Abstract

This application provides an image processing method and related apparatus, applied to an electronic device. A camera operating in the electronic device captures multiple frames of an object with a first exposure time. The object is in an environment with a strobe light source and is stationary. The first exposure time is less than the strobe period. The method includes: performing an average operation on the multiple frames to obtain a fused image; obtaining a stripe mapping image based on the difference between the fused image and the image to be processed; and fusing the fused image and the image to be processed based on the stripe mapping image to obtain an image without stripes. During the fusion process, the larger the value of the first pixel in the stripe mapping image, the greater the weight of the fused image at the first pixel, and the smaller the weight of the image to be processed at the first pixel. This makes the first pixel after fusion more susceptible to the influence of the stripe-free fused image, thereby achieving the purpose of stripe removal.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of digital signal processing technology, and in particular to an image processing method and related apparatus. Background Technology

[0002] A stroboscopic light source can be understood as a light source powered by alternating current with a certain frequency, such as a common fluorescent lamp.

[0003] Images taken under a stroboscopic light source may exhibit banding, also known as water ripples, etc. Figure 1 For example, in photos taken indoors with the lights on, bands appear, including bright bands and dark bands. Summary of the Invention

[0004] This application provides an image processing method and related apparatus to remove bands from an image. The disclosed technical solution is as follows:

[0005] The first aspect of this application provides an image processing method applied to an electronic device. A camera operating in the electronic device captures multiple frames of an object by calling the camera lens for a first exposure time. The object is in an environment with a strobe light source and is stationary. The first exposure time is less than the strobe period. The method includes: performing an average operation on the multiple frames to obtain a fused image; obtaining a striped mapping image based on the difference between the fused image and the image to be processed; fusing the fused image and the image to be processed based on the striped mapping image to obtain an image with stripes removed. During the fusion process, the value of a first pixel in the striped image is obtained based on a weighted fusion method of the value of the first pixel in the fused image and the value of the first pixel in the image to be processed. The larger the value of the first pixel in the striped mapping image, the greater the weight of the first pixel in the fused image, and the smaller the weight of the first pixel in the image to be processed.

[0006] Based on the characteristics of the camera and the environment, it is known that both the multi-frame images and the image to be processed contain bands, but the positions of the bands are different in the multi-frame images. Therefore, the fused image obtained by the mean operation does not contain bands. Thus, in the band mapping image obtained based on the difference between the fused image and the image to be processed, the larger the value of a pixel, the greater the difference between the fused image and the image to be processed at that pixel. Therefore, that pixel is more likely to be a pixel on a band. So, during the fusion process, the larger the value of the first pixel (which can be any pixel) in the band mapping image, the greater the weight of the fused image at the first pixel, and the smaller the weight of the image to be processed at the first pixel. This makes the pixel after fusion more affected by the band-free fused image, thereby achieving the purpose of removing bands.

[0007] In some implementations, obtaining the stripe mapping image based on the difference between the fused image and the image to be processed includes: obtaining the difference between the fused image and the image to be processed to obtain a stripe mask image; extracting dark stripe mask images and bright stripe mask images from the stripe mask image to obtain the stripe mapping image. Extracting the dark stripe mask images and bright stripe mask images helps to process dark and bright stripes separately, thereby improving the accuracy of the stripe mapping image and further improving the stripe removal effect.

[0008] In some implementations, extracting dark and bright stripe mask images from a stripe mask image includes: retaining the original value or setting it to 1 for pixel values ​​in the stripe mask image that are greater than or equal to a first threshold (such as threshold value thr_1), and setting the pixel values ​​less than the first threshold to 0 to obtain a dark stripe mask image; and retaining the original value or setting it to 1 for pixel values ​​in the flipped image of the stripe mask image that are greater than or equal to the first threshold, and setting the pixel values ​​less than the first threshold to 0 to obtain a bright stripe mask image.

[0009] The second aspect of this application provides an image processing method applied to an electronic device. A camera operating in the electronic device captures an object using a superimposed high dynamic range (HDR) stagger HDR camera. The object is in an environment with a strobe light source. The exposure duration of a long frame captured by the stagger HDR camera is greater than or equal to the strobe period, and the exposure duration of a short frame captured by the stagger HDR camera is less than the strobe period. The method includes: acquiring a striped mapping image based on the difference between a first long frame and a first short frame. The first long frame and the first short frame have the same capture time. Based on characteristics such as the exposure duration of the stagger HDR camera and the characteristics of the ambient light source, the difference between the first long frame and the second long frame is a difference caused by striping, and does not contain differences caused by motion; that is, the striped mapping image can reflect this difference.

[0010] Based on a striped image mapping, a first image and a second image are fused to obtain a striped image. The first image is obtained from a first long frame, and the second image is obtained from a first short frame. During the fusion process, the value of the first pixel in the striped image is obtained using a weighted fusion method based on the values ​​of the first pixel in the first image and the first pixel in the second image. The larger the value of the first pixel in the striped image, the greater the weight of the first pixel in the first image, and the smaller the weight of the first pixel in the second image. The first image can be the first long frame. To obtain a higher quality striped image, the first image can also be an image registered from the first long frame to the first short frame and brightness aligned. Similarly, the second image can be the first short frame, or an image registered from the first short frame to the first long frame. The larger the value of the first pixel in the striped image, the greater the weight of the first pixel in the first image, and the smaller the weight of the first pixel in the second image, the more the first pixel in the fused image is affected by the stripe-free fused image, thus achieving the purpose of stripe removal.

[0011] In some implementations, obtaining the stripe mapping image based on the difference between the first long frame and the first short frame includes: calculating the difference between the first long frame and the first short frame; obtaining a difference mask image, which represents the difference between the first long frame and the first short frame caused by object motion and striping; obtaining a stripe mask image based on a pre-obtained motion mask image and a difference mask image, where the motion mask image represents the difference between the first long frame and adjacent long frames caused by object motion, and the stripe mask image represents the difference between the first long frame and the first short frame caused by striping; and extracting dark stripe mask images and bright stripe mask images from the stripe mask image to obtain the stripe mapping image. Removing motion-induced differences from the difference mask image helps to obtain a more accurate stripe mapping image representing the stripes, thus improving the accuracy of stripe removal.

[0012] In some implementations, before obtaining the strip mask image based on the pre-acquired motion mask image and the difference mask image, the method further includes: obtaining a motion mask image based on the difference between a first long frame and a second long frame, wherein the motion mask image represents the difference between the first long frame and the second long frame caused by the motion of the object, and the first long frame and the second long frame are image frames with adjacent timestamps, in order to obtain a more accurate motion difference.

[0013] In some implementations, extracting dark and bright stripe mask images from a stripe mask image includes: retaining the original value or setting it to 1 for pixel values ​​in the stripe mask image that are greater than or equal to a first threshold (such as threshold value thr_2), and setting the pixel values ​​less than the first threshold to 0 to obtain a dark stripe mask image; and retaining the original value or setting it to 1 for pixel values ​​in the inverted image of the stripe mask image that are greater than or equal to the first threshold, and setting the pixel values ​​less than the first threshold to 0 to obtain a bright stripe mask image.

[0014] In some implementations, before fusing the first image and the second image based on the strip mapping image, the method further includes: registering the first long frame and the first short frame to obtain a registered long frame and a registered short frame; aligning the registered long frame to the registered short frame in terms of brightness to obtain a brightness-processed long frame; the first image is the brightness-processed long frame, and the second image is the registered short frame; aligning the position and brightness of the first long frame and the first short frame helps to improve the quality of the strip-processed image.

[0015] A third aspect of this application provides an electronic device, comprising: one or more processors, a memory, and a touch screen, wherein the memory is used to store program code, and the processor is used to run the program code, thereby enabling the electronic device to implement the image processing method provided in the first or second aspect of this application.

[0016] The fourth aspect of this application provides a computer-readable storage medium having instructions stored thereon, which, when executed on an electronic device, cause the electronic device to perform the image processing method provided in the first or second aspect of this application. Attached Figure Description

[0017] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0018] Figure 1 Here is an example image containing the stripes;

[0019] Figure 2 This application provides a structural example diagram of an electronic device;

[0020] Figure 3 This is an example diagram of the software framework running in the electronic device provided in the embodiments of this application;

[0021] Figure 4 This is a flowchart of an image processing method provided in an embodiment of this application;

[0022] Figure 5 This is a flowchart of yet another image processing method provided in the embodiments of this application;

[0023] Figure 6 This is a flowchart of the image processing method for obtaining the BandingMap provided in the embodiments of this application;

[0024] Figure 7 This is a flowchart of yet another image processing method provided in the embodiments of this application;

[0025] Figure 8 This is an example diagram of obtaining weights based on BandingMap in the image processing method provided in the embodiments of this application;

[0026] Figure 9 This is a flowchart of yet another image processing method provided in the embodiments of this application;

[0027] Figure 10 This is a flowchart of another image processing method provided in the embodiments of this application. Detailed Implementation

[0028] The terms "first," "second," and "third," etc., used in this application specification, claims, and drawings are used to distinguish different objects, not to limit a specific order.

[0029] In the embodiments of this application, the words "in some implementations" or "for example" are used to indicate examples, illustrations or descriptions, and should not be construed as being more preferred or more advantageous than other embodiments or designs.

[0030] To remove banding from an image, embodiments of this application disclose an image processing method applied to an electronic device.

[0031] Electronic devices include, but are not limited to, mobile phones, tablets, desktop, laptop, notebook computers, ultra-mobile personal computers (UMPCs), handheld computers, netbooks, personal digital assistants (PDAs), tablet computers (PADs), and wearable electronic devices such as smartwatches with cameras.

[0032] like Figure 2 As shown, taking a mobile phone as an example, the electronic device 100 may include a processor 110, an internal memory 120, a display screen 130, a camera 140, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, and an audio module 170, etc.

[0033] It is understood that the structure illustrated in this embodiment does not constitute a specific limitation on the electronic device. In other embodiments, the electronic device may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

[0034] Processor 110 may include one or more processing units, such as application processors (APs), modem processors, graphics processing units (GPUs), image signal processors (ISPs), controllers, video codecs, digital signal processors (DSPs), baseband processors, and / or neural network processing units (NPUs). These different processing units may be independent devices or integrated into one or more processors.

[0035] Internal memory 120 can be used to store executable program code, including instructions. Processor 110 executes various functional applications and data processing of electronic device 100 by running the instructions stored in internal memory 110. Internal memory 120 may include a program storage area and a data storage area. The program storage area may store the operating system, at least one application program required for a function (such as sound playback, image playback, etc.), etc. The data storage area may store data created during the use of the electronic device (such as audio data, phonebook, etc.). Furthermore, internal memory 120 may include high-speed random access memory and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc. Processor 110 executes various functional applications and data processing of electronic device by running instructions stored in internal memory 120 and / or instructions stored in memory located within the processor.

[0036] Electronic devices implement display functions through a GPU, a display screen 130, and an application processor. The GPU is a microprocessor for image processing, connecting the display screen 130 and the application processor. The GPU is used to perform mathematical and geometric calculations and for graphics rendering. The processor 110 may include one or more GPUs, which execute program instructions to generate or modify display information.

[0037] The display screen 130 is used to display images, videos, etc. The display screen 130 includes a display panel. The display panel may be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), a miniature LED, a microLED, a quantum dot light-emitting diode (QLED), etc. In some embodiments, the electronic device may include one or N displays 130, where N is a positive integer greater than 1.

[0038] Electronic device 100 can perform shooting functions through ISP, camera 140, video codec, GPU, display 194 and application processor.

[0039] The ISP (Image Signal Processor) is used to process data fed back from the camera 140. For example, when taking a picture, the shutter is opened, and light is transmitted through the lens to the camera's photosensitive element. The light signal is converted into an electrical signal, and the camera's photosensitive element transmits the electrical signal to the ISP for processing, transforming it into an image visible to the naked eye. The ISP can also perform algorithmic optimizations on image noise, brightness, and skin tone. The ISP can also optimize parameters such as exposure and color temperature of the shooting scene. In some embodiments, the ISP can be integrated into the camera 140.

[0040] Camera 140 is used to capture still images or videos. An object is projected onto a photosensitive element by generating an optical image through the lens. The photosensitive element can be a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the light signal into an electrical signal, which is then passed to an ISP for conversion into a digital image signal. The ISP outputs the digital image signal to a DSP for processing. The DSP converts the digital image signal into image signals in standard RGB, YUV, or other formats. In some embodiments, the electronic device 100 may include one or N cameras 140, where N is a positive integer greater than 1.

[0041] In some embodiments, camera 140 is a camera with an exposure time shorter than the flicker period of the ambient light source; in other embodiments, camera 140 is a stagger high-dynamic range (staggerHDR) camera. The characteristics of camera 140 result in banding in the captured images, such as... Figure 1 As shown, the details will be explained in detail in conjunction with subsequent embodiments.

[0042] Digital signal processors (DSPs) are used to process digital signals. Besides digital image signals, they can also process other digital signals. For example, when electronic device 100 selects a frequency, the DSP can perform Fourier transforms on the frequency energy.

[0043] Video codecs are used to compress or decompress digital video. Electronic device 100 may support one or more video codecs. Thus, electronic device 100 can play or record videos in various encoding formats, such as Moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, etc.

[0044] Internal memory 120 can be used to store computer executable program code, which includes instructions. Processor 110 executes various functional applications and data processing of electronic device 100 by running the instructions stored in internal memory 120. Internal memory 120 may include a program storage area and a data storage area. The program storage area may store the operating system, at least one application program required for a function (such as sound playback function, image playback function, etc.), etc. The data storage area may store data created during the use of electronic device 100 (such as audio data, phone book, etc.). In addition, internal memory 120 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc. Processor 110 executes various functional applications and data processing of electronic device 100 by running instructions stored in internal memory 120 and / or instructions stored in memory disposed in the processor.

[0045] Electronic device 100 can implement audio functions such as music playback and recording through audio module 170, speaker 170A, microphone 170B, and application processor.

[0046] The audio module 170 is used to convert digital audio information into analog audio signals for output, and also to convert analog audio input into digital audio signals. The audio module 170 can also be used for encoding and decoding audio signals. In some embodiments, the audio module 170 may be located in the processor 110, or some functional modules of the audio module 170 may be located in the processor 110.

[0047] The speaker 170A, also known as a "loudspeaker," is used to convert audio electrical signals into sound signals. The electronic device 100 can listen to music or make hands-free calls through the speaker 170A.

[0048] In some embodiments, the speaker 170A can play video information with special effects mentioned in the embodiments of this application.

[0049] Microphone 170B, also known as a "microphone" or "voice transducer," is used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can speak by bringing their mouth close to microphone 170B, inputting the sound signal into microphone 170B.

[0050] In some embodiments, microphone 170B can capture the sound of the environment in which the electronic device is located while the camera is capturing video information with special effects.

[0051] The wireless communication function of electronic device 100 can be realized through antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, modem processor and baseband processor, etc.

[0052] Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.

[0053] The mobile communication module 150 can provide solutions for wireless communication, including 2G / 3G / 4G / 5G, applied to the electronic device 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc. The mobile communication module 150 can receive electromagnetic waves via antenna 1, and perform filtering, amplification, and other processing on the received electromagnetic waves before transmitting them to a modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves for radiation via antenna 1.

[0054] The wireless communication module 160 can provide solutions for wireless communication applications on the electronic device 100, including wireless local area networks (WLANs) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), and infrared (IR) technologies. The wireless communication module 160 can be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via antenna 2, performs frequency modulation and filtering of the electromagnetic wave signals, and sends the processed signal to processor 110. The wireless communication module 160 can also receive signals to be transmitted from processor 110, perform frequency modulation and amplification, and convert them into electromagnetic waves for radiation via antenna 2.

[0055] An operating system runs on top of these components. Examples include iOS, Android, and Windows. Applications can be installed and run on this operating system.

[0056] Figure 3 This is a software structure block diagram of an electronic device according to an embodiment of this application.

[0057] A layered architecture divides software into several layers, each with a clear role and function. Layers communicate with each other through software interfaces. In some embodiments, the Android system is divided into five layers, from top to bottom: the application layer, the application framework layer, the Android runtime and system libraries, the Hardware Abstraction Layer (HAL), and the kernel layer.

[0058] The application layer can include a series of application packages. For example... Figure 3 As shown, the application package can include applications such as camera and gallery.

[0059] In some embodiments, the camera is used to capture videos or images in response to user actions. The gallery is used to store videos or images captured by the camera.

[0060] The application framework layer provides application programming interfaces (APIs) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions. For example... Figure 3As shown, some shooting-related modules in the application framework layer may include the camera frame (CameraFwk), the media recorder (Media Recorder), and the media provider (Media Provider).

[0061] The Android Runtime includes core libraries and a virtual machine. The Android Runtime is responsible for scheduling and managing the Android system. In some embodiments of this application, application cold starts run within the Android Runtime, which obtains the application's optimized file status parameters. The Android Runtime can then use these parameters to determine whether the optimized file has become outdated due to system upgrades and returns the result to the application management module.

[0062] The system library can include multiple functional modules. For example, a surface manager, a 3D graphics processing library (e.g., OpenGL ES), a 2D graphics engine (e.g., SGL), and a recording service.

[0063] The Surface Manager manages the display subsystem and provides fusion of 2D and 3D layers for multiple applications. The 3D graphics processing library implements 3D graphics drawing, image rendering, compositing, and layer processing. The 2D graphics engine is the drawing engine for 2D graphics. The recording service receives video streams acquired by the recording function and stores them in the electronic device's memory.

[0064] The HAL includes the camera HAL and algorithm modules related to the embodiments of this application.

[0065] The kernel layer is the layer between hardware and software. It contains at least display drivers, camera drivers, audio drivers, and sensor drivers. The camera HAL controls the camera by calling the camera driver.

[0066] It should be noted that although the embodiments of this application are illustrated using the Android system as an example, the basic principles are also applicable to electronic devices based on operating systems such as iOS and Windows.

[0067] based on Figure 3 The software framework shown illustrates the following implementation process for an electronic device to capture images:

[0068] In response to a user's image capture operation, the camera invokes the camera via modules such as the camera frame, camera HAL, and camera driver. The camera then captures multiple frames of images in response to the invocation. These captured frames can be processed by algorithm modules and then transmitted to the camera. In the embodiments of this application, the camera generates a capture result based on the multiple frames to respond to the user's shooting operation, and the capture result is stored in a gallery.

[0069] In other words, the camera responds to a user's shooting action by activating the camera, which then captures multiple frames of images. Based on these multiple frames, the electronic device generates an image that is stored in the image library in response to the user's shooting action. Therefore, the multiple frames of images are not the actual shooting result stored in the image library by the electronic device in response to the user's shooting action.

[0070] During the process of generating the shooting results by the camera, the image processing method provided in the following embodiments of this application is used to remove the stripes in the image.

[0071] Figure 4 An image processing method provided in the embodiments of this application is applied in the following scenario: Figure 2 Taking the camera shown as an example, the camera responds to the camera's call by sequentially acquiring multiple frames of images with the same exposure time (first exposure time). The object being photographed is in an environment with a strobe light source and is stationary. The camera's exposure time (i.e., the camera's exposure time) is less than the flicker period of the strobe light source (i.e., the flicker period). The flicker period is the duration between one consecutive bright (i.e., on) and one consecutive dark (i.e., off) period of the light source. For example, a fluorescent lamp uses 220V and 50Hz AC power, with a flicker period of 10 milliseconds, and the camera's exposure time is less than 10 milliseconds.

[0072] Given the limitations of the above scenario, it is understandable that multiple frames captured by the camera have different timestamps and the same exposure, with the timestamp indicating the time the image was captured. Because the object being photographed is stationary (i.e., not moving), there is no displacement between the images of the object in each frame. Furthermore, because the exposure time is less than the flash duration of the stroboscopic light source, all multiple frames exhibit banding.

[0073] Figure 4 In the diagram, dashed lines with arrows represent data flow, and solid lines with arrows represent process steps. Figure 4 The process includes the following steps:

[0074] S11. Perform mean operation on multiple frames of images to obtain a multi-frame fused image Frame_GT (which can be simply referred to as the fused image).

[0075] Mean averaging refers to adding multiple frames of images together and then averaging the results to obtain a fused image.

[0076] Based on the above scenario, since the camera's exposure time is less than the flash period of the stroboscopic light source, each frame of the image captured by the camera in the environment of the stroboscopic light source includes bands. It can be understood that each frame of the image may have local bands or global bands. Local bands refer to the presence of bands in a part of the image, while global bands refer to the presence of bands in the entire area of ​​the image.

[0077] Since the multiple frames were captured at different times, the positions of the bands in the multiple frames are different. Therefore, by averaging the multiple frames, a fused image without bands can be obtained.

[0078] In some implementations, the multi-frame image includes the first frame (denoted as Frame_0), the second frame (denoted as Frame_1), ..., the (N-1)th frame (denoted as Frame_N-1), and the Nth frame (Frame_N). The fused image Frame_GT is obtained by adding Frame_0, Frame_1, ..., Frame_N-1 together and taking the average.

[0079] S12. Obtain the difference between the fused image and the image to be processed to obtain the banding mask image banding_mask.

[0080] Understandably, the image to be processed also has stripes.

[0081] In some implementations, the image to be processed is Frame_N, meaning it is the most recently captured frame. The fused image is obtained based on image frames captured before the most recently captured frame.

[0082] In some implementations, the difference between Frame_GT and the image to be processed, Frame_N, is calculated to obtain the strip mask image.

[0083] Since there are no bands in the fused image, the difference image is a banding mask image, denoted as banding_mask. The banding mask image can be understood as an image that contains bright and dark bands, but does not contain other content.

[0084] S13. Extract the dark stripe mask image and the bright stripe mask image from the banding_mask to obtain the bandingMap image.

[0085] In some implementations, a threshold value `thr_1` is set. Pixel values ​​in `banding_mask` greater than or equal to `thr_1` are kept unchanged (or set to 1), while pixel values ​​less than `thr_1` are all set to 0, resulting in the dark band mask image `banding_dark_mask`. The inverted image of `banding_mask` is then obtained, for example, by multiplying `banding_mask` by -1. Pixel values ​​greater than or equal to `thr_1` are then kept unchanged (or set to 1), while pixel values ​​less than `thr_1` are all set to 0, resulting in the bright band mask image `banding_light_mask`. It can be understood that the dark band mask image `banding_dark_mask` contains only dark bands, and the bright band mask image `banding_light_mask` contains only bright bands. The dark band mask image `banding_dark_mask` and the bright band mask image `banding_light_mask` are collectively referred to as the BandingMap.

[0086] Based on the above acquisition method, the BandingMap may be a grayscale image or a binary image.

[0087] S14. Based on the BandingMap, merge the image Frame_GT and the image to be processed to obtain the image T with the bands removed.

[0088] The principle of fusion is that for each pixel in the BandingMap, the larger the value of the pixel, the greater the difference between the Frame_GT and the image to be processed at that pixel, which means that the pixel is very likely to be a pixel on the band. Therefore, the value of the pixel is mainly obtained based on the value of the Frame_GT at that pixel. Conversely, the smaller the value of the pixel, the less likely the pixel is to be a pixel on the band. The value of the pixel is mainly obtained based on the value of the image to be processed at that pixel, thereby achieving the purpose of removing the bands.

[0089] Specifically, a pixel in the BandingMap is called P, with coordinates (x1, y1). The pixel with coordinates (x1, y1) in Frame_GT is called P1, the pixel with coordinates (x1, y1) in the image to be processed Frame_N is called P2, and the pixel with coordinates (x1, y1) in the striped image T is called P3. P3 is obtained based on a weighted fusion (such as a weighted sum) of P1 and P2. Therefore, the larger the value of P, the greater the weight of P1 and the smaller the weight of P2.

[0090] In other words, when extended to each pixel, the larger the value of BandingMap at a pixel, the greater the weight of Frame_GT at that pixel, and the smaller the weight of Frame_N at that pixel.

[0091] The specific implementation method of the fusion will be described in detail in the following embodiments.

[0092] In this embodiment, unlike the traditional method of adjusting camera exposure time to prevent banding, when banding has already appeared, the Banding Map is used to determine the contribution of the band-free image and the image to be processed to the pixels in the image. In other words, the band-free pixels are used to correct or replace the banded pixels, thereby achieving the purpose of removing the banding.

[0093] Figure 4 The process shown is performed under the assumption that the object being photographed is not moving. In reality, the object being photographed may be in motion. In order to better remove motion-induced blur (referred to as motion blur), cameras configured in electronic devices include stagger HDR cameras.

[0094] The stagger HDR camera can capture multiple frames of images with different exposure times simultaneously in a single shot.

[0095] In the following embodiments of this application, image frames obtained using a longer exposure time are referred to as long frames (N-frames), and image frames obtained using a shorter exposure time are referred to as short frames (S-frames). The longer exposure time is greater than the flicker period of the light source, and the shorter exposure time is less than the flicker period of the light source. For example, a fluorescent lamp uses 220V and 50Hz AC power with a flicker period of 10 milliseconds. The stagger HDR camera is configured with a longer exposure time greater than or equal to 10 milliseconds and a shorter exposure time less than 10 milliseconds. Therefore, long frames captured under fluorescent lighting will not exhibit banding, but short frames will. Figure 1 For example.

[0096] It is understandable that when the subject is moving, long frames obtained using a longer exposure time will exhibit motion blur, while short frames obtained using a shorter exposure time will not exhibit motion blur.

[0097] For any given frame, the time it was captured is used as its acquisition timestamp. N frames acquired in ascending order of their acquisition timestamps are designated N1, N2, etc. N frames and S frames with the same acquisition timestamp (i.e., captured at the same time) are called time-aligned long and short frames.

[0098] Figure 5This application provides another image processing method. In this embodiment, the input consists of a long frame and a short frame, which are two time-aligned frames, N1 and S1, as an example. The output is an image T with stripes removed.

[0099] Figure 5 In the diagram, dashed lines with arrows represent data flow, and solid lines with arrows represent process steps. Figure 5 The process includes the following steps:

[0100] S21. Register the time-aligned long frame and short frame to obtain the registered long frame and the registered short frame.

[0101] The purpose of registration is to align pixel positions. Aligning pixel positions can be understood as corresponding pixels representing the same actual location. That is, it involves obtaining the correspondence between pixels in a long frame and a short frame. This correspondence is established based on the actual position of the subject. For example, a point in real space might be represented by pixel 'a' in a long frame and pixel 'b' in a short frame. Registration is about establishing the correspondence between pixel 'a' and pixel 'b'. Various methods can be used for registration, which will not be elaborated upon here.

[0102] S22. Align the registered long frame with the registered short frame to obtain the brightness-processed long frame.

[0103] In some implementations, a short frame is used as a reference frame, and a wrap calculation is performed on the long frame to obtain a long frame with luminance processing.

[0104] S23. Calculate the difference between long frames and short frames (BandingMap).

[0105] Understandably, since long frames do not have banding but short frames do, the Banding Map at least represents the difference caused by banding between the luma-processed long frames and the registered short frames.

[0106] In some implementations, to more accurately reflect the differences caused by the stripes, the difference between the long frame and the registered short frame after brightness processing is calculated and used as a BandingMap.

[0107] BandingMap at least characterizes the differences caused by banding between long frames with luminance processing and short frames after registration, while removing the differences caused by the lack of positional and luminance alignment.

[0108] When the subject is moving, BandingMap also indicates the difference in brightness between long frames and short frames after registration caused by motion.

[0109] A banding map can be understood as an image where the larger the value of a pixel, the greater the difference between the long frame after brightness processing and the short frame after registration at that pixel.

[0110] The specific implementation of S23 will be described in detail in the following embodiments.

[0111] S24. Based on the Banding Map, fuse the long frame after brightness processing and the short frame after registration to obtain the image T with stripes removed.

[0112] The principle of fusion is that for each pixel in the BandingMap, the larger the value of the pixel, the greater the difference between the long frame and the short frame after registration and brightness alignment at that pixel. This indicates that the pixel is very likely to be a pixel on the stripe. Therefore, the value of the pixel is mainly obtained based on the value of the long frame at that pixel. Conversely, the smaller the value of the pixel, the less likely the pixel is to be a pixel on the stripe. The value of the pixel is mainly obtained based on the value of the short frame at that pixel, thereby achieving the purpose of stripe removal.

[0113] The specific implementation method of the fusion will be described in detail in the following embodiments.

[0114] In this embodiment, unlike the traditional method of adjusting camera exposure time to prevent banding, when banding has already appeared, information from long frames is used to compensate for banding in short frames, thereby achieving the purpose of removing banding.

[0115] Figure 6 This describes the specific process of obtaining the BandingMap in S23. Figure 6 The process includes the following steps:

[0116] S231. Calculate the absolute difference between adjacent long frames to obtain the motion alpha map.

[0117] Understandably, adjacent long frames have different timestamps but the same exposure duration. When the subject is moving, the absolute difference between adjacent long frames represents the difference caused by the subject's motion; therefore, the motion alpha map represents the difference caused by the subject's motion.

[0118] In some implementations, in order to save computational resources and obtain more accurate image processing results, because Figure 5 The input is frame N1 and frame S1, so the adjacent long frames here are frame N1 and frame N2.

[0119] In this embodiment, it is assumed that the object being photographed has moved in frame N2 compared to frame N1, combined with... Figure 6 As shown, the head of the person being photographed twists in frame N2 relative to frame N1.

[0120] S232. Binarize the motion alpha image to obtain the motion mask.

[0121] In some implementations, a threshold value thr_1 is set. Pixels with a value higher than or equal to thr_1 are considered to be in the moving region and have a pixel value of 0. Pixels with a value lower than thr_1 are considered to be in the stationary region and have a pixel value of 1. Therefore, in motion_mask, 0 indicates that there are pixels with motion blur, and 1 indicates that there are no pixels with motion blur.

[0122] S233. Calculate the difference between long frames and short frames with the same timestamp to obtain diff_mask.

[0123] The difference between long and short frames with the same timestamp represents a stripe.

[0124] Combination Figure 5 As shown, long frames with the same timestamp are N1 and short frames are S1.

[0125] S234. Multiply motion_mask and diff_mask to obtain banding_mask.

[0126] Based on the rules for setting pixel values ​​in motion_mask, pixels with a value of 0 in banding_mask are pixels whose values ​​are set to 0 due to motion blur, while pixels with a value of 1 are pixels without motion blur. Therefore, banding_mask excludes pixels affected by motion and only includes pixels on the light and dark bands.

[0127] S235. Extract the dark band banding_dark_mask and the bright band banding_light_mask from the banding_mask to obtain the BandingMap.

[0128] Binarize the banding_mask to obtain the BandingMap.

[0129] In some implementations, a threshold value `thr_2` is set. Pixels in the `banding_mask` that are greater than or equal to `thr_2` retain their original values, while pixels less than `thr_2` are set to 0, resulting in `banding_dark_mask`. Multiplying `banding_mask` by -1, and then again retaining the original values ​​of pixels greater than or equal to `thr_2`, while setting the values ​​less than `thr_2` to 0, results in `banding_light_mask`. `banding_dark_mask` and `banding_light_mask` are collectively referred to as the BandingMap. It's understandable that this method produces a grayscale BandingMap.

[0130] In other implementations, pixels in the banding_mask that are greater than or equal to the threshold value thr_2 are set to 1, and pixels less than thr_2 are set to 0, resulting in a banding_dark_mask. Multiplying the banding_mask by -1, and then setting pixels greater than or equal to the threshold value thr_2 to 1 again, and pixels less than thr_2 to 0, results in a banding_light_mask. This method produces a binary BandingMap.

[0131] Understandably, in both grayscale and binary images, in the BandingMap, 0 indicates that the pixel is not on a band, while a non-zero value indicates that the pixel is on a band.

[0132] Figure 6 The BandingMap acquisition method shown is based on the characteristics of long and short frames captured by the stagger HDR camera. It can not only acquire bands, but also acquire bands after removing motion effects, laying the foundation for accurately removing bands from the image to be processed.

[0133] Understandable, Figure 6 The inputs S1 and N1 in the step of calculating the difference are only examples. An alternative is to replace N1 with the aforementioned long frame of brightness processing obtained by N1 transformation, and replace S1 with the aforementioned short frame of registration obtained by S1 transformation.

[0134] from Figure 4 and Figure 5 As can be seen, in the image processing method provided by the embodiments of this application, in the fusion steps (S14 and S24), the fused images are the image to be processed, i.e., the image to be stripped, and the image without stripes. The image to be stripped is as follows: Figure 4 Frame_N or Figure 5 The registered short frames in the image, without stripes, are like... Figure 4Frame_GT or Figure 5 Brightness processing in long frames.

[0135] The fusion is based on the Banding Map. The Banding Map determines whether the pixels in the resulting image are primarily based on the original image or on the image without bands.

[0136] The specific methods of integration will be explained in detail below.

[0137] Figure 7 This is a specific flow chart of a fusion step in an image processing method provided in an embodiment of this application, to... Figure 5 Based on the illustrated process, S24 includes the following steps:

[0138] S241. Based on the BandingMap and the preset first rule, obtain the first type of weight and the second type of weight.

[0139] As mentioned earlier, in the BandingMap, 0 indicates that the pixel is not on a strip, and a non-zero value indicates that the pixel is on a strip. The first rule stipulates that the larger the value in the BandingMap, the larger the first type of weight and the smaller the second type of weight. The first type of weight is the weight for long frames in luminance processing, and the second type of weight is the weight for short frames after registration. The sum of the first type of weight and the second type of weight is 1.

[0140] For a binary BandingMap, 1 corresponds to the pre-configured first and second weights, and 0 corresponds to the pre-configured third and fourth weights. The first and third weights are first-class weights, and the second and fourth weights are second-class weights. Assuming a pixel P in the BandingMap has a value of 1 and its coordinates are (x, y), then pixel P corresponds to the first and second weights. The first weight represents the weight of the pixel at coordinates (x, y) in the long frame of the luminance processing, and the second weight represents the weight of the pixel at coordinates (x, y) in the short frame after registration.

[0141] Similarly, obtain the first-class weight and the second-class weight corresponding to each pixel in the BandingMap.

[0142] For grayscale banding maps, the correspondence between non-zero values ​​and first-class weights is as follows: Figure 8 As shown, Figure 8 In the diagram, the horizontal axis represents the size of the Banding Map (i.e., Banding intensity), and the vertical axis represents the first-class weights corresponding to the Banding Map. Based on... Figure 8 It can be seen that within a certain range [X] min X maxWithin [Y], the larger the value, the larger the corresponding first-class weight. The first-class weight value can be based on a preset [Y]. min Y max The weights are obtained through interpolation and other methods. For any pixel, the sum of the first type of weight and the second type of weight is 1, so after obtaining the first type of weight, the second type of weight can be obtained.

[0143] S242. Based on the first type of weight and the second type of weight, calculate the weighted sum of the brightness-processed long frame and the registered short frame to obtain the striped image T.

[0144] Figure 9 In order to be in Figure 5 Based on this, another specific process for fusion computing is proposed, and... Figure 7 The difference between the illustrated process and the previous one is that a weight mapping network is used instead of the first rule to obtain the first and second types of weights. It's understood that the weight mapping network is pre-trained; the specific training process will not be detailed here. After training, the weight mapping network learns the following: for pixels with larger values ​​in the BandingMap, the first type of weight is larger, and the second type of weight is smaller. The first type of weight is for luminance processing of long frames, and the second type of weight is for registration of short frames.

[0145] Figure 10 In order to be in Figure 5 Based on this, another specific process for fusion computing is proposed, and... Figure 7 and Figure 9 The difference lies in the fact that instead of obtaining weights separately, the Banding Map and the images to be fused are input into the fusion network together, yielding the network's output. The images to be fused consist of a long frame processed by luminance and a short frame after registration. The fusion network is pre-trained and possesses the following capabilities: it uses the Banding Map as prior information to guide the fusion of the images to be fused; pixels with larger values ​​in the Banding Map have larger first-class weights and smaller second-class weights. The first-class weights are for the long frame processed by luminance, and the second-class weights are for the short frame after registration.

[0146] All the above fusion methods are based on the BandingMap. The larger the value in the BandingMap, the greater the difference between the corresponding pixel in the long frame and the short frame. Therefore, it is more likely to be a pixel on the strip. So, the pixel value in the long frame is mainly used as a reference during fusion. Conversely, if the value is smaller, it is less likely to be a pixel on the strip. The pixel value in the short frame is mainly used as a reference during fusion. The fusion result can not only remove the strip, but also has the advantages of the short frame.

[0147] The above Figure 5 The fusion method is explained based on the process shown. Similarly, using... Figure 4For details on S14 based on this, please refer to [link / reference]. Figure 7 S241 and S242 shown, or Figure 9 S241 and S242 shown, or Figure 10 S241, as shown, will not be described in detail here.

[0148] It is understood that the image processing method provided in the embodiments of this application is based on BandingMap and utilizes long frames without banding ( Figure 4 The average value of different image frames (which can be considered as the long frame) is used to supplement the pixels on the bands in the short frames, thereby removing the bands in the short frames. This method is a way to process images with bands to remove them, rather than a way to prevent bands from appearing. Therefore, there is no need to pre-adjust the exposure time to prevent bands from appearing.

[0149] Embodiments of this application also provide a computer-readable storage medium storing instructions thereon, which, when executed on an electronic device, cause the electronic device to perform the image processing method provided in the above embodiments.

[0150] Embodiments of this application also provide a computer program product that, when run on an electronic device, enables the electronic device to implement the image processing method provided in the above embodiments.

[0151] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. An image processing method, characterized by, An electronic device, wherein a camera operating in the electronic device captures multiple frames of images of an object at different times by calling the camera to capture images of an object at a first exposure duration, the object being in an environment with a strobe light source and being stationary, and the first exposure duration being less than the strobe light period, the method comprising: The average value of the multiple frames of images is calculated to obtain a fused image. Each frame of the multiple frames of images contains bands, but the positions of the bands in the multiple frames of images are different. The fused image does not contain any bands. A stripe mapping image is obtained based on the difference between the fused image and the image to be processed; the image to be processed is a frame obtained after the multiple frames of images, and the image to be processed contains stripes; Based on the striped mapping image, the fused image and the image to be processed are fused to obtain an image with stripes removed. During the fusion process, the value of the first pixel in the striped image is obtained based on a weighted fusion method of the value of the first pixel in the fused image and the value of the first pixel in the image to be processed. The larger the value of the first pixel in the striped mapping image, the greater the weight of the fused image at the first pixel, and the smaller the weight of the image to be processed at the first pixel. The first pixel can be any pixel.

2. The method according to claim 1, characterized in that, The step of obtaining a strip mapping image based on the difference between the fused image and the image to be processed includes: The difference between the fused image and the image to be processed is obtained to obtain a strip mask image; The dark stripe mask image and the bright stripe mask image are extracted from the stripe mask image to obtain the stripe mapping image.

3. The method according to claim 2, characterized in that, Extracting the dark stripe mask image and the bright stripe mask image from the stripe mask image includes: In the strip mask image, pixel values ​​greater than or equal to the first threshold are retained or set to 1, and pixel values ​​less than the first threshold are set to 0 to obtain the dark strip mask image. In the flipped image of the strip mask image, pixel values ​​greater than or equal to the first threshold are retained or set to 1, and pixel values ​​less than the first threshold are set to 0, thus obtaining the bright strip mask image.

4. An image processing method, characterized in that, An application to an electronic device, wherein a camera operating in the electronic device captures images of an object in an environment with a strobe light source by invoking a superimposed high dynamic range (HDR) stagger HDR camera, the object being captured in such an environment as the HDR camera is located, wherein the exposure duration of a long frame captured by the stagger HDR camera is greater than or equal to the period of the strobe light, and the exposure duration of a short frame captured by the stagger HDR camera is less than the period of the strobe light, the method comprising: Based on the difference between the first long frame and the first short frame, a stripe mapping image is obtained. The first long frame and the first short frame have the same shooting time. The first long frame does not contain stripes, while the first short frame contains stripes. Based on the striped mapping image, the first image and the second image are fused to obtain an image with stripes removed. The first image is obtained based on the first long frame, and the second image is obtained based on the first short frame. During the fusion process, the value of the first pixel in the striped image is obtained based on a weighted fusion method of the value of the first pixel in the first image and the value of the first pixel in the second image. The larger the value of the first pixel in the striped mapping image, the greater the weight of the first image in the first pixel, and the smaller the weight of the second image in the first pixel.

5. The method according to claim 4, characterized in that, The step of obtaining the strip mapping image based on the difference between the first long frame and the first short frame includes: Calculate the difference between the first long frame and the first short frame to obtain a difference mask image, wherein the difference mask image represents the difference between the first long frame and the first short frame caused by the motion and striping of the object; Based on the pre-acquired motion mask image and the difference mask image, a strip mask image is obtained. The motion mask image represents the difference between the first long frame and the adjacent long frame caused by the motion of the object, and the strip mask image represents the difference between the first long frame and the first short frame caused by the strip. The dark stripe mask image and the bright stripe mask image are extracted from the stripe mask image to obtain the stripe mapping image.

6. The method according to claim 5, characterized in that, Before obtaining the strip mask image based on the pre-acquired motion mask image and the difference mask image, the method further includes: Based on the difference between the first long frame and the second long frame, a motion mask image is obtained. The motion mask image represents the difference between the first long frame and the second long frame caused by the motion of the object. The first long frame and the second long frame are image frames with adjacent timestamps.

7. The method according to claim 5 or 6, characterized in that, Extracting the dark stripe mask image and the bright stripe mask image from the stripe mask image includes: In the strip mask image, pixel values ​​greater than or equal to the first threshold are retained or set to 1, and pixel values ​​less than the first threshold are set to 0 to obtain the dark strip mask image. In the inverted image of the strip mask image, pixel values ​​greater than or equal to the first threshold are retained or set to 1, and pixel values ​​less than the first threshold are set to 0 to obtain the bright strip mask image.

8. The method according to any one of claims 4-7, characterized in that, Before fusing the first image and the second image based on the strip mapping image, the method further includes: The first long frame and the first short frame are registered to obtain the registered long frame and the registered short frame. The registered long frame is luminance aligned with the registered short frame to obtain a luminance-processed long frame. The first image is the luminance-processed long frame, and the second image is the registered short frame.

9. An electronic device, characterized in that, The electronic device includes: a camera, one or more processors, a memory, and a touchscreen; The camera captures multiple frames of images of the object with a first exposure duration, where the first exposure duration is less than the strobe period of the strobe light source; or, the camera is a stacked high dynamic range stagger HDR camera, where the exposure duration of the long frames captured by the stagger HDR camera is greater than or equal to the strobe period, and the exposure duration of the short frames captured by the stagger HDR camera is less than the strobe period. The memory is used to store program code; the processor is used to run the program code, so that the electronic device implements the image processing method as described in any one of claims 1 to 8.

10. A computer-readable storage medium, characterized in that, It stores instructions that, when executed on an electronic device, cause the electronic device to perform the image processing method as described in any one of claims 1 to 8.

11. A computer program product, characterized in that, When the computer program product is run on an electronic device, the electronic device enables the image processing method as described in any one of claims 1 to 8.