An image processing method, device and system

By generating a second tone mapping curve containing disparity maps and contrast information, the problem of contrast loss caused by tone mapping is solved, thereby improving the perceptual quality of 3D images.

CN115272440BActive Publication Date: 2026-06-23HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2021-07-09
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Tone mapping processing reduces the brightness of binocular 3D images, resulting in a loss of contrast and affecting the user's 3D perception.

Method used

By acquiring the disparity map and contrast information of the binocular images, a second tone mapping curve is generated to compensate for the contrast decrease caused by tone mapping, ensuring that the contrast after mapping meets the minimum disparity change requirement for human eye 3D perception.

Benefits of technology

It improves the 3D perception of 3D images displayed on the display device, ensures that the contrast is not lower than the level before tone mapping, and enhances the user's 3D perception effect.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115272440B_ABST
    Figure CN115272440B_ABST
Patent Text Reader

Abstract

The application relates to the technical field of image processing, and discloses an image processing method, device and system, which are used to solve the problem that tone mapping has a large influence on the binocular 3D image perception. The tone mapping curve is determined for the display image of the display end device in combination with the parallax between binocular images, instead of simply generating the tone mapping curve based on the image content, so that the contrast of the binocular image after tone mapping at least reaches the level before tone mapping, the minimum parallax change requirement of the human eye 3D perception is met, and the 3D perception of the 3D image displayed by the display end device is improved.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] This application claims priority to Chinese Patent Application No. 202110484386.8, filed on April 30, 2021, entitled "Method and Apparatus for Image Tone Mapping", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of image processing technology, and in particular to an image processing method, device and system. Background Technology

[0003] Dynamic range (DR) can be used to represent the ratio of the maximum to the minimum value of a variable in an image. The dynamic range in the real world is between 10... -3 -10 -6 The range between these two values ​​is called High Dynamic Range (HDR). In other words, the brightness values ​​of objects in the real world can be very high or very low. However, the display capabilities of display devices are usually limited. Therefore, before displaying an image, the image to be displayed needs to undergo tone mapping processing, which maps the high dynamic range image to a display device with low dynamic range display capabilities. However, for binocular 3D images, tone mapping processing will reduce the brightness of the binocular 3D image, resulting in a loss of global or local contrast and a decrease in the user's perception of the displayed binocular 3D image. Summary of the Invention

[0004] This application provides an image processing method, device, and system to address the problem that tone mapping has a significant impact on the perception of binocular 3D images.

[0005] In a first aspect, embodiments of this application provide an image processing method, comprising: acquiring a binocular image and generating a first tone mapping curve for the binocular image, wherein the binocular image is a high dynamic range (HDR) image; acquiring a first contrast ratio of the binocular image and a disparity map between the binocular images; acquiring a second contrast ratio of the binocular image after tone mapping using the first tone mapping curve; and obtaining a second tone mapping curve based on the first contrast ratio, the second contrast ratio, and the disparity map; wherein the second tone mapping curve is used to ensure that the contrast ratio of the tone-mapped binocular image meets the minimum disparity change requirement for human eye 3D perception. Through the above scheme, the tone mapping curve is determined for the image displayed on the display device by combining the disparity between the binocular images, rather than simply generating the tone mapping curve based on the image content. This ensures that the contrast ratio of the tone-mapped binocular image is at least at the level before tone mapping, thus meeting the minimum disparity change requirement for human eye 3D perception and improving the 3D perception of the 3D image displayed on the display device.

[0006] In one possible design, obtaining a second tone mapping curve based on the first contrast ratio, the second contrast ratio, and the disparity map includes: determining the disparity gradient of a first pixel point based on the disparity map, where the first pixel point is any pixel point in the binocular image; obtaining a gain coefficient for the first pixel point based on the first contrast ratio, the second contrast ratio, and the disparity gradient of the first pixel point; the gain coefficient is used to compensate for the decrease in contrast of the first pixel point caused by tone mapping; and obtaining the second tone mapping curve based on the determined gain coefficients of each pixel point in the binocular image. In the above design, the gain coefficients are determined by comparing the contrast ratios before and after tone mapping using the disparity map and the first tone mapping curve, and are used to compensate for the decrease in contrast of each pixel point caused by tone mapping, so that the contrast ratio after tone mapping after compensation is at least at the level before tone mapping.

[0007] In one possible design, the gain coefficient satisfies the following condition:

[0008]

[0009] Where Q(x,y) represents the gain coefficient of a pixel, x represents the x-coordinate of a pixel, y represents the y-coordinate of a pixel, D′(x,y) represents the disparity gradient of a pixel, and K -1 () represents the inverse function of the relationship between parallax perception sensitivity and contrast, C(x,y) represents the first contrast of a pixel, C tm (x,y) represents the second contrast of a pixel.

[0010] The above design method is used to determine the gain coefficient, which is simple and effective.

[0011] In one possible design, the relationship between parallax perception sensitivity and contrast satisfies the following condition:

[0012]

[0013] Where c represents contrast, K(c) represents parallax perception sensitivity, and J represents the fitting coefficient.

[0014] In one possible design, obtaining the second tone mapping curve based on the gain coefficient includes: using the gain coefficient as a weight, obtaining a weighted histogram based on the pixel values ​​of the pixels included in the binocular image; and generating the second tone mapping curve based on the weighted histogram.

[0015] The above design generates tone mapping curves that can compensate for contrast using a weighted histogram, thereby improving 3D perception.

[0016] In one possible design, obtaining the second tone mapping curve based on the gain coefficient includes: using the gain coefficient as weight, obtaining a weighted histogram based on the pixel values ​​of the pixels included in the binocular image; generating a third tone mapping curve based on the weighted histogram; and performing a weighted processing on the first tone mapping curve and the third tone mapping curve to obtain the second tone mapping curve. For example, the weights of the first and second tone mapping curves can differ in different application scenarios. The above design, using a fusion of two histograms to obtain a tone mapping curve that compensates for contrast, offers greater flexibility and can be applied to different scenarios.

[0017] In one possible design, the weighted histogram satisfies the following condition:

[0018]

[0019] Where Q(x,y) represents the gain coefficient of the pixel, w(x,y) represents the adjustment factor of the pixel, I(x,y) represents the pixel value of the pixel, and l i h represents the boundary of the i-th histogram interval. i This represents the value of the i-th histogram interval of the weighted histogram.

[0020] In one possible design, obtaining the second tone mapping curve based on the gain coefficient includes:

[0021] The parameters of the second tone mapping curve are determined based on the gain coefficient as follows;

[0022]

[0023] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3D Let Q(x,y) represent the derivative of the second tone mapping curve function, Q(x,y) represent the gain coefficient of the pixel, and arg min() represent the parameter values ​​of the second tone mapping curve that minimize the value in parentheses. By solving for the mapping curve parameters as described above, a tone mapping curve for contrast compensation can be obtained based on the gain coefficient, which is simple and effective.

[0024] In one possible design, obtaining the second tone mapping curve based on the gain coefficient includes:

[0025] The parameters of the third tone mapping curve are determined based on the gain coefficients in the following manner;

[0026]

[0027] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3D The function represents the derivative of the third tone mapping curve function, Q(x,y) represents the gain coefficient of the pixel, and arg min() represents the parameter value of the second tone mapping curve when the value in parentheses is minimized.

[0028] The second tone mapping curve is obtained by weighting the first tone mapping curve and the third tone mapping curve.

[0029] For example, the weights of the first and second tone mapping curves can differ in different application scenarios. The above design obtains the tone mapping curve that compensates for contrast by solving for the mapping curve parameters, making it more flexible and applicable to different scenarios.

[0030] In one possible design, obtaining the second contrast of the binocular image after tone mapping using the first tone mapping curve includes:

[0031] The binocular image is tone-mapped using the first tone mapping curve, and the second contrast of the tone-mapped binocular image is obtained; or,

[0032] The contrast loss value of the binocular image after tone mapping is estimated based on the first tone mapping curve, and the second contrast is obtained based on the contrast loss value and the first contrast.

[0033] The above design provides two methods for obtaining a second contrast after tone mapping using the first tone mapping curve.

[0034] In one possible design, the method further includes: encoding the weighted histogram into the dynamic metadata of the bitstream.

[0035] In one possible design, the method further includes: encoding the second tone mapping curve into the dynamic metadata of the bitstream.

[0036] In one possible design, the method further includes: performing tone mapping processing on the binocular image according to the second tone mapping curve.

[0037] Secondly, embodiments of this application provide an image processing apparatus, comprising:

[0038] An acquisition module is used to acquire binocular images and generate a first tone mapping curve for the binocular images, wherein the binocular images are high dynamic range (HDR) images;

[0039] The first determining module is used to obtain the first contrast of the binocular image and the disparity map between the binocular images;

[0040] The second determining module is used to obtain the second contrast of the binocular image after tone mapping by the first tone mapping curve.

[0041] The third determining module obtains the second tone mapping curve based on the first contrast, the second contrast, and the disparity map;

[0042] The second tone mapping curve is used to ensure that the contrast of the tone-mapped binocular image meets the minimum parallax change requirement for human eye 3D perception.

[0043] In one possible design, the third determining module is specifically used for: determining the disparity gradient of a first pixel point based on the disparity map, wherein the first pixel point is any pixel point in the binocular image; obtaining the gain coefficient of the first pixel point based on the first contrast, the second contrast, and the disparity gradient of the first pixel point; the gain coefficient is used to compensate for the decrease in contrast of the first pixel point caused by tone mapping; and obtaining the second tone mapping curve based on the determined gain coefficient of each pixel point in the binocular image.

[0044] In one possible design, the gain coefficient satisfies the following condition:

[0045]

[0046] Where Q(x,y) represents the gain coefficient of a pixel, x represents the x-coordinate of a pixel, y represents the y-coordinate of a pixel, D′(x,y) represents the disparity gradient of a pixel, and K -1() represents the inverse function of the relationship between parallax perception sensitivity and contrast, C(x,y) represents the first contrast of a pixel, C tm (x,y) represents the second contrast of a pixel.

[0047] In one possible design, the relationship between parallax perception sensitivity and contrast satisfies the following condition:

[0048]

[0049] Where c represents contrast, K(c) represents parallax perception sensitivity, and J represents the fitting coefficient.

[0050] In one possible design, the third determining module, when performing the operation of obtaining the second tone mapping curve based on the gain coefficient, is specifically used to: obtain a weighted histogram based on the pixel values ​​of the pixels included in the binocular image, using the gain coefficient as a weight; and generate the second tone mapping curve based on the weighted histogram.

[0051] In one possible design, the third determining module, when performing the operation of obtaining the second tone mapping curve based on the gain coefficient, is specifically configured to: obtain a weighted histogram based on the pixel values ​​of the pixels included in the binocular image, using the gain coefficient as a weight; generate a third tone mapping curve based on the weighted histogram; and obtain the second tone mapping curve by weighting the first tone mapping curve and the third tone mapping curve.

[0052] In one possible design, the weighted histogram satisfies the following condition:

[0053]

[0054] Where Q(x,y) represents the gain coefficient of the pixel, w(x,y) represents the adjustment factor of the pixel, I(x,y) represents the pixel value of the pixel, and l i h represents the boundary of the i-th histogram interval. i This represents the value of the i-th histogram interval of the weighted histogram.

[0055] In one possible design, when the third determining module performs the operation of obtaining the second tone mapping curve based on the gain coefficient, it is specifically used to: solve for the parameters of the second tone mapping curve based on the gain coefficient in the following manner;

[0056]

[0057] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3DThe function represents the derivative of the second tone mapping curve function, Q(x,y) represents the gain coefficient of the pixel, and arg min() represents the parameter value of the second tone mapping curve that makes the value in parentheses reach its minimum.

[0058] In one possible design, when the third determining module performs the operation of obtaining the second tone mapping curve based on the gain coefficient, it is specifically used to: solve for the parameters of the third tone mapping curve based on the gain coefficient in the following manner;

[0059]

[0060] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3D The function represents the derivative of the third tone mapping curve function, Q(x,y) represents the gain coefficient of the pixel, and arg min() represents the parameter value of the second tone mapping curve when the value in parentheses is minimized.

[0061] The second tone mapping curve is obtained by weighting the first tone mapping curve and the third tone mapping curve.

[0062] In one possible design, the second determining module is specifically used to: perform tone mapping on the binocular image using the first tone mapping curve, and obtain the second contrast of the tone-mapped binocular image; or, estimate the contrast loss value of the binocular image after tone mapping using the first tone mapping curve based on the first tone mapping curve, and obtain the second contrast based on the contrast loss value and the first contrast.

[0063] In one possible design, an encoding module is also included for encoding the weighted histogram into the dynamic metadata of the bitstream.

[0064] In one possible design, an encoding module is also included to encode the second tone mapping curve into the dynamic metadata of the bitstream.

[0065] In one possible design, a tone mapping module is also included for performing tone mapping processing on the binocular image based on the second tone mapping curve.

[0066] Thirdly, embodiments of this application provide an encoder for implementing the method described in the first aspect. It should be noted that the encoder does not perform tone mapping processing on the stereo image according to the second tone mapping curve. Exemplarily, the encoder may include the apparatus described in any of the designs of the second aspect but does not include a tone mapping module.

[0067] Fourthly, embodiments of this application provide an encoding device comprising: a non-volatile memory and a processor coupled together, wherein the processor invokes program code stored in the memory to execute the method described in the first aspect or any design of the first aspect. It should be noted that the processor does not perform tone mapping processing on the binocular image according to the second tone mapping curve.

[0068] Fifthly, embodiments of this application provide a decoder for implementing the method described in the first aspect. It should be noted that the decoder does not perform the operation of encoding the weighted histogram into the bitstream or encoding the second tone mapping curve into the bitstream. Exemplarily, the encoder may include the apparatus described in any of the designs of the second aspect but does not include an encoding module.

[0069] Sixthly, embodiments of this application provide an encoding device comprising: a non-volatile memory and a processor coupled together, wherein the processor invokes program code stored in the memory to execute the method described in the first aspect or any design of the first aspect. It should be noted that the processor does not perform encoding operations, such as encoding the weighted histogram into the bitstream or encoding the second tone mapping curve into the bitstream.

[0070] In a seventh aspect, embodiments of this application provide an image processing system, including the encoder described in the third aspect and the decoder described in the fifth aspect, or including the encoding device described in the fourth aspect or the decoding device described in the sixth aspect.

[0071] Eighthly, embodiments of this application provide a computer-readable storage medium storing program code, wherein the program code includes instructions for performing some or all of the steps of any of the methods of the first aspect.

[0072] Ninthly, embodiments of this application provide a computer program product that, when run on a computer, causes the computer to perform some or all of the steps of any of the methods of the first aspect.

[0073] It should be understood that the beneficial effects of aspects two through nine of this application can be found in the relevant description of aspect one, and will not be repeated here. Attached Figure Description

[0074] Figure 1 This is a schematic diagram of the PQ photoelectric transfer function relationship;

[0075] Figure 2 This is a schematic diagram of the HLG photoelectric transfer function relationship;

[0076] Figure 3 This is a schematic diagram of the photoelectric transfer function relationship of SLF;

[0077] Figure 4A This is a schematic block diagram of the image display system in the embodiments of this application;

[0078] Figure 4B This is a schematic diagram of object contrast in an embodiment of this application;

[0079] Figure 5 This is a schematic diagram of the image processing method in the embodiments of this application;

[0080] Figure 6 This is a schematic diagram of the process for obtaining the second tone mapping curve in an embodiment of this application;

[0081] Figure 7A This is a schematic diagram of an image processing flow provided as Example 1 of an embodiment of this application;

[0082] Figure 7B This is a schematic diagram of another image processing flow provided for Example 1 of the embodiments of this application;

[0083] Figure 8A This is a schematic diagram of an image processing flow provided as Example 2 of an embodiment of this application;

[0084] Figure 8B This is a schematic diagram of another image processing flow provided for Example 2 of the embodiments of this application;

[0085] Figure 9A This is a schematic diagram of an image processing flow provided for Example 3 of an embodiment of this application;

[0086] Figure 9B This is a schematic diagram of another image processing flow provided for Example 3 of the embodiments of this application;

[0087] Figure 10A A schematic diagram of an image processing flow is provided for Example 4 of the embodiments of this application;

[0088] Figure 10B This is a schematic diagram of another image processing flow provided for Example 4 of the embodiments of this application;

[0089] Figure 11A A schematic diagram of an image processing flow is provided for Example 5 of the embodiments of this application;

[0090] Figure 11B Another image processing flow diagram provided for Example 6 of the embodiments of this application;

[0091] Figure 12A A schematic diagram of an image processing flow is provided for Example 7 of the embodiments of this application;

[0092] Figure 12BAnother image processing flow diagram provided for Example 7 of the embodiments of this application;

[0093] Figure 13A Example 8 of this application provides a schematic diagram of an image processing flow;

[0094] Figure 13B Another image processing flow diagram provided for Example 9 of the embodiments of this application;

[0095] Figure 14A A schematic diagram of an image processing flow is provided for Example 10 of the embodiments of this application;

[0096] Figure 14B Another image processing flow diagram provided for Example 11 of the embodiments of this application;

[0097] Figure 15A This is a schematic diagram of an image processing device according to an embodiment of this application;

[0098] Figure 15B This is a schematic diagram of another image processing device in an embodiment of this application;

[0099] Figure 16 This is a schematic diagram of the encoding device structure in an embodiment of this application;

[0100] Figure 17 This is a schematic diagram of the decoding device structure in an embodiment of this application. Detailed Implementation

[0101] The terms "first," "second," etc., in the specification, embodiments, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion, such as including a series of steps or units. A method, system, product, or apparatus is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to these processes, methods, products, or apparatuses.

[0102] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.

[0103] Dynamic range represents the ratio between the maximum and minimum grayscale values ​​within the displayable range of an image. In most current color digital images, each of the R, G, and B channels uses one byte (8 bits) for storage, meaning each channel represents a range of 0 to 255 grayscale levels. This 0-255 range is the image's dynamic range. However, in the real world, the dynamic range of the same scene is between 10⁻³ and 10⁶, which we call high dynamic range (HDR). In contrast to high dynamic range, the dynamic range of ordinary images is called low dynamic range (LDR). Therefore, the imaging process of a digital camera is essentially a mapping from the high dynamic range of the real world to the low dynamic range of the photograph.

[0104] The greater the dynamic range of an image, the more scene details it displays, the richer the brightness levels, and the more realistic the visual effect.

[0105] The process of optical digital imaging (e.g., the imaging process of a digital camera) involves converting the light radiation of a real scene into electrical signals through an image sensor and saving it as a digital image. The purpose of image display is to reproduce the real scene described by a digital image through a display device. The ultimate goal of both is to provide users with the same visual perception as if they were directly observing a real scene. The brightness levels in a real scene that light radiation (light signals) can display are almost linear; therefore, light signals are also called linear signals. However, in the process of converting light signals into electrical signals in optical digital imaging, not every light signal corresponds to an electrical signal; the converted electrical signals are non-linear. Therefore, electrical signals are also called non-linear signals. The curve that converts light signals into electrical signals is called the optical electrotransfer function (OETF). The OETFs involved in the embodiments of this application include the perceptual quantizer (PQ) OETF, the hybrid log-gamma (HLG) OETF, and the scene luminance fidelity (SLF) OETF.

[0106] The PQ photoelectric transfer function is a perceptual quantization photoelectric transfer function proposed based on the human eye's brightness perception model. The PQ photoelectric transfer function represents the conversion relationship between linear signal values ​​of image pixels and nonlinear signal values ​​in the PQ domain. See also... Figure 1 The figure shows a schematic diagram of the PQ photoelectric transfer function relationship. The PQ photoelectric transfer function can be expressed as formula (1-1):

[0107]

[0108] The parameters in formula (1) are calculated as follows:

[0109]

[0110] in,

[0111] L represents a linear signal value, which is normalized to [0, 1].

[0112] L' represents the nonlinear signal value, and its value ranges from [0, 1].

[0113] m1 is the PQ photoelectric transfer coefficient.

[0114] PQ is the photoelectric transfer coefficient.

[0115] c1 is the PQ photoelectric transfer coefficient.

[0116] c2 is the PQ photoelectric transfer coefficient.

[0117] c3 is the PQ photoelectric transfer coefficient.

[0118] The HLG photoelectric transfer function is an improvement upon the traditional Gamma curve. The HLG photoelectric transfer function uses the traditional Gamma curve in the lower range and supplements it with a log curve in the higher range. See also... Figure 2 This is a schematic diagram of the HLG photoelectric transfer function relationship. The HLG photoelectric transfer function represents the conversion relationship between the linear signal value of an image pixel and the nonlinear signal value in the HLG domain. The HLG photoelectric transfer function can be expressed as equation (1-2):

[0119]

[0120] Where L represents the linear signal value, with a range of [0, 12]. L' represents the nonlinear signal value, with a range of [0, 1]. a = 0.17883277 represents the HLG photoelectric transfer coefficient. b = 0.28466892 represents the HLG photoelectric transfer coefficient. c = 0.55991073 represents the HLG photoelectric transfer coefficient.

[0121] The SLF photoelectric transfer function is the optimal curve obtained based on the brightness distribution of an HDR scene, while satisfying the optical characteristics of the human eye. (See also...) Figure 3 , Figure 3 This is a schematic diagram of the photoelectric transfer function relationship of SLF.

[0122] The SLF photoelectric transfer curve represents the conversion relationship between the linear signal value of an image pixel and the nonlinear signal value in the SLF domain. The conversion relationship between the linear signal value of an image pixel and the nonlinear signal value in the SLF domain is shown in formula (1-3):

[0123]

[0124] The SLF photoelectric transfer function can be expressed as formula (1-4):

[0125]

[0126] in:

[0127] L represents the linear signal value, which is normalized to [0,1]. L' represents the nonlinear signal value, which ranges from [0,1]. p = 2.3 represents the SLF photoelectric transfer coefficient. m = 0.14 represents the SLF photoelectric transfer coefficient. a = 1.12762 represents the SLF photoelectric transfer coefficient. b = -0.12762 represents the SLF photoelectric transfer coefficient.

[0128] The principle of binocular stereoscopic (3D) vision is that because there is a distance between the two eyes, the two images obtained by the left and right eyes are different. The human visual system merges the images from the left and right eyes and perceives depth by the difference between the images from the left and right eyes.

[0129] A disparity map is used to describe the correspondence between features in the left and right eye images, or the correspondence between the image points of the same physical point in different images. There are two formats for binocular stereo vision images. One format uses binocular images, where binocular cameras simultaneously capture images of the same scene from both the left and right viewpoints. The disparity map for binocular images can be derived using disparity estimation algorithms. The other format combines a monocular image with a depth map. The depth map describes the distance of each pixel in the monocular image from the observer. Using the monocular image and the depth map, the images of the left and right viewpoints can be calculated. As an example, a depth map can be converted into a disparity map, with the relationship: Disparity = Interpupillary Distance × Focal Length / Depth.

[0130] In this application's embodiments, a voxel is a concept derived from a pixel extending from two-dimensional space to three-dimensional space, similar to the smallest unit in two-dimensional space—a pixel. Two-dimensional images are composed of pixels, while three-dimensional images are composed of voxels. A voxel is the smallest unit of a three-dimensional (3D) image. It should be noted that a voxel itself does not possess absolute coordinates, but depth information can be extracted from its relative position.

[0131] The 3D mesh (polygon mesh) involved in the embodiments of this application is a collection of vertices and polygons representing the shape of a polyhedron in 3D computer graphics. For example, the surface of an object can be divided into multiple triangular regions, each with a set of parameters to control rendering. The 3D mesh contains three-dimensional spatial information. Depth information can also be extracted from the 3D mesh.

[0132] 3D point clouds refer to points obtained through 3D scanning, where each point contains three-dimensional coordinates. Other properties can also be added, such as color components (R, G, B) or the surface reflectance of an object. Since point clouds contain three-dimensional coordinates, the corresponding depth map can be directly derived from these coordinates.

[0133] See Figure 4A The diagram shown is a schematic block diagram of an exemplary image display system provided in an embodiment of this application. The image display system may include an image acquisition device 100 and an image processing device 200. For example, the image acquisition device 100 can acquire binocular 3D images, which may be HDR images or standard dynamic range (SDR) images. The image processing device 200 can process the binocular 3D images acquired by the image acquisition device 100.

[0134] Image acquisition device 100 and image processing device 200 can be communicatively connected via link 102, and image processing device 200 can receive image data from image acquisition device 100 via link 102. Link 102 may include one or more communication media or devices. The one or more communication media may include wireless and / or wired communication media, such as radio frequency (RF) spectrum or one or more physical transmission lines. Optionally, the one or more communication media may form part of a packet-based network, such as a local area network, wide area network, or global network (e.g., the Internet). The one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from image acquisition device 100 to image processing device 200. In an optional case, link 102 may be a Bluetooth wireless link.

[0135] For example, the image acquisition device 100 includes an image source 11 and a communication interface 12. The communication interface 12 can also be called a transmission interface. Alternatively, the image acquisition device 100 may also include an image processor 13. In specific implementations, the image processor 13, the image source 11, and the communication interface 12 may be hardware components in the image acquisition device 100, or they may be software programs in the image acquisition device 100, or the communication interface 12 may be a combination of a hardware module and a software program.

[0136] Image source 11 may include or be any type of image capture device for, for example, capturing real-world images, and / or any type of image or commentary (for screen content encoding, some text on the screen is also considered part of the image or picture to be encoded) generation device, such as a computer graphics processor for generating computer-animated images, or any type of device for acquiring and / or providing real-world images, computer-animated images (e.g., screen content, virtual reality (VR) images), and / or any combination thereof (e.g., augmented reality (AR) images). For example, image source 11 may be a camera for capturing images or a memory for storing images. Image source 11 may also include any type of (internal or external) interface for storing previously captured or generated images and / or acquiring or receiving images. When image source 11 is a camera, image source 11 may be, for example, a local or integrated camera within an image acquisition device; when image source 11 is a memory, image source 11 may be a local or integrated memory within an image acquisition device. When the image source 11 includes an interface, the interface may be, for example, an external interface for receiving images from an external video source. The external video source may be, for example, an external image capture device, such as a camera, external storage, or an external image generation device. The external image generation device may be, for example, an external computer graphics processor, a computer, or a server.

[0137] An image can be viewed as a two-dimensional array or matrix of pixels. The pixels in the array can also be called sampling points. In one alternative case, to represent color, each pixel includes three color components. For example, in RGB format or color space, an image includes corresponding red, green, and blue sampling arrays. However, in video encoding, each pixel is typically represented in a luma / chroma format or color space. For example, for a YUV format image, this includes a luma component indicated by Y (sometimes also indicated by L) and two chroma components indicated by U and V. The luma component Y represents the brightness or grayscale level intensity (e.g., both are the same in a grayscale image), while the two chroma components U and V represent chroma or color information components. Accordingly, a YUV format image includes a luma sampling array of luma sample values ​​(Y) and two chroma sampling arrays of chroma values ​​(U and V). RGB format images can be converted or transformed to YUV format, and vice versa; this process is also called color conversion or color format conversion. If the image is black and white, it may only include a luma sampling array. In this embodiment of the application, the image transmitted from the image source 11 to the image processor can also be referred to as the original image data.

[0138] Image processor 13 is used to perform image processing, such as brightness mapping, tone mapping, color format conversion (e.g., from RGB to YUV format), color gamut conversion, saturation adjustment, color correction, resolution adjustment, or noise reduction.

[0139] Communication interface 12 can be used to receive image data that has undergone image processing, and can transmit the image data to image processing device 200 for further image processing via link 102, or to memory for storage. For example, communication interface 12 can be used to encapsulate the image data into a suitable format, such as data packets, for transmission over link 102.

[0140] The image processing device 200 includes a communication interface 21, an image processor 22, and a display device 23. These are described below:

[0141] Communication interface 21 can be used to receive image-processed image data from image acquisition device 100 or any other source, such as a storage device. Specific examples of communication interfaces 12 and 21 can be found in the foregoing description of the interfaces, and will not be repeated here. Communication interface 21 can be used to transmit or receive image-processed image data via link 102 between image acquisition device 100 and image processing device 200 or any other type of network. Communication interface 21 can, for example, be used to decapsulate data packets transmitted by communication interface 12 to obtain image-processed image data.

[0142] Both communication interface 21 and communication interface 12 can be configured as unidirectional or bidirectional communication interfaces, and can be used, for example, to send and receive messages to establish connections, acknowledge and exchange any other information related to the communication link and / or, for example, image data processed by image processing and / or data transmission. For example, communication interface 21 and communication interface 12 can be any type of interface according to any proprietary or standardized interface protocol, such as High Definition Multimedia Interface (HDMI), Mobile Industry Processor Interface (MIPI), MIPI-standardized Display Serial Interface (DSI), Video Electronics Standards Association (VESA) standardized Embedded Display Port (eDP), Display Port (DP), or V-By-One interface (a digital interface standard developed for image transmission), as well as various wired or wireless interfaces, optical interfaces, etc.

[0143] Image processor 22 is used to perform tone mapping processing on image data that has undergone image processing, so as to obtain tone-mapped image data. The processing performed by image processor 22 may also include: super-resolution, color format conversion (e.g., from YUV format to RGB format), noise reduction, color gamut conversion, saturation adjustment, brightness mapping, upsampling, downsampling, and image sharpening, etc., and can also be used to transmit tone-mapped image data to display device 23. It should be understood that image processor 13 and image processor 22 can be general-purpose central processing units (CPUs), systems on chips (SOCs), processors integrated on SOCs, separate processor chips, or controllers, etc.; image processor 22 and image processor 13 can also be dedicated processing devices, such as application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), dedicated video or graphics processors, graphics processing units (GPUs), and neural-network processing units (NPUs), etc. Image processor 13 and image processor 22 can also be processor groups consisting of multiple processors, which are coupled to each other through one or more buses.

[0144] Display device 23 is used to receive tone-mapped image data to display the image to a user or viewer. Display device 23 can be or may include any type of display for presenting the reconstructed image, such as an integrated or external display or monitor. For example, the display may include a liquid crystal display (LCD), an organic light-emitting diode (OLED) display, a plasma display, a projector, a micro-LED display, a liquid crystal on silicon (LCoS), a digital light processor (DLP), or any other type of display. Alternatively, display device 23 itself has image processing capabilities, and tone mapping of the image can also be performed at the display device.

[0145] It should be understood that Figure 4AThe image acquisition device 100, image processing device 200, and display device 23 are illustrated as separate devices. In one optional case, a single image processing device may simultaneously possess the functions of both the image acquisition device 100 and the image processing device 200. In another optional case, a single image processing device may simultaneously possess the functions of both the image processing device 200 and the display device 23. In yet another optional case, a single image processing device may simultaneously possess the functions of all three: the image acquisition device 100, the image processing device 200, and the display device 23. For example, a smartphone may have a camera, an image processor, and a display screen, where the camera corresponds to the image acquisition device 100, the image processor corresponds to the image processing device 200, and the display screen corresponds to the display device 23. Similarly, a smart TV may have a webcam, an image processor, and a display screen, where the webcam corresponds to the image acquisition device 100, the image processor corresponds to the image processing device 200, and the display screen corresponds to the display device 23.

[0146] In one possible scenario, the image acquisition device 100 needs to transmit a video stream consisting of multiple consecutive images to the image processing device 200. Before transmitting the video stream to the image processing device 200, the image acquisition device 100 can encode each image in the video stream into a bitstream and transmit it to the image processing device 200 via link 102. In this scenario, after receiving the encoded bitstream, the image processing device 200 decodes the bitstream to obtain each image in the video stream and then performs further processing on each image. As an example, in this scenario, the image acquisition device 100 can be called an encoder or encoding device, and the image processing device 200 can be called a decoder or decoding device.

[0147] In one optional embodiment, the image processing device includes a hardware layer, an operating system layer running on top of the hardware layer, and an application layer running on top of the operating system layer. The hardware layer includes hardware such as a CPU, a memory management unit (MMU), and memory (also known as main memory). The operating system can be any one or more computer operating systems that implement business processing through processes, such as Linux, Unix, Android, iOS, or Windows. The application layer includes applications such as browsers, address books, word processing software, and instant messaging software. Furthermore, this application embodiment does not specifically limit the specific structure of the execution subject of the method provided in this application embodiment, as long as it can perform image processing by running relevant code. For example, the execution subject of the method provided in this application embodiment can be an electronic device, or a functional module in an electronic device capable of calling and executing programs, such as a processor in an electronic device.

[0148] It should be understood that the image acquisition device 100 and the image processing device 200 may include any type of handheld or stationary device, such as a laptop or laptop computer, mobile phone, smartphone, tablet or tablet computer, camera, desktop computer, set-top box, television, camera, in-vehicle device, display device, digital media player, video game console, video streaming device (e.g., content service server or content distribution server), broadcast receiver device, broadcast transmitter device, etc., and may or may not use any type of operating system.

[0149] Image acquisition devices can generate images or videos containing the natural scene by receiving light signals from it. To facilitate image or video transmission, the light signals need to be converted into electrical signals, and image information for each pixel is recorded using fixed-range brightness or chromaticity values ​​(e.g., grayscale values ​​between 0 and 255). Display devices can determine the brightness of an object at the time of capture based on the photoelectric transfer function and the brightness or grayscale values ​​of each pixel in the image; that is, electronic devices can convert YUV or RGB information into brightness in nits. In one optional scenario, the image acquisition device is called the front-end device, and the display device is called the back-end device. However, the brightness of the object may exceed the display capability of the display device. Because the brightness information acquired by the image acquisition device does not match the brightness display capability of the display device, there are situations where a display device with low brightness display capability displays a high-brightness image, and vice versa. For example, a front-end device might capture a 4000 nit light signal, while the back-end display device (TV, tablet, etc.) only has an HDR display capability of 1000 nits. In these cases, tone mapping is required on the image acquired by the image acquisition device to ensure the image matches the display capability of the display device. For instance, a tone mapping curve can be used to tone map a high dynamic range image onto a display device with a low dynamic range capability. It should be understood that this tone mapping curve can be determined by the display device or by a processing device external to the display device.

[0150] Tone mapping methods can be divided into static and dynamic methods. Static mapping is a process of applying the same tone curve to all data or content on a single hard drive, based on a single data point. This means the processing curve is typically the same for all data or content. Static mapping carries less information and has a simpler processing flow. However, using the same curve for every scene can lead to the loss of information in some scenes. For example, if the curve focuses on protecting bright areas, details in extremely dark scenes may be lost, and in extreme cases, extremely dark scenes may not be visible after tone mapping. Dynamic mapping, on the other hand, adjusts dynamically based on specific regions, each scene, or each frame. Dynamic mapping generally yields better results than static mapping.

[0151] When performing tone mapping on binocular 3D images, the brightness of the binocular 3D images is reduced, which leads to a loss of global or local contrast. Only when the contrast is strong enough can the visual systems of the left and right eyes merge the two images. Therefore, after tone mapping, the loss of global or local contrast in binocular 3D images affects the user's perception of the displayed binocular 3D images.

[0152] See Figure 4B As shown, after tone mapping processing, the brightness of a typical binocular image decreases, leading to a loss of global or local contrast. This loss of contrast translates to a decline in 3D perception. The applicant's research has revealed a correlation between contrast and 3D perception effects and the parallax information of binocular images. For example, see... Figure 4B As shown, assuming objects A and B have significant depths but are also far from the background, in binocular 3D display, the human eye can easily distinguish the depths of the background and object A, as well as the depths of the background and object B. Even after tone mapping reduces the contrast of the binocular 3D image, within a certain range, the human eye can still easily perceive the depths of objects A and B relative to the background, thus perceiving the 3D effect. However, if the depths of A and B are very similar, 3D perception becomes more sensitive to the decrease in contrast. After tone mapping, the human eye may not perceive the depth difference between objects A and B, and might even mistakenly believe that A and B are the same object. Therefore, to better perceive objects A and B, it is necessary to protect their contrast and reduce the impact of tone mapping on their contrast.

[0153] Based on this, embodiments of this application provide an image processing method and apparatus that reduce the loss of 3D effect caused by the decrease in contrast by adjusting dynamic mapping. In the tone mapping process of binocular 3D images, the contrast and parallax information of the image source are combined to make the 3D perception effect of the tone-mapped image infinitely close to the 3D perception effect of the image source. Alternatively, it can be understood that the contrast of the tone-mapped binocular image meets the parallax requirements, or in other words, meets the user's 3D perception requirements.

[0154] See Figure 5 The diagram shown is a schematic flowchart of an image processing method provided in this application. Figure 5 The method shown can be performed by one or more processors used for tone mapping. These one or more processors may include... Figure 4A The image processor in the image processing device 200 shown, and / or Figure 4A The image processor in the image acquisition device 100 shown. Or, in other words, Figure 5 The method shown can be implemented at the encoding end, at the decoding end, or in combination at the encoding end and the decoding end.

[0155] 501. Acquire the stereo image and generate the first tone mapping curve for the stereo image. The stereo image is an HDR image.

[0156] This application involves two different tone mapping curves, which are referred to as the first tone mapping curve and the second tone mapping curve for easy distinction. A binocular image can be understood as two 2D images, which are then stitched together to form a single 2D image. A tone mapping curve is then calculated based on the content of this 2D image. In this application, the tone mapping curve generated based on the 2D image content is called the first tone mapping curve, which can also be called an image-based tone mapping curve or a 2D image tone mapping curve. Other names are also possible, and this application does not impose any specific limitations on this. The tone mapping curve generated according to the scheme provided in this application, combined with the image's depth or parallax, is called the second tone mapping curve in this application, which can also be called a parallax-based tone mapping curve or a 3D image tone mapping curve.

[0157] In this embodiment of the application, any content-based tone mapping curve generation method based on an image can be used when generating the first tone mapping curve. This embodiment of the application does not make any specific limitation on this method.

[0158] 502, obtain the first contrast ratio of the binocular images.

[0159] This application does not specifically limit the method of obtaining image contrast. As an example, the following describes one possible method of obtaining image contrast.

[0160] The local root mean square (RMS) contrast within an M×M window centered at each pixel (x,y) is calculated using the following formula (2-1).

[0161]

[0162] Among them, C RMS (x,y) represents the local root mean square contrast of pixel (x,y). This represents the average pixel value of all pixels within an M×M window centered at pixel (x,y). I(x,y) represents the pixel value at pixel (x,y). (x,y) represents the coordinates of the pixel, where x is the x-coordinate and y is the y-coordinate. The pixel value can be one of the red, green, and blue (RGB) components, or max(R, G, B), which is the maximum of the R, G, and B components at each pixel, or the luminance Y (the Y in YCbCr), or other values.

[0163] Furthermore, the image signal is decomposed at different spatial frequencies to obtain image signals in different spatial frequency bands. For each pixel, the frequency corresponding to the maximum pixel value after decomposition is obtained, and this frequency is taken as the dominant frequency f(x,y) of that pixel. For example, a cosine log filter can be used to decompose the image signal at different spatial frequencies.

[0164] Furthermore, the contrast of each pixel is obtained based on its local root mean square contrast and contrast sensitivity function (CSF). CSF is related to spatial frequency and luminance and is used to characterize the human eye's sensitivity to spatial frequency.

[0165] For example, the contrast of each pixel can be obtained by the following formula (2-2).

[0166]

[0167] C(x,y) represents the contrast of each pixel.

[0168] 503, Obtain the disparity map between the binocular images.

[0169] The method for obtaining the disparity map by performing disparity estimation on the binocular images in this embodiment is not specifically limited. As an example, the obtained disparity map can be expressed in arcseconds (arcsec).

[0170] The disparity map D(x,y) of each pixel represents the number of pixels that are distanced in the disparity direction between pixel I1(x,y) at the same pixel position in one of the binocular images and the corresponding pixel I2(x+D(x,y),y) in the other image.

[0171] It should be noted that the embodiments of this application do not specifically limit the execution order of steps 503 and 502. For example, step 503 can precede step 502, step 502 can precede step 503, or steps 502 and 503 can be executed simultaneously.

[0172] 504, obtain the second contrast of the binocular image after tone mapping using the first tone mapping curve.

[0173] In some embodiments, a first tone mapping curve can be used to perform tone mapping processing on the binocular image, and then the contrast ratio is calculated for the tone-mapped binocular image to obtain a second contrast ratio. When calculating the contrast ratio for the tone-mapped binocular image, the same calculation method as the first contrast ratio can be used, or a different calculation method can be used; this application does not specifically limit this.

[0174] In other embodiments, the contrast loss value can be estimated based on the first tone mapping curve, and then the second contrast value after tone mapping can be estimated by combining the first contrast value and the contrast loss value.

[0175] As an example, the contrast loss can be estimated using the functional derivative of the first tone mapping curve. The physical meaning of the derivative is the slope of the curve at a certain pixel value L. This slope is typically less than 1, representing a decrease or loss of contrast. Of course, it could also be equal to 1, in which case the contrast remains unchanged. There is also the possibility of contrast enhancement, in which case the slope is greater than 1, meaning that at a certain pixel value L, the contrast is enhanced.

[0176] Furthermore, the second contrast ratio can be estimated using the following formula (2-3):

[0177]

[0178] Where C(x,y) represents the first contrast of pixel (x,y). L′ represents the estimated second contrast of pixel (x,y). 2D (I(x,y)) represents the function derivative of the first tone mapping curve. L 2D (L) represents the function of the first tone mapping curve, and I(x,y) represents the pixel value of pixel (x,y).

[0179] 505. Obtain the second tone mapping curve based on the first contrast ratio, the second contrast ratio, and the disparity map.

[0180] The second tone mapping curve is used to ensure that the contrast of the tone-mapped binocular image meets the minimum parallax change requirement for human eye 3D perception.

[0181] In one possible implementation, when obtaining the second tone mapping curve based on the first contrast ratio, the second contrast ratio, and the disparity map, a gain coefficient for contrast compensation can be calculated using the first contrast ratio, the second contrast ratio, and the disparity map, and then the second tone mapping curve capable of achieving contrast compensation can be obtained using the gain coefficient.

[0182] For example, see Figure 6 As shown, step 505, which obtains the second tone mapping curve based on the first contrast, the second contrast, and the disparity map, can be achieved through steps 601-603.

[0183] 601. Determine the disparity gradient of the first pixel based on the disparity map. The first pixel can be any pixel in the binocular image.

[0184] As an example, let D(x,y) represent the disparity map of pixel (x,y). Then pixel (x,y) can be determined by the following formula (2-4).

[0185]

[0186] 602. The gain coefficient of the first pixel is obtained based on the first contrast ratio, the second contrast ratio, and the disparity gradient of the first pixel. The gain coefficient is used to compensate for the decrease in contrast of the first pixel caused by tone mapping.

[0187] As an example, when obtaining the gain coefficient based on the first contrast ratio, the second contrast ratio, and the parallax gradient, it can be determined by the following formula (2-5).

[0188]

[0189] Where Q(x,y) represents the gain coefficient of a pixel, x represents the x-coordinate of a pixel, y represents the y-coordinate of a pixel, D′(x,y) represents the disparity gradient of a pixel, C(x,y) represents the first contrast of a pixel, and C tm (x,y) represents the second contrast of a pixel. K() represents the function relating parallax perception sensitivity and contrast. -1() represents the inverse function of the relationship between parallax perception sensitivity and contrast. Parallax perception sensitivity can be understood as the degree to which the human eye is sensitive to changes in parallax, or as the minimum parallax threshold at which the human eye can perceive changes in parallax given the contrast of an image.

[0190] As an example, the relationship function K() between parallax perception sensitivity and contrast can be obtained by fitting binocular 3D parallax perception test data, such as using Cormack's binocular 3D parallax perception test data to obtain K(). For example, the fitted K() is shown in Equation (2-6).

[0191]

[0192] Where c represents contrast, K(c) represents parallax perception sensitivity, and J represents the fitting coefficient. As an example, when using Cormack's binocular 3D parallax perception test data for fitting, J is set to 1.

[0193] Understandably, the smaller the contrast value c, the larger K(c) becomes. Generally, after tone mapping, the contrast of an image decreases, and the perceptual threshold for parallax changes increases. In areas with small parallax changes, or areas with small depth changes, the 3D perception effect weakens. To compensate for the parallax changes caused by the contrast changes resulting from tone mapping, a gain coefficient for each pixel can be calculated using K(c), the first contrast before tone mapping, and the second contrast after tone mapping, as shown in formula (2-5). Using the gain coefficient, the contrast after tone mapping is increased to the minimum parallax perception threshold, i.e., K... -1 (D′(x,y)), thus allowing the human eye to perceive a 3D effect. In some scenarios, if the contrast of the image source is less than this threshold, the solution provided in this application can enhance the contrast to the contrast level of the image source, ensuring that the contrast after tone mapping processing does not decrease significantly, achieving the same 3D perception effect as the image source.

[0194] 603, the second tone mapping curve is obtained based on the gain coefficient of each pixel in the binocular image.

[0195] It should be noted that the solution provided in this application embodiment can be applied to the acquisition of local tone mapping curves or global tone mapping curves, and this application does not make any specific limitations on it.

[0196] In one possible implementation, when performing step 603 to obtain the second tone mapping curve based on the gain coefficient of each pixel in the binocular image, any of the following possible methods can be used.

[0197] In the first possible approach, the second tone mapping curve can be generated through histogram equalization. When calculating the histogram, the gain coefficient can be used as a weight to obtain a weighted histogram. Based on the weighted histogram, the second tone mapping curve can be generated using histogram equalization.

[0198] As an example, a weighted histogram can satisfy the conditions shown in formula (2-7).

[0199]

[0200] Where Q(x,y) represents the gain coefficient of a pixel, I(x,y) represents the pixel value of a pixel, and l i h represents the upper boundary of the i-th histogram interval in the weighted histogram. i This represents the value of the i-th histogram interval of the weighted histogram.

[0201] The above w(x,y) represents the adjustment factor for each pixel, which can be determined according to requirements. For example, the weight of edge pixels in a binocular image can be increased to improve contrast. Alternatively, different values ​​of w(x,y) can be used for different user groups. For instance, some user groups have lower sensitivity to 3D perception than the general population, so the proportion of weight in w(x,y) can be increased to improve contrast. For example, the value of w(x,y) can be 1.

[0202] In some embodiments, l i It can be determined using the following formula (2-8):

[0203]

[0204] Where i = 0,..,N, l min l represents the minimum pixel value in a binocular image. max This represents the maximum pixel value in the stereo image, and N represents the number of histogram intervals that divide the range of pixel values ​​in the stereo image.

[0205] In other embodiments, l i It can be determined using the following formula (2-9):

[0206]

[0207] In this embodiment, the pixel values ​​of the pixels in the binocular image can be normalized first, so that the value range of the pixels in the binocular image is normalized to the interval between 0 and 1, and then the weighted histogram can be calculated in the manner provided above.

[0208] It should be noted that this application is also applicable to local tone mapping processing. The binocular image can be divided into different regions, a weighted histogram can be calculated for each region, and then a tone mapping curve corresponding to each region can be generated using the local histogram equalization method.

[0209] In the second possible approach, a histogram equalization method can be used to generate the second tone mapping curve. When calculating the histogram, the gain coefficient can be used as a weight to obtain a weighted histogram. Based on this weighted histogram, a histogram equalization method is used to generate the third tone mapping curve. Then, the first and third tone mapping curves are weighted to obtain the second tone mapping curve. For example, the second tone mapping curve can be obtained by merging the two curves using a weighted average. Of course, the weights of the two tone mapping curves can also be determined separately according to requirements.

[0210] It should be noted that in some scenarios, the display device's ability to display binocular 3D images is insufficient. In such cases, the weight of the first tone mapping curve can be configured to be larger, and the weight of the third tone mapping curve smaller. For example, the weight of the first tone mapping curve can be 1, and the weight of the third tone mapping curve can be 0. In other scenarios, when it is necessary to enhance the 3D perception effect, the weight of the third tone mapping curve can be configured to be larger, and the weight of the first tone mapping curve smaller. For example, the weight of the third tone mapping curve can be 1, and the weight of the first tone mapping curve can be 0.

[0211] When calculating the weighted histogram, the method described in the first possible approach can be used, which will not be repeated here.

[0212] It should be noted that this application is also applicable to local tone mapping processing. The binocular image can be divided into different regions, a weighted histogram can be calculated for each region, and then a tone mapping curve corresponding to each region can be generated using the local histogram equalization method.

[0213] In the third possible approach, instead of using histogram equalization to generate the tone mapping curve, the gain coefficient can be applied to the calculation of the function of the second tone mapping curve. For example, the gain coefficient can be used to solve the parameters of the second tone mapping curve.

[0214] As an example, the parameters of the second tone mapping curve are solved based on the gain coefficient using the following formula (2-10);

[0215]

[0216] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3DThe function represents the derivative of the second tone mapping curve function, Q(x,y) represents the gain coefficient of the pixel, and arg min() represents the parameter value of the second tone mapping curve that makes the value in parentheses reach its minimum.

[0217] For example, taking the tone mapping curve as the SMPTE ST 2094-10 standard curve as an example, the first tone mapping curve is represented by the following formula (2-11).

[0218]

[0219] Where L represents the pixel value before tone mapping of the input first tone mapping curve. and These represent the adjustment parameters for the first tone mapping curve. The first tone mapping curve... and It can be calculated based on the content of the binocular images.

[0220] Suppose that the second tone mapping curve is represented by the following formula (2-12).

[0221]

[0222] Where L represents the pixel values ​​before tone mapping of the input second tone mapping curve. c1, c2, and n represent the adjustment parameters of the second tone mapping curve.

[0223] By combining the above formula (2-10) to solve the optimization problem, the values ​​of c1, c2, and n can be obtained. Specifically, they can be calculated as follows:

[0224]

[0225] In the fourth possible approach, a second tone mapping curve can be obtained by weighted fusion of the first tone mapping curve and the tone mapping curve obtained based on the gain coefficient. When obtaining the tone mapping curve based on the gain coefficient, the method shown in formula (2-10) above can be used.

[0226] The solutions provided in the embodiments of this application will be described in detail below with reference to specific application scenarios.

[0227] In the first possible scenario, the encoding end determines the weighted histogram and transmits it to the decoding end. The decoding end then determines the second tone mapping curve based on the weighted histogram. For example, the encoding end could be an image acquisition device, and the decoding end could be an image processing device.

[0228] Example 1, taking the first possible implementation method to obtain the second tone mapping curve, see [link / reference]. Figure 7A and Figure 7B As shown. Figure 7A The diagram shows a possible image processing flow.

[0229] 701a, The encoding end obtains the first contrast ratio of each pixel in the binocular image. See step 502 for details, which will not be repeated here.

[0230] A stereo image can be any video frame from the video stream.

[0231] 702a, The encoding end performs disparity estimation on the stereo image to obtain the disparity map of each pixel in the stereo image. For details, please refer to the relevant description in step 503, which will not be repeated here.

[0232] 703a, the encoding end determines the disparity gradient of each pixel based on the disparity map. For example, the method for determining the disparity gradient can be found in the relevant description in step 601, and will not be repeated here.

[0233] 704a, the encoding end performs tone mapping processing on the binocular image according to the first tone mapping curve, and obtains the second contrast of each pixel in the tone-mapped binocular image. For details, please refer to the relevant description in step 504, which will not be repeated here.

[0234] 705a: The encoding end obtains the gain coefficient of each pixel based on the first contrast ratio, the second contrast ratio, and the disparity gradient. For details, please refer to the relevant description in step 602; it will not be repeated here.

[0235] 706a: The encoder uses the gain coefficient of each pixel as a weight to obtain the weighted histogram of the stereo image.

[0236] In 707a, the encoder encodes a weighted histogram into the bitstream and sends it to the decoder. For example, the weighted histogram can be encoded into the dynamic metadata of the bitstream.

[0237] The bitstream may include stereo image data and metadata describing the stereo image data. Both metadata and stereo image data are encoded into the bitstream. The metadata contains a description of the stereo image data. For example, the metadata may include static metadata and dynamic metadata. Static metadata can be used to describe the production environment of the entire video, such as information about the monitor used for color grading and correction, peak brightness, RGB coordinates, or white point coordinates, etc. For example, dynamic metadata generally includes a description of the stereo image content, such as the highest brightness, lowest brightness, and average brightness of the stereo image.

[0238] In this embodiment, a weighted histogram can be added to the dynamic metadata to generate the tone mapping curve between the stereo image and the display device. It should be understood that the bitstream may include data from multiple consecutive stereo images, and the weighted histogram included in the dynamic metadata may vary depending on the stereo images in the bitstream.

[0239] 708a, the decoding end obtains the second tone mapping curve of the binocular image based on the weighted histogram.

[0240] For example, a second-tone mapping curve can be generated based on a weighted histogram using histogram equalization. See the description in the first possible method above for details, which will not be repeated here.

[0241] For example, the decoding end can decode the data of the stereo image from the bitstream, as well as the weighted histogram of the stereo image from the dynamic metadata, and then further generate a second tone mapping curve based on the weighted histogram.

[0242] 709a, the decoding end performs tone mapping processing on the binocular image according to the second tone mapping curve and then outputs it.

[0243] As an example, the decoding end may also perform other processing on the stereo image before tone mapping processing, or perform other processing on the tone-mapped stereo image after tone mapping processing and before output. This application embodiment does not specifically limit this.

[0244] Figure 7B The diagram shows another possible image processing flow.

[0245] For steps 701b-703b, please refer to steps 701a-703a, which will not be repeated here.

[0246] 704b: The encoding end estimates the contrast loss value of each pixel in the binocular image after mapping by the first tone mapping curve based on the first tone mapping curve. For details, please refer to the relevant description in step 504, which will not be repeated here.

[0247] 705b, the encoder estimates the second contrast of the binocular image after mapping by the first tone mapping curve based on the contrast loss value of each pixel.

[0248] For steps 706b-710b, see steps 706a-709a, which will not be repeated here.

[0249] Example 2, taking the second possible implementation method to obtain the second tone mapping curve as an example, see [link to example]. Figure 8A and Figure 8B As shown. Figure 8A The diagram shows a possible image processing flow.

[0250] For steps 701a-707a, see steps 801a-707a; they will not be repeated here.

[0251] In some embodiments, the encoder may also encode the parameters of the first tone mapping curve into a bitstream.

[0252] 808a: The decoding end obtains the third tone mapping curve of the binocular image based on the weighted histogram.

[0253] In 809a, the decoding end performs a weighted fusion process based on the first tone mapping curve and the third tone mapping curve to obtain the second tone mapping curve.

[0254] In some embodiments, the decoding end can obtain the parameters of the first tone mapping curve from the bitstream to generate the first tone mapping curve. In other embodiments, the decoding end can also generate the first tone mapping curve based on the content of the stereo image.

[0255] 810a, see 709a, will not be repeated here.

[0256] Figure 8B The diagram shows another possible image processing flow.

[0257] For steps 701b-708b, see steps 801b-708b; they will not be repeated here.

[0258] For steps 809b-811b, please refer to steps 808a-810a, which will not be repeated here.

[0259] In the second possible scenario, the encoding end determines the weighted histogram and uses it to determine the second tone mapping curve. The parameters of the second tone mapping curve are then transmitted to the decoding end, which performs tone mapping processing based on the curve. For example, the encoding end could be an image acquisition device, and the decoding end could be an image processing device.

[0260] Example 3, taking the first possible implementation method to obtain the second tone mapping curve, see [link to example]. Figure 9A and Figure 9B As shown. Figure 9A The diagram shows a possible image processing method.

[0261] 901a-906a, see 701a-706a as shown, will not be repeated here.

[0262] 907a, the encoder obtains the second tone mapping curve of the binocular image based on the weighted histogram.

[0263] 908a, the encoder encodes the second tone mapping curve into the bitstream and sends it to the decoder. For example, the second tone mapping curve can be encode into the dynamic metadata of the bitstream. It should be understood that the bitstream may include data from multiple consecutive stereo images, and the second tone mapping curve included in the dynamic metadata may differ as the stereo images in the bitstream change.

[0264] 909a: The decoding end decodes the second tone mapping curve from the bitstream, and then uses the second tone mapping curve to perform tone mapping processing on the stereo image before outputting it.

[0265] Figure 9B The diagram shows a possible image processing method.

[0266] 901b-907b, see 701b-707b as shown, will not be repeated here.

[0267] 908b-910b, see 907a-909a, will not be repeated here.

[0268] In the third possible scenario, the encoding end determines the weighted histogram and uses it to determine the third tone mapping curve. The parameters of the third tone mapping curve are then transmitted to the decoding end. The decoding end performs a weighted fusion process based on the third and first tone mapping curves to obtain the second tone mapping curve, and then performs tone mapping processing based on the second tone mapping curve. For example, the encoding end could be an image acquisition device, and the decoding end could be an image processing device.

[0269] Example 4, see Figure 10A and Figure 10B As shown. Figure 10A The diagram shows a possible image processing method.

[0270] 1001a-1006a, see 801a-806a, will not be repeated here.

[0271] 1007a, the encoder obtains the third tone mapping curve of the binocular image based on the weighted histogram.

[0272] 1008a, the encoder encodes the parameters of the third tone mapping curve into the bitstream. In some embodiments, the encoder may also encode the first tone mapping curve into the bitstream.

[0273] 1009a: The decoding end decodes the third tone mapping curve from the bitstream and performs a weighted fusion process based on the third tone mapping curve and the first tone mapping curve to obtain the second tone curve.

[0274] 1010a: The decoding end uses the second tone mapping curve to perform tone mapping processing on the stereo image before outputting it.

[0275] Figure 10B The diagram shows a possible image processing method.

[0276] 1001b-1007b, see 801b-807b, will not be repeated here.

[0277] 1008b-1011b, see 1007a-1010a, will not be repeated here.

[0278] As an example, the encoder can perform a weighted fusion process based on the third tone mapping curve and the first tone mapping curve to obtain the second tone mapping curve, and then encode the second tone mapping curve into the bitstream. The decoder can then directly obtain the second tone mapping curve from the decoder and use it to perform tone mapping processing on the stereo image.

[0279] In the fourth possible scenario, the encoding end does not perform the determination of the weighted histogram. Instead, it determines the third tone mapping curve by solving for the function parameters of the tone mapping curve based on the gain coefficient. The parameters of the third tone mapping curve are then transmitted to the decoding end. The decoding end performs a weighted fusion process based on the third and first tone mapping curves to obtain the second tone mapping curve, and then performs tone mapping processing based on the second tone mapping curve. For example, the encoding end can be an image acquisition device, and the decoding end can be an image processing device.

[0280] Example 5, see Figure 11A The diagram shown is a schematic of a possible image processing method.

[0281] 1101a-1105a, see 701a-705a, will not be repeated here.

[0282] 1106a: The encoder generates a third tone mapping curve based on the gain coefficient and the first tone mapping curve. See the relevant description in the fourth possible method for details, which will not be repeated here.

[0283] 1107a, the encoder encodes the parameters of the third tone mapping curve into the bitstream. In some embodiments, the encoder may also encode the first tone mapping curve into the bitstream.

[0284] 1108a: The decoding end decodes the third tone mapping curve from the bitstream and performs a weighted fusion process based on the third tone mapping curve and the first tone mapping curve to obtain the second tone curve.

[0285] 1109a, the decoding end uses the second tone mapping curve to perform tone mapping processing on the stereo image before outputting it.

[0286] Example 6, see Figure 11BThe diagram shown illustrates another possible image processing method.

[0287] 1101b-1106b, see 701b-706b, will not be repeated here.

[0288] 1107a-1110b, see 1106a-1109a, will not be repeated here.

[0289] In the fifth possible scenario, the encoding end does not determine the weighted histogram. Instead, it determines the second tone mapping curve by solving for the function parameters of the tone mapping curve based on the gain coefficient. The parameters of the second tone mapping curve are then transmitted to the decoding end, which performs tone mapping processing based on the second tone mapping curve. For example, the encoding end could be an image acquisition device, and the decoding end could be an image processing device.

[0290] Example 7, see Figure 12A The diagram shown is a schematic of a possible image processing method.

[0291] 1201a-1205a, see 701a-705a, will not be repeated here.

[0292] 1206a: The encoder generates the second tone mapping curve based on the gain coefficient and the first tone mapping curve. See the relevant description in the fourth possible method for details, which will not be repeated here.

[0293] 1207a, the encoder encodes the parameters of the second tone mapping curve into the bitstream.

[0294] 1208a: The decoding end decodes the second tone mapping curve from the bitstream, and then uses the second tone mapping curve to perform tone mapping processing on the stereo image before outputting it.

[0295] See Figure 12B The diagram shown illustrates another possible image processing method.

[0296] 1201b-1206b, see 701b-706b, will not be repeated here.

[0297] 1207b-1209b, see 1106a-1108a, will not be repeated here.

[0298] In the sixth possible scenario, the generation of the second tone mapping curve is entirely performed by the decoder. The encoder can encode the stereo image into the bitstream. Alternatively, after determining the first tone mapping curve, it can encode both the stereo image and the first tone mapping curve into the bitstream.

[0299] Example 8, taking the first possible implementation as an example to obtain the second tone mapping curve, see [link to example]. Figure 13AThe diagram shown is a schematic of a possible image processing method.

[0300] 1301a, The decoder obtains the first contrast ratio of each pixel in the stereo image. See step 502 for details, which will not be repeated here. The stereo image can be any video frame image from the video stream.

[0301] For example, an encoder can encode stereo images into a bitstream and send it to a decoder.

[0302] 1302a, The decoder performs disparity estimation on the stereo image to obtain the disparity map for each pixel of the stereo image. See step 503 for details, which will not be repeated here.

[0303] 1303a, The decoding end determines the disparity gradient of each pixel based on the disparity map. For example, the method for determining the disparity gradient can be found in the relevant description in step 601, and will not be repeated here.

[0304] 1304a, the decoding end performs tone mapping processing on the binocular image according to the first tone mapping curve, and obtains the second contrast of each pixel in the tone-mapped binocular image. For details, please refer to the relevant description in step 504, which will not be repeated here.

[0305] In one example, the encoding end can generate a first tone mapping curve based on the content of the stereo image and encode the parameters of the first tone mapping curve into the bitstream. In another example, the decoding end can generate the first tone mapping curve based on the content of the stereo image decoded from the bitstream.

[0306] 1305a: The decoding end obtains the gain coefficient of each pixel based on the first contrast ratio, the second contrast ratio, and the parallax gradient of each pixel. For details, please refer to the relevant description in step 602, which will not be repeated here.

[0307] In 1306a, the decoding end uses the gain coefficient of each pixel as a weight to obtain the weighted histogram of the stereo image.

[0308] 1307a, the decoding end obtains the second tone mapping curve of the binocular image based on the weighted histogram.

[0309] For example, a second-tone mapping curve can be generated based on a weighted histogram using histogram equalization. See the description in the first possible method above for details, which will not be repeated here.

[0310] 1308a: The decoding end performs tone mapping processing on the binocular image based on the second tone mapping curve and then outputs the result.

[0311] As an example, the decoding end may also perform other processing on the stereo image before tone mapping processing, or perform other processing on the tone-mapped stereo image after tone mapping processing and before output. This application embodiment does not specifically limit this.

[0312] Example 9, for instance, illustrates the second possible implementation for obtaining the second tone mapping curve; see [link / reference]. Figure 13B The diagram shown is a schematic of a possible image processing method.

[0313] 1301b-1306b, see steps 1301a-1306a, will not be repeated here.

[0314] 1307b-1309b, see steps 808a-810a, which will not be repeated here.

[0315] Example 10, for instance, illustrates the fourth possible implementation for obtaining the second tone mapping curve; see [link / reference]. Figure 14A As shown. Figure 14A This is a schematic diagram of a possible image processing method.

[0316] 1401a-1405a, see steps 1301a-1305a, which will not be repeated here.

[0317] 1406a: The decoder generates the third tone mapping curve based on the gain coefficient and the first tone mapping curve. See the relevant description in the fourth possible method for details, which will not be repeated here.

[0318] 1407a, the decoding end performs a weighted fusion process based on the third tone mapping curve and the first tone mapping curve to obtain the second tone curve.

[0319] 1408a, the decoding end uses the second tone mapping curve to perform tone mapping processing on the stereo image before outputting it.

[0320] Example 11, for instance, illustrates a third possible implementation for obtaining the second tone mapping curve; see [link / reference]. Figure 14A As shown. Figure 14A This is a schematic diagram of a possible image processing method.

[0321] For steps 1401b-1406b, please refer to steps 1301b-1306b; they will not be repeated here.

[0322] 1407b: The decoder generates the second tone mapping curve based on the gain coefficient and the first tone mapping curve. See the relevant description in the third possible method for details, which will not be repeated here.

[0323] 1408b, the decoding end uses the second tone mapping curve to perform tone mapping processing on the stereo image before outputting it.

[0324] Based on the same inventive concept as the method described above, embodiments of this application provide an image processing apparatus. See also... Figure 15A and Figure 15B As shown, the image processing apparatus may include an acquisition module 1501, a first determination module 1502, a second determination module 1503, and a third determination module 1504. The image processing apparatus may be applied to an encoder or a decoder.

[0325] The acquisition module 1501 is used to acquire a stereo image and generate a first tone mapping curve for the stereo image, wherein the stereo image is a high dynamic range (HDR) image.

[0326] The first determining module 1502 is used to obtain the first contrast of the binocular image and the disparity map between the binocular images;

[0327] The second determining module 1503 is used to obtain the second contrast of the binocular image after tone mapping by the first tone mapping curve.

[0328] The third determining module 1504 obtains a second tone mapping curve based on the first contrast, the second contrast, and the disparity map.

[0329] The second tone mapping curve is used to ensure that the contrast of the tone-mapped binocular image meets the minimum parallax change requirement for human eye 3D perception.

[0330] In one possible implementation, the third determining module 1504 is specifically configured to: determine the disparity gradient of a first pixel point based on the disparity map, wherein the first pixel point is any pixel point in the binocular image; obtain the gain coefficient of the first pixel point based on the first contrast, the second contrast, and the disparity gradient of the first pixel point; the gain coefficient is used to compensate for the decrease in contrast of the first pixel point caused by tone mapping; and obtain the second tone mapping curve based on the determined gain coefficient of each pixel point in the binocular image.

[0331] In one possible implementation, the gain coefficient satisfies the following condition:

[0332]

[0333] Where Q(x,y) represents the gain coefficient of a pixel, x represents the x-coordinate of a pixel, y represents the y-coordinate of a pixel, D′(x,y) represents the disparity gradient of a pixel, and K -1() represents the inverse function of the relationship between parallax perception sensitivity and contrast, C(x,y) represents the first contrast of a pixel, C tm (x,y) represents the second contrast of a pixel.

[0334] In one possible implementation, the relationship between parallax perception sensitivity and contrast satisfies the following condition:

[0335]

[0336] Where c represents contrast, K(c) represents parallax perception sensitivity, and J represents the fitting coefficient.

[0337] In one possible implementation, the third determining module 1504, when performing the operation of obtaining the second tone mapping curve based on the gain coefficient, is specifically used to: obtain a weighted histogram based on the pixel values ​​of the pixels included in the binocular image, using the gain coefficient as a weight; and generate the second tone mapping curve based on the weighted histogram.

[0338] In one possible implementation, the third determining module 1504, when performing the operation of obtaining the second tone mapping curve based on the gain coefficient, is specifically configured to: obtain a weighted histogram based on the pixel values ​​of the pixels included in the binocular image, using the gain coefficient as a weight; generate a third tone mapping curve based on the weighted histogram; and obtain the second tone mapping curve by weighting the first tone mapping curve and the third tone mapping curve.

[0339] In one possible implementation, the weighted histogram satisfies the following condition:

[0340]

[0341] Where Q(x,y) represents the gain coefficient of the pixel, w(x,y) represents the adjustment factor of the pixel, I(x,y) represents the pixel value of the pixel, and l i h represents the boundary of the i-th histogram interval. i This represents the value of the i-th histogram interval of the weighted histogram.

[0342] In one possible implementation, when the third determining module 1504 performs the operation of obtaining the second tone mapping curve based on the gain coefficient, it is specifically used to: solve for the parameters of the second tone mapping curve based on the gain coefficient in the following manner;

[0343]

[0344] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3DThe function represents the derivative of the second tone mapping curve function, Q(x,y) represents the gain coefficient of the pixel, and arg min() represents the parameter value of the second tone mapping curve that makes the value in parentheses reach its minimum.

[0345] In one possible implementation, when the third determining module 1504 performs the operation of obtaining the second tone mapping curve based on the gain coefficient, it is specifically used to: solve the parameters of the third tone mapping curve based on the gain coefficient in the following manner;

[0346]

[0347] Among them, L′ 2D L′ represents the derivative of the first hue mapping curve function. 3D The function represents the derivative of the third tone mapping curve function, Q(x,y) represents the gain coefficient of the pixel, and arg min() represents the parameter value of the second tone mapping curve when the value in parentheses is minimized.

[0348] The second tone mapping curve is obtained by weighting the first tone mapping curve and the third tone mapping curve.

[0349] In one possible implementation, the second determining module 1503 is specifically configured to: perform tone mapping on the binocular image using the first tone mapping curve, and obtain the second contrast of the tone-mapped binocular image; or, estimate the contrast loss value of the binocular image after tone mapping using the first tone mapping curve based on the first tone mapping curve, and obtain the second contrast based on the contrast loss value and the first contrast.

[0350] In some scenarios, when applied to an encoder, the image processing apparatus may further include an encoding module 1505A. This module can be used to encode the weighted histogram into the dynamic metadata of the bitstream. Alternatively, it can be used to encode the second tone mapping curve into the dynamic metadata of the bitstream.

[0351] In some scenarios, when applied to a decoder, the image processing device also includes a tone mapping module 1505B, used to perform tone mapping processing on the binocular image according to the second tone mapping curve.

[0352] This application also provides an encoding device, such as... Figure 16 As shown, the encoding device 1600 may include a communication interface 1610 and a processor 1620. Optionally, the encoding device 1600 may also include a memory 1630. The memory 1630 may be located inside or outside the encoding device. Figure 15AThe acquisition module 1501, the first determination module 1502, the second determination module 1503, the third determination module 1504, and the encoding module 1505A shown can all be implemented by the processor 1620.

[0353] In one possible implementation, processor 1620 is used to implement Figures 7A to 14B The encoding end described herein executes any method and outputs the encoded bitstream through the communication interface 1610.

[0354] In the implementation process, each step of the processing flow can be completed through the integrated logic circuits in the processor 1620 or through software instructions. Figures 7A to 14B The method executed by the encoding end is described below. For simplicity, it will not be elaborated further. The program code executed by the processor 1620 to implement the above method can be stored in the memory 1630. The memory 1630 and the processor 1620 are coupled.

[0355] Any communication interface involved in the embodiments of this application can be a circuit, a bus, a transceiver, or any other device that can be used for information exchange. For example, the communication interface 1610 in the encoding device 1600. Exemplarily, the other device can be a device connected to the encoding device 1600, such as a decoding device.

[0356] Processor 1620 may operate in conjunction with memory 1630. Memory 1630 may be non-volatile memory, such as hard disk drive (HDD) or solid-state drive (SSD), or it may be volatile memory, such as random-access memory (RAM). Memory 1630 may be any other medium capable of carrying or storing desired program code in the form of instructions or data structures, and accessible by a computer, but is not limited to this.

[0357] This application embodiment does not limit the specific connection medium between the communication interface 1610, processor 1620, and memory 1630. This application embodiment... Figure 16 The memory 1630, processor 1620, and communication interface 1610 are connected via a bus, and the bus is in... Figure 16 The connections between other components are shown in bold and are for illustrative purposes only, not as limiting information. The bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, Figure 16 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.

[0358] This application also provides a decoding device, such as... Figure 17 As shown, the decoding device 1700 may include a communication interface 1710 and a processor 1720. Optionally, the decoding device 1700 may also include a memory 1730. The memory 1730 may be located inside or outside the decoding device. Figure 15B The acquisition module 1501, the first determination module 1502, the second determination module 1503, the third determination module 1504, and the tone mapping curve 1505B shown can all be implemented by the processor 1720.

[0359] In one possible implementation, the processor 1720 is used to acquire the bitstream through the communication interface 1710 and implement... Figures 7A to 14B Any method executed by the decoding end as described herein.

[0360] In the implementation process, each step of the processing flow can be completed through the integrated logic circuits in the processor 1720 or through software instructions. Figures 7A to 14B The method executed by the encoding end is described below. For simplicity, it will not be elaborated further. The program code executed by the processor 1720 to implement the above method can be stored in the memory 1730. The memory 1730 and the processor 1720 are coupled.

[0361] Any communication interface involved in the embodiments of this application can be a circuit, a bus, a transceiver, or any other device that can be used for information exchange. For example, the communication interface 1710 in the encoding device 1700. Exemplarily, the other device can be a device connected to the encoding device 1700, such as a decoding device.

[0362] Processor 1720 may operate in conjunction with memory 1730. Memory 1730 may be non-volatile memory, such as hard disk drive (HDD) or solid-state drive (SSD), or it may be volatile memory, such as random-access memory (RAM). Memory 1730 may be any other medium capable of carrying or storing desired program code in the form of instructions or data structures, and accessible by a computer, but is not limited to this.

[0363] This application embodiment does not limit the specific connection medium between the communication interface 1710, processor 1720, and memory 1730. This application embodiment... Figure 17 The memory 1730, processor 1720, and communication interface 1710 are connected via a bus, and the bus is in... Figure 17The connections between other components are shown in bold and are for illustrative purposes only, not as limiting information. The bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, Figure 17 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.

[0364] The processors involved in the embodiments of this application can be general-purpose processors, digital signal processors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components, and can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly manifested as being executed by a hardware processor, or executed by a combination of hardware and software modules within the processor.

[0365] The coupling in the embodiments of this application is an indirect coupling or communication connection between devices, modules, or modules, which can be electrical, mechanical, or other forms, and is used for information interaction between devices, modules, or modules.

[0366] Based on the above embodiments, this application also provides a computer storage medium storing software programs. When these software programs are read and executed by one or more processors, they can implement the methods provided in any one or more of the above embodiments. The computer storage medium may include various media capable of storing program code, such as a USB flash drive, portable hard drive, read-only memory, random access memory, magnetic disk, or optical disk.

[0367] Based on the above embodiments, this application also provides a chip, which includes a processor for implementing the functions involved in any one or more of the above embodiments, such as for implementing... Figures 7A-14B The method executed by the encoding end, or used to implement Figures 7A-14B The method executed by the decoding device. Optionally, the chip further includes a memory for storing program instructions and data necessary for the processor to execute. The chip may be composed of chips or may include chips and other discrete components.

[0368] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0369] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0370] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0371] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0372] Obviously, those skilled in the art can make various modifications and variations to the embodiments of this application without departing from the scope of the embodiments of this application. Therefore, if these modifications and variations to the embodiments of this application fall within the scope of the claims of this application and their equivalents, this application also intends to include these modifications and variations.

Claims

1. An image processing method, characterized in that, include: Acquire a stereo image and generate a first tone mapping curve for the stereo image, wherein the stereo image is a high dynamic range (HDR) image; Obtain the first contrast ratio of the binocular images and the disparity map between the binocular images; Obtain the second contrast of the binocular image after tone mapping using the first tone mapping curve; A second tone mapping curve is obtained based on the first contrast ratio, the second contrast ratio, and the disparity map; The second tone mapping curve is used to ensure that the contrast of the tone-mapped binocular image meets the minimum parallax change requirement for human eye 3D perception.

2. The method as described in claim 1, characterized in that, The second tone mapping curve is obtained based on the first contrast ratio, the second contrast ratio, and the disparity map, including: The disparity gradient of the first pixel is determined based on the disparity map, where the first pixel is any pixel in the binocular image. The gain coefficient of the first pixel is obtained based on the first contrast ratio, the second contrast ratio of the first pixel, and the disparity gradient of the first pixel; the gain coefficient is used to compensate for the decrease in contrast of the first pixel caused by tone mapping. The second tone mapping curve is obtained based on the gain coefficient of each pixel in the determined binocular image.

3. The method as described in claim 2, characterized in that, The gain coefficient satisfies the following condition: ; in, This represents the gain coefficient of a pixel. Represents the x-coordinate of a pixel. Represents the ordinate of a pixel. Represents the disparity gradient of a pixel. The inverse function representing the relationship between parallax perception sensitivity and contrast. This represents the first contrast of a pixel. This represents the second contrast of a pixel.

4. The method as described in claim 3, characterized in that, The relationship between parallax perception sensitivity and contrast satisfies the following condition: ; in, c Indicates contrast. Indicates parallax perception sensitivity. This represents the fitting coefficient.

5. The method according to any one of claims 2-4, characterized in that, Obtaining the second tone mapping curve based on the gain coefficient includes: Using the gain coefficient as the weight, a weighted histogram is obtained based on the pixel values ​​of the pixels included in the binocular image; The second tone mapping curve is generated based on the weighted histogram.

6. The method according to any one of claims 2-4, characterized in that, Obtaining the second tone mapping curve based on the gain coefficient includes: Using the gain coefficient as the weight, a weighted histogram is obtained based on the pixel values ​​of the pixels included in the binocular image; A third-tone mapping curve is generated based on the weighted histogram; The second tone mapping curve is obtained by weighting the first tone mapping curve and the third tone mapping curve.

7. The method according to claim 5 or 6, characterized in that, The weighted histogram satisfies the following conditions: ; in, This represents the gain coefficient of a pixel. The adjustment factor represents the pixel. Represents the pixel value of a pixel. Indicates the first The boundaries of a histogram interval. The weighted histogram represents the first... The values ​​that can be taken in each histogram interval.

8. The method according to any one of claims 2-4, characterized in that, Obtaining the second tone mapping curve based on the gain coefficient includes: The parameters of the second tone mapping curve are determined based on the gain coefficient as follows; ; in, This represents the derivative of the first-tone mapping curve function. This represents the derivative of the second-tone mapping curve function. This represents the gain coefficient of a pixel. This indicates the parameter value of the second tone mapping curve that makes the value in parentheses reach its minimum.

9. The method according to any one of claims 2-4, characterized in that, Obtaining the second tone mapping curve based on the gain coefficient includes: The parameters of the third tone mapping curve are determined based on the gain coefficients in the following manner; ; in, This represents the derivative of the first-tone mapping curve function. The derivative of the function representing the third-tone mapping curve. This represents the gain coefficient of a pixel. This indicates the parameter value of the third tone mapping curve that makes the value in parentheses reach its minimum. The second tone mapping curve is obtained by weighting the first tone mapping curve and the third tone mapping curve.

10. The method according to any one of claims 1-9, characterized in that, Obtaining the second contrast of the binocular image after tone mapping using the first tone mapping curve includes: The binocular image is tone-mapped using the first tone mapping curve, and the second contrast of the tone-mapped binocular image is obtained; or, The contrast loss value of the binocular image after tone mapping is estimated based on the first tone mapping curve, and the second contrast is obtained based on the contrast loss value and the first contrast.

11. The method according to any one of claims 5-7, characterized in that, The method further includes: The weighted histogram is incorporated into the dynamic metadata of the bitstream.

12. The method according to any one of claims 1-10, characterized in that, The method further includes: The second tone mapping curve is encoded into the dynamic metadata of the bitstream.

13. The method according to any one of claims 1-10, characterized in that, The method further includes: The binocular image is subjected to tone mapping processing based on the second tone mapping curve.

14. An encoding device, characterized in that, include: A non-volatile memory and a processor are coupled together, the processor calling program code stored in the memory to perform the method as described in any one of claims 1-12.

15. A decoding device, characterized in that, include: A non-volatile memory and a processor are coupled together, the processor calling program code stored in the memory to perform the method as described in any one of claims 1-10, 13.

16. An image processing system, characterized in that, It includes the encoding device as described in claim 14 and the decoding device as described in claim 15.

17. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores program code, the program code including instructions for a processor to execute the method as described in any one of claims 1-13.