HDR video rate-distortion optimization method and device based on adaptive distortion weight, equipment and medium
By constructing an adaptive distortion weight mapping model, the distortion weight factor is dynamically adjusted according to the brightness characteristics of HDR video, optimizing the encoding process, solving the problem of inefficient bitrate allocation in existing technologies, and improving the encoding efficiency and quality of HDR video.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- MALANSHAN AUDIO & VIDEO LABORATORY
- Filing Date
- 2026-03-19
- Publication Date
- 2026-06-19
AI Technical Summary
Existing video coding standards assume that all pixel distortions are visually equivalent in HDR video coding, resulting in the valuable bitrate being evenly distributed between extremely bright and medium brightness areas, which is inefficient and cannot allocate bitrate more efficiently to improve overall compression efficiency while maintaining subjective quality.
By constructing an adaptive distortion weight mapping model, the distortion weight factor is dynamically adjusted according to the brightness characteristics of HDR video, and a target rate distortion cost formula is constructed to optimize the encoding operation of different brightness regions during the encoding process.
It achieves more efficient bitrate allocation while maintaining subjective quality, thereby improving the overall compression efficiency and encoding quality of HDR videos.
Smart Images

Figure CN121864971B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of HDR video coding technology, and in particular to a method, apparatus, device and medium for HDR video rate-distortion optimization based on adaptive distortion weights. Background Technology
[0002] Current mainstream video coding standards such as H.266 / VVC (Versatile Video Coding) and AV1 (a video compression format) typically use the following formula to calculate rate-distortion cost when optimizing rate-distortion: J = D + λ R. Here, R is the number of bits required for encoding, and the distortion D is usually calculated as the sum of absolute differences (SAD) or the sum of squares due to error (SSE) between the reconstructed and original values. The Lagrange multiplier λ is mainly determined by the quantization parameters. This mechanism has a fundamental problem: it assumes that the distortion of all pixels is visually equivalent. However, the human visual system's perception of brightness is non-linear, especially for HDR (High Dynamic Range) video, whose brightness range far exceeds that of SDR (Standard Dynamic Range). This non-linear effect is more pronounced: in extremely dark areas, most details are almost imperceptible (approaching black saturation) and nearly indistinguishable; in dark areas, the human eye is very sensitive to small changes in brightness and noise; in medium-brightness areas, the human eye's contrast sensitivity is highest; in extremely bright areas, due to the saturation effect of visual cells, the human eye's ability to distinguish details and distortion decreases significantly, and its tolerance for distortion is higher. Therefore, in HDR encoding, it is inefficient from a visual perception perspective to evenly distribute the precious bitrate between extremely bright and medium-brightness areas.
[0003] In conclusion, optimizing HDR video rate distortion optimization methods to allocate bitrate more efficiently and improve overall compression efficiency while maintaining subjective quality is a pressing issue that needs to be addressed. Summary of the Invention
[0004] In view of this, the purpose of this invention is to provide a method, apparatus, device, and medium for HDR video rate-distortion optimization based on adaptive distortion weights, which can optimize the HDR video rate-distortion optimization method to allocate bitrate more efficiently and improve overall compression efficiency while maintaining subjective quality. The specific solution is as follows:
[0005] Firstly, this application provides a method for optimizing HDR video rate-distortion based on adaptive distortion weights, including:
[0006] During the encoding of the target HDR video, each encoding region to be processed in the current video frame is determined, and the average brightness of each encoding region to be processed is calculated.
[0007] The average brightness of the region is input into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each of the regions to be processed for encoding; the distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor.
[0008] Based on the distortion weight factor, a target rate-distortion cost formula is constructed, and the target rate-distortion cost formula is applied to the rate-distortion optimization decision of the coding region to be processed, so as to complete the coding operation of the coding region to be processed.
[0009] Optionally, the distortion weight mapping model is a first preset inflection point distortion weight mapping model;
[0010] The construction process of the first preset inflection point distortion weight mapping model includes:
[0011] By using the target contrast sensitivity function model and the target HDR video standard, the preset brightness range is divided into a preset number of brightness intervals, and a corresponding set of inflection points is defined.
[0012] Based on the visual perception sensitivity of the brightness range, a corresponding inflection point brightness weight is defined for several inflection points in the inflection point set, and the inflection point brightness weight is determined as the distortion weight to obtain a piecewise linear model composed of the inflection point set and the distortion weight.
[0013] The piecewise linear model is determined as the first preset inflection point distortion weight mapping model.
[0014] Optionally, the distortion weight mapping model is a second preset inflection point distortion weight mapping model;
[0015] Before determining each encoding region to be processed in the current video frame, the process further includes:
[0016] The system uses a preset video scene switching detection technology to determine whether a scene switch has occurred in the current video frame, and obtains the corresponding judgment result.
[0017] If the judgment result indicates that a scene switch has occurred, then a luminance histogram is statistically analyzed for the target video frame of the current scene in the target HDR video to obtain the distribution characteristics of the luminance histogram.
[0018] Based on the distribution characteristics, the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted to obtain the adjusted piecewise linear model, and the adjusted piecewise linear model is determined as the second preset inflection point distortion weight mapping model.
[0019] Optionally, adjusting the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics to obtain the adjusted piecewise linear model includes:
[0020] By using a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy, the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted according to the distribution characteristics.
[0021] Determine the brightness contrast of the current video frame, and determine the weight scaling factor based on the brightness contrast, so as to adjust the inflection point distortion weight coordinates of the plurality of inflection points based on the weight scaling factor.
[0022] The inflection point coordinates include the inflection point brightness coordinates and the inflection point distortion weight coordinates.
[0023] Optionally, adjusting the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics using a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy includes:
[0024] The target brightness range of the target video frame is determined based on the distribution characteristics, and the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are mapped to the target brightness range using linear interpolation technology.
[0025] And / or, based on the distribution characteristics, determine the average brightness of the main subject in the target video frame, determine the first target brightness interval with the highest distortion weight in the first preset inflection point distortion weight mapping model, and calculate the target translation amount using the average brightness of the main subject and the first target brightness interval, so as to translate the inflection point brightness coordinates of the plurality of inflection points based on the target translation amount.
[0026] Optionally, for any of the regions to be processed, the average brightness of the region is input into a pre-constructed distortion weight mapping model to obtain the distortion weight factors corresponding to each of the regions to be processed, including:
[0027] Determine several inflection point brightness values in a pre-constructed distortion weight mapping model, and compare the average brightness of the region with the several inflection point brightness values to determine the second target brightness range in which the average brightness of the region is located in the distortion weight mapping model;
[0028] If the number of inflection points corresponding to the second target brightness range is one, then the distortion weight value of the inflection point corresponding to the second target brightness range is determined as the distortion weight factor corresponding to the average brightness of the region.
[0029] If the number of inflection points corresponding to the second target brightness range is not one, then the distortion weighting factor is calculated using linear interpolation technology.
[0030] Optionally, calculating the average brightness of the region corresponding to the region to be coded includes:
[0031] The luminance component of each pixel in the encoding region to be processed is determined, and the luminance component is converted into linear luminance using an electro-optic conversion function;
[0032] Calculate the average value of the linear brightness and determine the average value as the regional average brightness of the region to be coded.
[0033] Secondly, this application provides an HDR video rate-distortion optimization device based on adaptive distortion weights, comprising:
[0034] The brightness calculation module is used to determine each encoding region to be processed in the current video frame during the encoding of the target HDR video, and to calculate the average brightness of each encoding region to be processed.
[0035] A brightness input module is used to input the average brightness of the region into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each of the regions to be processed for encoding; the distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor.
[0036] The formula application module is used to construct a target rate-distortion cost formula based on the distortion weight factor, and apply the target rate-distortion cost formula to the rate-distortion optimization decision of the coding region to be processed, so as to complete the encoding operation of the coding region to be processed.
[0037] Thirdly, this application provides an electronic device, comprising:
[0038] Memory, used to store computer programs;
[0039] A processor is configured to execute the computer program to implement the aforementioned HDR video rate-distortion optimization method based on adaptive distortion weights.
[0040] Fourthly, this application provides a computer-readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, it implements the aforementioned HDR video rate-distortion optimization method based on adaptive distortion weights.
[0041] In this application, during the encoding of a target HDR video, each region to be processed in the current video frame is determined, and the average brightness of each region is calculated. The average brightness is input into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each region. The distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor. A target rate-distortion cost formula is constructed based on the distortion weight factor, and the target rate-distortion cost formula is applied to the rate-distortion optimization decision of the region to be processed to complete the encoding operation of the region. As can be seen from the above, in the process of encoding the target HDR video, this application determines each region to be processed in the current video frame and calculates its corresponding average brightness. The average brightness of the region is input into a pre-constructed distortion weight mapping model, which is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor, thereby obtaining the corresponding distortion weight factor. Subsequently, a target rate-distortion cost formula is constructed based on the distortion weight factor and applied to the rate-distortion optimization decision of the region to be processed, completing the encoding operation of the region to be processed. In this way, through the above process of this application, the distortion weight factor is dynamically introduced based on brightness characteristics for rate-distortion optimization. Differentiated encoding optimization can be achieved for the visual perception characteristics of different brightness regions of HDR video, effectively improving the subjective visual quality of the encoded video, while taking into account encoding efficiency and bitrate control accuracy. This optimizes the HDR video rate-distortion optimization method to allocate bitrate more efficiently without changing subjective quality, thereby improving the overall compression efficiency. Attached Figure Description
[0042] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0043] Figure 1 This application discloses a flowchart of an HDR video rate-distortion optimization method based on adaptive distortion weights.
[0044] Figure 2 This is a schematic diagram of the structure of an HDR video rate-distortion optimization device based on adaptive distortion weight disclosed in this application;
[0045] Figure 3 This is a structural diagram of an electronic device disclosed in this application. Detailed Implementation
[0046] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0047] Current mainstream video coding standards such as H.266 / VVC and AV1 typically use the following formula to calculate rate-distortion cost when optimizing rate-distortion: J = D + λ R. Where R is the number of bits required for encoding, the distortion D is usually calculated as the sum of the absolute differences or the sum of the squared differences between the reconstructed value and the original value, and the Lagrange multiplier λ is mainly determined by the quantization parameter. This mechanism has a fundamental problem: it assumes that the distortion of all pixels is visually equivalent. However, the human visual system's perception of brightness is non-linear, especially for HDR video, whose brightness range is far greater than SDR, this non-linear effect is more significant: in extremely dark areas, most details are almost imperceptible (approaching black saturation) and almost indistinguishable; in dark areas, the human eye is very sensitive to small changes in brightness and noise; in medium brightness areas, the human eye's contrast sensitivity is the highest; in extremely bright areas, due to the saturation effect of visual cells, the human eye's ability to distinguish details and distortion decreases significantly, and its tolerance for distortion is higher. Therefore, in HDR encoding, evenly distributing the precious bitrate to extremely bright and medium brightness areas is inefficient from a visual perception perspective.
[0048] To overcome the aforementioned technical problems, this application provides an HDR video rate-distortion optimization method based on adaptive distortion weights, which can optimize the HDR video rate-distortion optimization method to allocate bitrate more efficiently and improve overall compression efficiency while maintaining subjective quality.
[0049] See Figure 1 As shown, this embodiment of the invention discloses a method for optimizing HDR video rate-distortion based on adaptive distortion weights, including:
[0050] Step S11: During the encoding of the target HDR video, determine each encoding region to be processed in the current video frame, and calculate the average brightness of each encoding region to be processed.
[0051] In this embodiment, during the encoding of the target HDR video, each encoding region to be processed in the current video frame is determined, and the average brightness of each encoding region to be processed is calculated. The encoding region to be processed can be the current encoding unit to be processed during the encoding process.
[0052] It should be noted that the processing flow for calculating the average brightness of the region corresponding to the region to be coded is as follows: The brightness component of each pixel in the region to be coded is determined, and the brightness component is converted into linear brightness using an electro-optical conversion function; the average value of the linear brightness is calculated, and this average value is determined as the average brightness of the region to be coded. That is, the brightness component of each pixel in the region to be coded is determined, the brightness component is converted into linear brightness in nits using an electro-optical conversion function, and then the average value of the linear brightness of all pixels in the region to be coded is calculated. The specific formula for calculating the average value is as follows:
[0053] L_avg = (Σ L) / N;
[0054] Wherein, L_avg is the average value; L is the linear luminance; N is the number of pixels contained in the region to be processed for encoding; Σ represents the summation operation. After obtaining the average value, it is determined as the regional average luminance of the region to be processed for encoding. In this way, this embodiment calculates luminance features for video frames by region, which can provide accurate partitioned luminance basis for subsequent differential coding optimization based on luminance differences, ensuring the adaptability of coding quality and efficiency for different luminance regions of HDR video; the luminance representation method based on pixel-level luminance conversion and mean calculation can eliminate the interference of nonlinear luminance mapping on regional luminance evaluation and accurately reflect the true luminance level of the encoded region.
[0055] Step S12: Input the average brightness of the region into the pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each of the regions to be processed and encoded; the distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor.
[0056] In this embodiment, the average brightness of the region is input into a pre-constructed distortion weight mapping model, and the distortion weight factor w corresponding to each region for the calculation of the adjustment rate distortion cost is obtained. The distortion weight mapping model is a piecewise linear model that characterizes the correspondence between brightness and distortion weight factor.
[0057] It is understandable that the human visual system's perception of brightness is non-linear, especially for HDR video, whose brightness range far exceeds that of SDR, making this non-linear effect even more pronounced: in extremely dark areas, most details are almost imperceptible (approaching black saturation) and nearly indistinguishable; in dark areas, the human eye is highly sensitive to minute changes in brightness and noise; in medium brightness areas, the human eye's contrast sensitivity is highest; in extremely bright areas, due to the saturation effect of visual cells, the human eye's ability to distinguish details and distortion decreases significantly, resulting in a higher tolerance for distortion. Therefore, this embodiment can design a non-linear mapping function F(L), namely the distortion weight mapping model, which maps linear brightness to a weighting factor. This function should possess the following characteristics: in the medium brightness range, F(L) reaches its maximum value, indicating that the distortion weight is highest in this area, and the encoder must strive to protect it. In dark areas, the value of F(L) is higher to protect dark details from noise and blockiness. In extremely bright and extremely dark areas, the value of F(L) decreases significantly, indicating that greater distortion is permissible.
[0058] It should be noted that the distortion weight mapping model is a first preset inflection point distortion weight mapping model; the distortion weight mapping model is a second preset inflection point distortion weight mapping model. That is, the distortion weight mapping model can be either the first preset inflection point distortion weight mapping model or the second preset inflection point distortion weight mapping model. The construction process of the first preset inflection point distortion weight mapping model is as follows: Using the target contrast sensitivity function model and the target HDR video standard, a preset brightness range is divided into a preset number of brightness intervals, and a corresponding set of inflection points is defined; based on the visual perception sensitivity of the brightness intervals, corresponding inflection point brightness weights are defined for several inflection points in the set of inflection points, and these inflection point brightness weights are determined as distortion weights to obtain a piecewise linear model composed of the set of inflection points and the distortion weights; the piecewise linear model is then determined as the first preset inflection point distortion weight mapping model. The target contrast sensitivity function model can be the Barten model (a mathematical model for calculating the contrast sensitivity of the human eye); the target HDR video standard can be the tone mapping curve of HDR VIVID (i.e., high dynamic range video technology standard). That is, based on the target contrast sensitivity function model and the target HDR video standard, the preset brightness range is divided into a preset number of brightness intervals, such as seven brightness intervals, and a corresponding set of inflection points is defined, including six inflection points (L1, w1), (L2, w2), (L3, w3), (L4, w4), (L5, w5), and (L6, w6). Then, according to the visual perception sensitivity of each brightness interval, a corresponding brightness and inflection point brightness weight are set for several inflection points in the set of inflection points, respectively set to (0.1, 0.7), (1, 1.0), (10, 1.5), (100, 1.5), (1000, 1.0), and (4000, 0.7). The inflection point brightness weight is determined as the distortion weight, and a piecewise linear model composed of the set of inflection points and the distortion weight is constructed. This piecewise linear model is the first preset inflection point distortion weight mapping model. The construction process of the second preset inflection point distortion weight mapping model is as follows: Before determining each encoding region to be processed in the current video frame, a preset video scene switching detection technology is used to determine whether the current video frame has undergone scene switching, and the corresponding judgment result is obtained; if the judgment result indicates that scene switching has occurred, then a luminance histogram is statistically analyzed for the target video frame of the current scene in the target HDR video to obtain the distribution characteristics of the luminance histogram; the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted according to the distribution characteristics to obtain an adjusted piecewise linear model, and the adjusted piecewise linear model is determined as the second preset inflection point distortion weight mapping model.That is, by using the preset video scene switching detection technology, which is the mature technology already available in the encoder, it is determined whether the current video frame has undergone a scene switch and the corresponding judgment result is obtained. If the judgment result indicates that a scene switch has occurred, that is, a new scene has been detected, then for the target video frame of the current scene in the target HDR video, such as the first frame or the first few frames, a luminance histogram is statistically analyzed to obtain the distribution characteristics of the luminance histogram. Then, based on the distribution characteristics, the inflection coordinates of several inflection points in the first preset inflection point distortion weight mapping model are dynamically adjusted to generate an adjusted piecewise linear model, and this model is determined as the second preset inflection point distortion weight mapping model.
[0059] It should be further pointed out that the processing flow for adjusting the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics is as follows: by using a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy, the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted according to the distribution characteristics; the brightness contrast of the current video frame is determined, and a weight scaling factor is determined according to the brightness contrast, so as to adjust the inflection point distortion weight coordinates of the several inflection points based on the weight scaling factor; wherein, the inflection point coordinates include the inflection point brightness coordinates and the inflection point distortion weight coordinates. That is, the inflection point coordinates encompass both inflection point brightness coordinates and inflection point distortion weight coordinates. Specifically, through a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy, the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted according to the distribution characteristics. Simultaneously, the brightness contrast (e.g., standard deviation) of the current video frame is determined, and a weight scaling factor k is determined accordingly. Based on this weight scaling factor, the inflection point distortion weight coordinates of the aforementioned several inflection points are adjusted. Specifically, all weight values w in the final inflection point set are multiplied by the factor k. When the contrast is low, k < 1 is set to save bitrate; when the contrast is high, k ≥ 1 is set to preserve details.
[0060] Furthermore, the processing flow for adjusting the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model by means of a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy is as follows: The target brightness range of the target video frame is determined based on the distribution characteristics, and the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are mapped to the target brightness range using linear interpolation technology; and / or, the average brightness of the main subject in the target video frame is determined based on the distribution characteristics, and the first target brightness interval with the highest distortion weight in the first preset inflection point distortion weight mapping model is determined, and the target translation amount is calculated using the average brightness of the main subject and the first target brightness interval, so as to translate the inflection point brightness coordinates of the several inflection points based on the target translation amount. That is, the preset linear interpolation adjustment strategy is as follows: Based on the distribution characteristics, the target brightness range [L_min, L_max] of the target video frame is determined. Linear interpolation technology is used to map the brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model to this target brightness range, ensuring that the weight distribution always matches the actual dynamic range of the content. For example, for the preset inflection point (L2, w2), the formula for expressing its adjusted new brightness coordinate L2_adapted is as follows:
[0061] L2_adapted = L_min + L2 (L_max - L_min) / 10000;
[0062] Wherein, L_min represents the minimum brightness value of the target brightness range; L_max represents the maximum brightness value of the target brightness range. All preset inflection points are converted into new adaptive inflection points, where the weight values remain unchanged. The preset translation adjustment strategy is as follows: Based on the distribution characteristics, the average brightness of the main subject in the target video frame is determined through face detection or visual saliency analysis. At the same time, the first target brightness interval [L3, L4] with the highest distortion weight in the first preset inflection point distortion weight mapping model is identified. Then, the first target brightness interval is translated to the interval centered on the average brightness of the main subject, and the corresponding target translation amount is calculated. The calculation formula for the target translation amount is as follows:
[0063] S = L_salient - (L3 + L4) / 2;
[0064] Where S is the target translation amount; L_salient is the average brightness of the subject. After obtaining the target translation amount, the brightness coordinates of the inflection points of the plurality of inflection points are translated based on the target translation amount, while the weight values remain unchanged. In addition, this embodiment can combine the preset linear interpolation adjustment strategy and the preset translation adjustment strategy for coordinate adjustment. Specifically, the preset linear interpolation adjustment strategy is first used to perform range mapping to obtain the first set of inflection points Set_A, and then the preset translation adjustment strategy is used for subject alignment fine-tuning. With Set_A as input, the preset translation adjustment strategy is used for translation to obtain the second set of inflection points Set_B.
[0065] It should be noted that, for any of the aforementioned encoded regions, the process of inputting the average brightness of the region into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each encoded region is as follows: Several inflection point brightness values are determined in the pre-constructed distortion weight mapping model, and the average brightness of the region is compared with these inflection point brightness values to determine the second target brightness interval in the distortion weight mapping model where the average brightness of the region lies; if the number of inflection points corresponding to the second target brightness interval is one, then the inflection point distortion weight value corresponding to the second target brightness interval is determined as the distortion weight factor corresponding to the average brightness of the region; if the number of inflection points corresponding to the second target brightness interval is not one, then the distortion weight factor is calculated using linear interpolation. That is, the distortion weight factor corresponding to the average brightness of the region can be calculated using a piecewise interpolation algorithm, and the corresponding algorithm logic is as follows:
[0066] If L_avg <= L1:
[0067] w_i =w1
[0068] Otherwise, if L_avg <= L2:
[0069] w_i = w1 + (w2 - w1) (L_avg - L1) / (L2 - L1)
[0070] Otherwise, if L_avg <= L3:
[0071] w_i = w2 + (w3 - w2) (L_avg - L2) / (L3 - L2)
[0072] Otherwise, if L_avg <= L4:
[0073] w_i = w3 + (w4 - w3) (L_avg - L3) / (L4 - L3)
[0074] Otherwise, if L_avg <= L5:
[0075] w_i = w4 + (w5 - w4) (L_avg - L4) / (L5 - L4)
[0076] Otherwise, if L_avg <= L6:
[0077] w_i = w5 + (w6 - w5) (L_avg - L5) / (L6 - L5)
[0078] otherwise:
[0079] w_i = w6;
[0080] Specifically, several inflection point brightness values in the pre-constructed distortion weight mapping model are determined, and the average brightness of the region is compared with the several inflection point brightness values to clarify the second target brightness interval in the distortion weight mapping model where the average brightness of the region is located. If the number of inflection points corresponding to the second target brightness interval is one, the distortion weight value of the inflection point corresponding to the interval is directly determined as the distortion weight factor corresponding to the average brightness of the region. If the number of inflection points corresponding to the second target brightness interval is not one, the corresponding distortion weight factor is calculated by linear interpolation. In this way, this embodiment allocates distortion weight factors based on a piecewise linear model, enabling differentiated weight mapping for the visual characteristics of different brightness regions in HDR videos. This provides a brightness-related weight basis for subsequent rate-distortion optimization decisions, ensuring a balance between subjective quality and bitrate efficiency in HDR video encoding. The piecewise linear distortion weight mapping model, constructed by combining visual perception characteristics and HDR video standards, is computationally simple and easy to implement in the encoder. It accurately matches the perceptual differences of the human eye in different brightness ranges, improving the subjective quality performance of HDR video encoding. The model's dynamic adjustment strategy based on scene switching detection and brightness distribution characteristics allows the model to accurately adapt to the brightness characteristics of different video scenes and different video content, improving the model's adaptability to scene changes and ensuring a balance between subjective quality and bitrate efficiency in HDR video encoding under different scenarios. The model parameter optimization method, combining coordinate position adjustment and weight scaling, allows the model to more accurately adapt to the brightness distribution and contrast characteristics of the current video scene, further improving the matching degree between the model's output distortion weight factors and the video's visual perception characteristics.
[0081] Step S13: Construct a target rate-distortion cost formula based on the distortion weight factor, and apply the target rate-distortion cost formula to the rate-distortion optimization decision of the coding region to be processed, so as to complete the coding operation of the coding region to be processed.
[0082] In this embodiment, a target rate-distortion cost formula is constructed based on the distortion weight factor, and the target rate-distortion cost formula is applied to the rate-distortion optimization decision of the coding region to be processed, thereby completing the coding operation of the coding region to be processed.
[0083] Specifically, the traditional rate-distortion cost formula is first modified based on the distortion weighting factor. The specific expression of the traditional rate-distortion cost formula is as follows:
[0084] J = D + λ R;
[0085] Where J represents the rate-distortion cost; D is the traditional rate-distortion calculation value, usually calculated as the sum of the absolute differences or the sum of the squared differences between the reconstructed value and the original value; λ is the Lagrange multiplier, mainly determined by the quantization parameter; and R is the number of bits required for encoding. This embodiment modifies the traditional rate-distortion cost formula into a target rate-distortion cost formula, the specific expression of which is as follows:
[0086] J_new = w D + λ R;
[0087] Where w is the distortion weighting factor. The target rate-distortion cost formula is then applied to the rate-distortion-optimized coding tool decision for the coding region to be processed, including but not limited to coding unit partitioning decisions: selecting the partitioning method that minimizes J_new; prediction mode selection (intra-frame / inter-frame): selecting the prediction mode that minimizes J_new. In this way, this embodiment introduces a rate-distortion optimization strategy based on a brightness-related distortion weighting factor. This reweights the contribution of distortion terms to coding units in different brightness regions during rate-distortion optimization processes such as mode decision-making and motion estimation. This ensures that the encoder tends to retain more detail in visually sensitive areas, allowing more compression distortion in non-sensitive areas. This improves bitrate control efficiency while ensuring the subjective visual quality of the video, thus optimizing overall coding performance.
[0088] As can be seen from the above, in the process of encoding the target HDR video, this application embodiment determines each region to be processed in the current video frame and calculates its corresponding average brightness. The average brightness of the region is input into a pre-constructed distortion weight mapping model, which is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor, thereby obtaining the corresponding distortion weight factor. Subsequently, a target rate-distortion cost formula is constructed based on the distortion weight factor and applied to the rate-distortion optimization decision of the region to be processed, completing the encoding operation of the region to be processed. In this way, through the above process of this application embodiment, the distortion weight factor is dynamically introduced based on brightness characteristics for rate-distortion optimization. Differentiated encoding optimization can be achieved for the visual perception characteristics of different brightness regions of HDR video, effectively improving the subjective visual quality of the encoded video, while taking into account encoding efficiency and bitrate control accuracy. This optimizes the HDR video rate-distortion optimization method to allocate bitrate more efficiently without changing subjective quality, thereby improving the overall compression efficiency.
[0089] Accordingly, see Figure 2 As shown in the figure, this application embodiment also provides an HDR video rate-distortion optimization device based on adaptive distortion weights, including:
[0090] The brightness calculation module 11 is used to determine each encoding region to be processed in the current video frame during the encoding process of the target HDR video, and to calculate the average brightness of each encoding region to be processed.
[0091] The brightness input module 12 is used to input the average brightness of the region into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each of the regions to be processed for encoding; the distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor.
[0092] Formula application module 13 is used to construct a target rate-distortion cost formula based on the distortion weight factor, and apply the target rate-distortion cost formula to the rate-distortion optimization decision of the coding region to be processed, so as to complete the coding operation of the coding region to be processed.
[0093] In some specific implementations, the distortion weight mapping model is a first preset inflection point distortion weight mapping model;
[0094] Specifically, the HDR video rate-distortion optimization device based on adaptive distortion weights may include:
[0095] The set definition unit is used to divide the preset brightness range into a preset number of brightness intervals by using the target contrast sensitivity function model and the target HDR video standard, and to define the corresponding set of inflection points;
[0096] The weight determination unit is used to define corresponding inflection point brightness weights for several inflection points in the inflection point set based on the visual perception sensitivity of the brightness range, and to determine the inflection point brightness weights as distortion weights, so as to obtain a piecewise linear model composed of the inflection point set and the distortion weights.
[0097] The model determination unit is used to determine the piecewise linear model as the first preset inflection point distortion weight mapping model.
[0098] In some specific implementations, the distortion weight mapping model is a second preset inflection point distortion weight mapping model;
[0099] The HDR video rate-distortion optimization device based on adaptive distortion weights may further include:
[0100] The condition judgment unit is used to determine whether a scene change has occurred in the current video frame using a preset video scene switching detection technology, and to obtain the corresponding judgment result.
[0101] The histogram statistics unit is used to perform luminance histogram statistics on the target video frame of the current scene in the target HDR video if the judgment result indicates that a scene switch has occurred, so as to obtain the distribution characteristics of the luminance histogram.
[0102] The model determination submodule is used to adjust the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics, to obtain the adjusted piecewise linear model, and to determine the adjusted piecewise linear model as the second preset inflection point distortion weight mapping model.
[0103] In some specific implementations, the model determining submodule may specifically include:
[0104] The coordinate adjustment unit is used to adjust the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics by means of a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy.
[0105] The first factor determination unit is used to determine the brightness contrast of the current video frame and determine the weight scaling factor based on the brightness contrast, so as to adjust the inflection point distortion weight coordinates of the plurality of inflection points based on the weight scaling factor.
[0106] The inflection point coordinates include the inflection point brightness coordinates and the inflection point distortion weight coordinates.
[0107] In some specific embodiments, the coordinate adjustment unit is specifically used to determine the target brightness range of the target video frame based on the distribution characteristics, and to map the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model to the target brightness range using linear interpolation technology; and / or, to determine the average brightness of the main subject in the target video frame based on the distribution characteristics, and to determine the first target brightness interval with the highest distortion weight in the first preset inflection point distortion weight mapping model, and to calculate the target translation amount using the average brightness of the main subject and the first target brightness interval, so as to translate the inflection point brightness coordinates of the several inflection points based on the target translation amount.
[0108] In some specific embodiments, the brightness input module 12 may specifically include:
[0109] A brightness comparison unit is used to determine several inflection point brightness values in a pre-constructed distortion weight mapping model, and compare the average brightness of the region with the several inflection point brightness values to determine the second target brightness range in which the average brightness of the region is located in the distortion weight mapping model.
[0110] The second factor determination unit is used to determine the distortion weight value of the inflection point corresponding to the second target brightness interval as the distortion weight factor corresponding to the average brightness of the region if the number of inflection points corresponding to the second target brightness interval is one.
[0111] The factor calculation unit is used to calculate the distortion weight factor using linear interpolation if the number of inflection points corresponding to the second target brightness range is not one.
[0112] In some specific embodiments, the brightness calculation module 11 may specifically include:
[0113] A component conversion unit is used to determine the luminance component of each pixel in the coding region to be processed, and to convert the luminance component into linear luminance using an electro-optic conversion function;
[0114] A brightness determination unit is used to calculate the average value of the linear light brightness and determine the average value as the regional average brightness of the region to be processed for encoding.
[0115] Furthermore, embodiments of this application also disclose an electronic device, Figure 3This is a structural diagram of an electronic device 20 according to an exemplary embodiment. The content of the diagram should not be construed as limiting the scope of this application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input / output interface 25, and a communication bus 26. The memory 22 stores a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the HDR video rate-distortion optimization method based on adaptive distortion weights disclosed in any of the foregoing embodiments. Furthermore, the electronic device 20 in this embodiment may specifically be a computer.
[0116] In this embodiment, the power supply 23 is used to provide operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be any communication protocol applicable to the technical solution of this application, and is not specifically limited here; the input / output interface 25 is used to acquire external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs, and is not specifically limited here.
[0117] In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, random access memory, disk or optical disk, etc. The resources stored thereon can include operating system 221, computer program 222, etc., and the storage method can be temporary storage or permanent storage.
[0118] The operating system 221 is used to manage and control the various hardware devices on the electronic device 20 and the computer program 222, which may be Windows Server, Netware, Unix, Linux, etc. In addition to including a computer program capable of performing the HDR video rate-distortion optimization method based on adaptive distortion weights as disclosed in any of the foregoing embodiments, the computer program 222 may further include computer programs capable of performing other specific tasks.
[0119] Furthermore, this application also discloses a computer-readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, it implements the aforementioned disclosed HDR video rate-distortion optimization method based on adaptive distortion weights. Specific steps of this method can be found in the corresponding content disclosed in the foregoing embodiments, and will not be repeated here.
[0120] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple; relevant parts can be referred to in the method section.
[0121] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0122] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented directly by hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
[0123] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0124] The technical solutions provided in this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A rate-distortion optimization method for HDR video based on adaptive distortion weights, characterized in that, include: During the encoding of the target HDR video, each encoding region to be processed in the current video frame is determined, and the average brightness of each encoding region to be processed is calculated. The average brightness of the region is input into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each of the regions to be processed for encoding. The distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor; Based on the distortion weight factor, a target rate-distortion cost formula is constructed, and the target rate-distortion cost formula is applied to the rate-distortion optimization decision of the coding region to be processed, so as to complete the coding operation of the coding region to be processed. The distortion weight mapping model is either a first preset inflection point distortion weight mapping model or a second preset inflection point distortion weight mapping model. The construction process of the first preset inflection point distortion weight mapping model includes: By using the target contrast sensitivity function model and the target HDR video standard, the preset brightness range is divided into a preset number of brightness intervals, and a corresponding set of inflection points is defined. Based on the visual perception sensitivity of the brightness range, a corresponding inflection point brightness weight is defined for several inflection points in the inflection point set, and the inflection point brightness weight is determined as the distortion weight to obtain a piecewise linear model composed of the inflection point set and the distortion weight. The piecewise linear model is determined as the first preset inflection point distortion weight mapping model; Before determining each encoding region to be processed in the current video frame, the process further includes: The system uses a preset video scene switching detection technology to determine whether a scene switch has occurred in the current video frame, and obtains the corresponding judgment result. If the judgment result indicates that a scene switch has occurred, then a luminance histogram is statistically analyzed for the target video frame of the current scene in the target HDR video to obtain the distribution characteristics of the luminance histogram. Based on the distribution characteristics, the inflection coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted to obtain the adjusted piecewise linear model, and the adjusted piecewise linear model is determined as the second preset inflection point distortion weight mapping model.
2. The HDR video rate-distortion optimization method based on adaptive distortion weights according to claim 1, characterized in that, The step of adjusting the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics to obtain the adjusted piecewise linear model includes: By using a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy, the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are adjusted according to the distribution characteristics. Determine the brightness contrast of the current video frame, and determine the weight scaling factor based on the brightness contrast, so as to adjust the inflection point distortion weight coordinates of the plurality of inflection points based on the weight scaling factor. The inflection point coordinates include the inflection point brightness coordinates and the inflection point distortion weight coordinates.
3. The HDR video rate-distortion optimization method based on adaptive distortion weights according to claim 2, characterized in that, The step of adjusting the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics through a preset linear interpolation adjustment strategy and / or a preset translation adjustment strategy includes: The target brightness range of the target video frame is determined based on the distribution characteristics, and the inflection point brightness coordinates of several inflection points in the first preset inflection point distortion weight mapping model are mapped to the target brightness range using linear interpolation technology. And / or, based on the distribution characteristics, determine the average brightness of the main subject in the target video frame, determine the first target brightness interval with the highest distortion weight in the first preset inflection point distortion weight mapping model, and calculate the target translation amount using the average brightness of the main subject and the first target brightness interval, so as to translate the inflection point brightness coordinates of the plurality of inflection points based on the target translation amount.
4. The HDR video rate-distortion optimization method based on adaptive distortion weights according to claim 1, characterized in that, For any of the aforementioned encoded regions, the average brightness of the region is input into a pre-constructed distortion weight mapping model to obtain the distortion weight factors corresponding to each of the encoded regions, including: Determine several inflection point brightness values in a pre-constructed distortion weight mapping model, and compare the average brightness of the region with the several inflection point brightness values to determine the second target brightness range in which the average brightness of the region is located in the distortion weight mapping model; If the number of inflection points corresponding to the second target brightness range is one, then the distortion weight value of the inflection point corresponding to the second target brightness range is determined as the distortion weight factor corresponding to the average brightness of the region. If the number of inflection points corresponding to the second target brightness range is not one, then the distortion weighting factor is calculated using linear interpolation technology.
5. The HDR video rate-distortion optimization method based on adaptive distortion weights according to any one of claims 1 to 4, characterized in that, Calculating the average brightness of the region corresponding to the region to be coded includes: The luminance component of each pixel in the encoding region to be processed is determined, and the luminance component is converted into linear luminance using an electro-optic conversion function; Calculate the average value of the linear brightness and determine the average value as the regional average brightness of the region to be coded.
6. A rate-distortion optimization device for HDR video based on adaptive distortion weights, characterized in that, include: The brightness calculation module is used to determine each encoding region to be processed in the current video frame during the encoding of the target HDR video, and to calculate the average brightness of each encoding region to be processed. A brightness input module is used to input the average brightness of the region into a pre-constructed distortion weight mapping model to obtain the distortion weight factor corresponding to each of the regions to be processed for encoding. The distortion weight mapping model is a piecewise linear model used to characterize the relationship between brightness and distortion weight factor; The formula application module is used to construct a target rate-distortion cost formula based on the distortion weight factor, and apply the target rate-distortion cost formula to the rate-distortion optimization decision of the coding region to be processed, so as to complete the coding operation of the coding region to be processed. The distortion weight mapping model is either a first preset inflection point distortion weight mapping model or a second preset inflection point distortion weight mapping model. The HDR video rate-distortion optimization device based on adaptive distortion weights includes: The set definition unit is used to divide the preset brightness range into a preset number of brightness intervals by using the target contrast sensitivity function model and the target HDR video standard, and to define the corresponding set of inflection points; The weight determination unit is used to define corresponding inflection point brightness weights for several inflection points in the inflection point set based on the visual perception sensitivity of the brightness range, and to determine the inflection point brightness weights as distortion weights, so as to obtain a piecewise linear model composed of the inflection point set and the distortion weights. The model determination unit is used to determine the piecewise linear model as the first preset inflection point distortion weight mapping model. The HDR video rate-distortion optimization device based on adaptive distortion weights further includes: The condition judgment unit is used to determine whether a scene change has occurred in the current video frame using a preset video scene switching detection technology, and to obtain the corresponding judgment result. The histogram statistics unit is used to perform luminance histogram statistics on the target video frame of the current scene in the target HDR video if the judgment result indicates that a scene switch has occurred, so as to obtain the distribution characteristics of the luminance histogram. The model determination submodule is used to adjust the inflection point coordinates of several inflection points in the first preset inflection point distortion weight mapping model according to the distribution characteristics, to obtain the adjusted piecewise linear model, and to determine the adjusted piecewise linear model as the second preset inflection point distortion weight mapping model.
7. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor for executing the computer program to implement the HDR video rate-distortion optimization method based on adaptive distortion weights as described in any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that, Used to store a computer program; wherein, when the computer program is executed by a processor, it implements the HDR video rate-distortion optimization method based on adaptive distortion weights as described in any one of claims 1 to 5.