Variable bit rate control method and device based on avs
By dynamically adjusting the target bit count of image groups and the bit count of encoded frames in AVS video encoding, the problem of unstable video output under frequent scene switching is solved, achieving better image quality and encoding efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA TELECOM CORP LTD
- Filing Date
- 2021-12-29
- Publication Date
- 2026-06-23
AI Technical Summary
Existing AVS bitrate control methods cannot provide stable video output when scenes change frequently, and they are also quite complex.
By determining the target number of bits for image groups in a video sequence and dynamically adjusting the number of bits for encoded frames based on peak signal-to-noise ratio and scene switching factor, variable bit rate control is optimized by combining frame complexity and encoding complexity.
It achieves more stable image quality, reduced coding complexity, and improved network bandwidth resource utilization in video sequences with frequent scene changes and intense motion.
Smart Images

Figure CN116418986B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of video coding technology, and in particular to a variable bit rate control method and apparatus, electronic device, and readable storage medium based on AVS. Background Technology
[0002] Rate control specifically refers to how to achieve better and more stable image quality and effectively improve network bandwidth resources under a given channel transmission rate.
[0003] Bitrate control is mainly divided into constant bit rate (CBR) control and variable bit rate (VBR) control. CBR is less adaptable to video sequences with frequent scene changes and rapid motion. VBR specifically reduces the bitrate of some parts of the image by lowering the quality of those parts to compensate for the additional bandwidth requirements of other parts of the image, which can improve the resource utilization of transmission and storage media. However, the CBR bitrate control method has high computational complexity. The existing AVS (Audio Video Coding Standard) cannot provide smooth video output for situations with frequent scene changes, and it also has high complexity. Summary of the Invention
[0004] This invention provides a variable bitrate control method and device, electronic device, and readable storage medium based on AVS, aiming to solve the problem that existing bitrate control methods cannot achieve stable video output and have high bitrate control complexity for video sequences with frequent scene switching.
[0005] A first aspect of the present invention provides a variable bit rate control method based on AVS, the method comprising:
[0006] Determine the target number of bits for the i-th image group in the video sequence; where i is the image group number of the image group to be encoded in the video sequence.
[0007] When the frame to be encoded is a frame other than the first I-frame and the first P-frame in the i-th image group, the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group is determined based on the peak signal-to-noise ratio of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the average peak signal-to-noise ratio of all encoded P-frames in the video sequence, and the peak signal-to-noise ratio of each encoded frame before the frame to be encoded in the i-th image group.
[0008] Based on the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the target number of bits in the i-th image group, the encoded number of bits of the previous encoded frame adjacent to the frame to be encoded, the number of encoded bits in the i-th image group, the actual bit rate and frame rate of the encoded bitstream in the i-th image group, the remaining target number of bits in the i-th image group before encoding the frame to be encoded is adjusted.
[0009] Based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, and the number of target bits remaining before encoding the frame to be encoded in the i-th image group, the target number of the frame to be encoded in the i-th image group is determined.
[0010] At the same bitrate, the peak signal-to-noise ratio (PSNR) is higher for frames with low encoding complexity and smooth motion, and lower for frames with high encoding complexity and rapid motion. The PSNR also changes abruptly during scene transitions. In this embodiment, based on the PSNR of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average PSNR of all P-frames encoded in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group, the scene transition factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group can be accurately determined. This not only effectively detects scene transitions but also determines whether the transition is from a relatively simple scene to a complex scene or vice versa. The process of dynamically adjusting the remaining target bits in the i-th image group before encoding the frame to be encoded, based on the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded, increases the remaining target bits for scenes progressing from simple to complex, and decreases the remaining target bits for scenes progressing from complex to simple. This results in a higher degree of scene fit, especially for video sequences with rapid motion, frequent scene switching, and complex textures. The relative complexity of the preceding encoded frame adjacent to the frame to be encoded can accurately reflect the complexity of the frame to be encoded. In determining the target bit number of the frame to be encoded in the i-th image group, the relative complexity of the preceding encoded frame adjacent to the frame to be encoded is further referenced. Thus, the target bit number determined for the frame to be encoded is more closely aligned with the complexity of the frame to be encoded and the degree of scene switching of the target encoded frame, resulting in smoother image quality. It has a better encoding effect, especially for videos with frequent scene switching and complex textures, while significantly reducing encoding complexity. Encoding the frame to be encoded with this target bit number not only achieves high fidelity but also effectively improves the utilization of network bandwidth resources and has good real-time performance.
[0011] Optionally, determining the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group based on the peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average PSNR of all encoded P frames in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group includes:
[0012] The scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group is determined using the following formula. :
[0013]
[0014] Where 'a' is the first preset coefficient and 'b' is the second preset coefficient. , The peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. The average peak signal-to-noise ratio of all P-frames encoded in the video sequence. ; Let be the peak signal-to-noise ratio of the d-th encoded frame preceding the frame to be encoded in the i-th image group, and j be the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0015] Optionally, adjusting the remaining target bits in the i-th image group before encoding the frame to be encoded, based on the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the target number of bits in the i-th image group, the encoded number of bits in the preceding encoded frame adjacent to the frame to be encoded, the encoded number of bits in the i-th image group, the actual bitrate and frame rate of the encoded bitstream in the i-th image group, includes:
[0016] The number of target bits remaining in the i-th image group before encoding the frame to be encoded is adjusted using the following formula. :
[0017]
[0018] in, In the i-th image group, the scene switching factor is the preceding encoded frame adjacent to the frame to be encoded. It is equal to the difference between the target number of the i-th image group and the number of bits already encoded in the i-th image group; The number of encoded bits in the preceding encoded frame adjacent to the frame to be encoded. The actual bitrate of the encoded bitstream in the video sequence is given by c, where c is a third preset coefficient and d is a fourth preset coefficient. , The first preset threshold, The second preset threshold is denoted as j, where j is the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0019] Optionally, determining the target number of bits for the frame to be encoded in the i-th image group based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, and the number of target bits remaining before encoding the frame to be encoded in the i-th image group, includes:
[0020] The target number of bits for the frame to be encoded in the i-th image group is determined using the following formula. :
[0021]
[0022] in, The number of target bits remaining before encoding the frame to be encoded in the i-th image group. The number of uncoded P-frames remaining in the i-th image group. is the frame layer target bitrate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded in the i-th image group, and j is the frame number of the previous encoded frame adjacent to the frame to be encoded in the i-th image group.
[0023] Optionally, the frame layer target bitrate adjustment factor corresponding to the preceding encoded frame adjacent to the frame to be encoded in the i-th image group can be determined using the following formula.
[0024]
[0025] in, The first preset value, The second preset value; This is the third preset value; This is the fourth preset value. Both q and q are preset constants; Let be the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, and j be the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0026]
[0027] The relative complexity of the second-to-last encoded frame adjacent to the frame to be encoded in the i-th image group. For the i-th image group, the encoded first... The relative complexity of each P-frame.
[0028] Optionally, determining the target number of bits for the i-th image group in the video sequence includes:
[0029] When i=1, the target number of bits for the first image group in the video sequence is determined as the preset number of bits;
[0030] When i > 1, the target number of bits for the i-th image group in the video sequence is determined using the following formula. :
[0031]
[0032] in, , The target average bitrate for all image groups in the video sequence. The total number of image groups in the video sequence. Let i be the adjustment factor for the i-th image group. Let i be the complexity factor of the i-th image group. This is the bitstream balancing factor.
[0033] Optionally, the complexity factor of the i-th image group can be determined using the following formula. :
[0034]
[0035] in, ; This is the fifth preset coefficient. To predict the average complexity of all P frames in the i-th image group, , Let P be the average complexity of all P frames in the (i-1)th image group. The average complexity of all P-frames encoded in the video sequence;
[0036] Bitstream balance factor The calculation formula is:
[0037]
[0038] Where med(.) is the intermediate value function; , The actual bitrate of the encoded bitstream in the video sequence. The target average bitrate for all image groups in the video sequence.
[0039] Optionally, the method further includes:
[0040] The quantization parameters of the first I-frame in each image group of the video sequence and the quantization parameters of the first P-frame in the first image group of the video sequence are both set to the preset quantization parameters.
[0041] A second aspect of the present invention provides an AVS-based variable bit rate control device, the device comprising:
[0042] The target bit count determination module for an image group is used to determine the target bit count of the i-th image group in the video sequence; where i is the image group number of the image group to be encoded in the video sequence.
[0043] The scene switching factor determination module is used to determine the scene switching factor of the preceding encoded frame in the i-th image group when the frame to be encoded is a frame other than the first I-frame and the first P-frame in the i-th image group, based on the peak signal-to-noise ratio of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average peak signal-to-noise ratio of all encoded P-frames in the video sequence, and the peak signal-to-noise ratio of each encoded frame before the frame to be encoded in the i-th image group.
[0044] The remaining target bit number adjustment module is used to adjust the remaining target bit number in the i-th image group before encoding the frame to be encoded, based on the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the target bit number of the i-th image group, the encoded bit number of the previous encoded frame adjacent to the frame to be encoded, the encoded bit number in the i-th image group, the actual bit rate and frame rate of the encoded bitstream in the i-th image group;
[0045] The bit count determination module for the frame to be encoded is used to determine the target bit count of the frame to be encoded in the i-th image group based on the relative complexity of the previous encoded frame adjacent to the frame to be encoded in the i-th image group and the target bit count remaining before encoding the frame to be encoded in the i-th image group.
[0046] Optionally, the scene switching factor determination module includes:
[0047] The switching factor determination unit is used to determine, using the following formula, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. :
[0048]
[0049] Where 'a' is the first preset coefficient and 'b' is the second preset coefficient. , The peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. The average peak signal-to-noise ratio of all P-frames encoded in the video sequence. ; Let be the peak signal-to-noise ratio of the d-th encoded frame preceding the frame to be encoded in the i-th image group, and j be the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0050] Optionally, the remaining target bit number adjustment module includes:
[0051] The remaining target bit number adjustment unit is used to adjust the remaining target bit number in the i-th image group before encoding the frame to be encoded using the following formula. :
[0052]
[0053] in, In the i-th image group, the scene switching factor is the preceding encoded frame adjacent to the frame to be encoded. It is equal to the difference between the target number of the i-th image group and the number of bits already encoded in the i-th image group; The number of encoded bits in the preceding encoded frame adjacent to the frame to be encoded. The actual bitrate of the encoded bitstream in the video sequence is given by c, where c is a third preset coefficient and d is a fourth preset coefficient. , The first preset threshold, The second preset threshold is denoted as j, where j is the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0054] Optionally, the bit count determination module for the frame to be encoded includes:
[0055] The bit count determination unit for the frame to be encoded is used to determine the target bit count of the frame to be encoded in the i-th image group using the following formula. :
[0056]
[0057] in, The number of target bits remaining before encoding the frame to be encoded in the i-th image group. The number of uncoded P-frames remaining in the i-th image group. is the frame layer target bitrate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded in the i-th image group, and j is the frame number of the previous encoded frame adjacent to the frame to be encoded in the i-th image group.
[0058] Optionally, the bit count determination unit for the frame to be encoded includes: a frame-layer target bit rate adjustment factor determination subunit, which is used to determine the frame-layer target bit rate adjustment factor corresponding to the preceding encoded frame adjacent to the frame to be encoded in the i-th image group using the following formula.
[0059]
[0060] in, The first preset value, The second preset value; This is the third preset value; This is the fourth preset value. Both q and q are preset constants; Let be the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, and j be the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0061]
[0062] The relative complexity of the second-to-last encoded frame adjacent to the frame to be encoded in the i-th image group. For the i-th image group, the encoded first... The relative complexity of each P-frame.
[0063] Optionally, the target bit count determination module for the image group includes:
[0064] The first unit for determining the target bit count of an image group is used to determine the target bit count of the first image group in the video sequence as a preset bit count when i=1.
[0065] The second unit for determining the target bit count of an image group is used to determine the target bit count of the i-th image group in the video sequence using the following formula when i > 1. :
[0066]
[0067] in, , The target average bitrate for all image groups in the video sequence. The total number of image groups in the video sequence. Let i be the adjustment factor for the i-th image group. Let i be the complexity factor of the i-th image group. This is the bitstream balancing factor.
[0068] Optionally, the second unit for determining the target number of bits in the image group includes: a complexity factor determination subunit for the image group, which is used to determine the complexity factor of the i-th image group using the following formula. :
[0069]
[0070] in, ; This is the fifth preset coefficient. To predict the average complexity of all P frames in the i-th image group, , Let P be the average complexity of all P frames in the (i-1)th image group. The average complexity of all P-frames encoded in the video sequence;
[0071] Bitstream balance factor The calculation formula is:
[0072]
[0073] Where med(.) is the intermediate value function; , The actual bitrate of the encoded bitstream in the video sequence. The target average bitrate for all image groups in the video sequence.
[0074] Optionally, the device further includes:
[0075] The quantization parameter determination module is used to set the quantization parameters of the first I-frame in each image group of the video sequence and the quantization parameters of the first P-frame in the first image group of the video sequence to preset quantization parameters.
[0076] A third aspect of the present invention provides an electronic device comprising a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of any of the aforementioned AVS-based variable bit rate control methods.
[0077] A fourth aspect of the present invention provides a readable storage medium storing a computer program that, when executed by a processor, implements the steps of any of the aforementioned AVS-based variable bitrate control methods.
[0078] The aforementioned AVS-based variable bit rate control device, electronic device, and readable storage medium have the same or similar beneficial effects as the aforementioned AVS-based variable bit rate control method. To avoid repetition, they will not be described again here. Attached Figure Description
[0079] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0080] Figure 1 A flowchart illustrating the steps of an AVS-based variable bit rate control method in an embodiment of the present invention is shown.
[0081] Figure 2 A schematic diagram illustrating the steps of an AVS-based variable bit rate control method according to an embodiment of the present invention is shown.
[0082] Figure 3 A structural block diagram of an AVS-based variable bit rate control device is shown in an embodiment of the present invention.
[0083] Figure 4 A block diagram of another AVS-based variable bit rate control device is shown in an embodiment of the present invention. Detailed Implementation
[0084] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0085] Figure 1 A flowchart illustrating the steps of an AVS-based variable bit rate control method according to an embodiment of the present invention is shown. (Refer to...) Figure 1 As shown, the AVS-based variable bit rate control method includes the following steps:
[0086] Step 101: Determine the target number of bits for the i-th image group in the video sequence; where i is the image group number of the image group to be encoded in the video sequence.
[0087] In this embodiment of the invention, the video sequence may contain at least one Group of Pictures (GOP). Hereinafter, GOP refers to a group of pictures. The number of GOPs contained in the video sequence is not specifically limited. The GOP to be encoded is the GOP that is about to be encoded. i is the GOP index of the GOP to be encoded in the video sequence, and the value of i is less than or equal to the total number of GOPs contained in the video sequence. The definition of i in the following text refers to this definition. For example, if the video sequence has 3 GOPs, and the GOP to be encoded is the first GOP in the video sequence, then i here is 1.
[0088] The target bit count for the i-th image group is the initially set bit count for the i-th image group. When determining the target bit count for the i-th image group in the video sequence, factors such as bandwidth and the actual situation of the i-th image group can be considered; the specific setting method is not limited here.
[0089] Optionally, when i=1, the target bit count of the first image group in the video sequence is determined as a preset bit count, which can be less than or equal to the maximum network bandwidth. The actual value of this preset bit count is not specifically limited.
[0090] i > 1 refers to all the image groups in the video sequence except for the first image group. For example, in the previous example, if the video sequence has 3 GOPs, i > 1 refers to the 2nd and 3rd GOPs in the video sequence.
[0091] Optionally, when i > 1, the target number of bits for the i-th image group in the video sequence is determined using the following formula. :
[0092]
[0093] in, , This represents the target average bitrate for all image groups in the video sequence. N is the total number of image groups in the video sequence. For example, in the previous example, if the video sequence has 3 GOPs, then N=3. Let i be the adjustment factor for the i-th image group. Let i be the complexity factor of the i-th image group. This is the bitstream balancing factor.
[0094] Adjustment factor for the i-th image group The following formula can be used to determine it:
[0095]
[0096] in, . The fifth preset coefficient, e, is mainly used to adjust for cases where the complexity factor is too large. The value of e can be 0.1-0.9, for example, e can be 0.5. Let be the average complexity of all P frames in the predicted i-th image group. , Let P be the average complexity of all P frames in the (i-1)th image group. The average complexity of all P-frames encoded in the video sequence; bitstream balance factor. The calculation formula is:
[0097]
[0098] Where med(.) is the intermediate value function; The value of m is mainly for accurate bitrate control and smooth image quality. Bitrate fluctuations are limited, usually within 10%. Therefore, the value of m can be 10. This represents the actual bitrate of the encoded bitstream in the video sequence. This represents the target average bitrate for all image groups in the video sequence.
[0099] For example, if the target number of bits is the second GOP in the video sequence. During the process, i is set to 2. This represents the average complexity of all P-frames in the first image group. At this point, only the first GOP in the video sequence has been encoded. This represents the average complexity of all P-frames encoded in the first GOP of the video sequence.
[0100] When i > 1, the above formula can more accurately determine the target number of bits for the i-th image group in the video sequence. .
[0101] Step 102: When the frame to be encoded is a frame other than the first I-frame and the first P-frame in the i-th image group, the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group is determined based on the peak signal-to-noise ratio of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the average peak signal-to-noise ratio of all encoded P-frames in the video sequence, and the peak signal-to-noise ratio of each encoded frame before the frame to be encoded in the i-th image group.
[0102] The first frame of each image group is an I-frame. During bitrate control of frames other than the first I-frame and the first P-frame in the i-th image group, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group is determined based on the peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded, the average PSNR of all encoded P-frames in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group. In other words, during bitrate control of frames other than the first I-frame and the first P-frame in the i-th image group, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group is determined based on the above parameters. The scene switching factor of the preceding encoded frame adjacent to the frame to be encoded can not only effectively detect scene changes but also determine whether the change is from a relatively simple scene to a complex scene or vice versa. For example, a larger scene switching factor indicates a change from a relatively simple scene to a complex scene, while a smaller scene switching factor indicates a change from a relatively complex scene to a relatively simple scene. The preceding coded frame is adjacent to the frame to be encoded and is closer to the first I-frame in the i-th picture group than the frame to be encoded.
[0103] For example, in the aforementioned case, if the frame structure of each GOP in the video sequence is IPPP, then during the bitrate control process for the second P-frame of the first GOP, the frame to be encoded is the second P-frame of the first GOP, and the preceding encoded frame adjacent to the frame to be encoded is the first P-frame of the first GOP. All encoded P-frames in this video sequence are the first P-frame of the first GOP. The average peak signal-to-noise ratio (PSNR) of all encoded P-frames in this video sequence is the PNR of the first encoded P-frame in the first GOP. In the first GOP, the PNR of each encoded frame preceding the frame to be encoded is the PNR of the first encoded I-frame and the PNR of the first encoded P-frame in the first GOP. Then, based on the peak signal-to-noise ratio (PSNR) of the first encoded P-frame in the first GOP, the peak signal-to-noise ratio (PSNR) of the first encoded I-frame in the first GOP, and the peak signal-to-noise ratio (PSNR) of the first encoded P-frame in the first GOP, the scene switching factor of the first encoded P-frame in the first GOP is determined.
[0104] At the same bitrate, the peak signal-to-noise ratio (PSNR) is higher for frames with low encoding complexity and smooth motion, and lower for frames with high encoding complexity and rapid motion. The PSNR also changes abruptly during scene transitions. In this embodiment, based on the PSNR of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average PSNR of all P-frames encoded in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group, the scene transition factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group can be accurately determined. This not only effectively detects scene transitions but also determines whether the transition is from a relatively simple scene to a complex scene or vice versa.
[0105] Optionally, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group can be determined using the following formula. :
[0106]
[0107] Where 'a' is the first preset coefficient and 'b' is the second preset coefficient, a + b = 1, and both 'a' and 'b' are weighting factors. The closer the weighting factor is to the frame to be encoded, the more accurate the prediction of the frame is likely to be. Therefore, generally 'a' ≥ 'b'. For example, the value of 'a' can be 0.8-0.9. , The peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. The average peak signal-to-noise ratio of all P-frames encoded in the video sequence. ; Let be the peak signal-to-noise ratio (PSNR) of the d-th encoded frame preceding the frame to be encoded in the i-th image group, where d ranges from 1 to j. Let j be the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. The definition of j used below is the same as here. At the same bitrate, the PSNR is higher for frames with low coding complexity and smooth motion, and lower for frames with high coding complexity and rapid motion. The PSNR will change abruptly when a scene change occurs. The scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, determined by the above formula. It can accurately reflect the specific circumstances of scene switching.
[0108] For example, in the above case, if the frame structure of each GOP in the video sequence is IPPP, then during the bitrate control process for the second P-frame of the first GOP, the preceding encoded frame adjacent to the frame to be encoded is the first P-frame of the first GOP, where j is the frame number of the first P-frame of the first GOP in the first GOP, which is 2. This represents the peak signal-to-noise ratio (PSNR) of the first P-frame encoded within the first GOP. Within the first GOP, the d-th encoded frames preceding this frame are: [List of frames within the first GOP]. These are the peak signal-to-noise ratios (PSNRs) of the first encoded I-frame and the first encoded P-frame, respectively, within the first GOP.
[0109] Step 103: Based on the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the target number of bits in the i-th image group, the encoded number of the previous encoded frame adjacent to the frame to be encoded, the number of encoded bits in the i-th image group, the actual bit rate and frame rate of the encoded bitstream in the i-th image group, adjust the remaining target number of bits in the i-th image group before encoding the frame to be encoded.
[0110] When the frame to be encoded is any frame other than the first I-frame and the first P-frame in the i-th image group, the remaining target bits in the i-th image group before encoding the frame to be encoded are determined based on the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded, the target number of bits in the i-th image group, the encoded number of bits in the preceding encoded frame adjacent to the frame to be encoded, the number of bits already encoded in the i-th image group, the actual bitrate of the encoded bitstream in the i-th image group, and the frame rate. Specifically, this step increases the remaining target bits for scenes progressing from simple to complex, and decreases the remaining target bits for scenes progressing from complex to simple, making the remaining target bits more closely match the scene, especially providing better adaptability for video sequences with rapid motion, frequent scene changes, and complex textures.
[0111] Optionally, in step 103, the number of remaining target bits in the i-th image group before encoding the frame to be encoded is adjusted using the following formula. :
[0112]
[0113] in, For the i-th image group, the scene switching factor is the previous encoded frame adjacent to the frame to be encoded. It equals the difference between the target number of bits in the i-th image group and the number of bits already encoded in the i-th image group. This represents the number of encoded bits in the preceding encoded frame adjacent to the frame to be encoded. The actual bitrate of the encoded bitstream in this video sequence is given by , c is the third preset coefficient, and d is the fourth preset coefficient. , The first preset threshold, This is the second preset threshold. Typically, TH1 is set to 1.1-1.2, and TH2 is set to 0.8-0.9. >TH1 indicates that the scene predicted by the preceding coded frame adjacent to the frame to be coded changes drastically from complex to simple, so fewer bits are allocated to the frame to be coded. <TH2 indicates that the scene predicted by the preceding coded frame adjacent to the frame to be coded changes drastically from simple to complex, so more bits are allocated to the frame to be coded. This indicates that the scene transition may be relatively smooth. Therefore, the target bit count determined for this frame to be encoded is more closely aligned with the complexity of the frame and the degree of scene transition, resulting in smoother image quality. This is particularly beneficial for videos with frequent scene transitions, providing better encoding performance while significantly reducing encoding complexity. Encoding this frame with this target bit count not only achieves high fidelity but also effectively improves the utilization of network bandwidth resources. c and d are adjustment coefficients; typically, c can be set to 0.8 and d to 1.2.
[0114] For example, in the aforementioned case, if the frame structure of each GOP in the video sequence is IPPP, then during the bitrate control process for the second P-frame of the first GOP, the preceding encoded frame adjacent to the frame to be encoded is the first P-frame of the first GOP, where j is the frame number of the first P-frame of the first GOP in the first GOP, which is 2. This is the scene switching factor for the first P-frame of the first GOP. The number of bits encoded in the i-th frame group is equal to the number of encoded bits for the first I-frame and the first P-frame of the first GOP. The actual bitrate of the encoded bitstream in the video sequence is equal to the bitrate of the first I-frame and the first P-frame of the first GOP. It equals the target bit count of the first GOP, minus the coded bit count of the first I-frame of the first GOP, and then minus the coded bit count of the first P-frame. This represents the number of encoded bits for the first P-frame of the first GOP. The first I-frame and the first P-frame in the first GOP.
[0115] Step 104: Based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, and the number of target bits remaining before encoding the frame to be encoded in the i-th image group, determine the target number of the frame to be encoded in the i-th image group.
[0116] When the frame to be encoded is a frame other than the first I-frame and the first P-frame in the i-th image group, the target bit count for the frame to be encoded in the i-th image group is determined based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded, and the number of target bits remaining before encoding the frame to be encoded in the i-th image group. The relative complexity of the preceding encoded frame adjacent to the frame to be encoded can accurately reflect the complexity of the frame to be encoded. In determining the target bit count for the frame to be encoded in the i-th image group, the relative complexity of the preceding encoded frame adjacent to the frame to be encoded is further referenced. Therefore, the target bit count determined for the frame to be encoded is more closely aligned with the complexity of the frame to be encoded and the scene switching degree of the target encoded frame, resulting in smoother image quality. In particular, it has a better encoding effect for videos with frequent scene switching, while greatly reducing the encoding complexity. Encoding the frame to be encoded with this target bit count not only achieves high fidelity but also effectively improves the utilization of network bandwidth resources.
[0117] Optionally, the target number of bits for the frame to be encoded in the i-th image group can be determined using the following formula. :
[0118]
[0119] in, For the i-th image group, the number of target bits remaining before encoding the frame to be encoded. Let be the number of uncoded P-frames remaining in the i-th image group. For the i-th image group, the frame layer target bitrate adjustment factor is the frame layer target bitrate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded.
[0120] For example, in the above case, if the frame structure of each GOP in the video sequence is IPPP, then during the bitrate control process for the second P-frame of the first GOP, the preceding encoded frame adjacent to the frame to be encoded is the first P-frame of the first GOP, where j is the frame number of the first P-frame of the first GOP in the first GOP, which is 2. This represents the target number of bits for the second P-frame of the first GOP. This represents the number of target bits remaining before encoding the second P-frame in a GOP. In the first GOP, the remaining uncoded P-frames are the second and third P-frames. The value is 2. This is the frame layer target bitrate adjustment factor for the first P-frame already encoded in the first GOP.
[0121] Optionally, the frame layer target bitrate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded in the i-th image group can be determined using the following formula.
[0122]
[0123] in, The first preset value, The second preset value; This is the third preset value; This is the fourth preset value. Both q and q are preset constants; Let be the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group;
[0124]
[0125] Let be the relative complexity of the second-to-last encoded frame adjacent to the frame to be encoded in the i-th image group. For the i-th image group, the encoded first... The relative complexity of each P-frame. The adjacent preceding coded frame is adjacent to the adjacent second-to-first coded frame, and is closer to the first I-frame of the i-th picture group than the adjacent second-to-first coded frame. Wherein, The value can be 2.0. The value can be 1.1. The value can be 1.0. The value can be 0.8. The value can range from 1.6 to 1.9; for example, p can be 1.7. The value can range from 0.8 to 1; for example, q can be 0.95. , , , p Within the above value range, the frame layer target bitrate adjustment factor corresponding to the preceding coded frame adjacent to the frame to be encoded is determined. It is relatively accurate, and the prediction of the coded frame is also relatively accurate.
[0126] Optionally, the method may further include the following steps: setting the quantization parameters of the first I-frame in each image group of the video sequence and the quantization parameters of the first P-frame in the first image group of the video sequence to preset quantization parameters. These preset quantization parameters need to be compatible with the maximum network bandwidth. For example, the value range of these preset quantization parameters can be 20-30.
[0127] Figure 2 A schematic diagram illustrating the steps of an AVS-based variable bit rate control method according to an embodiment of the present invention is shown. The following examples further explain and illustrate this application:
[0128] If the video sequence has 3 GOPs, and the frame structure of each GOP is IPPP, refer to... Figure 2 As shown, the process of bitrate control or video encoding for this video sequence is as follows: S1, determine the target bit count of the first GOP in the video sequence as a preset bit count. This preset bit count can be less than or equal to the maximum network bandwidth. S2, set the quantization parameters of the first I-frame and the first P-frame in the first GOP of the video sequence to preset quantization parameters. S3, for the second P-frame of the first GOP, determine the scene switching factor of the first P-frame already encoded in the first GOP. Then determine the number of target bits remaining in the first GOP before encoding the second P-frame. Next, determine the target number of bits for the second P-frame in the first GOP. Complete the encoding of the second P-frame of the first GOP. S4, for the third P-frame of the first GOP, determine the scene switching factor of the second P-frame already encoded in the first GOP. Then determine the number of target bits remaining in the first GOP before encoding the third P-frame. Next, determine the target number of bits for the third P-frame in the first GOP. Once the encoding of the third P-frame of the first GOP is completed, the encoding of the first GOP is also completed.
[0129] S5, For the second GOP, determine the target number of bits for the second GOP. S6. Set the quantization parameters of the first I-frame and the first P-frame in the second GOP to the preset quantization parameters in step S2. S7. For the second P-frame of the second GOP, determine the scene switching factor of the first P-frame already encoded in the second GOP. Then determine the number of target bits remaining in the second GOP before encoding the second P-frame. Next, determine the target number of bits for the second P-frame in the second GOP. Complete the encoding of the second P-frame of the second GOP. S8, for the third P-frame of the second GOP, determine the scene switching factor of the already encoded second P-frame in the second GOP. Then determine the number of target bits remaining in the second GOP before encoding the third P-frame. Next, determine the target number of bits for the third P-frame in the second GOP. Once the encoding of the third P-frame of the second GOP is completed, the encoding of the second GOP is also completed.
[0130] S9, For the third GOP, first determine the target number of bits for the third GOP. S10, set the quantization parameters of the first I-frame and the first P-frame in the third GOP to the preset quantization parameters in step S2. S11, for the second P-frame of the third GOP, determine the scene switching factor of the first P-frame already encoded in the third GOP. Then determine the number of target bits remaining in the third GOP before encoding the second P-frame. Next, determine the target number of bits for the second P-frame in the third GOP. Complete the encoding of the second P-frame of the third GOP. S12, for the third P-frame of the third GOP, determine the scene switching factor of the already encoded second P-frame in the third GOP. Then determine the number of target bits remaining in the third GOP before encoding the third P-frame. Next, determine the target number of bits for the third P-frame in the third GOP. Once the encoding of the third P-frame of the third GOP is completed, the encoding of the third GOP is also completed, thus completing the encoding of the video sequence.
[0131] At the same bitrate, the peak signal-to-noise ratio (PSNR) is relatively high when the encoding complexity is low and the motion is smooth, and relatively low when the encoding complexity is high and the motion is intense. The PSNR will change abruptly when the scene changes. In this embodiment of the invention, based on the peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average PSNR of all P-frames encoded in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group can be accurately determined. This not only effectively detects scene switching but also determines whether the switch is from a relatively simple scene to a complex scene or vice versa. Based on the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded, the remaining target bits in the i-th image group before encoding the frame to be encoded are dynamically adjusted. For scenes progressing from simple to complex, the remaining target bits are increased; for scenes progressing from complex to simple, the remaining target bits are decreased. This results in a higher degree of scene fit, especially for video sequences with intense motion, frequent scene switching, and complex textures, exhibiting better adaptability. The relative complexity of the preceding encoded frame adjacent to the frame to be encoded can accurately reflect the complexity of the frame to be encoded. In determining the target number of bits for the frame to be encoded in the i-th image group, the relative complexity of the preceding encoded frame adjacent to the frame to be encoded is further referenced. Thus, the target number of bits determined for the frame to be encoded is more closely aligned with the complexity of the frame to be encoded and the scene switching degree of the target encoded frame, resulting in smoother image quality. In particular, it has a better encoding effect for videos with frequent scene switching and complex textures. Moreover, compared with the complex linear regression analysis in the prior art, the embodiments of the present invention only involve simple addition, subtraction, multiplication, and division operations, which greatly reduces the encoding complexity. Encoding the frame to be encoded with the target number of bits can not only achieve high fidelity but also effectively improve the utilization of network bandwidth resources and has good real-time performance.
[0132] On the one hand, the embodiments of this invention have good application scenarios for cloud service providers' video cloud services, especially suitable for scenarios with less bandwidth constraints but high quality requirements, and are suitable for on-demand, recorded, or video storage systems that are not sensitive to latency. On the other hand, the embodiments of this invention are also particularly suitable for videos with complex motion scenes, such as traffic videos, sports competition videos, and movie videos, and can maintain high clarity and relatively stable output quality.
[0133] It should be noted that, for the sake of simplicity, the method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments of the present invention are not limited to the described order of actions, because according to the embodiments of the present invention, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily essential to the embodiments of the present invention.
[0134] Figure 3 A structural block diagram of an AVS-based variable bit rate control device according to an embodiment of the present invention is shown. (Refer to...) Figure 3 As shown, this embodiment of the invention also provides a variable bit rate control device based on AVS, the device comprising:
[0135] The target bit count determination module 301 for image groups is used to determine the target bit count of the i-th image group in the video sequence; where i is the image group number of the image group to be encoded in the video sequence.
[0136] The scene switching factor determination module 302 is used to determine the scene switching factor of the preceding encoded frame in the i-th image group when the frame to be encoded is a frame other than the first I-frame and the first P-frame in the i-th image group, based on the peak signal-to-noise ratio of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average peak signal-to-noise ratio of all encoded P-frames in the video sequence, and the peak signal-to-noise ratio of each encoded frame before the frame to be encoded in the i-th image group.
[0137] The remaining target bit number adjustment module 303 is used to adjust the remaining target bit number in the i-th image group before encoding the frame to be encoded based on the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the target bit number of the i-th image group, the encoded bit number of the previous encoded frame adjacent to the frame to be encoded, the encoded bit number in the i-th image group, the actual bit rate and frame rate of the encoded bitstream in the i-th image group;
[0138] The bit count determination module 304 is used to determine the target bit count of the frame to be encoded in the i-th image group based on the relative complexity of the previous encoded frame adjacent to the frame to be encoded in the i-th image group and the target bit count remaining before encoding the frame to be encoded in the i-th image group.
[0139] Figure 4 A structural block diagram of another AVS-based variable bit rate control device according to an embodiment of the present invention is shown. Optionally, the scene switching factor determination module 302 may include:
[0140] The switching factor determination unit 3021 is used to determine, using the following formula, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. :
[0141]
[0142] Where 'a' is the first preset coefficient and 'b' is the second preset coefficient. , The peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group. The average peak signal-to-noise ratio of all P-frames encoded in the video sequence. ; Let be the peak signal-to-noise ratio of the d-th encoded frame preceding the frame to be encoded in the i-th image group, and j be the frame number of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group.
[0143] Optionally, the remaining target bit number adjustment module 303 may include:
[0144] The remaining target bit number adjustment unit 3031 is used to adjust the remaining target bit number in the i-th image group before encoding the frame to be encoded using the following formula. :
[0145]
[0146] in, In the i-th image group, the scene switching factor is the preceding encoded frame adjacent to the frame to be encoded. It is equal to the difference between the target number of the i-th image group and the number of bits already encoded in the i-th image group; The number of encoded bits in the preceding encoded frame adjacent to the frame to be encoded. The actual bitrate of the encoded bitstream in the video sequence is given by c, where c is a third preset coefficient and d is a fourth preset coefficient. , The first preset threshold, This is the second preset threshold.
[0147] Optionally, the bit count determination module 304 of the frame to be encoded may include:
[0148] The bit count determination unit 3041 for the frame to be encoded is used to determine the target bit count of the frame to be encoded in the i-th image group using the following formula. :
[0149]
[0150] in, The number of target bits remaining before encoding the frame to be encoded in the i-th image group. The number of uncoded P-frames remaining in the i-th image group. In the i-th image group, the frame layer target bitrate adjustment factor is the frame layer target bitrate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded.
[0151] Optionally, the bit count determination unit 3041 of the frame to be encoded includes: a frame layer target bit rate adjustment factor determination subunit, which is used to determine the frame layer target bit rate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded in the i-th image group using the following formula.
[0152]
[0153] in, The first preset value, The second preset value; This is the third preset value; This is the fourth preset value. Both q and q are constants; The relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group;
[0154]
[0155] The relative complexity of the second-to-last encoded frame adjacent to the frame to be encoded in the i-th image group. For the i-th image group, the encoded first... The relative complexity of each P-frame.
[0156] Optionally, the target bit count determination module 301 for the image group may include:
[0157] The first target bit number determination unit 3011 for the image group is used to determine the target bit number of the first image group in the video sequence as a preset bit number when i=1.
[0158] The second target bit count determination unit 3012 for the image group is used to determine the target bit count of the i-th image group in the video sequence using the following formula when i > 1. :
[0159]
[0160] in, , The target average bitrate for all image groups in the video sequence. The total number of image groups in the video sequence. Let i be the adjustment factor for the i-th image group. Let i be the complexity factor of the i-th image group. This is the bitstream balancing factor.
[0161] Optionally, the second unit 3012 for determining the target number of bits in the image group may include: a complexity factor determination subunit for the image group, wherein the complexity factor determination subunit is used to determine the complexity factor of the i-th image group using the following formula. :
[0162]
[0163] in, ; This is the fifth preset coefficient. To predict the average complexity of all P frames in the i-th image group, , Let P be the average complexity of all P frames in the (i-1)th image group. The average complexity of all P-frames encoded in the video sequence;
[0164] Bitstream balance factor The calculation formula is:
[0165]
[0166] Where med(.) is the intermediate value function; , The actual bitrate of the encoded bitstream in the video sequence. The target average bitrate for all image groups in the video sequence. This ensures that the deviation between the actual average bitrate and the target average bitrate does not exceed 10%.
[0167] Optionally, the device further includes:
[0168] The quantization parameter determination module is used to set the quantization parameters of the first I-frame in each image group of the video sequence and the quantization parameters of the first P-frame in the first image group of the video sequence to preset quantization parameters.
[0169] The variable bit rate control device based on AVS provided in this embodiment of the invention can achieve Figure 1 , Figure 2The steps in the method embodiment are described in detail, and can achieve the corresponding beneficial effects. To avoid repetition, they will not be described again here.
[0170] This invention also provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and executable on the processor. When the computer program is executed by the processor, it implements the steps of any of the aforementioned AVS-based variable bit rate control methods.
[0171] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
[0172] The memory may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
[0173] This invention also provides a readable storage medium storing a computer program, which, when executed by a processor, implements the steps of any of the aforementioned AVS-based variable bit rate control methods.
[0174] It should be noted that, for the sake of simplicity, the method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments of this application are not limited to the described order of actions, because according to the embodiments of this application, some steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily essential to the embodiments of this application.
[0175] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
[0176] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of the present invention.
[0177] The embodiments of the present invention have been described above with reference to the accompanying drawings. However, the present invention is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of the present invention without departing from the spirit and scope of the claims. All of these forms are within the protection scope of the present invention.
Claims
1. An AVS-based variable bit rate control method, characterized in that, The method includes: Determine the target number of bits for the i-th image group in the video sequence; where i is the image group number of the image group to be encoded in the video sequence. When the frame to be encoded is any frame other than the first I-frame and the first P-frame in the i-th image group, the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group is determined based on the peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average PSNR of all encoded P-frames in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group. A larger scene switching factor indicates a switch from a simple scene to a complex scene, while a smaller scene switching factor indicates a switch from a complex scene to a simple scene. Based on the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the target number of bits in the i-th image group, the encoded number of bits of the preceding encoded frame adjacent to the frame to be encoded, the number of encoded bits in the i-th image group, the actual bitrate and frame rate of the encoded bitstream in the i-th image group, the remaining target number of bits in the i-th image group before encoding the frame to be encoded is adjusted; for scenes progressing from simple to complex, the remaining target number of bits is increased, and for scenes progressing from complex to simple, the remaining target number of bits is decreased. Based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, and the number of target bits remaining before encoding the frame to be encoded in the i-th image group, the target number of the frame to be encoded in the i-th image group is determined; the relative complexity of the preceding encoded frame adjacent to the frame to be encoded reflects the complexity of the frame to be encoded.
2. The method of claim 1, wherein the AVS-based variable bit rate control method is characterized by, The step of determining the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group based on the peak signal-to-noise ratio (PSNR) of the frame to be encoded in the i-th image group, the average PSNR of all encoded P-frames in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group includes: The scene change factor of the previous encoded frame adjacent to the frame to be encoded in the ith image group is determined by using the following formula : wherein a is a first preset coefficient, b is a second preset coefficient, , is a peak signal-to-noise ratio of a (d-1)th encoded frame before the to-be-encoded frame in the ith image group, is an average value of peak signal-to-noise ratios of all P frames that have been encoded in the video sequence, ; is a peak signal-to-noise ratio of a (d-1)th encoded frame before the to-be-encoded frame in the ith image group, j is a frame number of a previous encoded frame adjacent to the to-be-encoded frame in the ith image group.
3. The method of claim 1, wherein the AVS-based variable bit rate control method is characterized by, The step of adjusting the remaining target bits in the i-th image group before encoding the frame to be encoded, based on the scene switching factor of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the target number of bits in the i-th image group, the encoded number of bits in the preceding encoded frame adjacent to the frame to be encoded, the encoded number of bits in the i-th image group, the actual bitrate and frame rate of the encoded bitstream in the i-th image group, includes: The target number of bits remaining before encoding the frame to be encoded in the ith image group is adjusted using the following equation : wherein, is a scene change factor of a previous encoded frame adjacent to the frame to be encoded in the ith group of pictures, is equal to a target bit number of the ith group of pictures minus a difference between a number of bits already encoded in the ith group of pictures; is a number of encoding bits of the previous encoded frame adjacent to the frame to be encoded, is an actual code rate of a code stream already encoded in the video sequence, c is a third preset coefficient, and d is a fourth preset coefficient, , is a first preset threshold, is a second preset threshold, and j is a frame sequence number of the previous encoded frame adjacent to the frame to be encoded in the ith group of pictures in the ith group of pictures.
4. The method of claim 1, wherein the AVS-based variable bit rate control method is characterized by, The determination of the target bit count of the frame to be encoded in the i-th image group, based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group and the number of target bits remaining before encoding the frame to be encoded in the i-th image group, includes: The target number of bits of the frame to be encoded in the ith image group is determined by using the following formula : wherein, is the number of the remaining target bits for encoding the frame to be encoded in the ith group of pictures, is the number of the remaining unencoded P frames in the ith group of pictures, is the frame layer target bit rate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded in the ith group of pictures, and j is the frame sequence number of the previous encoded frame adjacent to the frame to be encoded in the ith group of pictures.
5. The variable bit rate control method based on AVS according to claim 4, characterized in that, The frame layer target code rate adjustment factor corresponding to the previous encoded frame adjacent to the to-be-encoded frame in the ith image group is determined by using the following formula wherein, is a first preset value, is a second preset value; is a third preset value; is a fourth preset value, and q are both preset constants; is a relative complexity of a previous encoded frame adjacent to the frame to be encoded in the ith image group, and j is a frame sequence number of the previous encoded frame adjacent to the frame to be encoded in the ith image group. a relative complexity of a second previously encoded frame adjacent to the frame to be encoded in the ith group of pictures, a relative complexity of a second previously encoded frame adjacent to the frame to be encoded in the ith group of pictures, a relative complexity of a second previously encoded frame adjacent to the frame to be encoded in the ith group of pictures, 6. The method for variable bit rate control based on AVS according to any one of claims 1-5, characterized in that, Determining the target number of bits for the i-th image group in the video sequence includes: When i=1, the target number of bits for the first image group in the video sequence is determined as the preset number of bits; In the case of i>1, the target number of bits for the i-th GOP in the video sequence is determined using the following equation : wherein, , is the target average bit rate for all groups of pictures in the video sequence, is the total number of groups of pictures in the video sequence, is the adjustment factor for the i-th group of pictures, is the complexity factor for the i-th group of pictures, is the bitstream balancing factor.
7. The variable bit rate control method based on AVS according to claim 6, characterized in that, The complexity factor of the i-th group of images is determined using the following formula : wherein, ; is a fifth preset coefficient, is an average complexity of all P frames in the i-th GOP obtained by prediction, , is an average complexity of all P frames in the i-1-th GOP, is an average complexity of all P frames in the video sequence that have been encoded; Code stream balancing factor The calculation formula is: Wherein, med(. ) is a median function; , is the actual code rate of the coded bitstream of the video sequence, is the target average code rate of all GOPs in the video sequence.
8. The method for variable bit rate control based on AVS according to any one of claims 1-5, characterized in that, The method further includes: The quantization parameters of the first I-frame in each image group of the video sequence and the quantization parameters of the first P-frame in the first image group of the video sequence are both set to the preset quantization parameters.
9. An AVS-based variable bit rate control apparatus, characterized by, The device includes: The target bit count determination module for an image group is used to determine the target bit count of the i-th image group in the video sequence; where i is the image group number of the image group to be encoded in the video sequence. The scene switching factor determination module is used to determine the scene switching factor of the preceding encoded frame in the i-th image group when the frame to be encoded is a frame other than the first I-frame and the first P-frame in the i-th image group. This is based on the peak signal-to-noise ratio (PSNR) of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group, the average PSNR of all encoded P-frames in the video sequence, and the PSNR of each encoded frame preceding the frame to be encoded in the i-th image group. A larger scene switching factor indicates a switch from a simple scene to a complex scene, while a smaller scene switching factor indicates a switch from a complex scene to a simple scene. The remaining target bit count adjustment module is used to adjust the remaining target bit count in the i-th image group before encoding the frame to be encoded, based on the scene switching factor of the previous encoded frame adjacent to the frame to be encoded in the i-th image group, the target bit count of the i-th image group, the encoded bit count of the previous encoded frame adjacent to the frame to be encoded, the encoded bit count in the i-th image group, the actual bit rate and frame rate of the encoded bitstream in the i-th image group; for scenarios from simple to complex, the remaining target bit count is increased; for scenarios from complex to simple, the remaining target bit count is decreased. The bit count determination module for the frame to be encoded is used to determine the target bit count of the frame to be encoded in the i-th image group based on the relative complexity of the preceding encoded frame adjacent to the frame to be encoded in the i-th image group and the remaining target bit count before encoding the frame to be encoded in the i-th image group; the relative complexity of the preceding encoded frame adjacent to the frame to be encoded reflects the complexity of the frame to be encoded.
10. The AVS-based variable bit rate control apparatus of claim 9, wherein, The scene switching factor determination module includes: a switching factor determination unit configured to determine a scene switching factor of a previous encoded frame adjacent to the frame to be encoded in the ith GOP by using the following formula : wherein a is a first preset coefficient, b is a second preset coefficient, , is a peak signal to noise ratio of a (d-1)th encoded frame before the to-be-encoded frame in the ith image group, is an average of peak signal to noise ratios of all P frames that have been encoded in the video sequence, ; is a peak signal to noise ratio of a (d-1)th encoded frame before the to-be-encoded frame in the ith image group, j is a frame number of a (j-1)th encoded frame adjacent to the to-be-encoded frame in the ith image group.
11. The AVS-based variable bit rate control apparatus of claim 9, wherein, The remaining target bit number adjustment module includes: A remaining target bit number adjusting unit is configured to adjust the remaining target bit number of the i-th GOP before encoding the frame to be encoded according to the following formula : wherein, is a scene change factor of a previous encoded frame adjacent to the frame to be encoded in the ith group of pictures, is equal to a target bit number of the ith group of pictures minus a difference between a number of bits already encoded in the ith group of pictures; is a number of encoding bits of the previous encoded frame adjacent to the frame to be encoded, is an actual code rate of a code stream already encoded in the video sequence, c is a third preset coefficient, and d is a fourth preset coefficient, , is a first preset threshold, is a second preset threshold, and j is a frame sequence number of the previous encoded frame adjacent to the frame to be encoded in the ith group of pictures in the ith group of pictures.
12. The AVS-based variable bit rate control apparatus of claim 9, wherein, The bit count determination module for the frame to be encoded includes: a bit number determination unit for determining the target bit number of the frame to be encoded in the ith GOP by using the following formula : wherein, is the number of the remaining target bits for the i-th GOP before encoding the frame to be encoded, is the number of the remaining un-encoded P frames in the i-th GOP, is the frame layer target bit rate adjustment factor corresponding to the previous encoded frame adjacent to the frame to be encoded in the i-th GOP, and j is the frame sequence number of the previous encoded frame adjacent to the frame to be encoded in the i-th GOP.
13. The variable bit rate control device based on AVS according to claim 12, characterized in that, The bit number determination unit of the to-be-encoded frame comprises a frame layer target code rate adjustment factor determination subunit, which is configured to determine the frame layer target code rate adjustment factor corresponding to the previous encoded frame adjacent to the to-be-encoded frame in the ith group of pictures by using the following formula wherein, is a first preset value, is a second preset value; is a third preset value; is a fourth preset value, and q are both preset constants; is a relative complexity of a previous encoded frame adjacent to the frame to be encoded in the ith image group, and j is a frame sequence number of the previous encoded frame adjacent to the frame to be encoded in the ith image group. the relative complexity of the second previously encoded frame adjacent to the frame to be encoded in the ith group of pictures, the relative complexity of the ith previously encoded frame in the ith group of pictures, the relative complexity of the ith previously encoded P frame in the ith group of pictures.
14. The variable bit rate control device based on AVS according to any one of claims 9-13, characterized in that, The target bit count determination module for the image group includes: The first unit for determining the target bit count of an image group is used to determine the target bit count of the first image group in the video sequence as a preset bit count when i=1. The second unit for determining the target bit count of an image group is used to determine the target bit count of the i-th image group in the video sequence using the following formula when i > 1. : in, , The target average bitrate for all image groups in the video sequence. The total number of image groups in the video sequence. Let i be the adjustment factor for the i-th image group. Let i be the complexity factor of the i-th image group. This is the bitstream balancing factor.
15. The variable bit rate control device based on AVS according to claim 14, characterized in that, The second unit for determining the target bit count of the image group includes: a complexity factor determination subunit for the image group, which is used to determine the complexity factor of the i-th image group using the following formula. : in, ; This is the fifth preset coefficient. To predict the average complexity of all P frames in the i-th image group, , Let P be the average complexity of all P frames in the (i-1)th image group. The average complexity of all P-frames encoded in the video sequence; Bitstream balance factor The calculation formula is: Where med(.) is the intermediate value function; , The actual bitrate of the encoded bitstream in the video sequence. The target average bitrate for all image groups in the video sequence.
16. The variable bit rate control device based on AVS according to any one of claims 9-13, characterized in that, The device further includes: The quantization parameter determination module is used to set the quantization parameters of the first I-frame in each image group of the video sequence and the quantization parameters of the first P-frame in the first image group of the video sequence to preset quantization parameters.
17. An electronic device, characterized in that, The electronic device includes a processor, a memory, and a computer program stored in the memory and executable on the processor. When executed by the processor, the computer program implements the steps of the AVS-based variable bit rate control method according to any one of claims 1-8.
18. A readable storage medium, characterized in that, A computer program is stored on the readable storage medium, and when executed by a processor, the computer program implements the steps of the AVS-based variable bit rate control method according to any one of claims 1-8.