Inter-frame prediction method, encoding apparatus, decoding apparatus, and storage medium
By utilizing pixel information from multiple reference blocks in inter-frame prediction technology to adjust the prediction pixel range of the current block, the problem of large prediction deviation under complex scene changes is solved, thus improving the video encoding and decoding effect.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- ZTE CORP
- Filing Date
- 2025-10-28
- Publication Date
- 2026-06-25
AI Technical Summary
Existing inter-frame prediction techniques often result in significant discrepancies between the predicted and actual frames in complex scene changes, negatively impacting video encoding and decoding performance.
By performing inter-frame prediction operations based on multiple reference blocks corresponding to the current block, the predicted pixel information of the current block is determined. The pixel range of the current block is determined by the pixel information of multiple reference blocks, and the predicted pixel information is adjusted to ensure that it is within a reasonable range.
It effectively reduces the deviation between predicted pixel information and actual pixel information, ensuring the video encoding and decoding effect.
Smart Images

Figure CN2025130550_25062026_PF_FP_ABST
Abstract
Description
Inter-frame prediction method, encoding device, decoding device and storage medium
[0001] This disclosure claims priority to Chinese patent application No. 202411866342.1, filed on December 16, 2024, the entire contents of which are incorporated herein by reference. Technical Field
[0002] This disclosure relates to the field of video processing, and in particular to an inter-frame prediction method, an encoding device, a decoding device, and a storage medium. Background Technology
[0003] Audio and video coding are core components of modern multimedia communication systems, widely used in streaming media services, video conferencing, online education, and other fields. To effectively compress video data, reduce transmission bandwidth requirements, and save storage space, inter-frame prediction technology is widely employed. Inter-frame prediction predicts the content of intermediate frames by utilizing the correlation between reference frames in a video sequence, thereby reducing redundant information. Summary of the Invention
[0004] On the one hand, an inter-frame prediction method is provided, including: performing inter-frame prediction operations based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block; determining the pixel range of the current block through the pixel information of the multiple reference blocks, and adjusting the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0005] On the other hand, an inter-frame prediction method is provided, comprising: receiving a bitstream; performing inter-frame prediction operations based on multiple reference blocks corresponding to the current block in the bitstream to determine the predicted pixel information of the current block; determining the pixel range of the current block through the pixel information of the multiple reference blocks, and adjusting the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0006] In another aspect, an inter-frame prediction apparatus is provided, comprising: a processing unit; the processing unit is configured to perform inter-frame prediction operations based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block; the processing unit is further configured to determine the pixel range of the current block through the pixel information of the multiple reference blocks, and adjust the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0007] In another aspect, an encoding device is provided, comprising: a memory and a processor; the memory and the processor are coupled; the memory is used to store a computer program; and the processor implements the method described above when executing the computer program.
[0008] In another aspect, a decoding device is provided, comprising: a memory and a processor; the memory and the processor are coupled; the memory is used to store a computer program; and the processor implements the method described above when executing the computer program.
[0009] In another aspect, a computer-readable storage medium is provided, on which computer program instructions are stored, which, when executed by a processor, implement the method described above.
[0010] In another aspect, a computer program product is provided, which includes computer program instructions that, when executed by a processor, implement the method described above. Attached Figure Description
[0011] To more clearly illustrate the technical solutions in this disclosure, the accompanying drawings used in some embodiments of this disclosure will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this disclosure. For those skilled in the art, other drawings can be obtained based on these drawings.
[0012] Figure 1 is an architecture diagram of an audio and video encoding and decoding system according to some embodiments;
[0013] Figure 2 is a framework diagram of an audio and video encoding and decoding technology according to some embodiments;
[0014] Figure 3 is a flowchart of an audio / video decoding method according to some embodiments;
[0015] Figure 4 is a flowchart of an inter-frame prediction method according to some embodiments;
[0016] Figure 5 is a flowchart of another inter-frame prediction method according to some embodiments;
[0017] Figure 6 is a flowchart of another inter-frame prediction method according to some embodiments;
[0018] Figure 7 is a flowchart of another inter-frame prediction method according to some embodiments;
[0019] Figure 8 is a scene diagram of a bidirectional optical flow according to some embodiments;
[0020] Figure 9 is a flowchart of a bidirectional prediction correction according to some embodiments;
[0021] Figure 10 is a flowchart of another inter-frame prediction method according to some embodiments;
[0022] Figure 11 is a flowchart of another inter-frame prediction method according to some embodiments;
[0023] Figure 12 is a flowchart of another inter-frame prediction method according to some embodiments;
[0024] Figure 13 is a block diagram of an encoding device according to some embodiments;
[0025] Figure 14 is a block diagram of a decoding device according to some embodiments;
[0026] Figure 15 is a block diagram of an inter-frame prediction apparatus according to some embodiments. Detailed Implementation
[0027] The technical solutions of this disclosure will now be clearly and completely described with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this disclosure, and not all embodiments. Based on the embodiments of this disclosure, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this disclosure.
[0028] It should be noted that, in this disclosure, the words "exemplarily" or "for example" are used to indicate examples, illustrations, or explanations. Any embodiment or design described as "exemplarily" or "for example" in this disclosure should not be construed as being more preferred or advantageous than other embodiments or designs. Specifically, the use of the words "exemplarily" or "for example" is intended to present the relevant concepts in a specific manner.
[0029] Hereinafter, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
[0030] In the description of this disclosure, unless otherwise stated, " / " means "or," for example, A / B can mean A or B. "And / or" in this document is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, and B alone. Furthermore, "at least one" means one or more, and "more than one" means two or more.
[0031] Audio and video coding are core components of modern multimedia communication systems, widely used in streaming media services, video conferencing, online education, and other fields. To effectively compress video data, reduce transmission bandwidth requirements, and save storage space, inter-frame prediction technology is widely employed. Inter-frame prediction predicts the content of intermediate frames by utilizing the correlation between reference frames in a video sequence, thereby reducing redundant information.
[0032] For example, inter-frame prediction techniques include bi-directional gradient correction (BGC), template-based bi-directional gradient correction (BGC-TM), bi-directional optical flow (BIO), and bi-prediction with CU-level weight (BCW), involving different prediction methods such as motion estimation compensation, bi-directional prediction, and pixel accuracy prediction.
[0033] However, in some scenarios, due to the complexity of image changes and insufficient coordination between different technologies, the images predicted by inter-frame prediction technology deviate too much from the actual images, affecting the video encoding and decoding effect.
[0034] To improve video encoding and decoding performance, one approach combines the different prediction methods mentioned above to perform inter-frame prediction. Taking the combination of BGC and BIO as an example, BGC is a technique that corrects bidirectional prediction values using the difference between bidirectional reference blocks. Here, the original bidirectional prediction value is the average of the unidirectional prediction values in both directions, and the magnitude of the correction depends on the difference between the bidirectional reference blocks. However, if the BIO condition is met, a BIO processing step needs to be added before BGC correction, that is, replacing the original bidirectional prediction value (average) with the prediction value processed by BIO technology.
[0035] In this case, since the predicted value needs to be processed by BIO technology before BGC correction, the bidirectional predicted value (mean) before correction will be affected by the BIO processing. Therefore, the bidirectional predicted value after BGC correction may not be within a reasonable range of pixel values, resulting in an excessive deviation between the predicted pixel value and the actual pixel value.
[0036] Therefore, in the technical solution provided in this disclosure, the inter-frame prediction device can perform inter-frame prediction operations based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block. Furthermore, the inter-frame prediction device can also determine the pixel range of the current block using the pixel information of the multiple reference blocks. Since there is a strong correlation between the current block and its corresponding multiple reference blocks, the inter-frame prediction device can adjust the predicted pixel information of the current block using the aforementioned pixel range, thereby avoiding the problem of excessive deviation between the predicted pixel information and the actual pixel information of the current block, and ensuring the video encoding and decoding effect.
[0037] The audio and video codecs involved in the embodiments of this disclosure include, but are not limited to, a series of standards in the Audio Video Coding Standard (AVS) (e.g., first-generation AVS standards (AVS1, AVS+), second-generation AVS standards (AVS2), third-generation AVS standards (AVS4), and fourth-generation AVS standards (AVS4), etc.), H.267, H.266, Versatile Video Coding (VVC), High Efficiency Video Coding (HEVC) standard (also known as H.265), Advanced Video Coding (AVC) standard (also known as H.264, H.264 / AVC), and a series of standards in the Moving Pictures Experts Group (MPEG) (e.g., first-generation MPEG standards (MPEG-1), second-generation MPEG standards (MPEG-2), third-generation MPEG standards (MPEG-3), and fourth-generation MPEG standards (MPEG-4), etc.). This disclosure does not limit the audio and video codec-related specifications or standards involved.
[0038] For example, as shown in FIG1, an architecture diagram of an audio / video encoding / decoding system provided in an embodiment of the present disclosure is provided. The audio / video encoding / decoding system includes an encoding device 10 and a decoding device 11. The encoding device 10 and the decoding device 11 can be one or more, and the number is not limited. It should be understood that, in this example, the inter-frame prediction device provided in the present disclosure can be implemented in an audio / video encoding / decoding device with encoding and / or decoding capabilities; that is, the inter-frame prediction device can be implemented in the encoding device 10 of the audio / video encoding / decoding system, or it can be implemented in the decoding device 11 of the audio / video encoding / decoding system.
[0039] Here, the encoding device 10 includes an encoder 101 and an output interface 102, and the decoding device 11 includes a decoder 111 and an input interface 112. The output interface 102 of the encoding device 10 is connected to the input interface 112 of the decoding device 11.
[0040] Encoding device 10 encodes the original audio and video data using encoder 101 to generate a corresponding bitstream, and sends the bitstream to decoding device 11 through output interface 102. Decoding device 11 receives the bitstream from encoding device 10 through input interface 112, and decodes the bitstream using decoder 111 to generate audio and video data. For example, decoder 111 can be a high-precision mode (HPM) encoder.
[0041] For example, the encoding device 10 and decoding device 11 described above can be any of the following devices: mobile phone, tablet computer, laptop computer, PDA, mobile internet device (MID), wearable device (e.g., smartwatch, smart bracelet, pedometer, etc.), vehicle-mounted device (e.g., car, bicycle, electric vehicle, airplane, ship, train, high-speed rail, etc.), virtual reality (VR) device, augmented reality (AR) device, wireless terminal in industrial control, smart home device (e.g., refrigerator, television, air conditioner, electricity meter, etc.), smart robot, workshop equipment, wireless terminal in self-driving, wireless terminal in remote medical surgery, wireless terminal in smart grid, wireless terminal in transportation safety, wireless terminal in smart city, or smart home. Wireless terminals and flying equipment (e.g., intelligent robots, hot air balloons, drones, airplanes) in the home (referring to the internet) can also be considered as examples of terminal-side services. These include servers, computing / processing nodes, computing / processing entities, computing / processing units, and servers such as over-the-top (OTT) servers and OTT terminals (e.g., set-top boxes). OTT refers to various services provided to users by third parties other than network operators based on the operator's network. Examples of OTT services include OTT voice communication services, OTT multimedia services, and OTT data processing services.
[0042] For example, as shown in Figure 2, this is a framework diagram of an audio / video encoding / decoding technology provided in an embodiment of this disclosure, namely a block-based hybrid video encoding / decoding framework. The original audio / video data (also known as YUV or YCbCr, where Y represents luminance, and UV and CbCr represent chrominance, and can also be collectively referred to as pixel values) can be compressed into video through several key modules: prediction, transformation, quantization, entropy coding, bitrate control, and post-processing. Here, video compression is mainly performed from the perspectives of temporal redundancy and spatial redundancy. The block-based hybrid video coding framework can be described as follows:
[0043] (1) First, divide the current block according to several partitioning types.
[0044] For example, it can be divided into coding tree units (CTUs).
[0045] (2) Make predictions based on the results of the division.
[0046] This mainly involves intra-frame prediction and inter-frame prediction, used to remove spatial and temporal redundancy, respectively. For example, mode selection can determine the applicable mode for the current block, including intra-frame prediction, inter-frame prediction, and joint intra / inter-frame prediction.
[0047] Here, intra-frame prediction includes intra-frame estimation, which predicts information of adjacent uncoded blocks using the already coded blocks within the current frame, reducing spatial redundancy. Inter-frame prediction uses forward or backward reference frames for prediction, reducing temporal redundancy. Inter-frame prediction involves motion estimation, which determines the motion vectors between the current frame and the reference frame to facilitate inter-frame prediction.
[0048] (3) Transformation, scaling and quantization.
[0049] The reconstructed image obtained from the prediction is compared with the original image for difference processing. The residual is then transformed and quantized to further reduce redundancy, and finally encoded into binary using entropy coding. Here, the quantized coefficients undergo inverse transform and scaling, chroma scaling, etc., to recover the residual signal. Before this, luminance mapping can be used to further improve compression efficiency.
[0050] (4) Filtering.
[0051] Finally, post-processing operations such as deblocking filter (DBF or DBK), sample adaptive offset (SAO), and adaptive loop filter (ALF) are used to eliminate distortion problems such as blocking and ringing. Intra-frame prediction data can be generated through filter control analysis, and filter control data can be generated by combining inverse luma mapping and loop filtering. The video signal can be output by decoding the image buffer.
[0052] Furthermore, rate control indicators adjust encoding parameters to adapt to target bit rate and quality requirements. The rate control data, quantization coefficients, intra-frame prediction data, filter control data, motion data, and other information involved in the above operations can be encoded in the header information of the bitstream using an encoding algorithm (e.g., context-based adaptive binary arithmetic coding (CABAC)) to meet the header information format requirements defined in the corresponding encoding specifications, ultimately outputting the bitstream.
[0053] For example, as shown in FIG3, a flowchart of audio and video decoding provided in an embodiment of the present disclosure is applicable to the video encoding and decoding framework of current video encoding standards. Here, the decoding process of decoding device 11 may include the following steps (1)-(4):
[0054] Step (1): Decode the encoded binary stream.
[0055] For example, binary code streams can be decoded using CABAC.
[0056] Step (2): Perform inverse quantization and inverse transformation on the decoded result data.
[0057] Step (3): Based on the prediction results of the mode selection (such as motion compensation and intra-frame prediction), add the prediction results and the results after inverse quantization and inverse transformation to obtain the reconstructed image.
[0058] Step (4): Use post-processing operations such as DBF, SAO, and ALF to post-process the reconstructed image and store the processed image in the decoded image buffer to output the video signal.
[0059] It should be noted that the various embodiments of this disclosure can be referenced or learned from each other. For example, the same or similar steps, method embodiments, system embodiments and device embodiments can be referenced from each other without limitation.
[0060] Taking an inter-frame prediction device as an example of a coding device, Figure 4 is a flowchart of an inter-frame prediction method provided by an embodiment of this disclosure. As shown in Figure 4, the method includes the following steps:
[0061] Step 401: Perform inter-frame prediction operation based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block.
[0062] For example, inter-frame prediction operations include bidirectional prediction, motion estimation compensation, and pixel precision prediction. Here, bidirectional prediction correction techniques include BGC, BGC-TM, BIO, BCW, and other techniques. The aforementioned inter-frame prediction operations can also refer to other algorithms not specified for inter-frame prediction.
[0063] In some embodiments, the predicted pixel information for the current block includes the predicted pixel values of target pixels in the current block. Target pixels are all or some of the pixels in the current block.
[0064] In this embodiment, the pixel information or pixel value can be pixel-related information of the corresponding block or pixel. For example, it can be luminance information (or luminance value), chrominance information (or chrominance value), etc., or it can be information obtained by function mapping of luminance information, chrominance information, etc.
[0065] In this disclosure, a "block" refers to a basic unit used for encoding or decoding in audio and video codecs. This unit can be a coding tree unit (CTU or CTUs), a control unit (CU), a prediction unit (PU), a transform unit (TU), a macroblock, a sub-block, etc. Different audio and video codec standards use different terminology; this disclosure uses only the term "block" for illustrative purposes. Here, a block corresponds to one or more pixels in a frame. For example, a CU can correspond to a set of pixels with sizes of 8x8, 16x16, 32x32, or 64x64.
[0066] During the encoding process, the encoding device determines relevant information (prediction information, motion information, etc.) of the reference block and the current block based on inter-frame prediction operations, and carries it in the data positions corresponding to the current block and the reference block in the bitstream. The inter-frame prediction operations in the encoding process and the inter-frame prediction operations in the decoding process are usually corresponding operations. For specific implementation methods, refer to relevant inter-frame prediction algorithms. The inter-frame prediction method provided in this disclosure is applicable to various inter-frame prediction algorithms, and no limitation is made on the inter-frame prediction algorithms involved.
[0067] Step 402: Determine the pixel range of the current block using the pixel information of multiple reference blocks, and adjust the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0068] For example, in the encoding process, the encoding device can obtain pixel information of a reference block from a decoded block.
[0069] In some embodiments, the pixel information of the reference block includes the pixel value of the pixel corresponding to the target pixel in the reference block.
[0070] In some embodiments, the pixel range of the current block includes the pixel interval corresponding to the target pixel in the current block. The upper limit of the pixel interval is determined by the maximum pixel value of the pixels corresponding to the target pixel in multiple reference blocks, and the lower limit of the pixel interval is determined by the minimum pixel value of the pixels corresponding to the target pixel in multiple reference blocks.
[0071] For example, the pixel range of the current block can be composed of the pixel interval corresponding to the target pixel in the current block. In one example, the upper limit of the pixel interval can be the maximum pixel value of the pixel corresponding to the target pixel in multiple reference blocks, and the lower limit of the pixel interval can be the minimum pixel value of the pixel corresponding to the target pixel in multiple reference blocks.
[0072] In some embodiments, the adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block. The adjusted pixel value of the target pixel in the current block is located within the pixel interval corresponding to the target pixel.
[0073] Based on the above technical solution, the inter-frame prediction device can perform inter-frame prediction operations based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block. Furthermore, the inter-frame prediction device can also determine the pixel range of the current block using the pixel information of the multiple reference blocks. Since there is a strong correlation between the current block and its corresponding multiple reference blocks, the inter-frame prediction device can adjust the predicted pixel information of the current block using the aforementioned pixel range, thereby avoiding the problem of excessive deviation between the predicted pixel information and the actual pixel information of the current block, and ensuring the video encoding and decoding effect.
[0074] In some embodiments, the adjusted pixel value of the target pixel in the current block can be adjusted by range restriction or function mapping.
[0075] In one example, the adjusted pixel value of the target pixel in the current block satisfies at least one of the following 1-3:
[0076] 1. If the predicted pixel value of the target pixel is greater than the upper limit of the corresponding pixel interval (that is, the target pixel is the pixel whose pixel value needs to be adjusted), the adjusted pixel value of the target pixel is the upper limit of the corresponding pixel interval.
[0077] 2. If the predicted pixel value of the target pixel is less than the lower limit of the corresponding pixel interval (that is, the target pixel is the pixel whose pixel value needs to be adjusted), the adjusted pixel value of the pixel is the lower limit of the corresponding pixel interval.
[0078] 3. When the predicted pixel value of the target pixel is less than or equal to the upper limit of the corresponding pixel interval and greater than or equal to the lower limit of the corresponding pixel interval, the adjusted pixel value of the pixel is the predicted pixel value of the pixel.
[0079] In another example, the adjusted pixel value of the target pixel can be mapped to the corresponding pixel interval using a function. This function can be determined by the pixel interval corresponding to the pixel. For example, this function can be a continuous function (such as a linear function, a non-linear function, etc.) or a discontinuous function (such as a piecewise function, a gradient function, etc.). In the example above, the condition satisfied by the adjusted pixel value of the target pixel can also be understood as a piecewise function.
[0080] Thus, when the predicted pixel information determined by the inter-frame prediction operation deviates little from the actual image information—that is, the adjusted pixel value of the target pixel in the current block is within the corresponding pixel interval—the inter-frame prediction device does not need to adjust the predicted pixel information to ensure the normal execution of the inter-frame prediction operation. However, when the predicted pixel information determined by the inter-frame prediction operation deviates too much from the actual image information—that is, the adjusted pixel value of the target pixel in the current block is not within the corresponding pixel interval—the inter-frame prediction device can adjust and limit the predicted pixel information to achieve a smoother pixel transition and ensure the video encoding and decoding effect.
[0081] As one embodiment of this disclosure, and in conjunction with the embodiment shown in FIG4, as shown in FIG5, the above step 401 can also be implemented by the following steps 501-502.
[0082] Step 501: For each of the multiple reference blocks, determine the unidirectional predicted pixel information of the current block based on the pixel information of the reference block.
[0083] Here, unidirectional predicted pixel information is used to characterize the pixel information of the current block predicted from the pixel information of the reference block. For example, in BGC, motion estimation determines the motion vector of the reference block relative to the current block, and motion compensation is performed based on the motion vector to obtain the unidirectional predicted pixel information of the current block.
[0084] Taking BGC in bidirectional prediction as an example, multiple reference blocks include forward reference blocks and backward reference blocks. The forward reference block is the block in the forward reference frame corresponding to the current block, and the backward reference block is the block in the backward reference frame corresponding to the current block. Prediction can be performed based on both the forward and backward directions, obtaining the unidirectional predicted pixel information pred0 and pred1 for the current block. pred0 is the unidirectional predicted pixel information for the current block determined based on the forward reference block, and pred1 is the unidirectional predicted pixel information for the current block determined based on the backward reference block.
[0085] BGC is applicable in the following situations:
[0086] (1) The forward reference frame (Ref0) and the backward reference frame (Ref1) are valid frames.
[0087] A valid frame is a frame that meets specific standards or conditions and can be correctly encoded, transmitted, decoded, and displayed.
[0088] (2) The number of pixels in the current block is greater than or equal to 256.
[0089] Step 502: Determine the predicted pixel information of the current block based on the unidirectional predicted pixel information of the current block corresponding to each of the multiple reference blocks.
[0090] Based on the above information, the uncorrected bidirectional prediction value predBI can be obtained from pred0 and pred1. For example, the mean information of the unidirectional predicted pixel information pred0 and pred1. If no correction is needed, the predicted pixel information of the current block can be the mean information of the unidirectional predicted pixel information; if correction is needed, the predicted pixel information of the current block is the pixel information after correcting the mean information predBI of the unidirectional predicted pixel information.
[0091] For example, the predicted pixel information for the current block is determined by the following formula 1:
[0092] Here, pred represents the predicted pixel information for the current block. BI This represents the mean value of the unidirectional predicted pixel information of multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
[0093] BigFlag and BigIdx are two syntax elements that can be transmitted in the bitstream (e.g., corresponding to a normal inter-frame CU). The value of k can be set according to the actual situation. For example, it can be set to 3. When the current prediction mode is skip mode or direct mode, BigFlag, BigIdx and other motion information can be obtained from surrounding blocks or from the historical motion vector list, and do not need to be transmitted in the bitstream.
[0094] In some embodiments, the inter-frame prediction device can implement the BGC process described above using a fast decision-making method, thereby reducing coding complexity. When the correction conditions are met, the design conditions skip the partial rate-distortion optimization (RDO) process.
[0095] For example, the RDO cost (also called cost) when BigFlag = 0 is calculated and denoted as rdoCostWoBIG. The current best RDO cost is denoted as rdoCostBest. The RDO process at this time is as follows:
[0096] If rdoCostWoBIG is greater than 1.2*rdoCostBest, BigFlag = 0; otherwise, BigFlag = 1, and BigIdx is determined by RDO.
[0097] As one embodiment of this disclosure, and in conjunction with the embodiment shown in FIG5, as shown in FIG6, the above step 502 can also be implemented by the following steps 601-602.
[0098] Step 601: Based on the unidirectional predicted pixel information of the current block corresponding to each of the multiple reference blocks, determine the cost and predicted pixel information of the current block in each of the multiple correction modes.
[0099] When determining the predicted pixel information of the current block, templates can be used to derive the BigFlag and BigIdx symbol identifiers under different prediction modes (such as skip / direct, unidirectional motion vector estimation (UMVE) mode).
[0100] Here, for the current block, the template is obtained from the reconstruction regions adjacent to the top and left of the current block. For the reference block, the template is obtained from the reconstruction regions adjacent to the top and left of the reference block. For example, this reconstruction region could be the adjacent reconstruction regions in the top row and left column.
[0101] After determining the corresponding prediction mode, the cost and predicted pixel information of the current block under different correction modes can be further determined. For example, correction modes include no correction, forward correction, and backward correction. This correction mode can also be called a gradient correction mode. The cost may include template cost. Taking BGC-TM as an example, the derivation process is as follows:
[0102] The first step is to construct a candidate list of motion information and the corresponding inherited BigFlag and BigIdx lists.
[0103] The second step is to iterate through all motion information candidates.
[0104] If a motion information candidate is a bidirectional prediction and contains two valid reference frames, then perform the following operations:
[0105] (1) Calculate the cost under various correction modes.
[0106] (2) Select the one with the lowest cost as the correction mode corresponding to the motion information candidate.
[0107] For example, combining with Formula 1 above, when BigFlag = 0, the corresponding correction mode is no correction; when BigFlag = 1 and BigIdx = 0, the corresponding correction mode is forward correction; when BigFlag = 1 and BigIdx = 1, the corresponding correction mode is backward correction. The cost of the current block under different correction modes can be determined by the predicted pixel information determined based on Formula 1 above and the reconstructed pixel information of the current block for the corresponding correction mode.
[0108] Step 602: Determine that the predicted pixel information of the current block is the predicted pixel information corresponding to the target correction mode.
[0109] Here, the target correction mode is the correction mode with the lowest cost among multiple correction modes.
[0110] For example, the predicted pixel information corresponding to all correction modes can be calculated in the above manner, and the correction mode with the lowest cost is selected as the correction mode for bidirectional prediction of the current block. The current block is then corrected to obtain the final predicted pixel information of the current block. The values of BigFlag and BigIdx corresponding to this correction mode can be transmitted to the decoding device through the bitstream.
[0111] As one embodiment of this disclosure, the inter-frame prediction device can perform weighted calculations on the unidirectional predicted pixel information corresponding to the reference block to reduce bias. Referring to the embodiment shown in FIG5, as shown in FIG7, the above step 502 can also be implemented through the following steps 701-702.
[0112] Step 701: Determine the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block.
[0113] Here, the weighting coefficient can be used to characterize the degree of influence of the corresponding one-way predicted pixel information on the predicted pixel information of the current block.
[0114] In some embodiments, the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the cost corresponding to all or part of the templates of each reference block in the plurality of reference blocks.
[0115] For example, taking multiple reference blocks including forward reference blocks and backward reference blocks as an example, the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block can be determined by the following formula 2:
[0116] Here, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, cost0 represents the cost corresponding to all or part of the template of the forward reference block, and cost1 represents the cost corresponding to all or part of the template of the backward reference block.
[0117] For example, the aforementioned cost can be the RDO cost, which can be used to evaluate the rate-distortion performance corresponding to all or part of the templates of the reference block. This cost can be determined by the predicted pixel information and reconstructed pixel information corresponding to all or part of the templates of the reference block. Therefore, for unidirectional predicted pixel information with lower cost, i.e., corresponding to better rate-distortion performance, a higher weight coefficient can be configured; for unidirectional predicted pixel information with higher cost, i.e., corresponding to poorer rate-distortion performance, a lower weight coefficient can be configured.
[0118] Step 702: Determine the predicted pixel information of the current block based on the unidirectional predicted pixel information corresponding to each reference block in the multiple reference blocks and the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block.
[0119] For example, taking multiple reference blocks including a forward reference block and a backward reference block as an example, the predicted pixel information of the current block can be determined by the following formula 3:
[0120] Here, pred represents the predicted pixel information for the current block. BI This represents the mean value of unidirectional predicted pixel information for multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction strength, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
[0121] Based on the above technical solution, the inter-frame prediction device can determine the weight coefficient of the unidirectional prediction pixel information corresponding to each reference block, and then realize the weighted calculation of the unidirectional prediction pixel information corresponding to the reference block, so as to balance the influence of different reference blocks and reduce the pixel deviation between the actual image and the actual image.
[0122] In one embodiment of this disclosure, the current block and the reference block are transmitted via a bitstream. The bitstream includes a first identifier. The first identifier is used to indicate whether the predicted pixel information of the current block is determined based on the weighting coefficients of the unidirectional predicted pixel information corresponding to each reference block.
[0123] When the inter-frame prediction method provided in this disclosure is applied to the encoding end, the encoding end can determine whether to determine the prediction pixel information of the current block based on the weight coefficient of the unidirectional prediction pixel information corresponding to each reference block, and carry the above-mentioned first identifier in the bit stream (e.g., the sequence header of the bit stream) so that the decoding end can identify it.
[0124] For example, the first identifier can be bgc_weight_enable_flag. As shown in Table 1 below, for the decoding end, the weighted calculation method in the above-mentioned correction process provided in this disclosure only takes effect when bgc_weight_enable_flag is the first value (e.g., 1); for the encoding end, the encoding end can determine whether it is necessary to determine the predicted pixel information of the current block based on the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block, and carry the indication result as the value of bgc_weight_enable_flag in the bitstream, and indicate it by sending the bitstream to the decoding end.
[0125] Table 1. List of Sequence Header Definitions
[0126] Here, the sequence header includes the video sequence start code (video_sequence_start_code), as well as the profile identifier (profile_id), level identifier (level_id), and BGC indicator identifier (bgc_enable_flag). The BGC indicator identifier includes the BGC-TM indicator identifier (bgc_tm_enable_flag). The BGC-TM indicator identifier includes the first identifier (bgc_weight_enable_flag). f(n) represents an n-bit fixed-pattern string, and u(n) represents an n-bit unsigned integer.
[0127] As one embodiment of this disclosure, the mean information of the unidirectional predicted pixel information can be corrected by bidirectional optical flow. For example, under the condition of bidirectional optical flow, the above-mentioned mean information can be corrected by bidirectional optical flow.
[0128] For example, other inter-frame correction techniques exist before bidirectional gradient correction (BGC). BIO, as a correction technique preceding BGC, has a certain impact on BGC. The BIO technique utilizes optical flow principles to perform motion compensation for the remaining motion after bidirectional prediction.
[0129] As shown in Figure 8, the forward reference frame Ref0 includes the forward prediction block corresponding to the current block, and the backward reference frame Ref1 includes the backward prediction block corresponding to the current block. The current block is located in the current frame B-picture. BIO calculates the gradient values in the x and y directions (i.e., v in Figure 8) for each pixel of the forward and backward prediction blocks. y τ0, v x τ0、v y τ1, v xτ1), and calculates the calculation factor for each pixel based on the pixel value and gradient value. BIO is calculated in 4x4 units, and the motion offset can be further calculated using the above calculation factor. Each pixel within the 4x4 sub-block shares the motion offset. Here (MV x0 MV y0 This refers to the motion vector offset of the forward prediction block, (MV). x1 MV y1 This refers to the motion vector offset of the backward prediction block. Each 4x4 non-overlapping block has a set of motion vector offsets.
[0130] Taking BGC combined with BIO as an example, as shown in Figure 9, after determining the mean information pred BI Then, the inter-frame prediction device can determine whether the BIO condition is met. If the BIO condition is not met, the mean information is pre-defined. BI The value is directly used as input to the BGC, i.e., directly as the value before BGC correction. At this time, the predicted pixel information of the current block can be determined by Equation 1 or Equation 3 above. Subsequently, the inter-frame prediction device can perform subsequent operations based on the predicted pixel information of the BGC, such as clipping / limiting (CLIP) operations.
[0131] If the BIO condition is met, the inter-frame prediction device can first perform BIO correction, and then pred the BIO-corrected prediction information. BIO As input to BGC, that is, pred in formula 1 or formula 3 above. BI Replace with pred BIO .
[0132] However, after BIO correction, the BGC correction result is difficult to guarantee to fall between the pixels of the two reference blocks, which may lead to excessive deviation between the determined predicted pixel information and the actual image information. Therefore, the predicted pixel information can be restricted based on the technical solution provided in this disclosure.
[0133] Furthermore, under the condition of bidirectional optical flow, the inter-frame prediction method provided in this disclosure can still choose to perform bidirectional optical flow correction or not perform bidirectional optical flow correction, so as to further reduce the occurrence of excessive deviation between predicted pixel information and actual image information.
[0134] In some embodiments, the current block and the reference block are transmitted via a bitstream. The bitstream includes a second identifier. The second identifier is used to indicate whether, under bidirectional optical flow conditions, the mean information of the unidirectional predicted pixel information of the multiple reference blocks is corrected using bidirectional optical flow.
[0135] It should be understood that the function of the second identifier described above can be expressed in other ways. For example, the second identifier is used to indicate whether, under the condition of satisfying bidirectional optical flow, the predicted pixel information of the current block is determined based on the average information of the unidirectional predicted pixel information of multiple reference blocks. All the above expressions represent the same meaning, that is, they are used to indicate that when determining the predicted pixel information of the current block, the information used is the average information pred of the unidirectional predicted pixel information of multiple reference blocks. BI It can also be the correction information pred obtained after bidirectional optical flow correction. BIO In other words, the second identifier can be used to determine whether bidirectional optical flow affects the predicted pixel information of bidirectional prediction when bidirectional prediction is performed and the bidirectional optical flow condition is met.
[0136] When the inter-frame prediction method provided in this disclosure is applied to the encoding end, the encoding end can determine whether to perform bidirectional optical flow correction on the mean information under the condition of satisfying the bidirectional optical flow condition, and carry the above-mentioned second identifier in the bit stream (e.g., the CU of the bit stream) so that the decoding end can identify it.
[0137] For example, the second flag can be bgc_mean_flag. As shown in Table 2 below, when bgc_mean_flag is a first value (e.g., a non-zero value), it indicates that the predicted pixel information of the current block is determined based on the mean information predBI, that is, no bidirectional optical flow correction is performed on the mean information of the unidirectional predicted pixel information of multiple reference blocks. When bgc_mean_flag is a second value different from the first value (e.g., zero), it indicates that the predicted pixel information of the current block is determined based on the correction information predBIO, that is, bidirectional optical flow correction is performed on the mean information of the unidirectional predicted pixel information of multiple reference blocks.
[0138] Table 2 List of Encoding Unit Definitions
[0139] Here, the coding unit includes parameters such as position (x0, y0), width, height, mode, and components. It also includes the second flag (bgc_mean_flag). u(n) represents an n-bit unsigned integer, and ae(v) represents a context-adaptive entropy-coded syntax element.
[0140] In some embodiments, taking multiple reference blocks including a forward reference block and a backward reference block as an example, if the bidirectional optical flow condition is not met, the predicted pixel information of the current block is determined by the above formula 1 or formula 3.
[0141] Under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula 4 or formula 5:
[0142] Here, pred represents the predicted pixel information for the current block. BI Pred represents the mean value of unidirectional predicted pixel information from multiple reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, BigIdx is used to indicate the correction direction, and bgc_mean_flag represents the second flag.
[0143] In some embodiments, the current block and the reference block are transmitted via a bitstream. The bitstream includes a third identifier, which indicates whether the bidirectional optical flow condition is met.
[0144] For example, the third identifier can be BioFlag. When BioFlag = 1, it indicates that the bidirectional optical flow condition is met; when BioFlag = 0, it indicates that the bidirectional optical flow condition is not met.
[0145] Taking an inter-frame prediction device as an example of a decoding device, Figure 10 is a flowchart of an inter-frame prediction method provided by an embodiment of this disclosure. As shown in Figure 10, the method includes the following steps:
[0146] Step 1001: Receive the bitstream.
[0147] For example, the encoding device can encode and generate a corresponding bitstream based on the original audio and video data, and send the bitstream to the decoding device.
[0148] Step 1002: Perform inter-frame prediction operation based on multiple reference blocks corresponding to the current block in the bitstream to determine the predicted pixel information of the current block.
[0149] In some embodiments, the predicted pixel information for the current block includes the predicted pixel values of target pixels in the current block. Target pixels are all or some of the pixels in the current block.
[0150] During the decoding process, the decoding device obtains relevant information about the current block and the reference block from the bitstream based on inter-frame prediction operations to perform inter-frame prediction operations. The inter-frame prediction operations in the decoding process are usually corresponding operations to those in the encoding process. For a related description, please refer to step 401 above, which will not be repeated here.
[0151] Step 1003: Determine the pixel range of the current block using the pixel information of multiple reference blocks, and adjust the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0152] For example, during the decoding process, the decoding device can obtain the pixel information of the reference block from the bitstream or from previously decoded blocks.
[0153] In some embodiments, the pixel information of the reference block includes the pixel value of the pixel corresponding to the target pixel in the reference block.
[0154] In some embodiments, the pixel range of the current block includes the pixel interval corresponding to the target pixel in the current block. The upper limit of the pixel interval is determined by the maximum pixel value of the pixels corresponding to the target pixel in multiple reference blocks, and the lower limit of the pixel interval is determined by the minimum pixel value of the pixels corresponding to the target pixel in multiple reference blocks.
[0155] For example, the pixel range of the current block can be composed of the pixel interval corresponding to the target pixel in the current block. In one example, the upper limit of the pixel interval can be the maximum pixel value of the pixel corresponding to the target pixel in multiple reference blocks, and the lower limit of the pixel interval can be the minimum pixel value of the pixel corresponding to the target pixel in multiple reference blocks.
[0156] In some embodiments, the adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block. The adjusted pixel value of the target pixel in the current block is located within the pixel interval corresponding to the target pixel.
[0157] In some embodiments, the adjusted pixel value of the target pixel in the current block can be adjusted by range restriction or function mapping.
[0158] In one example, the adjusted pixel value of the target pixel in the current block satisfies at least one of the following 1-3:
[0159] 1. If the predicted pixel value of the target pixel is greater than the upper limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the upper limit of the corresponding pixel interval.
[0160] 2. If the predicted pixel value of the target pixel is less than the lower limit of the corresponding pixel interval, the adjusted pixel value of the pixel is the lower limit of the corresponding pixel interval.
[0161] 3. When the predicted pixel value of the target pixel is less than or equal to the upper limit of the corresponding pixel interval and greater than or equal to the lower limit of the corresponding pixel interval, the adjusted pixel value of the pixel is the predicted pixel value of the pixel.
[0162] In another example, the adjusted pixel value of the target pixel can be mapped to the corresponding pixel interval using a function. This function can be determined by the pixel interval corresponding to the pixel. For example, this function can be a continuous function (such as a linear function, a non-linear function, etc.) or a discontinuous function (such as a piecewise function, a gradient function, etc.). In the example above, the condition satisfied by the adjusted pixel value of the target pixel can also be understood as a piecewise function.
[0163] For related explanations of the above content, please refer to the corresponding implementation examples on the encoding side, which will not be repeated here.
[0164] As an embodiment of this disclosure, and in conjunction with the embodiment shown in FIG10, as shown in FIG11, the above step 1002 can also be implemented by the following steps 1101-1102.
[0165] Step 1101: For each of the multiple reference blocks, determine the unidirectional predicted pixel information of the current block based on the pixel information of the reference block.
[0166] Here, unidirectional predicted pixel information is used to characterize the pixel information of the current block predicted from the pixel information of the reference block. For example, in BGC, motion estimation is used to determine the motion vector of the reference block relative to the current block, and motion compensation is performed based on the motion vector to obtain the unidirectional predicted pixel information of the current block.
[0167] For related explanations, please refer to the description in step 501 above, which will not be repeated here.
[0168] Step 1102: Determine the predicted pixel information of the current block based on the unidirectional predicted pixel information of the current block corresponding to each of the multiple reference blocks.
[0169] In some embodiments, the predicted pixel information of the current block can be determined by Formula 1 above, which will not be elaborated here.
[0170] For example, the correction mode can be distinguished by relevant identifiers in the bitstream. Examples include symbols like BigFlag and BigIdx. During the decoding process, the decoding device can obtain these identifiers from the bitstream to determine the corresponding target correction mode.
[0171] For related explanations, please refer to the description in step 502 above, which will not be repeated here.
[0172] As one embodiment of this disclosure, the inter-frame prediction device can perform weighted calculations on the unidirectional prediction pixel information corresponding to the reference block to reduce bias. Referring to the embodiment shown in FIG11, as shown in FIG12, the above step 1102 can also be implemented through the following steps 1201-1202.
[0173] Step 1201: Determine the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block.
[0174] Here, the weighting coefficient can be used to characterize the degree of influence of the corresponding one-way predicted pixel information on the predicted pixel information of the current block.
[0175] In some embodiments, the bitstream may include weighting coefficients. The decoding device can directly obtain the weighting coefficients of the unidirectional predicted pixel information corresponding to each reference block from the bitstream.
[0176] In some embodiments, the bitstream may include information related to weighting coefficients. The decoding device can determine the weighting coefficients for the unidirectional predicted pixel information corresponding to each reference block based on the relevant information.
[0177] In some embodiments, the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the cost corresponding to all or part of the templates of each reference block in the plurality of reference blocks.
[0178] For example, the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the above formula 2, which will not be elaborated here.
[0179] Step 1202: Determine the predicted pixel information of the current block based on the unidirectional predicted pixel information corresponding to each reference block in the multiple reference blocks and the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block.
[0180] For example, the predicted pixel information of the current block is determined by the above formula 3, which will not be elaborated here.
[0181] As one embodiment of this disclosure, the bitstream includes a first identifier. The first identifier is used to indicate whether the predicted pixel information of the current block is determined based on the weighting coefficients of the unidirectional predicted pixel information corresponding to each reference block.
[0182] When the inter-frame prediction method provided in this disclosure is applied to the decoding end, the decoding end can obtain the first identifier from the bitstream, and then determine whether to determine the prediction pixel information of the current block based on the weight coefficient of the unidirectional prediction pixel information corresponding to each reference block according to the first identifier.
[0183] For example, the first flag can be bgc_weight_enable_flag. The definition of this flag can be found in Table 1 above, and will not be repeated here. For the decoding end, bgc_weight_enable_flag can be parsed from the bitstream. If bgc_weight_enable_flag is 1, then the same weighted calculation method as the encoding end is performed to ensure encoding and decoding consistency.
[0184] As one embodiment of this disclosure, the mean information of the unidirectional predicted pixel information can be corrected by bidirectional optical flow. For example, under the condition of bidirectional optical flow, the above-mentioned mean information can be corrected by bidirectional optical flow.
[0185] Under the condition of bidirectional optical flow, the inter-frame prediction method provided in this disclosure can still choose to perform bidirectional optical flow correction or not perform bidirectional optical flow correction, so as to further reduce the occurrence of excessive deviation between predicted pixel information and actual image information.
[0186] In some embodiments, the bitstream includes a second identifier. The second identifier is used to indicate whether, under the condition of bidirectional optical flow, the mean information of the unidirectional predicted pixel information of multiple reference blocks is subject to bidirectional optical flow correction.
[0187] When the inter-frame prediction method provided in this disclosure is applied to the decoding end, the decoding end can obtain the second identifier from the bitstream, and then determine whether to perform bidirectional optical flow correction on the mean information under the condition of satisfying the bidirectional optical flow condition based on the second identifier.
[0188] For details regarding the second identifier, please refer to the corresponding implementation examples on the coding side mentioned above; they will not be repeated here.
[0189] In some embodiments, taking multiple reference blocks including a forward reference block and a backward reference block as an example, if the bidirectional optical flow condition is not met, the predicted pixel information of the current block is determined by the above formula 1 or formula 3.
[0190] Under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by formula 4 or formula 5 above.
[0191] In some embodiments, the bitstream includes a third identifier. The third identifier is used to indicate whether the bidirectional optical flow condition is met.
[0192] For example, the third identifier can be BioFlag. When BioFlag = 1, it indicates that the bidirectional optical flow condition is met; when BioFlag = 0, it indicates that the bidirectional optical flow condition is not met.
[0193] It is understood that, in order to achieve the above-mentioned functions, the inter-frame prediction device includes hardware structures and / or software modules corresponding to the execution of each function. Those skilled in the art should readily recognize that, based on the algorithm steps of the examples described in conjunction with the embodiments of this disclosure, this disclosure can be implemented in hardware or a combination of hardware and computer software. Whether a function is executed in hardware or by computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this disclosure.
[0194] This disclosure embodiment can divide the inter-frame prediction device into functional modules according to the above method embodiment. For example, each function can be divided into a separate functional module, or two or more functions can be integrated into one functional module. The integrated module can be implemented in hardware or software. It should be noted that the module division in this disclosure embodiment is illustrative and only represents one logical functional division. In actual implementation, there may be other division methods. The following description uses the example of dividing each functional module according to each function.
[0195] For example, taking an inter-frame prediction device as an encoding device, Figure 13 is a schematic diagram of the structure of an encoding device provided in an embodiment of this disclosure. The encoding device can execute the inter-frame prediction method provided in the above-described method embodiment. As shown in Figure 13, the encoding device 130 includes a processing unit 1301 and a communication unit 1302.
[0196] The processing unit 1301 is used to perform inter-frame prediction operations based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block.
[0197] The processing unit 1301 is further configured to determine the pixel range of the current block through the pixel information of multiple reference blocks, and adjust the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0198] In some embodiments, the predicted pixel information of the current block includes the predicted pixel value of the target pixel in the current block, where the target pixel is all or some of the pixels in the current block; the pixel information of the reference block includes the pixel value of the pixel in the reference block that corresponds to the target pixel.
[0199] In some embodiments, the pixel range of the current block includes the pixel interval corresponding to the target pixel in the current block; the upper limit of the pixel interval is determined by the maximum pixel value of the pixel corresponding to the target pixel in a plurality of reference blocks; and the lower limit of the pixel interval is determined by the minimum pixel value of the pixel corresponding to the target pixel in a plurality of reference blocks.
[0200] In some embodiments, the adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block is located within the pixel range corresponding to the target pixel.
[0201] In some embodiments, the adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block satisfies at least one of the following: if the predicted pixel value of the target pixel is greater than the upper limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the upper limit of the corresponding pixel interval; if the predicted pixel value of the target pixel is less than the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the lower limit of the corresponding pixel interval; if the predicted pixel value of the target pixel is less than or equal to the upper limit of the corresponding pixel interval and greater than or equal to the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the predicted pixel value of the target pixel.
[0202] In some embodiments, the processing unit 1301 is configured to, for each of the plurality of reference blocks, determine the unidirectional predicted pixel information of the current block based on the pixel information of the reference blocks; and determine the predicted pixel information of the current block based on the unidirectional predicted pixel information of the current block corresponding to each of the plurality of reference blocks.
[0203] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula:
[0204] Here, pred represents the predicted pixel information for the current block. BI This represents the mean value of the unidirectional predicted pixel information of multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
[0205] In some embodiments, the processing unit 1301 is configured to determine the cost and predicted pixel information corresponding to each correction mode in a plurality of correction modes based on the unidirectional predicted pixel information of the current block corresponding to each reference block in a plurality of reference blocks; determine the predicted pixel information of the current block as the predicted pixel information corresponding to the target correction mode; and the target correction mode is the correction mode with the lowest cost among the plurality of correction modes.
[0206] In some embodiments, the processing unit 1301 is used to determine the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block; and to determine the predicted pixel information of the current block based on the unidirectional predicted pixel information corresponding to each reference block and the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block.
[0207] In some embodiments, the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the cost corresponding to all or part of the templates of each reference block in the plurality of reference blocks.
[0208] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; the weighting coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the following formula:
[0209] Here, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, cost0 represents the cost corresponding to all or part of the template of the forward reference block, and cost1 represents the cost corresponding to all or part of the template of the backward reference block.
[0210] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula:
[0211] Here, pred represents the predicted pixel information for the current block. BI This represents the mean value of unidirectional predicted pixel information for multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction strength, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
[0212] In some embodiments, the current block and the reference block are transmitted via a bitstream, which includes a first identifier. The first identifier is used to indicate whether the predicted pixel information of the current block is determined based on the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block.
[0213] In some embodiments, the current block and the reference block are transmitted via a bitstream, which includes a second identifier. The second identifier is used to indicate whether, under the condition of satisfying the bidirectional optical flow condition, the mean information of the unidirectional predicted pixel information of the multiple reference blocks is subject to bidirectional optical flow correction.
[0214] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula:
[0215] Here, pred represents the predicted pixel information for the current block. BI Pred represents the mean value of unidirectional predicted pixel information from multiple reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. `pred0` represents the unidirectional predicted pixel information corresponding to the forward reference block, `pred1` represents the unidirectional predicted pixel information corresponding to the backward reference block, `k` represents the correction intensity, `>>` represents the right shift operator, `BigFlag` indicates whether to correct the predicted pixel information, `BigIdx` indicates the correction direction, and `bgc_mean_flag` represents the second flag. Alternatively, under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula:
[0216] Here, Pred represents the predicted pixel information for the current block. BI Pred represents the mean value of unidirectional predicted pixel information from multiple reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, BigIdx is used to indicate the correction direction, and bgc_mean_flag represents the second flag.
[0217] For example, taking the inter-frame prediction device as a decoding device, Figure 14 is a schematic diagram of the structure of a decoding device provided in an embodiment of this disclosure. The decoding device can execute the inter-frame prediction method provided in the above-described method embodiment. As shown in Figure 14, the decoding device 140 includes: a processing unit 1401 and a communication unit 1402.
[0218] The communication unit 1402 is used to receive the code stream from the encoding device.
[0219] The processing unit 1401 is used to perform inter-frame prediction operations based on multiple reference blocks corresponding to the current block in the bitstream to determine the predicted pixel information of the current block.
[0220] The processing unit 1401 is used to determine the pixel range of the current block through the pixel information of multiple reference blocks, and adjust the predicted pixel information of the current block based on the pixel range to obtain the adjusted pixel information of the current block.
[0221] In some embodiments, the predicted pixel information of the current block includes the predicted pixel value of the target pixel in the current block, where the target pixel is all or some of the pixels in the current block; the pixel information of the reference block includes the pixel value of the pixel in the reference block that corresponds to the target pixel.
[0222] In some embodiments, the pixel range of the current block includes the pixel interval corresponding to the target pixel in the current block; the upper limit of the pixel interval is determined by the maximum pixel value of the pixel corresponding to the target pixel in a plurality of reference blocks; and the lower limit of the pixel interval is determined by the minimum pixel value of the pixel corresponding to the target pixel in a plurality of reference blocks.
[0223] In some embodiments, the adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block is located within the pixel range corresponding to the target pixel.
[0224] In some embodiments, the adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block satisfies at least one of the following:
[0225] If the predicted pixel value of the target pixel is greater than the upper limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the upper limit of the corresponding pixel interval.
[0226] If the predicted pixel value of the target pixel is less than the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the lower limit of the corresponding pixel interval.
[0227] If the predicted pixel value of the target pixel is less than or equal to the upper limit of the corresponding pixel interval and greater than or equal to the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the predicted pixel value of the target pixel.
[0228] In some embodiments, the processing unit 1401 is configured to, for each of the plurality of reference blocks, determine the unidirectional predicted pixel information of the current block based on the pixel information of the reference blocks; and determine the predicted pixel information of the current block based on the unidirectional predicted pixel information of the current block corresponding to each of the plurality of reference blocks.
[0229] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula:
[0230] Here, Pred represents the predicted pixel information for the current block. BI This represents the mean value of the unidirectional predicted pixel information of multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
[0231] In some embodiments, the processing unit 1401 is used to determine the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block; and to determine the predicted pixel information of the current block based on the unidirectional predicted pixel information corresponding to each reference block and the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block.
[0232] In some embodiments, the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the cost corresponding to all or part of the templates of each reference block in the plurality of reference blocks.
[0233] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; the weighting coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the following formula:
[0234] Here, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, cost0 represents the cost corresponding to all or part of the template of the forward reference block, and cost1 represents the cost corresponding to all or part of the template of the backward reference block.
[0235] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula:
[0236] Here, Pred represents the predicted pixel information for the current block. BI This represents the mean value of unidirectional predicted pixel information for multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction strength, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
[0237] In some embodiments, the bitstream includes a first identifier; the first identifier is used to indicate whether the predicted pixel information of the current block is determined based on the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block.
[0238] In some embodiments, the bitstream includes a second identifier; the second identifier is used to indicate whether, under the condition of bidirectional optical flow, the mean information of the unidirectional predicted pixel information of multiple reference blocks is subject to bidirectional optical flow correction.
[0239] In some embodiments, the plurality of reference blocks include a forward reference block and a backward reference block; under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula:
[0240] Here, Pred represents the predicted pixel information for the current block. BI Pred represents the mean value of unidirectional predicted pixel information from multiple reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether the predicted pixel information is corrected, BigIdx is used to indicate the correction direction, and bgc_mean_flag represents the second flag.
[0241] Alternatively, if the bidirectional optical flow condition is met, the predicted pixel information for the current block can be determined using the following formula:
[0242] Here, Pred represents the predicted pixel information for the current block. BI Pred represents the mean value of unidirectional predicted pixel information from multiple reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, BigIdx is used to indicate the correction direction, and bgc_mean_flag represents the second flag.
[0243] In implementing the functions of the integrated modules described above in hardware, this disclosure provides another possible structure for the inter-frame prediction device involved in the above embodiments. This inter-frame prediction device can be an encoding device or a decoding device. As shown in FIG15, the inter-frame prediction device 150 includes: a processor 1502 and a bus 1504. For example, the inter-frame prediction device 150 may further include a memory 1501.
[0244] Processor 1502 may implement or execute various exemplary logic blocks, modules, and circuits described in conjunction with embodiments of this disclosure. Processor 1502 may be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute various exemplary logic blocks, modules, and circuits described in conjunction with embodiments of this disclosure. Processor 1502 may also be a combination that implements computing functions, for example, including one or more microprocessor combinations, a combination of a digital signal processor (DSP) and a microprocessor, etc.
[0245] The memory 1501 may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), disk storage medium or other magnetic storage device, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but is not limited thereto.
[0246] In some embodiments, the memory 1501 may exist independently of the processor 1502. The memory 1501 may be connected to the processor 1502 via a bus 1504 and is used to store instructions or program code. When the processor 1502 calls and executes the instructions or program code stored in the memory 1501, it can implement the method described in any embodiment of this disclosure.
[0247] In other embodiments, the memory 1501 may also be integrated with the processor 1502.
[0248] Bus 1504 can be an extended industry standard architecture (EISA) bus, etc. Bus 1504 can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used to represent it in Figure 15, but this does not mean that there is only one bus or one type of bus.
[0249] Some embodiments of this disclosure provide a computer-readable storage medium (e.g., a non-transitory computer-readable storage medium) storing computer program instructions that, when executed on a computer, cause the computer to perform the methods described in any of the above embodiments.
[0250] Exemplary examples show that the aforementioned computer-readable storage media may include, but are not limited to: magnetic storage devices (e.g., hard disks, floppy disks, or magnetic tapes), optical discs (e.g., compact disks (CDs), digital versatile disks (DVDs), etc.), smart cards, and flash memory devices (e.g., erasable programmable read-only memory (EPROMs), cards, sticks, or key drives, etc.). The various computer-readable storage media described in this disclosure may represent one or more devices and / or other machine-readable storage media for storing information. The term "machine-readable storage medium" may include, but is not limited to, wireless channels and various other media capable of storing, containing, and / or carrying instructions and / or data.
[0251] This disclosure provides a computer program product containing instructions that, when run on a computer, cause the computer to perform the methods described in any of the above embodiments.
[0252] The above description is merely a specific embodiment of this disclosure, but the scope of protection of this disclosure is not limited thereto. Any changes or substitutions within the technical scope disclosed in this disclosure should be included within the scope of protection of this disclosure. Therefore, the scope of protection of this disclosure should be determined by the scope of the claims.
Claims
1. An inter-frame prediction method, wherein, The method includes: Inter-frame prediction is performed based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block; The pixel range of the current block is determined by the pixel information of the multiple reference blocks, and the predicted pixel information of the current block is adjusted based on the pixel range to obtain the adjusted pixel information of the current block.
2. The method according to claim 1, wherein, The predicted pixel information of the current block includes the predicted pixel value of the target pixel in the current block, wherein the target pixel is all or some of the pixels in the current block; the pixel information of the reference block includes the pixel value of the pixel in the reference block that corresponds to the target pixel.
3. The method according to claim 2, wherein, The pixel range of the current block includes the pixel interval corresponding to the target pixel in the current block; the upper limit of the pixel interval is determined by the maximum pixel value of the pixel corresponding to the target pixel in the plurality of reference blocks; the lower limit of the pixel interval is determined by the minimum pixel value of the pixel corresponding to the target pixel in the plurality of reference blocks.
4. The method according to claim 3, wherein, The adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block is located within the pixel interval corresponding to the target pixel.
5. The method according to claim 3, wherein, The adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block satisfies at least one of the following: If the predicted pixel value of the target pixel is greater than the upper limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the upper limit of the corresponding pixel interval. If the predicted pixel value of the target pixel is less than the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the lower limit of the corresponding pixel interval. If the predicted pixel value of the target pixel is less than or equal to the upper limit of the corresponding pixel interval and greater than or equal to the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the predicted pixel value of the target pixel.
6. The method according to claim 1, wherein, The step of performing inter-frame prediction based on multiple reference blocks corresponding to the current block to determine the predicted pixel information of the current block includes: For each of the plurality of reference blocks, the unidirectional predicted pixel information of the current block is determined based on the pixel information of the reference block; The predicted pixel information of the current block is determined based on the unidirectional predicted pixel information of the current block corresponding to each of the plurality of reference blocks.
7. The method according to claim 6, wherein, The plurality of reference blocks includes a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI The mean value of the unidirectional predicted pixel information of the plurality of reference blocks is represented by: pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
8. The method according to claim 6, wherein, The step of determining the predicted pixel information of the current block based on the unidirectional predicted pixel information of the current block corresponding to each of the plurality of reference blocks includes: Based on the unidirectional predicted pixel information of the current block corresponding to each of the plurality of reference blocks, the cost and predicted pixel information of the current block in each of the plurality of correction modes are determined; The predicted pixel information of the current block is determined to be the predicted pixel information corresponding to the target correction mode; the target correction mode is the correction mode with the lowest cost among the multiple correction modes.
9. The method according to claim 6, wherein, The step of determining the predicted pixel information of the current block based on the unidirectional predicted pixel information corresponding to each of the plurality of reference blocks includes: Determine the weighting coefficients for the unidirectional predicted pixel information corresponding to each reference block; The predicted pixel information of the current block is determined based on the unidirectional predicted pixel information corresponding to each of the plurality of reference blocks and the weight coefficient of the unidirectional predicted pixel information corresponding to each of the plurality of reference blocks.
10. The method according to claim 9, wherein, The weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the cost corresponding to all or part of the templates of each of the plurality of reference blocks.
11. The method according to claim 10, wherein, The plurality of reference blocks include forward reference blocks and backward reference blocks; the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the following formula: Here, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, cost0 represents the cost corresponding to all or part of the templates of the forward reference block, and cost1 represents the cost corresponding to all or part of the templates of the backward reference block.
12. The method according to claim 9, wherein, The plurality of reference blocks includes a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI This represents the mean value of the unidirectional predicted pixel information of the multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
13. The method according to claim 6, wherein, The current block and the reference block are transmitted via a bitstream, the bitstream including a first identifier; the first identifier is used to indicate whether the predicted pixel information of the current block is determined based on the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block.
14. The method according to claim 6, wherein, The current block and the reference block are transmitted via a bitstream, which includes a second identifier. The second identifier is used to indicate whether to perform bidirectional optical flow correction on the mean information of the unidirectional predicted pixel information of the plurality of reference blocks when the bidirectional optical flow condition is met.
15. The method according to claim 14, wherein, The plurality of reference blocks includes a forward reference block and a backward reference block; under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI Pred represents the mean information of the unidirectional predicted pixel information of the plurality of reference blocks. BIO This indicates the correction information obtained by bidirectional optical flow correction of the mean information, pred0 indicates the unidirectional predicted pixel information corresponding to the forward reference block, pred1 indicates the unidirectional predicted pixel information corresponding to the backward reference block, k indicates the correction intensity, >> indicates the right shift operator, BigFlag is used to indicate whether the predicted pixel information is corrected, BigIdx is used to indicate the correction direction, and bgc_mean_flag indicates the second flag; Alternatively, under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI Pred represents the mean information of the unidirectional predicted pixel information of the plurality of reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, BigIdx is used to indicate the correction direction, and bgc_mean_flag represents the second identifier.
16. An inter-frame prediction method, wherein, The method includes: Receive bitstream; Inter-frame prediction is performed based on multiple reference blocks corresponding to the current block in the bitstream to determine the predicted pixel information of the current block; The pixel range of the current block is determined by the pixel information of the multiple reference blocks, and the predicted pixel information of the current block is adjusted based on the pixel range to obtain the adjusted pixel information of the current block.
17. The method according to claim 16, wherein, The predicted pixel information of the current block includes the predicted pixel value of the target pixel in the current block, wherein the target pixel is all or some of the pixels in the current block; the pixel information of the reference block includes the pixel value of the pixel in the reference block that corresponds to the target pixel.
18. The method according to claim 17, wherein, The pixel range of the current block includes the pixel interval corresponding to the target pixel in the current block; the upper limit of the pixel interval is determined by the maximum pixel value of the pixel corresponding to the target pixel in the plurality of reference blocks; the lower limit of the pixel interval is determined by the minimum pixel value of the pixel corresponding to the target pixel in the plurality of reference blocks.
19. The method according to claim 18, wherein, The adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block is located within the pixel interval corresponding to the target pixel.
20. The method according to claim 18, wherein, The adjusted pixel information of the current block includes the adjusted pixel value of the target pixel in the current block; the adjusted pixel value of the target pixel in the current block satisfies at least one of the following: If the predicted pixel value of the target pixel is greater than the upper limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the upper limit of the corresponding pixel interval. If the predicted pixel value of the target pixel is less than the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the lower limit of the corresponding pixel interval. If the predicted pixel value of the target pixel is less than or equal to the upper limit of the corresponding pixel interval and greater than or equal to the lower limit of the corresponding pixel interval, the adjusted pixel value of the target pixel is the predicted pixel value of the target pixel.
21. The method according to claim 16, wherein, The step of performing inter-frame prediction based on multiple reference blocks corresponding to the current block in the bitstream to determine the predicted pixel information of the current block includes: For each of the plurality of reference blocks, the unidirectional predicted pixel information of the current block is determined based on the pixel information of the reference block; The predicted pixel information of the current block is determined based on the unidirectional predicted pixel information of the current block corresponding to each of the plurality of reference blocks.
22. The method according to claim 21, wherein, The plurality of reference blocks includes a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI The mean value of the unidirectional predicted pixel information of the plurality of reference blocks is represented by: pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
23. The method according to claim 21, wherein, The step of determining the predicted pixel information of the current block based on the unidirectional predicted pixel information corresponding to each of the plurality of reference blocks includes: Determine the weighting coefficients for the unidirectional predicted pixel information corresponding to each reference block; The predicted pixel information of the current block is determined based on the unidirectional predicted pixel information corresponding to each of the plurality of reference blocks and the weight coefficient of the unidirectional predicted pixel information corresponding to each of the plurality of reference blocks.
24. The method according to claim 23, wherein, The weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the cost corresponding to all or part of the templates of each of the plurality of reference blocks.
25. The method according to claim 24, wherein, The plurality of reference blocks include forward reference blocks and backward reference blocks; the weight coefficients of the unidirectional predicted pixel information corresponding to each reference block are determined by the following formula: Here, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, cost0 represents the cost corresponding to all or part of the templates of the forward reference block, and cost1 represents the cost corresponding to all or part of the templates of the backward reference block.
26. The method according to claim 23, wherein, The plurality of reference blocks includes a forward reference block and a backward reference block; the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI This represents the mean value of the unidirectional predicted pixel information of the multiple reference blocks. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, and BigIdx is used to indicate the correction direction.
27. The method according to claim 21, wherein, The bitstream includes a first identifier; the first identifier is used to indicate whether the predicted pixel information of the current block is determined based on the weight coefficient of the unidirectional predicted pixel information corresponding to each reference block.
28. The method according to claim 21, wherein, The bitstream includes a second identifier; the second identifier is used to indicate whether, under the condition of bidirectional optical flow, the mean information of the unidirectional predicted pixel information of the plurality of reference blocks is corrected by bidirectional optical flow.
29. The method according to claim 28, wherein, The plurality of reference blocks includes a forward reference block and a backward reference block; under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula: Here, Pred represents the predicted pixel information of the current block. BI Pred represents the mean information of the unidirectional predicted pixel information of the plurality of reference blocks. BIO This indicates the correction information obtained by bidirectional optical flow correction of the mean information, pred0 indicates the unidirectional predicted pixel information corresponding to the forward reference block, pred1 indicates the unidirectional predicted pixel information corresponding to the backward reference block, k indicates the correction intensity, >> indicates the right shift operator, BigFlag is used to indicate whether the predicted pixel information is corrected, BigIdx is used to indicate the correction direction, and bgc_mean_flag indicates the second flag; Alternatively, under the condition of bidirectional optical flow, the predicted pixel information of the current block is determined by the following formula: Here, pred represents the predicted pixel information of the current block. BI Pred represents the mean information of the unidirectional predicted pixel information of the plurality of reference blocks. BIO This represents the correction information obtained by bidirectional optical flow correction of the mean information. pred0 represents the unidirectional predicted pixel information corresponding to the forward reference block, w0 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the forward reference block, pred1 represents the unidirectional predicted pixel information corresponding to the backward reference block, w1 represents the weight coefficient of the unidirectional predicted pixel information corresponding to the backward reference block, k represents the correction intensity, >> represents the right shift operator, BigFlag is used to indicate whether to correct the predicted pixel information, BigIdx is used to indicate the correction direction, and bgc_mean_flag represents the second identifier.
30. An encoding device, wherein, include: Memory and processor; Memory and processor are coupled; The memory is used to store instructions that can be executed by the processor; When the processor executes the instructions, it performs the method as described in any one of claims 1 to 15.
31. A decoding device, wherein, include: Memory and processor; Memory and processor are coupled; The memory is used to store instructions that can be executed by the processor; When the processor executes the instructions, it performs the method as described in any one of claims 16 to 29.
32. A computer-readable storage medium, wherein, The computer-readable storage medium stores computer instructions that, when executed on a computer, cause the computer to perform the method as described in any one of claims 1 to 15, or the method as described in any one of claims 16 to 29.
33. A computer program product, wherein, The computer program product includes computer program instructions that, when executed by a processor, implement the method as described in any one of claims 1 to 15, or implement the method as described in any one of claims 16 to 29.