Decoding method, encoding method, decoder, and encoder
The decoding and encoding methods with geometric division modes and transformations address the challenge of improving compression efficiency in digital video coding, enhancing decompression efficiency and video clarity.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Patents
- Current Assignee / Owner
- GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD
- Filing Date
- 2022-04-12
- Publication Date
- 2026-07-01
AI Technical Summary
Existing digital video compression technologies face challenges in improving compression efficiency to meet the growing demand for high video clarity and efficient data transmission and storage.
A decoding method and encoding method that incorporate geometric division modes and additional transformations to enhance prediction and residual block processing, utilizing a decoder and encoder with specific units for transformation, prediction, and reconstruction to improve decompression efficiency.
Enhances decompression efficiency by optimizing video coding processes through geometric partitioning and additional transformations, resulting in improved compression performance.
Smart Images

Figure 0007883595000009 
Figure 0007883595000010 
Figure 0007883595000011
Abstract
Description
Technical Field
[0001] Embodiments of the present application relate to the technical field of image video coding, and more specifically, to a decoding method, an encoding method, a decoder, and an encoder.
Background Art
[0002] Digital video compression technology is mainly a technology for compressing huge digital video data for transmission, storage, etc. With the rapid increase in Internet videos and the growing demand of people for video clarity, although video decompression technology can be realized with existing digital video compression standards, in order to improve compression efficiency, better digital video decompression technology is needed.
Summary of the Invention
[0003] Embodiments of the present application provide a decoding method, an encoding method, a decoder, and an encoder, thereby improving compression efficiency.
[0004] In a first aspect, the present application provides a decoding method. This decoding method includes decoding a bitstream to obtain a first transform coefficient of a current block; performing a first transform on the first transform coefficient of the current block to obtain a second transform coefficient of the current block; performing a second transform on the second transform coefficient to obtain a residual block of the current block; performing prediction on the current block based on a first prediction mode and a second prediction mode corresponding to a geometric partitioning mode to obtain a predicted block of the current block; and obtaining a reconstructed block of the current block based on the predicted block of the current block and the residual block of the current block.
[0005] In a second aspect, the present invention provides an encoding method. This encoding method includes: making a prediction for a current block based on a first prediction mode and a second prediction mode corresponding to a geometric division mode, and obtaining a predicted block of the current block; obtaining a residual block of the current block based on the predicted block of the current block; performing a third transformation on the residual block of the current block and obtaining a third transformation coefficient of the current block; performing a fourth transformation on the third transformation coefficient and obtaining a fourth transformation coefficient of the current block; and encoding the fourth transformation coefficient.
[0006] In a third aspect, the present invention provides a decoder comprising a decoding unit, a transformation unit, a prediction unit, and a reconstruction unit. The decoding unit is configured to decode a bitstream and obtain a first transformation coefficient of the current block. The transformation unit is configured to perform a first transformation on the first transformation coefficient to obtain a second transformation coefficient of the current block; and to perform a second transformation on the second transformation coefficient to obtain a residual block of the current block. The prediction unit is configured to make a prediction on the current block based on a first prediction mode and a second prediction mode corresponding to a geometric partitioning mode, and to obtain a predicted block of the current block. The reconstruction unit is configured to obtain a reconstructed block of the current block based on the predicted block of the current block and the residual block of the current block.
[0007] In a fourth aspect, the present invention provides an encoder comprising a prediction unit, a residual unit, a transformation unit, and an encoding unit. The prediction unit is configured to make predictions on the current block based on a first prediction mode and a second prediction mode corresponding to a geometric division mode, and to obtain a predicted block of the current block. The residual unit is configured to obtain a residual block of the current block based on the predicted block of the current block. The transformation unit is configured to perform a third transformation on the residual block of the current block and to obtain a third transformation coefficient of the current block; and to perform a fourth transformation on the third transformation coefficient and to obtain a fourth transformation coefficient of the current block. The encoding unit is configured to encode the fourth transformation coefficient.
[0008] In a fifth aspect, the present invention provides a decoder, which includes a processor and a computer-readable storage medium. The processor is configured to execute computer instructions. Computer instructions are stored in the computer-readable storage medium, and when the computer instructions are read and executed by the processor, the decoding method of the first aspect or each embodiment thereof is realized.
[0009] In one embodiment, there is one or more processors and one or more memories.
[0010] In one embodiment, the computer-readable storage medium may be integrated with the processor, or it may be installed separately from the processor.
[0011] In a sixth aspect, the present invention provides an encoder, which includes a processor and a computer-readable storage medium. The processor is configured to execute computer instructions. Computer instructions are stored in the computer-readable storage medium, and when the computer instructions are read and executed by the processor, the encoding method of the second aspect or each embodiment thereof is realized.
[0012] In one embodiment, there is one or more processors and one or more memories.
[0013] In one embodiment, the computer-readable storage medium may be integrated with the processor, or it may be installed separately from the processor.
[0014] In a seventh aspect, the present invention provides a computer-readable storage medium. Computer instructions are stored in this computer-readable storage medium, and when these computer instructions are read and executed by the processor of a computer device, the computer device executes the decoding method according to the first aspect or the encoding method according to the second aspect.
[0015] In an eighth aspect, the present application provides a bitstream, which is either the bitstream described in the first aspect or the bitstream described in the second aspect.
[0016] Based on the above technical proposal, this invention can improve the decompression efficiency of the current block by incorporating the first transformation in addition to the geometric division mode and the second transformation. [Brief explanation of the drawing]
[0017] [Figure 1] Block diagram showing an encoding framework according to an embodiment of the present application. [Figure 2]It is a schematic diagram showing the specific directions of 65 types of angle prediction modes according to the embodiments of the present application. [Figure 3] It is an example of a reference sample of a wide-angle prediction mode according to the embodiments of the present application. [Figure 4] It is a schematic diagram of the MIP mode according to the embodiments of the present application. [Figure 5] It is a schematic diagram showing the derivation of a prediction mode based on DIMD according to the embodiments of the present application. [Figure 6] It is a schematic diagram showing the derivation of a prediction block based on DIMD according to the embodiments of the present application. [Figure 7] It is a schematic diagram of a template used in TIMD according to the embodiments of the present application. [Figure 8] It is an example of a weight diagram corresponding to 64 types of weight derivation modes in GPM for a square block according to the embodiments of the present application. [Figure 9] It is an example of a dividing line of a weight derivation mode according to the embodiments of the present application. [Figure 10] It is an example of a weight diagram corresponding to 56 types of weight derivation modes in AWP for a square block according to the embodiments of the present application. [Figure 11] It is a schematic diagram of GPM or AWP according to the embodiments of the present application. [Figure 12] It is an example of a DCT2-type base image according to the embodiments of the present application. [Figure 13] It is an example of LFNST according to the embodiments of the present application. [Figure 14] It is an example of a conversion matrix set of LFNST according to the embodiments of the present application. [Figure 15] It is a block diagram showing a decoding framework according to the embodiments of the present application. [Figure 16] It is a flowchart showing a decoding method according to the embodiments of the present application. [Figure 17] It is a flowchart showing an encoding method according to the embodiments of the present application. [Figure 18] It is a block diagram showing a decoder according to the embodiments of the present application. [Figure 19] It is a block diagram showing an encoder according to an embodiment of the present application. [Figure 20] It is a block diagram showing an electronic device according to an embodiment of the present application.
Embodiments for Carrying out the Invention
[0018] Hereinafter, referring to the drawings, the technical solution in the embodiment of the present application will be described.
[0019] The solution according to the embodiment of the present application can be applied to the technical field of digital video coding, and the technical field includes, for example, the field of image coding, the field of video coding, the field of hardware video coding, the field of dedicated circuit video coding, and the field of real-time video coding, but is not limited thereto. Further, the solution according to the embodiment of the present application can be combined with an audio video coding standard (AVS), a second-generation AVS standard (AVS2), or a third-generation AVS standard (AVS3). As an example, it can be combined with the H.264 / audio video coding (AVC) standard, the H.265 / high efficiency video coding (HEVC) standard, and the H.266 / versatile video coding (VVC) standard, but is not limited thereto. Further, by the solution according to the embodiment of the present application, lossy compression or lossless compression can be performed on an image. The lossless compression may be visually lossless compression or mathematically lossless compression.
[0020] A block-based mixed coding framework is used in video coding standards. Each image (frame) in video is divided into a square largest coding unit (LCU) or coding tree unit (CTU) of the same size (e.g., 128x128, 64x64, etc.). Each largest coding unit or coding tree unit can also be divided into rectangular coding units (CU) based on rules. Coding units can be further divided into prediction units (PU), transform units (TU), etc. The mixed coding framework includes modules such as prediction, transform, quantization, entropy coding, and in-loop filtering. The prediction module includes intra-prediction and inter-prediction. Inter-prediction includes motion estimation and motion compensation. Because there is a strong correlation between adjacent samples in a single image of a video, video coding techniques utilize intra-prediction methods to eliminate spatial redundancy between adjacent samples. In intra-prediction, sample information within the current segmented block is predicted by referring only to information from the same single frame of image. Because there is a strong similarity between adjacent images in a video, video coding techniques can improve coding efficiency by using inter-prediction methods to eliminate temporal redundancy between adjacent images. In inter-prediction, image information from different frames is referred to, and motion estimation is used to find the motion vector information that best matches the current segmented block. Through transformation, the predicted block is transformed into the frequency domain, and energy redistribution is performed.By combining transformation and quantization, information that is difficult for the human eye to perceive can be removed, and this is used to eliminate visual redundancy. Entropy coding can remove character redundancy based on the current context model and the probabilistic information of the binary bitstream.
[0021] In the digital video encoding process, the encoder can first read a grayscale or color image from the original video sequence, and then encode the grayscale or color image. Here, the grayscale image may contain samples of the lumen component, and the color image may contain samples of the chroma component. Selectively, the color image may further contain samples of the lumen component. The color format of the original video sequence may be a luminance-chroma (YCbCr, YUV) format or a red-green-blue (RGB) format, etc. Specifically, after reading a grayscale or color image, the encoder divides it into blocks, generates a predicted block of the current block by performing intra-prediction or inter-prediction on the current block, obtains a residual block by subtracting the predicted block from the original block of the current block, transforms and quantizes the residual block to obtain a quantization coefficient matrix, and outputs the quantization coefficient matrix as an entropy code to a bitstream. In the digital video decoding process, the decoding side generates a predicted block of the current block by performing intra-prediction or inter-prediction on the current block. Furthermore, the decoding side decodes the bitstream to obtain the quantization coefficient matrix, inversely quantizes and inversely transforms the quantization coefficient matrix to obtain the residual block, and adds the predicted block and the residual block to obtain the reconstructed block. The reconstructed block forms the reconstructed image. The decoding side loop-filters the reconstructed image based on the image or block to obtain the decoded image.
[0022] The current block can be the current coding unit (CU) or the current prediction unit (PU), among others.
[0023] Furthermore, the encoding side also requires processing similar to that on the decoding side in order to obtain the decoded image. The decoded image can be used as a reference image for interpretation of subsequent images. Mode information or parameter information such as block partitioning information, prediction, transformation, quantization, entropy coding, and loop filtering, which are determined on the encoding side, are output to the bitstream as needed. The decoding side analyzes and processes the existing information to determine the same block partitioning information, prediction, transformation, quantization, entropy coding, and loop filtering mode information or parameter information as on the encoding side. This ensures that the decoded image obtained on the encoding side is the same as the decoded image obtained on the decoding side. The decoded image obtained on the encoding side is usually also called the reconstructed image. During prediction, the current block may be divided into prediction units, and during transformation, the current block may be divided into transformation units, and the division of the prediction units and transformation units may be the same or different. Naturally, the above is the basic flow of video coding in a block-based mixed coding framework. As technology advances, some modules or steps in the flow of the framework may be optimized. This application applies to the basic flow of a video codec in a mixed coding framework based on the said block.
[0024] To facilitate understanding, we will first briefly explain the encoding framework related to this application.
[0025] Figure 1 is a block diagram showing an encoding framework 100 according to an embodiment of the present application.
[0026] As shown in Figure 1, the encoding framework 100 may include an intra-prediction unit 180, an inter-prediction unit 170, a residual unit 110, a transform and quantization unit 120, an entropy encoding unit 130, an inverse transform and inverse quantization unit 140, and a loop filtering unit 150. Optionally, the encoding framework 100 may further include a decoded image buffer unit 160. The encoding framework 100 is also called a mixed framework encoding mode.
[0027] The intra-prediction unit 180 or inter-prediction unit 170 can make predictions for blocks awaiting coding and output predicted blocks. The residual unit 110 can calculate residual blocks, i.e., the difference between predicted blocks and blocks awaiting coding, based on the predicted blocks and blocks awaiting coding. The transformation and quantization unit 120 is used to perform operations such as transformation and quantization on residual blocks, thereby removing information that is difficult for the human eye to perceive and eliminating visual redundancy. Selectively, residual blocks before transformation and quantization by the transformation and quantization unit 120 may be called temporal residual blocks, and temporal residual blocks after transformation and quantization by the transformation and quantization unit 120 may be called frequency residual blocks or frequency-domain residual blocks. The entropy encoding unit 130 can receive the quantized transform coefficient output by the transformation and quantization unit 120 and output a bitstream based on the quantized transform coefficient. For example, the entropy encoding unit 130 can eliminate character redundancy based on a target context model and probabilistic information of the binary bitstream. For example, the entropy encoding unit 130 can be used in context-based adaptive binary arithmetic coding (CABAC). The entropy encoding unit 130 may also be called a header information encoding unit. Optionally, in this application, the block awaiting coding may also be called an original block or a target block. The prediction block may also be called a prediction image block or an image prediction block, and may also be called a prediction signal or prediction information.A reconstructed block may also be called a reconstructed image block or image reconstruction block, and may also be called a reconstructed signal or reconstructed information. Furthermore, on the encoding side, the block awaiting coding may also be called an encoding block or an encoded image block. On the decoding side, the block awaiting coding may also be called a decoding block or a decoded image block. The block awaiting coding may be a CTU or a CU.
[0028] In the encoding framework 100, the difference between the predicted block and the block awaiting coding is calculated to obtain the residual block. Processes such as transformation and quantization are performed on the residual block, and the residual block is transmitted to the decoding side. Accordingly, the decoding side receives the bitstream, decodes it, obtains the residual block through steps such as inverse transformation and inverse quantization, and obtains the reconstructed block by adding the residual block to the predicted block obtained by the decoding side.
[0029] Furthermore, the inverse transform and inverse quantization unit 140, the loop filtering unit 150, and the decoding image buffer unit 160 within the encoding framework 100 can be used to form a decoder. The intra-prediction unit 180 or inter-prediction unit 170 can predict blocks awaiting coding based on existing reconstruction blocks, thereby ensuring that the encoding and decoding sides utilize the reference frame in the same manner. In other words, the encoder can replicate the decoder's processing loop, thereby generating the same predictions as the decoding side. Specifically, the quantized transformation coefficients are inversely transformed and inversely quantized by the inverse transform and inverse quantization unit 140 to replicate the approximate residual block on the decoding side. After the prediction block is added to this approximate residual block, the effects of block-based processing and blocking artifacts due to quantization can be smoothly filtered through the loop filtering unit 150. The blocks output from the loop filtering unit 150 can be stored in the decoding image buffer unit 160 for use in predicting subsequent images.
[0030] Figure 1 is merely one example of this application and should not be interpreted as a limitation of this application.
[0031] For example, the loop filtering unit 150 within the encoding framework 100 may include a deblocking filter (DBF) and a sample adaptive offset (SAO). The role of the DBF is to remove deblocking artifacts, and the role of the SAO is to remove ringing effects. In other embodiments of the present invention, a neural-network-based loop filtering algorithm can be used in the encoding framework 100 to improve the video compression efficiency. Alternatively, the encoding framework 100 may be a deep learning neural network-based video coding hybrid framework. In one embodiment, the results after sample filtering can be calculated using a convolutional neural network-based model based on the deblocking filter and the SAO. The network structure in the luminance component and the network structure in the saturation component of the loop filtering unit 150 may be the same or different. Given that the luminance component contains more visual information, the luminance component can be used to guide the filtering of the saturation component in order to improve the reconstruction quality of the saturation component.
[0032] Next, we will explain the relationship between intra-prediction and inter-prediction.
[0033] Interpretation uses image information from different frames and motion estimation to search for the motion vector information that best matches the block awaiting coding, thereby eliminating temporal redundancy. The images used for interpretation may be P-frames and / or B-frames, where P-frames refer to forward prediction frames and B-frames refer to bidirectional prediction frames.
[0034] In intra-prediction, the intra-prediction refers only to information from the same image to predict sample information within blocks awaiting coding, which is used to eliminate spatial redundancy. The frame used for intra-prediction may be an I-frame. For example, following a coding order from left to right and top to bottom, the top-left block, the upper block, and the left-side block can be used as reference information to predict blocks awaiting coding. The blocks awaiting coding are also used as reference information for the next block. In this way, the entire image can be predicted. If the input digital video has a color format such as YUV4:2:0 format, each of the four pixels in each image frame of the digital video consists of four Y components and two UV components. The encoding framework can encode the Y components (i.e., luminance blocks) and UV components (i.e., chromaticity blocks), respectively. Similarly, the decoding side can decode according to the format.
[0035] In the intra-prediction process, the intra-prediction uses angle prediction mode and non-angle prediction mode to make predictions on blocks awaiting coding and acquire predicted blocks. Based on rate distortion information calculated from the predicted blocks and blocks awaiting coding, the optimal prediction mode for the blocks awaiting coding is selected, and this prediction mode can be transmitted to the decoding side via a bitstream. The decoding side can analyze and acquire the prediction mode, make predictions to acquire predicted blocks of the target decoding blocks, and acquire reconstructed blocks by superimposing the time-domain residual blocks acquired through bitstream transmission.
[0036] Through the development of successive digital video coding standards, non-angle prediction modes have remained relatively stable, consisting of average mode and planar mode. Angle prediction modes have increased with the evolution of digital video coding standards. Taking the H-series of international digital video coding standards as an example, the H.264 / AVC standard has only 8 angle prediction modes and 1 non-angle prediction mode. H.265 / HEVC expands this to 33 angle prediction modes and 2 non-angle prediction modes. In H.266 / VVC, intra-prediction modes are further expanded to include 67 conventional prediction modes and non-conventional prediction modes, namely matrix-weighted intra-frame prediction (MIP) modes, for luminance blocks. The 67 conventional prediction modes include planar mode, DC mode, and 65 angle prediction modes. Planar mode is typically used to process blocks with progressively changing textures, DC mode is typically used to process flat areas as its name suggests, and angle prediction mode is typically used to process blocks where the angle texture is relatively obvious.
[0037] In this application, the current block used for intra prediction may be a square block or a rectangular block.
[0038] Furthermore, since all intra-prediction blocks are square, the probability of using each angle prediction mode is equal. Currently, when the length and width of a block are not equal, for horizontal blocks (width greater than height), the probability of using the upper reference sample is greater than the probability of using the left reference sample, and for vertical blocks (height greater than width), the probability of using the upper reference sample is less than the probability of using the left reference sample. Based on this, the present invention incorporates a wide-angle prediction mode. When predicting a rectangular block, the conventional angle prediction mode is changed to the wide-angle prediction mode, and when predicting a rectangular block using the wide-angle prediction mode, the predicted angle range of the current block is greater than the predicted angle range when predicting a rectangular block using the conventional angle prediction mode. Selectively, when using the wide-angle prediction mode, the index of the conventional angle prediction mode can still be used to transmit the signal, and accordingly, after receiving the signal, the decoding side can change from the conventional angle prediction mode to the wide-angle prediction mode, thereby not changing the total number of intra-prediction modes or the intra-mode encoding method, and the intra-mode encoding method does not change.
[0039] Figure 2 is a schematic diagram showing the specific directions of 65 types of angle prediction modes according to the embodiment of the present application.
[0040] As shown in Figure 2, index 0 represents the planar mode, index 1 represents the DC mode, and indices -14 to 80 each represent different angle prediction modes. Specifically, indices 2 to 66 represent the conventional angle prediction mode, and indices -1 to -14 and indices 67 to 80 represent the wide-angle prediction mode. In other words, the conventional intra-prediction mode represented by indices 2 to 66 can be used to predict square blocks, and the wide-angle prediction modes represented by indices -1 to -14 and 67 to 80 can be used to predict rectangular blocks.
[0041] It should be understood that the prediction mode indicated by the index x according to the present application can also be called the prediction mode x. For example, the intra prediction mode indicated by index 2 may sometimes be called intra prediction mode 2.
[0042] FIG. 3 is an example of a reference sample of the wide-angle prediction mode according to an embodiment of the present application.
[0043] As shown in FIG. 3, in the wide-angle prediction mode, for a CU of size W×H, the number of reference samples above it is 2W + 1, and the number of reference samples on its left side is 2H + 1. Specifically, as shown in (a) of FIG. 3, when W > H (for example, an 8×4 CU), in the vicinity of intra prediction mode 2 (greater than intra prediction mode 2), there are cases where the lower right corner point of the CU cannot be indexed to the reference sample. In the vicinity of intra prediction mode 66 (greater than intra prediction mode 66), there are still points that can be indexed to the reference sample. Therefore, some of the quasi-horizontal angle modes in the vicinity of intra prediction mode 2 (greater than intra prediction mode 2) need to be replaced with some of the quasi-vertical angle modes in the vicinity of intra prediction mode 66 (greater than intra prediction mode 66), thereby expanding the prediction angle range. Similarly, as shown in (b) of FIG. 3, when W < H (for example, a 4×8 CU), in the vicinity of intra prediction mode 66 (less than intra prediction mode 66), some points may not be indexed to the reference sample. In the vicinity of intra prediction mode 2 (less than intra prediction mode 2), there are still points that can be indexed to the reference sample. Therefore, some of the quasi-vertical angle modes in the vicinity of intra prediction mode 66 (less than intra prediction mode 66) need to be replaced with the smaller quasi-horizontal angle modes in the vicinity of intra prediction mode 2 (less than intra prediction mode 2), thereby expanding the prediction angle range.
[0044] In some cases, the intra-prediction mode to be executed can be determined or selected based on the size of the current block. For example, an intra-prediction can be performed on the current block by determining or selecting a wide-angle prediction mode based on the size of the current block. For example, if the current block is a rectangular block (different width and height), an intra-prediction can be performed on the current block using the wide-angle prediction mode. The ratio of the width to height of the current block can be used to determine the angle prediction mode that is replaced in the wide-angle prediction mode and the angle prediction mode after the replacement. For example, when predicting the current block, any intra-prediction mode for angles that do not exceed the diagonals of the current block (from the bottom left corner to the top right corner of the current block) can be selected as the replacement angle prediction mode.
[0045] The following describes other intra-prediction modes related to this application.
[0046] (1) Matrix-based Intra Prediction (MIP) mode
[0047] The MIP mode is also known as Matrix-weighted Intra Prediction mode. The process involved in the MIP mode can be divided into three main steps: the downsampling process, the matrix multiplication process, and the upsampling process. Specifically, first, in the downsampling process, spatially adjacent reconstructed samples are downsampled. Next, the obtained downsampled sample sequence is used as the input vector for the matrix multiplication process, that is, the output vector of the downsampling process is used as the input vector for the matrix multiplication process. The input vector of the matrix multiplication process is multiplied by a predetermined matrix, and a bias vector is added to the result to output the calculated sample vector. Finally, the output vector of the matrix multiplication process is used as the input vector for the upsampling process to obtain the final prediction block by upsampling.
[0048] Figure 4 is a schematic diagram of the MIP mode according to the present invention.
[0049] As shown in Figure 4, in MIP mode, the downsampling process obtains a downsampled upper-adjacent reconstructed sample vector bdrytop by averaging the reconstructed samples adjacent to the current coding unit, and a downsampled left-adjacent reconstructed sample vector bdryleft by averaging the reconstructed samples adjacent to the left of the current coding unit. After obtaining bdrytop and bdryleft, they are used as the input vector bdryred for the matrix multiplication process. Specifically, a sample vector can be obtained based on the top row vector bdrytopred, bdryleft, and Ak - bdryred + bk, where Ak is a preset matrix, bk is a preset bias vector, and k is the index for MIP mode. After obtaining the sample vector, the sample vector is upsampled by linear interpolation to obtain a predicted sample block with a number of samples that matches the actual number of samples in the coding unit.
[0050] In other words, to predict a block with width W and height H, MIP requires H reconstructed samples from the left column of the current block and W reconstructed samples from the row above the current block as input. MIP generates the predicted block based on three steps: averaging of reference samples, matrix-vector multiplication, and interpolation. The core of MIP is matrix-vector multiplication, which can be considered the process of generating the predicted block using input samples (reference samples) in a matrix-vector multiplication manner. Various matrices are provided for MIP, and differences in prediction methods can be reflected in differences in matrices; if the input samples are the same, different matrices will yield different results. Furthermore, the processes of averaging and interpolation of reference samples are designed to consider the trade-off between performance and complexity. For blocks with large sizes, averaging of reference samples can achieve an effect that approximates downsampling, allowing the input to be adapted to a relatively small matrix. Interpolation provides an upsampling effect. Thus, it is no longer necessary to provide a separate MIP matrix for each block size; instead, only one or a few matrices of specific sizes need to be provided. As the need for compression performance increases and hardware performance improves, more complex MIPs may appear in next-generation standards.
[0051] In MIP mode, the MIP mode can be simplified from a neural network; for example, the matrix used in the MIP mode can be obtained based on training. Therefore, the MIP mode possesses relatively strong generalization ability and predictive effects that cannot be achieved by conventional prediction modes. The MIP mode is a model obtained by performing multiple simplifications of hardware and software complexity on an intra-predictive model based on a neural network. Based on a large number of training samples, multiple prediction modes represent multiple models and parameters, better covering the texture of natural sequences.
[0052] MIP mode is somewhat similar to planar mode, but clearly, MIP mode is more complex and flexible than planar mode.
[0053] The number of MIP modes can vary depending on the coding unit's block size. For example, a 4x4 coding unit has 16 prediction modes. An 8x8 coding unit, or a coding unit with a width equal to 4 or a height equal to 4, has 8 prediction modes. For other sizes of coding units, it has 6 prediction modes. MIP modes also have a transpose function. For prediction modes that match the current size, MIP modes allow the encoder to attempt a transpose calculation. Therefore, MIP modes require a flag indicating whether or not MIP modes are used for the current coding unit. Furthermore, if MIP modes are used for the current coding unit, a transpose flag must also be transmitted to the decoder.
[0054] (2) Decoder side Intra Mode Derivation (DIMD) mode
[0055] The core of DIMD mode is that the decoder derives the intra-predictive mode using the same method as the encoder. This avoids transmitting the intra-predictive mode index of the current coding unit in the bitstream, thereby saving bit overhead.
[0056] The specific process for DIMD mode can be divided into the following two main steps:
[0057] Step 1: Derive the prediction mode.
[0058] Figure 5 is a schematic diagram showing the derivation of a prediction mode based on DIMD according to an embodiment of the present application.
[0059] As shown in Figure 5(a), DIMD derives prediction modes using samples within a template in the reconstruction region (reconstruction samples to the left and above the current block). For example, the template can include three adjacent rows of reconstruction samples above the current block, three adjacent columns of reconstruction samples to the left, and the corresponding adjacent reconstruction sample to the upper left. Based on this, multiple gradient values can be determined within the template according to a window (e.g., the window shown in Figure 5(b) or Figure 5(c)). Each gradient value is used to obtain one intra prediction mode (ipm) that fits its gradient direction. Based on this, the encoder can derive prediction modes by selecting the prediction mode corresponding to the largest gradient value and the prediction mode corresponding to the second largest gradient value among the multiple gradient values. For example, as shown in Figure 5(b), for a 4x4 block, all samples for which gradient values need to be determined are analyzed to obtain the corresponding gradient histogram. For example, as shown in Figure 5(c), for blocks of other sizes, all samples for which gradient values need to be determined are analyzed to obtain the corresponding gradient histogram. Finally, the prediction modes corresponding to the largest gradient and the second largest gradient in the gradient histogram are defined as the derived prediction modes.
[0060] Of course, the gradient histogram in this application is merely an example for determining the derived prediction mode, and when implemented in practice, it can be carried out in various simple forms, and this application is not particularly limited thereto. Furthermore, this application is not limited to the method of statistically calculating the gradient histogram; for example, the gradient histogram can be statistically calculated using the Sobel operator or other methods.
[0061] Step 2: Derive the predicted block.
[0062] Figure 6 is a schematic diagram showing the derivation of a predicted block based on DIMD according to an embodiment of the present application.
[0063] As shown in Figure 6, the encoder can weight the predicted values corresponding to three intra-prediction modes (planar mode and two intra-prediction modes derived based on DIMD). The encoder and decoder use the same prediction block derivation method to obtain the predicted block for the current block. Assuming that the prediction mode corresponding to the largest gradient value is prediction mode 1 and the prediction mode corresponding to the second largest gradient value is prediction mode 2, the encoder determines the following two conditions. 1. The gradient value for prediction mode 2 is not 0. 2. Neither Prediction Mode 1 nor Prediction Mode 2 is a Planar Mode or a DC Prediction Mode.
[0064] If the above two conditions are not met simultaneously, the predicted sample value of the current block is calculated using only prediction mode 1, i.e., the normal prediction process is applied to prediction mode 1. If not, i.e., if the above two conditions are met simultaneously, the predicted block of the current block is derived using the weighted averaging method. The specific method is as follows: Planar mode accounts for 1 / 3 of the weight, and the remaining 2 / 3 is the combined weight of prediction mode 1 and prediction mode 2. For example, the gradient amplitude value of prediction mode 1 is divided by the sum of the gradient amplitude values of prediction mode 1 and prediction mode 2, and the result is used as the weight of prediction mode 1. The gradient amplitude value of prediction mode 2 is divided by the sum of the gradient amplitude values of prediction mode 1 and prediction mode 2, and the result is used as the weight of prediction mode 2. Finally, a weighted averaging is performed on the predicted blocks obtained based on the above three prediction modes, i.e., predicted block 1, predicted block 2, and predicted block 3 obtained based on planar mode, prediction mode 1, and prediction mode 2, respectively, to obtain the predicted block of the current coding unit. The decoder also obtains its predicted block in the same steps.
[0065] In other words, the specific weights for step 2 above are calculated as follows: Weight(PLANAR) = 1 / 3; Weight(mode1) = 2 / 3 * (amp1 / (amp1+amp2)); Weight(mode2) = 1 - Weight(PLANAR) - Weight(mode1); mode1 and mode2 represent prediction mode 1 and prediction mode 2, respectively, and amp1 and amp2 represent the gradient amplitude values for prediction mode 1 and prediction mode 2, respectively. In DIMD mode, a flag must be transmitted to the decoder. This flag is used to indicate whether DIMD mode is being used for the current coder unit.
[0066] Of course, the weighted averaging method described above is merely one example of this application and should not be understood as a limitation of this application.
[0067] In summary, DIMD uses gradient analysis of the reconstructed sample to select an intra-prediction mode, and depending on the analysis results, it is possible to weight two intra-prediction modes and a planar mode. The advantage of DIMD is that, if a DIMD mode is selected for a block, there is no need to specifically indicate which intra-prediction mode is being used in the bitstream, as this is derived by the decoder itself through the process described above, thus saving some overhead.
[0068] (3) Template-based Intra Mode Derivation (TIMD) Prediction Mode
[0069] The technical principle of TIMD mode is relatively similar to that of DIMD mode described above, and in both cases, the codec performs the same operation to derive the prediction mode, saving the overhead of transmitting the mode index. TIMD mode can be understood in two main parts. First, cost information for each prediction mode is calculated based on a template, and the prediction mode corresponding to the smallest cost and the prediction mode corresponding to the second smallest cost are selected. The prediction mode corresponding to the smallest cost is denoted as prediction mode 1, and the prediction mode corresponding to the second smallest cost is denoted as prediction mode 2. If the ratio of the second smallest cost (costMode2) to the smallest cost (costMode1) satisfies a pre-set condition such as costMode2 < 2*costMode1, then the prediction block corresponding to prediction mode 1 and the prediction block corresponding to prediction mode 2 are weighted and merged according to the weights corresponding to prediction mode 1 and prediction mode 2 to obtain the final prediction block.
[0070] For example, the weights corresponding to prediction mode 1 and prediction mode 2 are determined based on the following method. weight1 = costMode2 / (costMode1+ costMode2); weight2 = 1 - weight1; weight1 is the weight of the prediction block corresponding to prediction mode 1, and weight2 is the weight of the prediction block corresponding to prediction mode 2. If the ratio of the second smallest cost costMode2 to the smallest cost costMode1 does not satisfy a predetermined condition, weight fusion between prediction blocks does not occur, and the prediction block corresponding to prediction mode 1 becomes a TIMD prediction block.
[0071] Furthermore, when performing intraprediction on the current block using TIMD mode, if the reconstruction sample template for the current block does not contain any available adjacent reconstruction samples, TIMD mode will select planar mode and perform intraprediction on the current block, i.e., will not perform weighted fusion. Similar to DIMD mode, TIMD mode requires the transmission of a flag to the decoder. This flag is used to indicate whether TIMD mode is being used for the current coder unit.
[0072] Figure 7 is a schematic diagram of a template used in TIMD according to the embodiment of the present application.
[0073] As shown in Figure 7, if the current block is a coding unit with width equal to M and height equal to N, the codec can select a reference template for the current block from a coding unit with width equal to 2(M+L1)+1 and height equal to 2(N+L2)+1, and calculate the template for the current block. In this case, if the template for the current block does not contain any available adjacent reconstruction samples, the TIMD mode selects planar mode and performs intra-prediction for the current block. For example, available adjacent reconstruction samples may be samples adjacent to the left and above the current CU in Figure 7, i.e., there are no available reconstruction samples in the shaded area. In other words, if there are no available reconstruction samples in the shaded area, the TIMD mode selects planar mode and performs intra-prediction for the current block.
[0074] Except in the case of boundaries, when coding the current block, theoretically, reconstructed values can be obtained for the left and above the current block, i.e., the template of the current block contains available adjacent reconstructed samples. In a specific embodiment, the decoder predicts the template using a certain intra-prediction mode and compares the predicted value to the reconstructed value to obtain the cost of that intra-prediction mode in the template. Examples include the Sum of Absolute Differences (SAD), the Sum of Absolute Transformed Difference (SATD), and the Sum of Squared Error (SSE). Since the template and the current block are adjacent to each other, the reconstructed samples in the template and the samples in the current block are correlated. Therefore, the behavior of the prediction mode on the template can be used to estimate the behavior of this prediction mode on the current block. In TIMD, the template is predicted using several candidate intra-prediction modes to obtain the costs of the candidate intra-prediction modes on the template, and the predicted value of the one or two intra-prediction modes with the lowest cost is taken as the intra-prediction value of the current block. If the difference between the two costs corresponding to the two intra-prediction modes on the template is not large, the compression performance can be improved by performing a weighted average on the predicted values of the two intra-prediction modes. Selectively, the weights of the predicted values of the two prediction modes are related to the above costs, for example, the weights are inversely proportional to the costs.
[0075] In summary, TIMD allows for the selection of an intra-prediction mode by leveraging the predictive effect of the intra-prediction mode on the template, and weighting two intra-prediction modes according to their cost on the template. The advantage of TIMD is that, once a TIMD mode is selected for a block, there is no need to specifically indicate which intra-prediction mode is being used in the bitstream, as this is derived by the decoder itself through the process described above, thus saving some overhead.
[0076] Through a brief introduction to the above intra-prediction modes, the following can be observed: The technical principles of DIMD mode and TIMD mode are similar. Both utilize the fact that the decoder performs the same operation as the encoder to estimate the prediction mode of the current coding unit. In such prediction modes, if the complexity is acceptable, the transmission of the prediction mode index can be omitted, saving overhead and improving compression efficiency. However, due to the limitations of the available information and the fact that they do not significantly improve prediction quality in themselves, DIMD and TIMD modes are more effective in large areas where texture characteristics are consistent. When the texture changes slightly or the template area cannot cover it, the prediction effect of such prediction modes is inferior.
[0077] Furthermore, both DIMD and TIMD modes merge or weight prediction blocks obtained based on multiple conventional prediction modes. Merging prediction blocks can produce effects that cannot be achieved with a single prediction mode. In DIMD mode, the planar mode can be incorporated as an additional weighted prediction mode to enhance the spatial relevance between adjacent reconstructed samples and predicted samples, thereby improving the prediction effect of intra-prediction. However, because the prediction principle of the planar mode is relatively simple, using the planar mode as an additional weighted prediction mode for prediction blocks where there is a clear difference between the upper right corner and the lower left corner may have a counterproductive effect.
[0078] (4) Geometric partitioning mode (GPM) and angular weighted prediction (AWP).
[0079] In video coding standards, conventional unidirectional prediction searches for only one reference block of the same size as the current block, while conventional bidirectional prediction uses two reference blocks of the same size as the current block, and the sample value of each sample in the prediction block is the average of the samples at the corresponding positions in the two reference blocks, meaning that all samples in each reference block each account for 50%. Furthermore, with bidirectional weighted prediction, the proportions of the two reference blocks can differ; for example, all samples in the first reference block might account for 75%, and all samples in the second reference block might account for 25%, while the proportions of all samples in the same reference block remain the same. In addition, several optimization methods, such as decoder-side motion vector refinement (DMVR) and bidirectional optical flow (BIO or BDOF), can introduce some variation into the reference or prediction samples.
[0080] In both GPM and AWP, two reference blocks of the same size as the current block are used, but at some sample locations, 100% of the sample values from the corresponding location in the first reference block are used, at some sample locations, 100% of the sample values from the corresponding location in the second reference block are used, and in boundary regions (or regions called transition regions), a certain proportion of the sample values from the corresponding locations in the two reference blocks are used. The weights of the boundary regions also change gradually. The specific method of assigning these weights is determined by the weight derivation mode of GPM or AWP. The weight of each sample location is determined based on the weight derivation mode of GPM or AWP.
[0081] Of course, in some cases, for example when the block size is very small, in some GPM or AWP modes, it may not be possible to guarantee that 100% of the sample values of the corresponding positions in the first reference block are used at some sample positions, and 100% of the sample values of the corresponding positions in the second reference block are used at some sample pixel positions. In this case, the GPM or AWP can be understood as using two reference blocks of different sizes from the current block, that is, using the required portion of each as a reference block, or in other words, using the portion with a weight of 0 as a reference block and removing the portion with a weight of 0, and the present application does not limit the specific implementation of this.
[0082] Figure 8 shows an example of a weight diagram corresponding to 64 different weight derivation modes in the GPM for a square block according to an embodiment of the present application.
[0083] As shown in Figure 8, GPM has weight diagrams corresponding to 64 different weight derivation modes for a square block. Here, for each weight derivation mode, the black area indicates that the weight value of the corresponding position in the first reference block is 0%, the white area indicates that the weight value of the corresponding position in the first reference block is 100%, and the gray area indicates, depending on the shade of color, that the weight value of the corresponding position in the first reference block is a weight value greater than 0% but less than 100%. The weight value of the corresponding position in the second reference block is the value obtained by subtracting the weight value of the corresponding position in the first reference block from 100%.
[0084] Figure 9 shows an example of a dividing line for a weight derivation mode according to an embodiment of the present application.
[0085] As shown in Figure 9, the division line for the weight derivation mode may be a line consisting of points where the weights of two prediction modes corresponding to the GPM are the same. In other words, in the weight matrix of the GPM, the division line is a line consisting of points where the weights of two prediction modes corresponding to the GPM are the same. In other words, the division line may be a line consisting of points where the weights of two prediction modes corresponding to the GPM are the same within the region where the weights change in the weight matrix of the GPM. In other words, the division line is a line consisting of points where the weight is the midpoint, or the division line is a line consisting of points corresponding to the midpoint weight, and a point with the midpoint weight may or may not be in the midpoint of the entire sample. For example, if the weights are from 0 to 8, the midpoint weight may be 4.
[0086] Figure 10 shows an example of a weight diagram corresponding to 56 different weight derivation modes in AWP for a square block according to an embodiment of the present application.
[0087] As shown in Figure 10, weight diagrams corresponding to 56 different weight derivation modes in AWP for square blocks are shown. Here, for each weight derivation mode, the black area indicates that the weight value of the corresponding position in the first reference block is 0%, the white area indicates that the weight value of the corresponding position in the first reference block is 100%, and the gray area indicates that the weight value of the corresponding position in the first reference block is a weight value greater than 0% but less than 100%, depending on the shade of color. The weight value of the corresponding position in the second reference block is the value obtained by subtracting the weight value of the corresponding position in the first reference block from 100%.
[0088] Note that the weight derivation methods for GPM and AWP can differ. For example, in GPM, the angle and offset amount are determined for each weight derivation mode, and then a corresponding weight diagram is calculated for each weight derivation mode. In AWP, a one-dimensional weight line is first determined based on each weight derivation mode, and then the one-dimensional weight line is tiled across the entire image using a method similar to intra-angle prediction to obtain a weight diagram corresponding to each weight derivation mode. Of course, in other alternative embodiments, the weight diagram corresponding to each weight derivation mode can also be called a weight matrix.
[0089] Next, we will explain the weight derivation method using GPM as an example.
[0090] The encoder can determine the corresponding partition line based on each weight derivation mode, and then determine the corresponding weight matrix based on the partition line. For example, the encoder can determine the angle index variable angleIdx and the distance index variable distanceIdx corresponding to the weight derivation mode using Table 1, and determine the weight derivation mode merge_gpm_partition_idx. The angle index variable angleIdx and the distance index variable distanceIdx can be understood as variables for determining the partition line, that is, variables for determining the angle and offset amount of the partition line, respectively. After determining the partition line corresponding to each weight derivation mode, the encoder can determine the weight matrix corresponding to each weight derivation mode based on the partition line corresponding to each weight derivation mode.
[0091] [Table 1]
[0092] As shown in Table 1, there are 64 types of weight derivation modes (for example, the 64 modes shown in Figure 8), and the index values (merge_gpm_partition_idx) for these modes range from 0 to 63. Each of the 64 weight derivation modes can correspond to one angle index variable angleIdx and one distance index variable distanceIdx. That is, each weight derivation mode can correspond to one partition line. Of course, one angle index variable angleIdx or one distance index variable distanceIdx can correspond to the index of one or more weight derivation modes, and Table 1 is merely an example of the present invention and should not be understood as a limitation on the present invention.
[0093] Since GPM can be used for three components (e.g., Y, Cb, Cr), the process of generating a GPM prediction sample matrix for one component can be made a subprocess, namely a weighted sample prediction process for geometric partitioning mode. This process can be invoked for three components, differing only in the invocation parameters, and this application will explain using the luminance component as an example. Exemplarily, the weighted sample prediction process for GPM can be used to derive the prediction matrix predSamplesL[ xL ][ yL ] for the current luminance block, where xL = 0..cbWidth-1 and yL = 0..cbHeight-1, where nCbW is cbWidth and nCbH is cbHeight.
[0094] The input to the weighted prediction process for GPM is the current block width nCbW, the current block height nCbH, two (nCbW)x(nCbH) predicted sample matrices predSamplesLA and predSamplesLB, the angle index variable angleIdx for the "split" of the GPM, the distance index variable distanceIdx for the GPM, and the component index variable cIdx, where, for example, if cIdx is 0, the component index variable represents the luminance component. The output to the weighted prediction process for GPM is the GPM predicted sample matrix pbSamples[x][y] of (nCbW)x(nCbH), where x=0..nCbW-1 and y=0..nCbH-1.
[0095] The predicted sample matrix pbSamples[x][y] can be derived in the following way: For example, the variables nW, nH, shift1, offset1, displacementX, displacementY, partFlip, and shiftHor can be derived in the following way: nW = ( cIdx = = 0 ) ? nCbW : nCbW * SubWidthC. nH = ( cIdx = = 0 ) ? nCbH : nCbH * SubHeightC. shift1 = Max( 5, 17 - BitDepth ), where BitDepth is the bit depth of the coding. offset1 = 1 << ( shift1 - 1 ). DimensionX = angleIdx. displacementY = (angleIdx + 8) % 32. partFlip = ( angleIdx >= 13 && angleIdx <= 27 ) ? 0 : 1. shiftHor = ( angleIdx % 16 = = 8 | | ( angleIdx % 16 != 0 && nH >= nW ) ) ? 0 : 1. Next, the variables offsetX and offsetY can be derived in the following way: If the value of shiftHor is 0, offsetX = ( -nW ) >> 1; offsetY = ( ( -nH ) >> 1 ) + ( angleIdx < 16 ? ( distanceIdx * nH ) >> 3 : -( ( distanceIdx * nH ) >> 3 ) ); If not (i.e., the value of shiftHor is 1), offsetX = ( ( -nW ) >> 1 ) + ( angleIdx < 16 ? ( distanceIdx * nW ) >> 3 : -( ( distanceIdx * nW ) >> 3 ); offsetY = ( - nH ) >> 1. Subsequently, the predicted sample matrix pbSamples[x][y] can be derived using the following method (x = 0..nCbW - 1, y = 0..nCbH - 1), Variables xL and yL are derived in the following way: xL = ( cIdx = = 0 ) ? x : x * SubWidthC; yL = ( cIdx = = 0 ) ? y : y * SubHeightC; weightIdx = ( ( ( xL + offsetX ) << 1 ) + 1 ) * disLut[ displacementX ] + ( ( ( yL + offsetY ) << 1 ) + 1 ) * disLut[ displacementY ]; The disLut[displacementX] can be determined from Table 2: [Table 2] weightIdxL = partFlip ? 32 + weightIdx : 32 - weightIdx; wValue = Clip3( 0, 8, ( weightIdxL + 4 ) >> 3 ); pbSamples[ x ][ y ] = Clip3( 0, ( 1 << BitDepth ) - 1, ( predSamplesLA[ x ][ y ] * wValue + predSamplesLB[ x ][ y ] * ( 8 - wValue ) + offset1 ) >> shift1 ).
[0096] Here, pbSamples[x][y] represents the predicted sample at point (x,y). wValue represents the weights of the predicted values predSamplesLA[x][y] of the prediction matrix for one prediction mode at point (x,y), and (8-wValue) represents the weights of the predicted values predSamplesLB[x][y] of the prediction matrix for another prediction mode at point (x,y).
[0097] Furthermore, for one weight derivation mode, it is possible to use it to derive one weight value wValue for each point, and then calculate one GPM prediction value pbSamples[x][y]. In this method, the weight wValue does not need to be in matrix form, but it can be understood that if the wValue for each position is stored in a matrix, it becomes a weight matrix. The principle of calculating a weight for each point and weighting it to obtain a GPM prediction value is the same as the principle of calculating all the weights and weighting them uniformly to obtain a GPM prediction sample matrix. The reason why "weight matrix" is used in many descriptions in this specification is to make the description easier to understand and to obtain a more intuitive diagram using the weight matrix, but in reality it can also be described using weights for each position. For example, the weight matrix derivation mode can also be called a weight derivation mode, but this application does not specifically limit it.
[0098] Furthermore, the partitioning of CU, PU, and TU all belong to partitioning methods based on rectangles. However, GPM and AWP achieve the effect of non-rectangular partitioning of predictions without partitioning. GPM and AWP use a weight mask of two reference blocks, i.e., the weight diagram or weight matrix described above. This mask determines the weights of the two reference blocks when generating the prediction block, or it can be easily understood that some positions of the prediction block are obtained based on the first reference block and other positions are obtained based on the second reference block, and the blending area is obtained by weighting the corresponding positions of the two reference blocks, thereby making the transition smoother. In GPM and AWP, since the current block is not partitioned into two CUs or PUs, the current block is treated as a single unit in post-prediction residual transformations, quantization, inverse transformations, and inverse quantization.
[0099] In GPM, two inter-prediction blocks can be combined using a weight matrix. This application extends this to combining any two prediction blocks. For example, two inter-prediction blocks, two intra-prediction blocks, or one inter-prediction block and one intra-prediction block can be combined. Furthermore, in screen content coding, one or two prediction blocks from intra-block copy (IBC) mode or palette mode can be used. For convenience of explanation, this application collectively refers to intra-mode, inter-mode, IBC mode, and palette mode as prediction modes. A prediction mode can be understood as a mode on which the codec can generate information for one prediction block of the current block. Exemplaryly, in intra-prediction, the prediction mode may be an intra-prediction mode such as DC mode, planar mode, or various types of intra-angle prediction modes. Of course, some or more auxiliary information can also be superimposed, such as an optimization method for intra-reference samples or an optimization method after generating a rudimentary prediction block (e.g., filtering). For example, in interpretation, the prediction mode may be merge mode, merge with motion vector difference (MMVD) mode, or advanced motion vector prediction (AMVP). For example, the prediction mode may be unidirectional, bidirectional, or multi-hypothesis prediction. Furthermore, in interpretation mode, if unidirectional prediction is used and one piece of motion information can be determined, the prediction block can be determined based on the motion information. In interpretation mode, if bidirectional prediction is used and two pieces of motion information can be determined, the prediction block can be determined based on the motion information.
[0100] Figure 11 is a schematic diagram of a GPM or AWP according to an embodiment of the present application.
[0101] As shown in Figure 11, the information that needs to be determined in GPM can be represented as one weight derivation mode and two prediction modes. The weight derivation mode is used to determine the weight matrix or weights, and each of the two prediction modes determines one prediction block or prediction value. The weight derivation mode is also called the partition mode or weight matrix derivation mode. The two prediction modes may be the same or different prediction modes, and include, but are not limited to, the intra-prediction mode, inter-prediction mode, IBC mode, and palette mode.
[0102] The following explains the process of converting residual blocks.
[0103] When encoding, a prediction is first made to the current block, using spatial or temporal correlations to obtain an image that is the same as or similar to the current block. For one block, the predicted block and the current block may be exactly the same, but it is difficult to guarantee that this will be the case for all blocks in a video. In particular, in natural video or video shot with a camera, the image texture is complex and there are factors such as noise in the image, so the predicted block and the current block are usually similar but different. Also, due to irregular motion, twisting deformation, occlusion, and changes in brightness in the video, it is difficult to perfectly predict the current block. Therefore, in a hybrid coding framework, the residual image is obtained by subtracting the predicted image from the original image of the current block, or in other words, the residual block is obtained by subtracting the predicted block from the current block. Since the residual block is usually much simpler than the original image, the compression efficiency can be greatly improved through prediction. Rather than encoding the residual block directly, a transformation is usually performed first. The transformation is to convert the residual image from the spatial domain to the frequency domain and remove the correlation of the residual image. After the residual image is converted to the frequency domain, the energy is often concentrated in the low-frequency region, so the non-zero coefficients after conversion are often concentrated in the upper left corner. Next, quantization is used for further compression. Also, since the human eye has difficulty perceiving high frequencies, a larger quantization step size can be used in the high-frequency region.
[0104] Image transformation technology transforms an original image so that it can be represented by an orthogonal function or orthogonal matrix. This transformation is two-dimensional, linear, and reversible. Generally, the original image is called a spatial domain image, the transformed image is called a transformed domain image (also called a frequency domain image), and the transformed domain image can be inversely transformed back into a spatial domain image. After image transformation, on the one hand, the characteristics of the image itself can be reflected more effectively, and on the other hand, energy can be concentrated on a small amount of data, which is advantageous for image storage, transmission, and processing.
[0105] In the field of image-video coding, an encoder can obtain residual blocks and then transform those residual blocks. Transformation methods include, but are not limited to, the Discrete Cosine Transform (DCT) and the Discrete Sine Transform (DST). Because the DCT has strong energy concentration characteristics, after the original image undergoes a DCT transformation, non-zero coefficients may only exist in certain regions (e.g., the upper-left corner region). Of course, in video coding, images are processed by dividing them into blocks, so transformations are also performed based on these blocks. DCT types usable in video coding include, but are not limited to, DCT2 and DCT8, and DST types usable in video coding include, but are not limited to, DST7. Here, DCT2 is a transformation commonly used in video compression standards, and VVC can use DCT8 and DST7. While transformations are very useful for normal video compression, it's not always necessary to transform every block; sometimes, not transforming yields better compression results. Therefore, encoders may sometimes allow users to choose whether or not to use transformations on a given block.
[0106] When an encoder transforms the current block in the current image, it is possible to transform the residual block of the current block using a basis function or a basis image. A basis image is an image representation of a basis function.
[0107] Figure 12 shows an example of a DCT2 type base image according to an embodiment of the present application.
[0108] As shown in Figure 12, the DCT2 type basis image may be a basis image consisting of 8 × 8 small blocks created based on a basis function, where each small block consists of 8 × 8 elements (subblocks). In a specific embodiment, an 8 × 8 size block can be transformed using a basis image consisting of 8 × 8 small blocks, and the result is an 8 × 8 transformation coefficient matrix.
[0109] As mentioned above, VVC can perform basic transformations on residual blocks using the DCT2 type, as well as using the DCT8 and DST7 types, meaning that the multiple transform selection (MTS) technique in VVC can be used. The transformation type corresponding to the basis function used for the basic transformation is also called the transformation kernel type used for the basic transformation. When performing a basic transformation, the encoder can improve compression performance by selecting the optimal transformation kernel type based on different residual distribution characteristics. The basic transformation is also called the core transform. In MTS, the transformation kernel type can be selected through several syntax elements. Below, we show MTS for selecting the transformation kernel type through syntax elements in conjunction with Table 3.
[0110] [Table 3]
[0111] As shown in Table 3, if the value of MTS_CU_flag is 0, the transformation kernel type for all horizontal and vertical base transformations is DCT2. If the value of MTS_CU_flag is 1, and the value of MTS_Hor_flag is 0, and the value of MTS_Ver_flag is 0, then the horizontal transformation kernel type is DST7, and the vertical transformation kernel type is DST7.
[0112] The VVC standard also allows for rewriting, or in other words, simplifying, the syntax of MTS. Specifically, VVC can use a single syntax element, mts_idx, to determine the transformation kernel type of the underlying transformation.
[0113] [Table 4]
[0114] As shown in Table 4, trTypeHor indicates the conversion kernel type for horizontal transformation, and trTypeVer indicates the conversion kernel type for vertical transformation. A value of 0 for both trTypeHor and trTypeVer indicates a DCT2 type transformation, a value of 1 indicates a DST7 type transformation, and a value of 2 indicates a DCT8 type transformation.
[0115] Since there is a certain correlation between the residual distribution and the intra-prediction mode, this correlation can also be utilized in the underlying transformation. One method is to group the MTS transformation kernel types based on the intra-prediction mode. An example of one grouping is shown in the table below.
[0116] [Table 5]
[0117] As shown in Table 5, if the intra-prediction mode index is 0 or 1, the MTS selects the transformation kernel typeset with index 0. In VVC, the mode with index 0 is Planar, and the mode with index 1 is DC. Both DC and Planar produce flat prediction values. If the intra-prediction mode index is between 2 and 12, the MTS selects the transformation kernel typeset with index 1. According to the intra-prediction mode diagram, angles 2 through 12 all point downwards to the left.
[0118] Each transformation kernel type set may have one choice of transformation kernel type for horizontal and vertical transformations, or it may have multiple choices for horizontal and vertical transformation kernel types. In other words, after selecting a transformation kernel type set based on the intra-prediction mode, it is possible to further subdivide it, for example, by flags or block size information, but this will not be explained here. The emphasis is that in the base transformation, the transformation kernel type set can be selected based on the intra-prediction mode. This method of selecting the transformation kernel type set for the base transformation based on the intra-prediction mode may also allow for more detailed grouping of base transformations in the future, and this application does not specifically limit this.
[0119] Furthermore, in this application, the transformation kernel type relating to the main transformation is also referred to as a transformation matrix, transformation type, or transformation kernel, and the transformation kernel type set relating to the main transformation is also referred to as a transformation matrix set, transformation type set, or transformation kernel set, and this application does not specifically limit these terms. That is, the selection of a transformation kernel type or transformation kernel type set relating to this application is also referred to as the selection of a transformation matrix or transformation matrix set, the selection of a transformation type or transformation type set, and the selection of a transformation kernel or transformation kernel set. The transformation kernel type or transformation type may include DCT2, DCT8, DST7, etc., and may also include DCT5, DST4, DST1, or Identity Transform (IDTR), etc.
[0120] Furthermore, a corresponding size conversion kernel type can be used for blocks of different sizes, but this will not be described in this application.
[0121] Since all images are two-dimensional, and the computational complexity and memory overhead required for direct two-dimensional conversion cannot be handled by hardware limitations, it should be noted that the aforementioned DCT2, DCT8, and DST7 conversions are all divided into horizontal and vertical one-dimensional conversions, i.e., performed in two steps. For example, the horizontal conversion can be performed first, followed by the vertical conversion, or vice versa. The above conversion methods are relatively effective for horizontal and vertical textures, but less effective for diagonal textures. Since horizontal and vertical textures are the most common, the above conversion methods are very useful for improving compression efficiency. However, with technological advancements, processing only the residuals of horizontal and vertical textures is no longer sufficient to meet the needs of compression efficiency.
[0122] Based on this, the present invention introduces the concept of a secondary transformation, that is, the encoder can perform a secondary transformation based on the primary transformation, thereby improving compression efficiency.
[0123] Exemplary, the primary transform can be used to process horizontal and vertical textures, also called the basic transform, and includes, but is not limited to, the DCT2, DCT8, and DST7 transforms mentioned above. The secondary transform is used to process diagonal textures. The secondary transform includes, but is not limited to, the low-frequency non-separable transform (LFNST). On the encoding side, the secondary transform is performed after the primary transform and before quantization. On the decoding side, the secondary transform is performed after inverse quantization and before the inverse primary transform.
[0124] Figure 13 shows an example of an LFNST according to an embodiment of the present application.
[0125] As shown in Figure 13, on the encoding side, a quadratic transformation is performed on the low-frequency coefficients in the upper left corner after the basic transformation using LFNST. The main transformation concentrates the energy in the upper left corner by removing correlations to the image. Then, a quadratic transformation removes correlations again to the low-frequency coefficients after the main transformation. On the encoding side, if 16 coefficients are input to a 4x4 LFNST, 8 coefficients are output. If 64 coefficients are input to an 8x8 LFNST, 16 coefficients are output. On the decoding side, if 8 coefficients are input to a 4x4 inverse LFNST, 16 coefficients are output. If 16 coefficients are input to an 8x8 inverse LFNST, 64 coefficients are output.
[0126] When the encoder performs a quadratic transformation on the current block in the current image, it can use a transformation matrix from a selected set of transformation matrices to transform the residual block of the current block. For example, if the quadratic transformation is LFNST, the transformation matrix can refer to a matrix for transforming a texture in a certain diagonal direction, and the set of transformation matrices can contain matrices for transforming several similar textures in diagonal directions.
[0127] Figure 14 shows an example of a transformation matrix set for LFNST according to the present invention.
[0128] As shown in Figures 14(a) to (d), LFNST can have four sets of transformation matrices, and transformation matrices within the same set have similar diagonal textures. For example, the set of transformation matrices shown in Figure 14(a) may be a set of transformation matrices with index 0, the set of transformation matrices shown in Figure 14(b) may be a set of transformation matrices with index 1, the set of transformation matrices shown in Figure 14(c) may be a set of transformation matrices with index 2, and the set of transformation matrices shown in Figure 14(d) may be a set of transformation matrices with index 3.
[0129] To ensure understanding, in this application, the transformation matrix relating to a quadratic transformation is also referred to as a transformation kernel, transformation kernel type, or basis function, or any similar or identical term; and the set of transformation matrices relating to a quadratic transformation is also referred to as a transformation kernel set, transformation kernel type set, or basis function set, or any similar or identical term; and is not particularly limited in this application. That is, the selection of a transformation matrix or set of transformation matrices relating to this application is also referred to as the selection of a transformation kernel type or set of transformation kernel types, the selection of a transformation type or set of transformation types, and the selection of a transformation kernel or set of transformation kernels.
[0130] The following describes a correlation scheme for applying LFNST to intra-encoded blocks.
[0131] In intra prediction, a prediction is made for the current block using reconstructed samples around the current block as a reference. Currently, since video is generally encoded from left to right and top to bottom, the reference samples available for the current block are typically on the left and top sides. In angular prediction, the reference samples in the current block are arranged according to a given angle to obtain the predicted value, which means that the predicted block has an obvious directional texture, and the residual after the current block undergoes angular prediction exhibits a statistically obvious angular characteristic. Therefore, the transformation matrix selected for LFNST can be associated with the intra prediction mode, i.e., after the intra prediction mode is determined, LFNST can use a set of transformation matrices in which the texture direction fits the angular features of the intra prediction mode.
[0132] For illustrative purposes, we assume there are a total of four sets of transformation matrices for LFNST, with two transformation matrices per set. Table 6 shows the correspondence between the intra-prediction modes and the sets of transformation matrices.
[0133] [Table 6]
[0134] As shown in Table 6, intra-prediction modes 0 to 81 can be associated with indices in four sets of transformation matrices.
[0135] It should be noted that the cross-component prediction modes used for saturation intra-prediction are 81-83, while these modes are not available for luminance intra-prediction. By transposing the LFNST transformation matrix, a single set of transformation matrices can handle more angles. For example, both intra-prediction modes 13-23 and intra-prediction modes 45-55 correspond to transformation matrix set 2, but intra-prediction modes 13-23 are clearly closer to horizontal, and intra-prediction modes 45-55 are clearly closer to vertical, and therefore need to be adapted by transposing after the transformation or inverse transformation corresponding to intra-prediction modes 45-55.
[0136] In a specific embodiment, since there are four sets of transformation matrices for LFNST, the encoding side can determine which set of transformation matrices to use for LFNST based on the intra-prediction mode currently used for the block, and further determine which transformation matrices to use within the determined set. In other words, by utilizing the correlation between the intra-prediction mode and the LFNST transformation matrix sets, the transmission of selecting LFNST transformation matrices in the bitstream can be reduced. Whether to use LFNST for the current block, and if so, whether to use the first or second transformation matrix from the transformation matrix set, can be determined by the bitstream and several conditions.
[0137] Of course, considering that there are 67 normal intra-prediction modes and LFNST only has 4 sets of transformation matrices, multiple near-angle prediction modes can only be represented by one set of LFNST transformation matrices. This is a result of considering the trade-off between performance and complexity, as each transformation matrix requires memory space to store the coefficients of the transformation matrix. As the demand for compression efficiency increases and hardware capabilities improve, LFNST can also be designed to be more complex. For example, by using larger transformation matrices, using more sets of transformation matrices, and using more transformation matrices in each set of transformation matrices. Exemplarily, Table 7 shows another correspondence between intra-prediction modes and sets of transformation matrices.
[0138] [Table 7]
[0139] As shown in Table 7, 35 transformation matrix sets are used, with 3 transformation matrices used in each transformation matrix set. The correspondence between the transformation matrix sets and intra-prediction modes can be realized as follows: Intra-prediction modes 0 to 34 correspond to transformation matrix sets 0 to 34 in the forward direction, meaning that the larger the prediction mode number, the larger the index of the transformation matrix set; intra-prediction modes 35 to 67 correspond to transformation matrix sets 2 to 33 in the reverse direction due to transposition, meaning that the larger the prediction mode number, the smaller the index of the transformation matrix set; the remaining prediction modes can be uniformly corresponded to the transformation matrix set with index 2. In other words, if transposition is not considered, one intra-prediction mode corresponds to one transformation matrix set. With this design, a more appropriate LFNST transformation matrix can be obtained for the residuals corresponding to each intra-prediction mode, and the compression performance is also improved.
[0140] Of course, theoretically, a one-to-one ratio can be achieved even in wide-angle mode, but the cost-effectiveness of such a design is relatively low, and this application will not provide a detailed explanation of this.
[0141] Furthermore, in the case of LFNST, in order to adapt MIP to the transformation matrix set, in this application, the transformation matrix set adapted to planar modes can be made into the transformation matrix set adapted to MIP.
[0142] It should be noted that LFNST is merely one example of a quadratic transformation and should not be understood as a limitation of quadratic transformations. For example, LFNST is an inseparable quadratic transformation, and in other alternative embodiments, separable quadratic transformations can be used to improve the compression efficiency of diagonal texture residuals.
[0143] Figure 15 is a block diagram showing a decoding framework 200 according to an embodiment of the present application.
[0144] As shown in Figure 15, the decoding framework 200 may include an entropy decoding unit 210, an inverse transform / inverse quantization unit 220, a residual unit 230, an intra prediction unit 240, an inter prediction unit 250, a loop filtering unit 260, and a decoding image buffer unit 270.
[0145] The entropy decoding unit 210 receives and decodes the bitstream to obtain a prediction block and a frequency-domain residual block. The frequency-domain residual block can be subjected to steps such as inverse transformation and inverse quantization through the inverse transformation / inverse quantization unit 220 to obtain a time-domain residual block. The residual unit 230 can obtain a reconstructed block by adding the prediction block obtained by the intra-prediction unit 240 or inter-prediction unit 250 to the time-domain residual block obtained after inverse transformation and inverse quantization through the inverse transformation / inverse quantization unit 220.
[0146] Figure 16 is a flowchart illustrating a decoding method 300 according to an embodiment of the present application. To make it clear, the decoding method 300 can be performed by a decoder. For example, the decoding method 300 is applied to the decoding framework 200 shown in Figure 15. For the sake of clarity, a decoder will be described below as an example.
[0147] As shown in Figure 16, the decoding method 300 may include some or all of the following: S310: The decoder decodes the bitstream and obtains the first transformation coefficient of the current block. S320: The decoder performs the first transformation on the first transformation coefficient and obtains the second transformation coefficient for the current block. S330: The decoder performs a second transformation on the second transformation coefficient and obtains the residual block of the current block. S340: The decoder makes a prediction for the current block based on the first and second prediction modes corresponding to the geometric division mode, and obtains the predicted block for the current block. S350: The decoder retrieves the reconstructed block of the current block based on the predicted block of the current block and the residual block of the current block.
[0148] This invention can improve the decompression efficiency of blocks by incorporating the first transformation in addition to the geometric division mode and the second transformation.
[0149] For example, the first transformation may be LFNST, that is, in this application, the compression efficiency of the residuals of the texture in the diagonal direction is improved by associating LFNST with the geometric division mode.
[0150] Of course, the method of adapting the geometric decomposition mode to LFNST can also be applied to other quadratic transformation methods. For example, LFNST is an inseparable quadratic transformation, and in other alternative embodiments, the geometric decomposition mode can also be applied to a separable quadratic transformation, and this application is not specifically limited to such an application.
[0151] The geometric partitioning mode may be used for intra-prediction or inter-prediction. In other words, the prediction modes corresponding to the geometric partitioning mode (i.e., the first prediction mode and the second prediction mode) may both be intra-prediction modes, or both be inter-prediction modes, or may include both intra-prediction modes and inter-prediction modes simultaneously. In other words, in the geometric partitioning mode, two arbitrary prediction blocks can be combined using a weight matrix. For example, two inter-prediction blocks, or two intra-prediction blocks, or one inter-prediction block and one intra-prediction block can be combined. Furthermore, in screen content coding, one or two prediction blocks from intra-block copy (IBC) mode or palette mode can be used. For convenience of explanation, in this application, intra-mode, inter-mode, IBC mode, and palette mode are collectively referred to as prediction modes. A prediction mode can be understood as a basis on which the codec can generate information for one prediction block of the current block. Illustratively, in intra-prediction, the prediction mode may be an intra-prediction mode such as DC mode, planar mode, or various types of intra-angle prediction modes. Of course, some or more auxiliary information may be superimposed, such as an optimization method for intra-reference samples or an optimization method after generating rudimentary prediction blocks (e.g., filtering). Illustratively, in inter-prediction, the prediction mode may be a merge mode, a Merge with Motion Vector Difference (MMVD) mode, or an Advanced Motion Vector Prediction (AMVP) mode. Illustratively, the prediction mode may be unidirectional prediction, bidirectional prediction, or multi-hypothesis prediction. Furthermore, in inter-prediction modes, if unidirectional prediction is used and one piece of motion information can be determined, then prediction blocks can be determined based on the motion information.In interpretation mode, bidirectional prediction is used, and if two pieces of motion information can be determined, the prediction block can be determined based on the motion information.
[0152] Of course, the scheme provided in this application is also applicable to any prediction mode that performs weighted predictions based on multiple prediction modes, such as AWP, and this application does not specifically limit it to such modes.
[0153] In some embodiments, S320 is Decode the bitstream and obtain the first and second flags, If the first flag indicates that the use of geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and the second flag indicates that the use of the first transformation is permitted to make transformations for blocks in the current sequence, then 1 Perform the first transformation on the transformation coefficient and obtain the second transformation coefficient, It can include...
[0154] For example, the current sequence is the image sequence that includes the current block.
[0155] For example, the first flag is used to control whether to use geometric partitioning mode for the current sequence.
[0156] Exemplary, if the value of the first flag is the first value, the first flag indicates that the use of geometric partitioning mode to make predictions for blocks in the current sequence is permitted. If the value of the first flag is the second value, the first flag indicates that the use of geometric partitioning mode to make predictions for blocks in the current sequence is not permitted. In one embodiment, the first value is 0 and the second value is 1, and in another embodiment, the first value is 1 and the second value is 0. Of course, the first value or the second value may be any other value.
[0157] For example, the second flag is used to control whether to use the first transformation for the current sequence.
[0158] For example, if the value of the second flag is the third value, the second flag indicates that the first transformation is permitted to be used to perform a transformation on the block in the current sequence. If the value of the second flag is the fourth value, the second flag indicates that the first transformation is not permitted to be used to perform a transformation on the block in the current sequence. In one embodiment, the third value is 0 and the fourth value is 1, and in another embodiment, the third value is 1 and the fourth value is 0. Of course, the third value or the fourth value may be any other value.
[0159] For example, if we denote the first flag as sps_gpm_enabled_flag and the second flag as sps_lfnst_enabled_flag, then if the values of both sps_gpm_enabled_flag and sps_lfnst_enabled_flag are 1, then 1 The first transformation is performed on the transformation coefficient to obtain the second transformation coefficient.
[0160] For example, if the first flag indicates that the geometric partitioning mode is not permitted to make predictions for blocks in the current sequence, and / or the second flag indicates that the first transformation is not permitted to make transformations for blocks in the current sequence, then the first transformation can be omitted from the first transformation coefficients, or in other words, the second transformation can be directly applied to the first transformation coefficients to obtain the residual values of the current block.
[0161] Of course, in other alternative embodiments, the first flag and / or the second flag may be replaced with flags at the picture, slice, largest coding unit (LCU), coding tree unit (CTU), coding unit (CU), prediction unit (PU), or transform unit (TU) levels. Alternatively, additional flags at the picture, slice, LCU, CTU, CU, PU, or TU levels may be added based on the first and second flags to indicate whether to use the geometric partitioning mode or the first transform. The embodiments of this application are not limited thereto.
[0162] In some embodiments, S320 is If the first flag indicates that the use of geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and the second flag indicates that the use of the first transformation is permitted to make transformations for blocks in the current sequence, then decode the bitstream and obtain the third flag. If the third flag indicates that both geometric partitioning mode and the first transformation are permitted for blocks in the current sequence, then 1 Perform the first transformation on the transformation coefficient and obtain the second transformation coefficient, It can include...
[0163] For example, the third flag is used to control whether both the geometric partitioning mode and the first transformation can be used.
[0164] For example, if the value of the third flag is the fifth value, the third flag indicates that both the geometric partitioning mode and the first transformation are permitted for the blocks in the current sequence. If the value of the third flag is the sixth value, the third flag indicates that both the geometric partitioning mode and the first transformation are not permitted for the blocks in the current sequence. In one embodiment, the fifth value is 0 and the sixth value is 1, and in another embodiment, the fifth value is 1 and the sixth value is 0. Of course, the fifth value or the sixth value may be any other value.
[0165] For example, if we denote the first flag as sps_gpm_enabled_flag, the second flag as sps_lfnst_enabled_flag, and the third flag as sps_gpm_lfnst_enabled_flag, then if both sps_gpm_enabled_flag and sps_lfnst_enabled_flag have a value of 1, we determine whether sps_gpm_lfnst_enabled_flag is 1. If sps_gpm_lfnst_enabled_flag is 1, 1 The first transformation is performed on the transformation coefficient to obtain the second transformation coefficient.
[0166] Of course, in other alternative embodiments, the third flag may be replaced with a flag at the picture, slice, largest coding unit (LCU), coding tree unit (CTU), coding unit (CU), prediction unit (PU), or transform unit (TU) level. Alternatively, based on the third flag, additional flag information at the picture, slice, LCU, CTU, CU, PU, or TU level may be added indicating whether to use the geometric partitioning mode or the first transform. The embodiments of this application are not limited thereto.
[0167] In some embodiments, S320 is If the first flag indicates that the use of geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and the second flag indicates that the use of the first transformation is permitted to make transformations for blocks in the current sequence, then the decoder, when the height and / or width of the current block is greater than or equal to the first threshold, 1 This may include performing a first transformation on the transformation coefficient and obtaining a second transformation coefficient.
[0168] For example, if we denote the first flag as sps_gpm_enabled_flag, the second flag as sps_lfnst_enabled_flag, and the third flag as sps_gpm_lfnst_enabled_flag, then if both sps_gpm_enabled_flag and sps_lfnst_enabled_flag have a value of 1, the decoder determines the height and / or width of the current block, and if the height and / or width of the current block is greater than or equal to the first threshold, then the 1 The first transformation is performed on the transformation coefficient to obtain the second transformation coefficient.
[0169] For example, the first threshold can be 4, 8, 16, 32, 64, or any other value.
[0170] In some embodiments, method 300 may further include, prior to S320, determining the set of transformation matrices to be used for the first transformation by a decoder.
[0171] In geometric partitioning mode, two prediction modes (i.e., a first prediction mode and a second prediction mode) are combined to predict the current block. Predicted blocks obtained by predicting the current block using different prediction modes may have different texture characteristics. Therefore, when geometric partitioning mode is selected for the current block, the first prediction mode may result in one texture characteristic for the predicted block of the current block, and the second prediction mode may result in another texture characteristic for the predicted block of the current block. In other words, after predicting the current block, from a statistical standpoint, the residual block of the current block will also show two texture characteristics. That is, the residual block of the current block does not necessarily conform to the rules that can be reflected in a given prediction mode. Therefore, for geometric partitioning mode, the decoder needs to determine a set of transformation matrices that conform to the characteristics before performing the first transformation on the first transformation coefficients. However, since the set of transformation matrices used for the first transformation is usually a set of transformation matrices defined based on a single intra-prediction mode, the relevant scheme for determining the transformation matrices used for the first transformation needs to be further improved for geometric partitioning mode, and each embodiment is described below illustratively.
[0172] In some embodiments, the set of transformation matrices used for the first transformation is the same as the set of transformation matrices that conform to the planar mode or DC mode.
[0173] For example, when the decoder checks the predicted mode of the current block, if a geometric partitioning mode is used for the current block, the decoder classifies the geometric partitioning mode and the planar mode (or DC mode) into one type and determines the set of transformation matrices to be used for the first transformation based on the planar mode (or DC mode). Alternatively, when the decoder checks the predicted mode of the current block, if a geometric partitioning mode is used for the current block, the encoder may return the planar mode (or DC mode) as the predicted mode of the current block, and based on this, the decoder may determine the set of transformation matrices that conforms to the planar mode (or DC mode) as the set of transformation matrices to be used for the first transformation. Alternatively, when the decoder checks the predicted mode of the current block, if a geometric partitioning mode is used for the current block, the decoder may consider that the set of transformation matrices used for the first transformation of the current block may be a set of transformation matrices that conforms to the planar mode (or DC mode).
[0174] In this embodiment, since both the planar mode (or DC mode) and the geometric partitioning mode can represent various texture characteristics, by determining a set of transformation matrices that conforms to the planar mode or DC mode as the set of transformation matrices used for the first transformation, it is not only possible to decode the current block based on the geometric partitioning mode and the first transformation, but it is also guaranteed that the texture characteristics of the set of transformation matrices used for the first transformation will be as similar as possible to the texture characteristics of the residual block of the current block, thereby improving decompression efficiency.
[0175] For example, if the prediction modes corresponding to the geometric partitioning modes (i.e., the first and second prediction modes) are both intra-prediction modes, or if the geometric partitioning modes are used for intra-prediction, the decoder determines that the set of transformation matrices used for the first transformation is the same as the set of transformation matrices that conform to the planar mode or DC mode.
[0176] For example, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block and both of the two prediction modes corresponding to the geometric partitioning mode are intra-prediction modes, the decoder classifies the geometric partitioning mode and the planar mode (or DC mode) into one type and determines the set of transformation matrices to be used for the first transformation based on the planar mode (or DC mode). Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block and both of the two prediction modes corresponding to the geometric partitioning mode are intra-prediction modes, the encoder can return the planar mode (or DC mode) as the prediction mode of the current block, and based on this, the decoder can determine the set of transformation matrices that conforms to the planar mode (or DC mode) as the set of transformation matrices to be used for the first transformation. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block and both of the two prediction modes corresponding to the geometric partitioning mode are intra-prediction modes, the decoder can consider that the set of transformation matrices used for the first transformation of the current block may be a set of transformation matrices that conforms to the planar mode (or DC mode).
[0177] For example, if the prediction modes corresponding to the geometric partitioning modes (i.e., the first and second prediction modes) are both interprediction modes, or if the geometric partitioning modes are used for interprediction, the decoder determines that the set of transformation matrices used for the first transformation is the same as the set of transformation matrices that conform to the planar mode or DC mode.
[0178] For example, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block and both of the two prediction modes corresponding to the geometric partitioning mode are inter-prediction modes, the decoder classifies the geometric partitioning mode and the planar mode (or DC mode) into one type and determines the set of transformation matrices to be used for the first transformation based on the planar mode (or DC mode). Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block and both of the two prediction modes corresponding to the geometric partitioning mode are inter-prediction modes, the encoder can return the planar mode (or DC mode) as the prediction mode of the current block, and based on this, the decoder can determine the set of transformation matrices that conforms to the planar mode (or DC mode) as the set of transformation matrices to be used for the first transformation. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block and both of the two prediction modes corresponding to the geometric partitioning mode are inter-prediction modes, the decoder can consider that the set of transformation matrices used for the first transformation of the current block may be a set of transformation matrices that conforms to the planar mode (or DC mode).
[0179] In some embodiments, the decoder first determines a dividing line consisting of points where the weights of the first prediction mode and the weights of the second prediction mode are the same, then determines the angular index of the dividing line, and then, based on the angular index, determines the set of transformation matrices to be used for the first transformation.
[0180] For example, if the prediction modes corresponding to the geometric partitioning modes (i.e., the first and second prediction modes) are both intra-prediction modes, or if the geometric partitioning modes are used for intra-prediction, the decoder first determines a partitioning line consisting of points where the weights of the first prediction mode and the weights of the second prediction mode are the same, then determines the angular index of the partitioning line, and then, based on the angular index, determines the set of transformation matrices to be used for the first transformation. For example, the decoder determines the set of transformation matrices that fit the prediction mode corresponding to the angular index (e.g., the intra-prediction mode) as the set of transformation matrices to be used for the inverse quadratic transformation.
[0181] For example, if the prediction modes corresponding to the GPM geometric partitioning modes (i.e., the first and second prediction modes) are both interprediction modes, or if the geometric partitioning modes are used for interprediction, the decoder first determines a partition line consisting of points where the weights of the first prediction mode and the weights of the second prediction mode are the same, then determines the angular index of the partition line, and then, based on the angular index, determines the set of transformation matrices to be used for the first transformation. For example, the decoder determines the set of transformation matrices that fit the prediction mode corresponding to the angular index (e.g., the intraprediction mode) as the set of transformation matrices to be used for the inverse quadratic transformation.
[0182] For example, when predicting a current block using a geometric partitioning mode, the decoder can classify the geometric partitioning mode and the intra-prediction mode corresponding to the partition line into one type based on the partition line of the geometric partitioning mode when selecting the set of transformation matrices to use for the first transformation. That is, it can determine the set of transformation matrices to use for the first transformation based on the angle index. Specifically, it determines the angle index (angleIdx) of the partition line based on the weight derivation mode used for the geometric partitioning mode, and then determines the set of transformation matrices that fits the intra-prediction mode corresponding to the angle index as the set of transformation matrices to use for the first transformation. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the encoder can return the intra-prediction mode corresponding to the partition line as the prediction mode of the current block, and based on this, the decoder can determine the set of transformation matrices that fits the intra-prediction mode corresponding to the partition line as the set of transformation matrices to use for the first transformation. Alternatively, when the decoder checks the prediction mode for the current block, if a geometric partitioning mode is used for the current block, the decoder may assume that the set of transformation matrices used for the first transformation of the current block may be a set of transformation matrices that conforms to the intra-prediction mode corresponding to the partition line.
[0183] In this embodiment, the division lines or angle index can not only represent the characteristics of the geometric division mode, but also to some extent the texture characteristics of the residual block of the current block. By associating the angle index with the intra prediction mode, it is possible to decode the current block based on the geometric division mode and the first transformation. Furthermore, it is ensured that the texture characteristics of the transformation matrix set used for the first transformation are as similar as possible to the texture characteristics of the residual block of the current block, thereby improving decompression efficiency.
[0184] To ensure clarity, the division lines described above may be for the weight derivation mode, the weight matrix, or the weight matrix derivation mode. Their specific details can be found in the explanations and related information in Figure 8 or 9; they are not explained here to avoid repetition.
[0185] Exemplary, a decoder can determine a set of transformation matrices to be used for a first transformation based on a first mapping relationship and an angular index. The first mapping relationship includes a correspondence between at least one index and at least one intra-prediction mode, where this at least one index includes an angular index. Of course, the first mapping relationship may be implemented in the form of a table, an array, or other form, and is not specifically limited thereto.
[0186] For example, the first mapping relationship can be realized as shown in Table 8 below.
[0187] [Table 8]
[0188] As shown in Table 8, the first mapping relationship can include 32 indices and an intra-prediction mode corresponding to each index.
[0189] The indices in Table 8 may include angleIdx shown in Table 1, and of course, may also include indices other than angleIdx shown in Table 1. For example, some indices correspond to 0. This is because these indices are not used in geometric division modes, that is, they are not included in Table 1. Furthermore, if some indices change in a certain version, for example, if more indices are used in the future, or if the intra-angle prediction mode is changed, for example, if there are more intra-angle prediction modes in the future, the correspondence table shown in Table 8 may also be changed accordingly, and is not particularly limited in this application.
[0190] In this embodiment, the decoder may determine the set of transformation matrices used for the first transformation based solely on the angle index of the dividing line. In other alternative embodiments, the decoder may also determine the set of transformation matrices used for the first transformation based solely on the distance index of the dividing line. For example, the decoder may determine the set of transformation matrices used for the first transformation to be one that conforms to a prediction mode (e.g., an intra-prediction mode) corresponding to the distance index of the dividing line. Alternatively, the decoder may determine the set of transformation matrices used for the first transformation based on the dividing line (i.e., the angle index of the dividing line and the distance index of the dividing line). For example, the decoder may determine the set of transformation matrices used for the first transformation to be one that conforms to a prediction mode (e.g., an intra-prediction mode) corresponding to the dividing line (i.e., the angle index of the dividing line and the distance index of the dividing line). This invention is not particularly limited to these.
[0191] In some embodiments, the decoder first determines the weight derivation mode to be used for the geometric partitioning mode, and then determines the set of transformation matrices that fit the intra-prediction mode corresponding to the weight derivation mode as the set of transformation matrices to be used for the first transformation.
[0192] For example, if the prediction modes corresponding to the geometric partitioning modes (i.e., the first and second prediction modes) are both intra-prediction modes, or if the geometric partitioning modes are used for intra-prediction, the decoder first determines the weight derivation mode to be used for the geometric partitioning mode, and then determines the set of transformation matrices that fit the intra-prediction mode corresponding to the weight derivation mode as the set of transformation matrices to be used for the first transformation.
[0193] For example, if the prediction modes corresponding to the geometric partitioning modes (i.e., the first and second prediction modes) are both interprediction modes, or if the geometric partitioning modes are used for interprediction, the decoder first determines the weight derivation mode to be used for the geometric partitioning mode, and then determines the set of transformation matrices that fit the intraprediction mode corresponding to the weight derivation mode as the set of transformation matrices to be used for the first transformation.
[0194] For example, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the decoder classifies the geometric partitioning mode and the intra-prediction mode corresponding to the weight derivation mode into one type, and determines the set of transformation matrices to be used for the first transformation based on the intra-prediction mode corresponding to the weight derivation mode. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the encoder can return the intra-prediction mode corresponding to the weight derivation mode as the prediction mode of the current block, and based on this, the decoder can determine the set of transformation matrices that matches the intra-prediction mode corresponding to the weight derivation mode as the set of transformation matrices to be used for the first transformation. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the decoder can consider that the set of transformation matrices used for the first transformation of the current block may be a set of transformation matrices that matches the intra-prediction mode corresponding to the weight derivation mode.
[0195] In this embodiment, the weight derivation mode can not only represent the characteristics of the geometric partitioning mode, but also to some extent the texture characteristics of the residual blocks of the current block. By associating the weight derivation mode with the intra prediction mode, it is possible to decode the current block based on the geometric partitioning mode and the first transformation. Furthermore, it is ensured that the texture characteristics of the transformation matrix set used in the first transformation are as similar as possible to the texture characteristics of the residual blocks of the current block, thereby improving decompression efficiency.
[0196] Exemplary, the decoder may determine, based on a second mapping relationship, a set of transformation matrices that conforms to the intra-prediction modes corresponding to the weight derivation modes used for the geometric partitioning modes as the set of transformation matrices used for the first transformation. The second mapping relationship includes a correspondence between at least one weight derivation mode and at least one intra-prediction mode, where the at least one weight derivation mode includes a weight derivation mode used for the geometric partitioning modes.
[0197] Of course, the second mapping relationship may be implemented in the form of a table, or in other forms such as an array, and this application is not specifically limited to these forms. The weight derivation mode described above can also be called the weight matrix or weight matrix derivation mode, and its specific details can be found in the explanation of Figure 8, Table 1 and related content, and will not be explained further here to avoid redundancy.
[0198] In some embodiments, the first prediction mode is the first intra-prediction mode, and the second prediction mode is the second intra-prediction mode. The decoder determines the third intra-prediction mode based on the first and second intra-prediction modes. The set of transformation matrices used for the first transformation is the same as the set of transformation matrices that conform to the third intra-prediction mode.
[0199] For example, the decoder can determine a set of transformation matrices that fits a third intra-prediction mode as the set of transformation matrices used for the first transformation.
[0200] For example, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the decoder can determine a third intra-prediction mode based on the first and second intra-prediction modes, classify the geometric partitioning mode and the third intra-prediction mode into one type, and based on this, the decoder can determine the set of transformation matrices to be used for the first transformation based on the third intra-prediction mode. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the encoder can return a third intra-prediction mode as the prediction mode of the current block, and based on this, the decoder can determine the set of transformation matrices that conforms to the third intra-prediction mode as the set of transformation matrices to be used for the first transformation. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, the decoder can consider the set of transformation matrices used for the first transformation of the current block to be the set of transformation matrices that conforms to the third intra-prediction mode.
[0201] Of course, in other alternative embodiments, the decoder does not necessarily have to explicitly determine a third intra-prediction mode first, and then determine the set of transformation matrices to be used for the first transformation based on the third intra-prediction mode. Instead, the decoder directly uses the set of transformation matrices that fits the third intra-prediction mode as the set of transformation matrices to be used for the first transformation.
[0202] In some embodiments, the decoder determines the default prediction mode from among the first and second intra-prediction modes as the third intra-prediction mode. Alternatively, the decoder determines the intra-prediction mode from among the first and second intra-prediction modes that corresponds to the weight derivation mode used for the geometric partitioning mode as the third intra-prediction mode. Alternatively, the decoder determines the third intra-prediction mode based on the weights of the first intra-prediction mode and / or the weights of the second intra-prediction mode. Alternatively, the decoder determines the third intra-prediction mode based on the type of the first intra-prediction mode and the type of the second intra-prediction mode. Alternatively, the decoder determines the third intra-prediction mode based on the prediction angle of the first intra-prediction mode and the prediction angle of the second intra-prediction mode.
[0203] For example, when the decoder checks the prediction mode of the current block, and a geometric partitioning mode is used for the current block, it can determine the set of transformation matrices to be used for the first transformation based on the first intra-prediction mode and the second intra-prediction mode when selecting the set of transformation matrices to be used for the first transformation. In one embodiment, the determination can be made using the first intra-prediction mode in any case, i.e., in any case, the set of transformation matrices that conforms to the first intra-prediction mode is determined as the set of transformation matrices to be used for the first transformation; or, the determination can be made using the second intra-prediction mode in any case, i.e., in any case, the set of transformation matrices that conforms to the second intra-prediction mode is determined as the set of transformation matrices to be used for the first transformation. In another embodiment, the set of transformation matrices may be determined using a first intra-prediction mode, i.e., a set of transformation matrices that conforms to the first intra-prediction mode may be determined as the set of transformation matrices used for the first transformation; or, the set of transformation matrices may be determined using a second intra-prediction mode, i.e., a set of transformation matrices that conforms to the second intra-prediction mode may be determined as the set of transformation matrices used for the first transformation; furthermore, the set of transformation matrices may be determined using a planar mode or a DC mode, i.e., a set of transformation matrices that conforms to the planar mode or a DC mode may be determined as the set of transformation matrices used for the first transformation. Determining using a certain prediction mode means classifying the geometric partitioning mode and a certain prediction mode into one type, and based on this, the decoder can determine the set of transformation matrices used for the first transformation based on a certain prediction mode. Alternatively, when the decoder checks the prediction mode of the current block, if a geometric partitioning mode is used for the current block, it can return a certain prediction mode, and based on this, the decoder can determine the set of transformation matrices used for the first transformation based on a certain prediction mode.Alternatively, when the decoder checks the prediction mode for the current block, if a geometric partitioning mode is used for the current block, the decoder may assume that the set of transformation matrices used for the first transformation of the current block may be a set of transformation matrices that fits a certain prediction mode.
[0204] For example, if the decoder determines, among the first and second intra-prediction modes, the intra-prediction mode corresponding to the weight derivation mode used for the geometric partitioning mode as the third intra-prediction mode, then based on the third mapping relationship, the decoder can determine, as the third intra-prediction mode, the intra-prediction mode corresponding to the weight derivation mode used for the geometric partitioning mode.
[0205] The third mapping relationship includes a weight derivation mode corresponding to a first intra-prediction mode and a weight derivation mode corresponding to a second intra-prediction mode, wherein the weight derivation mode corresponding to the first intra-prediction mode includes a weight derivation mode used for geometric partitioning mode, or the weight derivation mode corresponding to the second intra-prediction mode includes a weight derivation mode used for geometric partitioning mode. In other words, based on the third mapping relationship, if the weight derivation mode corresponding to the first intra-prediction mode includes a weight derivation mode used for geometric partitioning mode, the first intra-prediction mode is determined as the third intra-prediction mode. If the weight derivation mode corresponding to the second intra-prediction mode includes a weight derivation mode used for geometric partitioning mode, the second intra-prediction mode is determined as the third intra-prediction mode. Alternatively, the third mapping relationship can be used to define a weight derivation mode corresponding to a first prediction mode and a weight derivation mode corresponding to a second prediction mode. In one specific embodiment, the third mapping relationship may include only the weight derivation mode corresponding to the first intra-prediction mode and the weight derivation mode corresponding to the second intra-prediction mode, while in another specific embodiment, the third mapping relationship may also include the weight derivation mode corresponding to intra-prediction modes other than the first intra-prediction mode and the second intra-prediction mode.
[0206] Of course, the third mapping relationship may be implemented in the form of a table, or in other forms such as an array, and this application is not specifically limited to these forms.
[0207] In some embodiments, the decoder can determine a third intra-prediction mode based on the weights of a first intra-prediction mode or a second intra-prediction mode used at the default position.
[0208] For example, the third intra-prediction mode is related to the weights of the first intra-prediction mode or the second intra-prediction mode used at the default position.
[0209] For example, the weights of the first intra-prediction mode used at the default position may be the weights used when predicting a point at the default position of the current block using the first intra-prediction mode. Similarly, the weights of the second intra-prediction mode used at the default position may be the weights used when predicting a point at the default position of the current block using the second intra-prediction mode. For example, in connection with the geometric partitioning modes described above, the weights of the first intra-prediction mode used at the default position may be the wValue calculated at the default position, and the weights of the second intra-prediction mode used at the default position may be the 8-wValue calculated at the default position.
[0210] In some embodiments, when the decoder determines a third intra-prediction mode based on the weights of a first intra-prediction mode and / or a second intra-prediction mode, it may determine the intra-prediction mode with the largest weight at the default position among the first and second intra-prediction modes as the third intra-prediction mode.
[0211] For example, if the weight of the first intra-prediction mode used at the default position is greater than the weight of the second intra-prediction mode used at the default position, the decoder can either determine the first intra-prediction mode as the third intra-prediction mode, or directly determine a set of transformation matrices that conforms to the first intra-prediction mode as the set of transformation matrices used for the first transformation. If the weight of the second intra-prediction mode used at the default position is greater than the weight of the first intra-prediction mode used at the default position, the decoder can either determine the second intra-prediction mode as the third intra-prediction mode, or directly determine a set of transformation matrices that conforms to the second intra-prediction mode as the set of transformation matrices used for the first transformation.
[0212] Of course, in other alternative embodiments, the third intra-prediction mode may be determined based solely on the weights of the first intra-prediction mode or the weights of the second intra-prediction mode. For example, if the weight of the first intra-prediction mode is greater than a certain threshold, the first intra-prediction mode can be determined as the third intra-prediction mode; otherwise, the second intra-prediction mode can be determined as the third intra-prediction mode. For example, if the weight of the second intra-prediction mode is greater than a certain threshold, the second intra-prediction mode can be determined as the third intra-prediction mode; otherwise, the first intra-prediction mode can be determined as the third intra-prediction mode.
[0213] Of course, in other alternative embodiments, the weights of the first intra-prediction mode may be the weights of the first intra-prediction mode statistically aggregated against the weight matrix used for the geometric partitioning mode, and similarly, the weights of the second intra-prediction mode may be the weights of the second intra-prediction mode statistically aggregated against the weight matrix used for the geometric partitioning mode. In other words, the decoder can first calculate the weight value for each point, statistically aggregate the weight values of all points for each of the first and / or second intra-prediction modes, and then determine the third intra-prediction mode based on the weights of the first and / or second intra-prediction modes. In one embodiment, if the weight of the first intra-prediction mode is greater than a threshold, the first intra-prediction mode can be determined as the third intra-prediction mode; otherwise, the second intra-prediction mode can be determined as the third intra-prediction mode. In another embodiment, the intra-prediction mode with the largest weight among the first intra-prediction mode and the second intra-prediction mode can be determined as the third intra-prediction mode.
[0214] In some embodiments, the default position is the center position.
[0215] Of course, in other alternative embodiments, the default position may be any other position, such as the top left, top right, bottom left, or bottom right.
[0216] In some embodiments, when the decoder determines a third intra-prediction mode based on the weights of a first intra-prediction mode or a second intra-prediction mode used at the default position, it may first determine a dividing line consisting of points where the weights of the first and second prediction modes are the same, and then determine the third intra-prediction mode based on the dividing line. For example, the decoder may determine the third intra-prediction mode based on the angle index and / or distance index of the dividing line.
[0217] For example, when the decoder determines the intra-prediction mode corresponding to the angle index of the dividing line from among the first intra-prediction mode and the second intra-prediction mode as the third intra-prediction mode, it can determine the intra-prediction mode corresponding to the angle index of the dividing line as the third intra-prediction mode based on a fourth mapping relationship.
[0218] The fourth mapping relationship includes an angle index corresponding to a first intra-prediction mode and an angle index corresponding to a second intra-prediction mode, wherein the angle index corresponding to the first intra-prediction mode includes an angle index used for the dividing line, or the angle index corresponding to the second intra-prediction mode includes an angle index used for the dividing line. In other words, based on the fourth mapping relationship, if the angle index corresponding to the first intra-prediction mode includes an angle index for the dividing line, the first intra-prediction mode is determined as the third intra-prediction mode. If the angle index corresponding to the second intra-prediction mode includes an angle index for the dividing line, the second intra-prediction mode is determined as the third intra-prediction mode. Alternatively, the fourth mapping relationship can be used to define the angle index corresponding to the first prediction mode and the angle index corresponding to the second prediction mode. In one specific embodiment, the fourth mapping relationship may include only the angular index corresponding to the first intra-prediction mode and the angular index corresponding to the second intra-prediction mode, while in another specific embodiment, the fourth mapping relationship may also include the angular index corresponding to intra-prediction modes other than the first intra-prediction mode and the second intra-prediction mode.
[0219] Of course, the fourth mapping relationship may be implemented in the form of a table, or in other forms such as an array, and this application is not specifically limited to these forms.
[0220] For example, when the decoder determines, as the third intra-prediction mode, the intra-prediction mode corresponding to the distance index used for the dividing line, from among the first and second intra-prediction modes, it can determine, as the third intra-prediction mode, based on a fifth mapping relationship.
[0221] The fifth mapping relationship includes a distance index corresponding to a first intra-prediction mode and a distance index corresponding to a second intra-prediction mode, wherein the distance index corresponding to the first intra-prediction mode includes a distance index used for dividing lines, or the distance index corresponding to the second intra-prediction mode includes a distance index used for dividing lines. In other words, based on the fifth mapping relationship, if the distance index corresponding to the first intra-prediction mode includes a distance index used for dividing lines, the first intra-prediction mode is determined as the third intra-prediction mode. If the distance index corresponding to the second intra-prediction mode includes a distance index used for dividing lines, the second intra-prediction mode is determined as the third intra-prediction mode. Alternatively, the fifth mapping relationship can be used to define the distance index corresponding to the first prediction mode and the distance index corresponding to the second prediction mode. In one specific embodiment, the fifth mapping relationship may include only distance indices corresponding to the first intra-prediction mode and distance indices corresponding to the second intra-prediction mode, while in another specific embodiment, the fifth mapping relationship may also include distance indices corresponding to intra-prediction modes other than the first intra-prediction mode and the second intra-prediction mode.
[0222] Of course, the fifth mapping relationship may be implemented in the form of a table, or in other forms such as an array, and this application is not specifically limited to these forms.
[0223] In some embodiments, when the decoder determines a third intra-prediction mode based on the type of a first intra-prediction mode and the type of a second intra-prediction mode, if the first and second intra-prediction modes include an angle prediction mode and a non-angle prediction mode, the decoder determines the angle prediction mode as the third intra-prediction mode.
[0224] For example, when a decoder determines a third intra-prediction mode based on the types of the first and second intra-prediction modes, the priority given to an angle prediction mode as the third intra-prediction mode is higher than the priority given to a non-angle prediction mode as the third intra-prediction mode. For instance, if the first intra-prediction mode is an angle prediction mode and the second intra-prediction mode is a non-angle prediction mode (e.g., a planar mode or a DC mode), the decoder determines the first intra-prediction mode (i.e., the angle prediction mode) as the third intra-prediction mode.
[0225] In some embodiments, when the decoder determines a third intra-prediction mode based on the prediction angle of a first intra-prediction mode and the prediction angle of a second intra-prediction mode, if the prediction angle of the first intra-prediction mode approaches the prediction angle of the second intra-prediction mode, the decoder may determine the first intra-prediction mode, the second intra-prediction mode, or an intra-prediction mode whose prediction angle is between the prediction angle of the first intra-prediction mode and the prediction angle of the second intra-prediction mode as the third intra-prediction mode. If the difference between the prediction angle of the first intra-prediction mode and the prediction angle of the second intra-prediction mode is large, the decoder may determine a planar mode or a DC mode as the third intra-prediction mode.
[0226] In some embodiments, if the absolute value of the difference between the predicted angle of the first intra-prediction mode and the predicted angle of the second intra-prediction mode is less than or equal to a second threshold, the intra-prediction mode corresponding to the first predicted angle is determined as the third intra-prediction mode. The first predicted angle is determined based on the predicted angle of the first intra-prediction mode and the predicted angle of the second intra-prediction mode. If the absolute value of the difference between the predicted angle of the first intra-prediction mode and the predicted angle of the second intra-prediction mode is greater than the second threshold, the planar mode or DC mode is determined as the third intra-prediction mode.
[0227] In some embodiments, the decoder determines a set of transformation matrices that conforms to the geometric partitioning mode as the set of transformation matrices to be used for the first transformation.
[0228] For example, if both prediction modes corresponding to a geometric partitioning mode (i.e., the first and second prediction modes) are intra-prediction modes, or if a geometric partitioning mode is used for intra-prediction, the decoder determines the set of transformation matrices that conforms to the geometric partitioning mode as the set of transformation matrices used for the first transformation.
[0229] For example, if the prediction modes corresponding to the geometric partitioning modes (i.e., the first and second prediction modes) are both interprediction modes, or if the geometric partitioning modes are used for interprediction, the decoder determines the set of transformation matrices that fit the geometric partitioning modes as the set of transformation matrices used for the first transformation.
[0230] For example, a decoder can define a set of transformation matrices that are adapted to or dedicated to geometric partitioning modes.
[0231] In some embodiments, the first transformation is used to process diagonal textures in the current block. The second transformation is used to process horizontal and vertical textures in the current block.
[0232] As described above, the decoding method according to the embodiment of the present application will be described in detail from the viewpoint of the decoder, and below, with reference to Figure 17, the encoding method according to the embodiment of the present application will be described from the viewpoint of the encoder.
[0233] Figure 17 is a flowchart illustrating an encoding method 400 according to an embodiment of the present application. It should be understood that the encoding method 400 may be performed by an encoder, and is applied, for example, to the encoding framework 100 shown in Figure 1. For the sake of explanation, an encoder will be used as an example.
[0234] As shown in Figure 17, the encoding method 400 may include the following: S410: A prediction is made for the current block based on the first and second prediction modes corresponding to the geometric partitioning mode, and the predicted block for the current block is obtained. S420: Based on the predicted block of the current block, obtain the residual block of the current block. S430: Perform a third transformation on the residual block of the current block and obtain the third transformation coefficient of the current block. S440: Perform the fourth transformation on the third transformation coefficient and obtain the fourth transformation coefficient for the current block. S450: Encode the fourth conversion coefficient.
[0235] It should be understood that the first transformation on the decoding side is the inverse transformation of the fourth transformation on the encoding side, and the second transformation on the decoding side is the inverse transformation of the third transformation on the encoding side. For example, the third transformation is the basic or principal transformation described above, and the fourth transformation is the quadratic transformation described above. Accordingly, the first transformation may be the inverse (or anti-transformation) of the quadratic transformation, and the second transformation may be the inverse (or anti-transformation) of the basic or principal transformation. For example, the first transformation may be the inverse (anti-)LFNST, and the second transformation may be the inverse (anti-)DCT2 type, inverse (anti-)DCT8 type, or inverse (anti-)DST7 type, etc. Accordingly, the third transformation may be the DCT2 type, DCT8 type, or DST7 type, etc., and the fourth transformation may be LFNST.
[0236] In some embodiments, S450 may include encoding a first flag, a second flag, and a fourth transformation coefficient. The first flag indicates that a geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and the second flag indicates that a fourth transformation is permitted to make transformations for blocks in the current sequence.
[0237] In some embodiments, S450 may include encoding a first flag, a second flag, a fourth transformation coefficient, and a third flag. The third flag indicates that both the geometric partitioning mode and the fourth transformation are permitted for the block in the current sequence.
[0238] In some embodiments, S440 may include performing a fourth transformation on the third transformation coefficient to obtain the fourth transformation coefficient when the current height and / or width of the block is greater than or equal to a first threshold.
[0239] In some embodiments, prior to S440, method 400 may further include determining the set of transformation matrices to be used for the fourth transformation.
[0240] In some embodiments, the set of transformation matrices used for the fourth transformation is the same as the set of transformation matrices that conform to the planar mode or DC mode.
[0241] In some embodiments, determining the set of transformation matrices used for the fourth transformation is: Determine the dividing line consisting of points where the weights of the first prediction mode and the weights of the second prediction mode are the same, Determining the angle index of the dividing line, Based on the angular index, determine the set of transformation matrices used for the fourth transformation, Includes.
[0242] In some embodiments, determining the set of transformation matrices used for the fourth transformation is: Determining the weight derivation mode used for the geometric partitioning mode, The set of transformation matrices that fits the intra-prediction mode corresponding to the weight derivation mode is determined as the set of transformation matrices used for the fourth transformation, Includes.
[0243] In some embodiments, the first prediction mode is the first intra-prediction mode, and the second prediction mode is the second intra-prediction mode. Determining the set of transformation matrices used for the fourth transformation includes determining the third intra-prediction mode based on the first and second intra-prediction modes. The set of transformation matrices used for the fourth transformation is the same as the set of transformation matrices that fit the third intra-prediction mode.
[0244] In some embodiments, determining a third intra-prediction mode based on a first intra-prediction mode and a second intra-prediction mode is possible. To determine the default prediction mode from the first intra prediction mode and the second intra prediction mode as the third intra prediction mode, or To determine the intra-prediction mode corresponding to the weight derivation mode used in the geometric partitioning mode among the first intra-prediction mode and the second intra-prediction mode as the third intra-prediction mode, or, Determining a third intra-prediction mode based on the weights of the first intra-prediction mode and / or the weights of the second intra-prediction mode, or Determining a third intra-prediction mode based on the type of the first intra-prediction mode and the type of the second intra-prediction mode, or The third intra prediction mode is determined based on the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode. Includes.
[0245] In some embodiments, determining a third intra-prediction mode based on the weights of a first intra-prediction mode and / or a second intra-prediction mode includes determining a third intra-prediction mode based on the weights of a first intra-prediction mode or a second intra-prediction mode used in the default position.
[0246] In some embodiments, determining the third intra prediction mode based on the weights of the first intra prediction mode used at the default position or the weights of the second intra prediction mode includes determining, as the third intra prediction mode, the intra prediction mode with the largest weight at the default position among the first intra prediction mode and the second intra prediction mode.
[0247] In some embodiments, the default position is the center position.
[0248] In some embodiments, determining the third intra prediction mode based on the types of the first intra prediction mode and the second intra prediction mode includes determining, as the third intra prediction mode, the angular prediction mode when the first intra prediction mode and the second intra prediction mode include an angular prediction mode and a non-angular prediction mode.
[0249] In some embodiments, determining the third intra prediction mode based on the prediction angles of the first intra prediction mode and the second intra prediction mode is when the absolute value of the difference between the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode is less than or equal to a second threshold, determining, as the third intra prediction mode, the intra prediction mode corresponding to the first prediction angle, where the first prediction angle is determined based on the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode, and when the absolute value of the difference between the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode is greater than the second threshold, determining the planar mode or the DC mode as the third intra prediction mode, and includes.
[0250] In some embodiments, determining the set of transformation matrices used for the fourth transformation includes determining, as the set of transformation matrices used for the fourth transformation, a set of transformation matrices that conforms to the geometric partitioning mode.
[0251] In some embodiments, the fourth transformation is used to process the diagonal texture in the current block. The third transformation is used to process the horizontal and vertical textures in the current block.
[0252] It can be understood that the encoding method is the reverse process of the decoding method. Therefore, for the specific method of the encoding method 400, the relevant content of the decoding method 300 can be referred to, and for the sake of simplicity of explanation, this specification will not explain it again here.
[0253] As described above, the preferred embodiments of the present application have been described in detail with reference to the accompanying drawings. However, the present application is not limited to the detailed content of the above embodiments. Within the scope of the technical idea of the present application, various simple changes can be made to the technical solutions of the present application, and all of these simple changes belong to the protection scope of the present application. For example, each specific technical feature described in the above specific embodiments may be combined by any appropriate means without contradiction. To avoid unnecessary duplication, this application will not explain various possible combinations again. Also, for example, among various different embodiments of the present application, any combination should be regarded as disclosed in the present application as long as it does not violate the idea of the present application. It should be understood that in various method embodiments of the present application, the magnitude of the sequence numbers of the above processes does not mean the execution order. The execution order of each process should be determined by its function and internal logic and should not constitute any limitation to the implementation process of the embodiments of the present application.
[0254] The method embodiments of the present application have been described in detail above. Hereinafter, the apparatus embodiments of the present application will be described in detail with reference to FIGS. 18 to 20.
[0255] FIG. 18 is a block diagram showing a decoder 500 according to an embodiment of the present application.
[0256] As shown in Figure 18, the decoder 500 may include a decoding unit 510, a conversion unit 520, a prediction unit 530, and a reconstruction unit 540. The decoding unit 510 is configured to decode a bitstream and obtain a first conversion coefficient of the current block. The conversion unit 520 is configured to perform a first conversion on the first conversion coefficient to obtain a second conversion coefficient of the current block; and to perform a second conversion on the second conversion coefficient to obtain a residual block of the current block. The prediction unit 530 is configured to make a prediction on the current block based on a first prediction mode and a second prediction mode corresponding to the geometric partitioning mode, and to obtain a predicted block of the current block. The reconstruction unit 540 is configured to obtain a reconstructed block of the current block based on the predicted block of the current block and the residual block of the current block.
[0257] In some embodiments, the conversion unit 520 is configured to specifically perform the following: decode the bitstream and obtain a first flag and a second flag; if the first flag indicates that a geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and the second flag indicates that the first conversion is permitted to make conversions for blocks in the current sequence, 1 The first transformation is performed on the transformation coefficient, and the second transformation coefficient is obtained.
[0258] In some embodiments, the conversion unit 520 is configured to specifically perform the following: if a first flag indicates that the geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and a second flag indicates that the first transformation is permitted to make transformations for blocks in the current sequence, decode the bitstream and obtain a third flag; if the third flag indicates that both the geometric partitioning mode and the first transformation are permitted to make predictions for blocks in the current sequence, 1 The first transformation is performed on the transformation coefficient, and the second transformation coefficient is obtained.
[0259] In some embodiments, the transformation unit 520 is configured to specifically perform the following: If a first flag indicates that a geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and a second flag indicates that a first transformation is permitted to make transformations for blocks in the current sequence, then when the height and / or width of the current block exceeds a first threshold, 1 The first transformation is performed on the transformation coefficient, and the second transformation coefficient is obtained.
[0260] In some embodiments, the transformation unit 520 is configured to perform a second transformation on the second transformation coefficients and, before obtaining the residual block of the current block, further determine the set of transformation matrices to be used for the first transformation.
[0261] In some embodiments, the set of transformation matrices used for the first transformation is the same as the set of transformation matrices that conform to the planar mode or DC mode.
[0262] In some embodiments, the transformation unit 520 is configured to specifically perform the following: determine a dividing line consisting of points where the weights of the first prediction mode and the weights of the second prediction mode are the same; determine the angular index of the dividing line; and determine the set of transformation matrices to be used for the first transformation based on the angular index.
[0263] In some embodiments, the transformation unit 520 is configured to specifically perform the following: determine the weight derivation mode to be used for the geometric partitioning mode; and determine the set of transformation matrices that fit the intra-prediction mode corresponding to the weight derivation mode as the set of transformation matrices to be used for the first transformation.
[0264] In some embodiments, the first prediction mode is the first intra-prediction mode, and the second prediction mode is the second intra-prediction mode. Specifically, the transformation unit 520 is configured to determine a third intra-prediction mode based on the first and second intra-prediction modes. The set of transformation matrices used for the first transformation is the same as the set of transformation matrices that conform to the third intra-prediction mode.
[0265] In some embodiments, the conversion unit 520 is configured to specifically perform the following: determine the default prediction mode from among the first and second intra-prediction modes as the third intra-prediction mode; or determine the intra-prediction mode from among the first and second intra-prediction modes that corresponds to the weight derivation mode used for the geometric partitioning mode as the third intra-prediction mode; or determine the third intra-prediction mode based on the weights of the first intra-prediction mode and / or the weights of the second intra-prediction mode; or determine the third intra-prediction mode based on the type of the first intra-prediction mode and the type of the second intra-prediction mode; or determine the third intra-prediction mode based on the prediction angle of the first intra-prediction mode and the prediction angle of the second intra-prediction mode.
[0266] In some embodiments, the conversion unit 520 is specifically configured to determine a third intra prediction mode based on the weight of the first intra prediction mode or the weight of the second intra prediction mode used at the default position.
[0267] In some embodiments, the conversion unit 520 is specifically configured to determine, as the third intra prediction mode, the intra prediction mode with the largest weight at the default position among the first intra prediction mode and the second intra prediction mode.
[0268] In some embodiments, the default position is the center position.
[0269] In some embodiments, the conversion unit 520 is specifically configured to determine, as the third intra prediction mode, the angular prediction mode when the first intra prediction mode and the second intra prediction mode include an angular prediction mode and a non-angular prediction mode.
[0270] In some embodiments, the conversion unit 520 is specifically configured to perform the following. That is, when the absolute value of the difference between the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode is less than or equal to a second threshold, determine the intra prediction mode corresponding to the first prediction angle as the third intra prediction mode, and the first prediction angle is determined based on the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode; when the absolute value of the difference between the prediction angle of the first intra prediction mode and the prediction angle of the second intra prediction mode is greater than the second threshold, determine the planar mode or the DC mode as the third intra prediction mode.
[0271] In some embodiments, the conversion unit 520 is specifically configured to determine a set of transformation matrices adapted to the geometric partitioning mode as the set of transformation matrices used for the first transformation.
[0272] In some embodiments, the first transformation is used to process diagonal textures in the current block. The second transformation is used to process horizontal and vertical textures in the current block.
[0273] Figure 19 is a block diagram showing an encoder 600 according to an embodiment of the present application.
[0274] As shown in Figure 19, the encoder 600 may include a prediction unit 610, a residual unit 620, a transformation unit 630, and an encoding unit 640. The prediction unit 610 is configured to make a prediction for the current block based on a first prediction mode and a second prediction mode corresponding to the geometric partitioning mode, and to obtain the predicted block of the current block. The residual unit 620 is configured to obtain the residual block of the current block based on the predicted block of the current block. The transformation unit 630 is configured to perform a third transformation on the residual block of the current block and obtain the third transformation coefficient of the current block; and to perform a fourth transformation on the third transformation coefficient and obtain the fourth transformation coefficient of the current block. The encoding unit 640 is configured to encode the fourth transformation coefficient.
[0275] In some embodiments, the encoding unit 640 is configured to encode, specifically, a first flag, a second flag, and a fourth transformation coefficient. The first flag indicates that a geometric partitioning mode is permitted to be used to make predictions for blocks in the current sequence, and the second flag indicates that a fourth transformation is permitted to be used to perform transformations for blocks in the current sequence.
[0276] In some embodiments, the encoding unit 640 is configured to encode, specifically, a first flag, a second flag, a fourth transformation coefficient, and a third flag. The third flag indicates that both the geometric partitioning mode and the fourth transformation are permitted for the block in the current sequence.
[0277] In some embodiments, the conversion unit 630 is configured to perform a fourth conversion on a third conversion coefficient and obtain a fourth conversion coefficient when the current height and / or width of the block is greater than or equal to a first threshold.
[0278] In some embodiments, the transformation unit 630 is configured to perform a fourth transformation on a third transformation coefficient and, before obtaining the fourth transformation coefficient for the current block, further determine the set of transformation matrices to be used for the fourth transformation.
[0279] In some embodiments, the set of transformation matrices used for the fourth transformation is the same as the set of transformation matrices that conform to the planar mode or DC mode.
[0280] In some embodiments, the transformation unit 630 is configured to specifically perform the following: determine a dividing line consisting of points where the weights of the first prediction mode and the weights of the second prediction mode are the same; determine the angular index of the dividing line; and determine the set of transformation matrices to be used for the fourth transformation based on the angular index.
[0281] In some embodiments, the transformation unit 630 is configured to specifically perform the following: determine the weight derivation mode to be used for the geometric partitioning mode; and determine the set of transformation matrices that fit the intra-prediction mode corresponding to the weight derivation mode as the set of transformation matrices to be used for the fourth transformation.
[0282] In some embodiments, the first prediction mode is the first intra-prediction mode, and the second prediction mode is the second intra-prediction mode. Specifically, the transformation unit 630 is configured to determine the third intra-prediction mode based on the first and second intra-prediction modes. The set of transformation matrices used for the fourth transformation is the same as the set of transformation matrices that fit the third intra-prediction mode.
[0283] In some embodiments, the conversion unit 630 is configured to specifically perform the following: that is, to determine the default prediction mode from among the first intra-prediction mode and the second intra-prediction mode as the third intra-prediction mode; or to determine the intra-prediction mode from among the first intra-prediction mode and the second intra-prediction mode that corresponds to the weight derivation mode used for the geometric partitioning mode as the third intra-prediction mode; or to determine the third intra-prediction mode based on the weights of the first intra-prediction mode and / or the weights of the second intra-prediction mode; or to determine the third intra-prediction mode based on the type of the first intra-prediction mode and the type of the second intra-prediction mode; or to determine the third intra-prediction mode based on the prediction angle of the first intra-prediction mode and the prediction angle of the second intra-prediction mode.
[0284] In some embodiments, the conversion unit 630 is configured to determine a third intra-prediction mode based specifically on the weights of a first intra-prediction mode or a second intra-prediction mode used at the default position.
[0285] In some embodiments, the conversion unit 630 is configured to determine, specifically, as the third intra-prediction mode the intra-prediction mode that has the greatest weight at the default position among the first intra-prediction mode and the second intra-prediction mode.
[0286] In some embodiments, the default position is the center position.
[0287] In some embodiments, the conversion unit 630 is configured to determine the angle prediction mode as the third intra prediction mode if the first intra prediction mode and the second intra prediction mode include an angle prediction mode and a non-angle prediction mode.
[0288] In some embodiments, determining a third intra-prediction mode based on the prediction angle of a first intra-prediction mode and the prediction angle of a second intra-prediction mode is possible. If the absolute value of the difference between the predicted angle of the first intra-prediction mode and the predicted angle of the second intra-prediction mode is less than or equal to a second threshold, the intra-prediction mode corresponding to the first predicted angle is determined as the third intra-prediction mode, and the first predicted angle is determined based on the predicted angle of the first intra-prediction mode and the predicted angle of the second intra-prediction mode. If the absolute value of the difference between the predicted angle of the first intra-prediction mode and the predicted angle of the second intra-prediction mode is greater than the second threshold, then the planar mode or DC mode is determined as the third intra-prediction mode. Includes.
[0289] In some embodiments, the transformation unit 630 is configured to determine, specifically, a set of transformation matrices that conforms to the geometric partitioning mode as the set of transformation matrices used for the fourth transformation.
[0290] In some embodiments, a fourth transformation is used to process diagonal textures in the current block. A third transformation is used to process horizontal and vertical textures in the current block.
[0291] Note that the apparatus embodiment and the method embodiment can correspond to each other, and for similar descriptions, refer to the method embodiment. To avoid duplication, such descriptions are omitted here. Specifically, the decoder 500 shown in Figure 18 may correspond to the entity that performs method 300 in the embodiment of the present application. Furthermore, the aforementioned and other operations and / or functions of each unit in the decoder 500 are each used to realize the corresponding process in each method, such as method 300. Similarly, the encoder 600 shown in Figure 19 may correspond to the entity that performs method 400 in the embodiment of the present application. That is, the aforementioned and other operations and / or functions of each unit in the encoder 600 are each used to realize the corresponding process in each method, such as method 400.
[0292] Furthermore, each unit in the decoder 500 or encoder 600 according to the embodiment of the present application may be integrated into one or more other units, or some of these units may be further divided into several functionally smaller units. This allows similar operation to be achieved without affecting the realization of the technical effects of the embodiment of the present application. The units are divided based on logic functions. In practical applications, the function of one unit may be realized by multiple units, or the function of multiple units may be realized by one unit. In other embodiments of the present application, the decoder 500 or encoder 600 may include other units, and in practical applications, these functions may be realized by the cooperation of other units or by the cooperation of multiple units. According to another embodiment of the present application, for example, a decoder 500 or encoder 600 according to an embodiment of the present application can be constructed by executing a computer program (including program code) capable of executing each step of the corresponding method in a general-purpose computing device such as a general-purpose computer equipped with processing elements and storage elements such as a central processing unit (CPU), random access memory (RAM), and read-only memory (ROM), thereby realizing the encoding method or decoding method according to an embodiment of the present application. The computer program can be recorded, for example, on a computer-readable storage medium, mounted on an electronic device via the computer-readable storage medium, and operated within it, thereby realizing the corresponding method in the embodiment of the present application.
[0293] In other words, the above-mentioned unit may be implemented in hardware form, in software form by instructions, or in combination of hardware and software. Specifically, each step of the method embodiment in the embodiments of the present application can be completed by hardware integrated logic circuits and / or software form instructions in a processor. The steps of the method disclosed in the embodiments of the present application can be executed and completed directly by a hardware decoding processor, or by a combination of hardware and software in a decoding processor. Optionally, the software can reside in a mature storage medium in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. The storage medium resides in memory. The processor reads information from memory and uses the processor's hardware to complete the steps of the method embodiment.
[0294] Figure 20 is a block diagram showing an electronic device 700 according to an embodiment of the present application.
[0295] As shown in Figure 20, the electronic device 700 includes at least a processor 710 and a computer-readable storage medium 720. The processor 710 and the computer-readable storage medium 720 may be connected by a bus or other means. The computer-readable storage medium 720 is used to store a computer program 721 containing computer instructions. The processor 710 is used to execute the computer instructions stored in the computer-readable storage medium 720. The processor 710 is the computing core and control core of the electronic device 700 and is suitable for implementing one or more computer instructions, specifically, for implementing a corresponding process or corresponding function by loading and executing one or more computer instructions.
[0296] For example, the processor 710 may also be called a central processing unit (CPU). The processor 710 may include, but is not limited to, a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component.
[0297] For example, the computer-readable storage medium 720 may be high-speed RAM memory, non-volatile memory, such as at least one magnetic disk storage device. Selectively, the computer-readable storage medium 720 may be at least one computer-readable storage medium located away from the processor 710. Specifically, the computer-readable storage medium 720 includes, but is not limited to, volatile memory and / or non-volatile memory. Non-volatile memory may be ROM, programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), or flash memory. Volatile memory may be RAM that functions as an external high-speed cache. As illustrative but not limited examples, various types of RAM are available, such as static random access memory (static RAM, SRAM), dynamic random access memory (dynamic RAM, DRAM), synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (synch link DRAM, SLDRAM), and direct rambus random access memory (direct rambus RAM, DRRAM).
[0298] In one embodiment, the electronic device 700 may be an encoder or encoding framework according to the embodiment of the present application. The computer-readable storage medium 720 stores a first computer instruction. The processor 710 loads and executes the first computer instruction stored in the computer-readable storage medium 720 to realize the corresponding step in the encoding method according to the embodiment of the present application. In other words, the first computer instruction in the computer-readable storage medium 720 is loaded and executed by the processor 710 to perform the corresponding step. To avoid redundancy, the explanation is omitted here.
[0299] In one embodiment, the electronic device 700 may be a decoder or decoding framework according to the embodiment of the present application. The computer-readable storage medium 720 stores a second computer instruction. The processor 710 loads and executes the second computer instruction stored in the computer-readable storage medium 720 to realize the corresponding step in the decoding method according to the embodiment of the present application. In other words, the second computer instruction in the computer-readable storage medium 720 is loaded and executed by the processor 710 to perform the corresponding step. To avoid redundancy, the explanation is omitted here.
[0300] In another embodiment of the present application, a coding system is provided, comprising the encoder and decoder described above.
[0301] In another aspect of the present application, an embodiment of the present application provides a computer-readable storage medium (Memory). The computer-readable storage medium is a storage device in the electronic device 700 and is used to store programs and data. For example, it may be a computer-readable storage medium 720. To make it clear, the computer-readable storage medium 720 herein may include an internal storage medium in the electronic device 700 and, of course, an extended storage medium supported by the electronic device 700. The computer-readable storage medium provides a storage space in which the operating system of the electronic device 700 is stored. Furthermore, the storage space stores one or more computer instructions suitable for being loaded and executed by the processor 710, which may be one or more computer programs 721 (including program code).
[0302] According to another aspect of the present application, a computer program product or computer program is provided. The computer program product or computer program includes computer instructions, which are stored in a computer-readable storage medium, and for example, the computer instructions may be computer program 721. In this case, electronic equipment 700 may be a computer, and the processor 710 reads a computer instruction from the computer-readable storage medium 720, and the processor 710 executes the computer instruction, thereby causing the computer to execute an encoding method or decoding method according to the various selectable methods described above.
[0303] In other words, when implemented by software, all or part of the above embodiments may be implemented in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes described in the embodiments of this application are executed, or all or part of the functions described in the embodiments of this application are implemented. The computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable device. The computer instructions may be stored on a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic cable, digital subscriber line (DSL), etc.) or wirelessly (e.g., infrared, radio, microwave, etc.).
[0304] It will be apparent to those skilled in the art that, in conjunction with the exemplary units and process steps described in the embodiments disclosed herein, the present application can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software will depend on the specific application of the technology and design limitations. Those skilled in the art can implement the described functions using different methods for each specific application, but these implementations should not be considered beyond the scope of the present application.
[0305] Finally, the above is merely a specific embodiment of the present application, and the scope of protection of the present application is not limited thereto. Any modification or substitution that a person skilled in the art could easily conceive within the scope of the art disclosed herein should be included within the scope of protection of the present application. Accordingly, the scope of protection of the present application should be determined by the scope of protection of the claims.
Claims
1. A decoding method, Decode the bitstream and obtain the first transformation coefficient of the current block, Determining the weight derivation mode used for the geometric partitioning mode, The set of transformation matrices that conforms to the intra-prediction mode corresponding to the weight derivation mode is determined as the set of transformation matrices used for the first transformation, Perform the first transformation on the first transformation coefficient to obtain the second transformation coefficient of the current block, Perform a second transformation on the second transformation coefficient to obtain the residual block of the current block, Based on the first and second prediction modes corresponding to the geometric division mode, a prediction is made for the current block, and the predicted block of the current block is obtained. Based on the predicted block of the current block and the residual block of the current block, the reconstructed block of the current block is obtained. including, A decoding method characterized by the following features.
2. Performing the first transformation on the aforementioned first transformation coefficient to obtain the second transformation coefficient of the current block is: Decode the bitstream and obtain the first flag and the second flag, If the first flag indicates that the geometric partitioning mode is permitted to make predictions for blocks in the current sequence, and the second flag indicates that the first transformation is permitted to make transformations for blocks in the current sequence, then the first transformation is performed on the first transformation coefficients to obtain the second transformation coefficients, including, The decoding method according to feature 1.
3. Performing the first transformation on the aforementioned first transformation coefficient to obtain the second transformation coefficient of the current block is: If the first flag indicates that the geometric division mode is permitted to be used to make predictions for the blocks in the current sequence, and the second flag indicates that the first transformation is permitted to be used to perform transformations for the blocks in the current sequence, then if the height and / or width of the current block is greater than or equal to a first threshold, the first transformation is performed on the first transformation coefficients to obtain the second transformation coefficients, The decoding method according to feature 2.
4. An encoding method, Based on a first prediction mode and a second prediction mode corresponding to the geometric partitioning mode, a prediction is made for the current block, and the predicted block of the current block is obtained. Based on the predicted block of the current block, the residual block of the current block is obtained, A third transformation is performed on the residual block of the current block, and the third transformation coefficient of the current block is obtained. Determining the weight derivation mode used for the aforementioned geometric partitioning mode, The set of transformation matrices that conforms to the intra-prediction mode corresponding to the weight derivation mode is determined as the set of transformation matrices used for the fourth transformation, Perform the fourth transformation on the third transformation coefficient to obtain the fourth transformation coefficient of the current block, Encoding the fourth conversion coefficient, including, An encoding method characterized by the following features.
5. Encoding the aforementioned fourth conversion coefficient is This includes encoding the first flag, the second flag, and the fourth conversion coefficient, The first flag indicates that the geometric partitioning mode is permitted to be used to make predictions for blocks in the current sequence, and the second flag indicates that the fourth transformation is permitted to be used to perform transformations for blocks in the current sequence. The encoding method according to feature 4.
6. Performing a fourth transformation on the aforementioned third transformation coefficient to obtain the fourth transformation coefficient of the current block is: The process includes, when the height and / or width of the current block becomes greater than or equal to a first threshold, performing the fourth transformation on the third transformation coefficient and obtaining the fourth transformation coefficient, The encoding method according to feature 5.
7. It is a decoder, The decoder includes a decoding unit, a conversion unit, a prediction unit, and a reconstruction unit. The decoding unit is configured to decode the bitstream and obtain a first conversion coefficient for the current block. The aforementioned conversion unit is configured to perform the following actions, namely: Determine the weight derivation mode used for the geometric partitioning mode. A set of transformation matrices that conforms to the intra-prediction mode corresponding to the weight derivation mode is determined as the set of transformation matrices used for the first transformation. Perform the first transformation on the first transformation coefficient to obtain the second transformation coefficient of the current block. Perform the second transformation on the second transformation coefficient to obtain the residual block of the current block. The prediction unit is configured to make predictions for the current block based on a first prediction mode and a second prediction mode corresponding to the geometric division mode, and to acquire the predicted block of the current block. The reconstruction unit is configured to obtain the reconstructed block of the current block based on the predicted block of the current block and the residual block of the current block. A decoder characterized by the following features.
8. It is an encoder, The encoder includes a prediction unit, a residual unit, a conversion unit, and an encoding unit. The prediction unit is configured to make predictions for the current block based on a first prediction mode and a second prediction mode corresponding to the geometric division mode, and to acquire the predicted block of the current block. The residual unit is configured to acquire the residual block of the current block based on the predicted block of the current block. The aforementioned conversion unit is configured to perform the following actions, namely: A third transformation is performed on the residual block of the current block, and the third transformation coefficient of the current block is obtained. Determine the weight derivation mode used for the aforementioned geometric partitioning mode. A set of transformation matrices that conforms to the intra-prediction mode corresponding to the weight derivation mode is determined as the set of transformation matrices used for the fourth transformation. Perform the fourth transformation on the third transformation coefficient to obtain the fourth transformation coefficient of the current block. The encoding unit is configured to encode the fourth conversion coefficient. An encoder characterized by the following features.
9. A computer-readable storage medium that stores computer programs, The computer program causes the computer to execute the decoding method described in any one of claims 1 to 3. A computer-readable storage medium characterized by the following features.
10. A computer-readable storage medium that stores computer programs, The computer program causes the computer to execute the encoding method described in any one of claims 4 to 6. A computer-readable storage medium characterized by the following features.
11. A computer-readable storage medium storing a computer program and a bitstream, When the computer program is executed by the processor, it causes the program to execute the bitstream generated by the encoding method described in any one of claims 4 to 6. A computer-readable storage medium characterized by the following features.