Image signal encoding / decoding method and apparatus therefor

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By dividing the coded block into multiple prediction blocks and exporting the motion information of each prediction block, and using the inter-frame motion information list to export merging candidates, the problem of insufficient compression performance of HEVC in high-definition video services is solved, and the inter-frame prediction efficiency is improved.

CN119728965BActive Publication Date: 2026-06-23GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD

View PDF 2 Cites -1 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD
Filing Date: 2019-11-08
Publication Date: 2026-06-23

Application Information

Patent Timeline

08 Nov 2019

Application

23 Jun 2026

Publication

CN119728965B

IPC: H04N19/105; H04N19/119; H04N19/136; H04N19/176; H04N19/503; H04N19/51

AI Tagging

Application Domain

Digital video signal modification

Technology Topics

Pattern recognitionCoding block

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN119728965B_ABST

Patent Text Reader

Abstract

The image decoding method according to the present application comprises the steps of: dividing a coding block into a first prediction unit and a second prediction unit; deriving a merge candidate list of the coding block; deriving first motion information of the first prediction unit and second motion information of the second prediction unit using the merge candidate list; and obtaining prediction samples in the coding block based on the first motion information and the second motion information.

Need to check novelty before this filing date? Find Prior Art

Description

[0001] This application is a divisional application of the invention patent application filed on November 8, 2019, with application number 201980063338.2 and invention title "Image Signal Encoding / Decoding Method and Device Thereof". Technical Field

[0002] This invention relates to video signal encoding / decoding methods and devices. Background Technology

[0003] With the trend of increasingly larger display panels, there is a growing need for higher-quality video services. The biggest problem with high-definition video services is the significant increase in data volume. To address this issue, research is actively underway to improve video compression rates. As a representative example, in 2009, the Moving Picture Experts Group (MPEG) and the Video Coding Experts Group (VCEG) of the International Telecommunication Union-Telecommunication (ITU-T) established the Joint Collaborative Team on Video Coding (JCT-VC). JCT-VC proposed the video compression standard HEVC (High Efficiency Video Coding), which was approved on January 25, 2013, and its compression performance is approximately twice that of H.264 / AVC. However, with the rapid development of high-definition video services, the limitations of HEVC have gradually become apparent. Summary of the Invention

[0004] Technical problems to be solved

[0005] The purpose of this invention is to provide a method for dividing a coding block into multiple prediction blocks when encoding / decoding a video signal, and an apparatus for performing the method.

[0006] The purpose of this invention is to provide a method for deriving motion information of each prediction block among multiple prediction blocks when encoding / decoding a video signal, and an apparatus for performing the method.

[0007] The purpose of this invention is to provide a method for deriving merging candidates using an inter-frame motion information list when encoding / decoding video signals, and an apparatus for performing the method.

[0008] The technical problems to be solved by the present invention are not limited to those mentioned above, and those skilled in the art to which the present invention pertains will clearly understand other technical problems not mentioned through the following description.

[0009] Technical solution

[0010] The video signal decoding / encoding method according to the present invention may include the following steps: dividing a coding block into a first prediction unit and a second prediction unit; deriving a merging candidate list of the coding block; using the merging candidate list to derive first motion information of the first prediction unit and second motion information of the second prediction unit; and obtaining prediction samples in the coding block based on the first motion information and the second motion information. In this case, whether to divide the coding block is determined according to the size of the coding block, and the first motion information of the first prediction unit is derived from a first merging candidate in the merging candidate list, and the second motion information of the second prediction unit is derived from a second merging candidate different from the first merging candidate.

[0011] In the video signal decoding / encoding method according to the present invention, when at least one of the width and height of the encoded block is greater than a threshold, the encoded block may not be divided.

[0012] In the video signal decoding / encoding method according to the present invention, the method may further include the following steps: decoding from the bitstream a first index information for specifying the first merging candidate and a second index information for specifying the second merging candidate, and when the value of the second index information is equal to or greater than the value of the first index information, the second merging candidate has a value obtained by adding 1 to the value of the second index information as an index.

[0013] In the video signal encoding / decoding method of the present invention, when the predicted sample is included in the boundary region between the first prediction unit and the second prediction unit, the predicted sample can be derived by weighted sum operation of the first predicted sample derived based on the first motion information and the second predicted sample derived based on the second motion information.

[0014] In the video signal encoding / decoding method of the present invention, a first weighting value applied to the first prediction sample can be determined based on the x-axis coordinates and y-axis coordinates of the prediction sample.

[0015] In the video signal encoding / decoding method of the present invention, a second weighting value applied to the second prediction sample can be derived by subtracting the first weighting value from a constant.

[0016] In the video signal decoding / encoding method according to the present invention, the maximum number of merging candidates included in the merging candidate list can be determined based on whether the coding block is divided into the first prediction unit and the second prediction unit.

[0017] The features briefly outlined above are merely exemplary embodiments of the invention as described in the detailed description to follow, and do not limit the scope of the invention.

[0018] Beneficial effects

[0019] According to the present invention, inter-frame prediction efficiency can be improved by providing a method that divides a coded block into multiple prediction blocks and derives motion information of each prediction block in the multiple prediction blocks.

[0020] According to the present invention, the efficiency of inter-frame prediction can be improved by providing a method for deriving merge candidates using a list of inter-frame motion information.

[0021] The effects that can be obtained in this invention are not limited to those described above, and other effects not mentioned will be clearly understood by those skilled in the art through the following description. Attached Figure Description

[0022] Figure 1 This is a block diagram of a video encoder according to an embodiment of the present invention.

[0023] Figure 2 This is a block diagram of a video decoder according to an embodiment of the present invention.

[0024] Figure 3 This is a diagram illustrating the basic coding tree unit of an embodiment of the present invention.

[0025] Figures 4(a), 4(b), 4(c), 4(d), and 4(e) are diagrams illustrating various partitioning types of coded blocks.

[0026] Figure 5 This is a diagram illustrating the partitioning pattern of the coding tree unit.

[0027] Figure 6 This is a flowchart of the inter-frame prediction method according to an embodiment of the present invention.

[0028] Figure 7 It is a diagram showing the nonlinear motion of an object.

[0029] Figure 8 This is a flowchart illustrating an inter-frame prediction method based on affine motion according to an embodiment of the present invention.

[0030] Figure 9 This is a diagram showing an example of the affine seed vector for each affine motion model.

[0031] Figure 10 This is a diagram showing an example of the affine vectors of a sub-block under a 4-parameter motion model.

[0032] Figure 11 This is a flowchart of the process of exporting motion information of the current block in merge mode.

[0033] Figure 12 This is a diagram showing an example of a candidate block used to derive merge candidates.

[0034] Figure 13 This is a diagram showing the location of the reference sample.

[0035] Figure 14 This is a diagram showing an example of a candidate block used to derive merge candidates.

[0036] Figure 15 It is a flowchart showing the update status of the inter-frame motion information list.

[0037] Figure 16 This is a diagram illustrating an embodiment of updating the inter-frame merging candidate list.

[0038] Figure 17 This is a diagram showing an example of how the index of a stored inter-frame merge candidate is updated.

[0039] Figure 18 This is a diagram showing the location of a representative sub-block.

[0040] Figure 19 An example of generating a list of inter-frame motion information for different inter-frame prediction modes is shown.

[0041] Figure 20 This is a diagram illustrating an example of adding inter-frame merge candidates included in the long-term motion information list to the merge candidate list.

[0042] Figure 21 This is a diagram illustrating an example of performing redundancy checks only on some merge candidates.

[0043] Figure 22 This is a diagram illustrating an example of skipping redundancy checks on a specific merge candidate.

[0044] Figure 23 This is a diagram illustrating an example of dividing a coded block into multiple prediction units using diagonals.

[0045] Figure 24 This is a diagram illustrating an example of dividing a coded block into two prediction units.

[0046] Figure 25A diagram showing an example of dividing a coded block into multiple prediction blocks of different sizes is provided.

[0047] Figure 26 This is a diagram showing adjacent blocks used to derive triangle merging candidates.

[0048] Figure 27 This is a diagram used to illustrate an example of determining the availability of neighboring blocks for each triangular prediction unit.

[0049] Figure 28 and Figure 29 This is a diagram illustrating an example of deriving a prediction sample based on a weighted sum of a first prediction sample and a second prediction sample. Detailed Implementation

[0050] Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

[0051] Video encoding and decoding are performed on a block-by-block basis. For example, encoding / decoding processes such as transform, quantization, prediction, in-loop filtering, or reconstruction can be performed on encoded blocks, transform blocks, or prediction blocks.

[0052] Hereinafter, the block to be encoded / decoded will be referred to as the "current block". For example, depending on the current encoding / decoding process step, the current block can represent an encoded block, a transform block, or a prediction block.

[0053] Additionally, as used herein, the term "unit" refers to a basic unit used to perform a specific encoding / decoding process, and "block" can be understood as representing a sample array of a predetermined size. Unless otherwise stated, "block" and "unit" are used interchangeably. For example, in the embodiments described later, encoding block and encoding unit can be understood to have the same meaning.

[0054] Figure 1 This is a block diagram of a video encoder according to an embodiment of the present invention.

[0055] Reference Figure 1 The video encoding device 100 may include an image segmentation unit 110, a prediction unit 120 and 125, a transformation unit 130, a quantization unit 135, a rearrangement unit 160, an entropy coding unit 165, an inverse quantization unit 140, an inverse transformation unit 145, a filter unit 150, and a memory 155.

[0056] Figure 1The components shown are illustrated individually to illustrate the distinct functionalities of the video encoding device and do not imply that each component is composed of separate hardware or a single software component. That is, for ease of explanation, the components are arranged such that at least two components are combined into one, or one component is divided into multiple components, thereby performing functions. Such embodiments of integrated components and embodiments of separated components are also within the scope of this invention, provided they do not depart from its spirit.

[0057] Furthermore, some structural elements are not essential structural elements for performing the essential functions of this invention, but rather optional structural elements used only to improve performance. This invention can be implemented by including only the components necessary for realizing the essence of the invention, excluding the structural elements used only to improve performance, and structures including only the essential structural elements, excluding the optional structural elements used only to improve performance, are also within the scope of this invention.

[0058] The image partitioning unit 110 can divide the input image into at least one processing unit. In this case, the processing unit can be a prediction unit (PU), a transformation unit (TU), or a coding unit (CU). The image partitioning unit 110 divides an image into a combination of multiple coding units, prediction units, and transformation units, and can select a combination of coding units, prediction units, and transformation units to encode the image based on a predetermined criterion (e.g., a cost function).

[0059] For example, an image can be divided into multiple coding units. To divide an image into coding units, a recursive tree structure such as a quadtree structure can be used. A video or the largest coding unit can be used as the root, and the coding unit can be divided into additional coding units with a number of child nodes equivalent to the number of coding units in the division. Coding units that are no longer divided according to certain constraints become leaf nodes. That is, when it is assumed that a coding unit can only be divided into squares, a coding unit can be divided into a maximum of four other coding units.

[0060] In the embodiments of the present invention, the encoding unit may mean a unit that performs encoding, or it may mean a unit that performs decoding.

[0061] Prediction units within a coding unit can be divided into at least one shape of the same size, such as squares or rectangles, or a prediction unit within a coding unit can be divided into units with different shapes and / or sizes than another prediction unit.

[0062] Intra-prediction can be performed when the prediction unit for intra-prediction based on the coding unit is not the smallest coding unit, without having to divide it into multiple prediction units N×N.

[0063] Prediction units 120 and 125 may include an inter-frame prediction unit 120 that performs inter-frame prediction and an intra-frame prediction unit 125 that performs intra-frame prediction. It can be determined whether inter-frame prediction or intra-frame prediction is used for the prediction unit, and specific information (e.g., intra-frame prediction mode, motion vectors, reference image, etc.) is determined based on each prediction method. In this case, the processing unit performing the prediction may be different from the processing unit that determines the prediction method and specific content. For example, the prediction unit may determine the prediction method and prediction mode, and the transformation unit may perform the prediction. The residual value (residual block) between the generated prediction block and the original block can be input to the transformation unit 130. Furthermore, the prediction mode information, motion vector information, etc., used for prediction, along with the residual value, can be encoded in the entropy coding unit 165 and transmitted to the decoder. When using a specific coding mode, the original block can also be directly encoded and transmitted to the decoder without generating a prediction block through prediction units 120 and 125.

[0064] The inter-frame prediction unit 120 can predict prediction units based on information from at least one of the previous or next images of the current image. In some cases, it can also predict prediction units based on information from a portion of the encoded region within the current image. The inter-frame prediction unit 120 may include a reference image interpolation unit, a motion prediction unit, and a motion compensation unit.

[0065] The reference image interpolation unit receives reference image information from memory 155 and can generate pixel information of integer pixels or fractional pixels from the reference image. For luminance pixels, in order to generate pixel information of fractional pixels in units of 1 / 4 pixels, an 8th-order DCT-based interpolation filter with different filter coefficients can be used. For chrominance signals, in order to generate pixel information of fractional pixels in units of 1 / 8 pixels, a 4th-order DCT-based interpolation filter with different filter coefficients can be used.

[0066] The motion prediction unit can perform motion prediction based on a reference image interpolated by the reference image interpolation unit. Various methods can be used to calculate motion vectors, such as the Full Search-based Block Matching Algorithm (FBMA), the Three-Step Search (TSS), and the New Three-Step Search Algorithm (NTS). Motion vectors can have values in units of 1 / 2 pixel or 1 / 4 pixel based on the interpolated pixels. Different motion prediction methods can be used in the motion prediction unit to predict the current prediction unit. These methods include skipping, merging, Advanced Motion Vector Prediction (AMVP), and Intra Block Copying.

[0067] The intra-prediction unit 125 can generate prediction units based on reference pixel information surrounding the current block, which serves as pixel information within the current image. When the neighboring block of the current prediction unit is a block that has already undergone inter-frame prediction, and the reference pixel is a pixel that has undergone inter-frame prediction, the reference pixel included in the block that has undergone inter-frame prediction can be used as the reference pixel information for the surrounding block that has undergone intra-frame prediction. That is, when a reference pixel is unavailable, at least one of the available reference pixels can be used to replace the unavailable reference pixel information.

[0068] In intra-frame prediction, the prediction mode can have an angular prediction mode that uses reference pixel information in the prediction direction and a non-angular mode that does not use direction information when performing prediction. The mode used to predict luminance information and the mode used to predict chrominance information can be different. To predict chrominance information, either the intra-frame prediction mode information used for predicting luminance information or the predicted luminance signal information can be applied.

[0069] When performing intra-frame prediction, if the size of the prediction unit is the same as the size of the transform unit, intra-frame prediction can be performed based on pixels to the left, upper left, and upper right of the prediction unit. However, when performing intra-frame prediction, if the size of the prediction unit is different from the size of the transform unit, intra-frame prediction can be performed using reference pixels based on the transform unit. Furthermore, intra-frame prediction using an N×N partition only for the smallest coding unit can be applied.

[0070] Intra-prediction methods can generate prediction blocks after applying an Adaptive Intra Smoothing (AIS) filter to a reference pixel based on the prediction mode. The type of AIS filter used for the reference pixel may vary. To perform intra-prediction, the intra-prediction mode of the current prediction unit can be predicted from the intra-prediction modes of prediction units existing in the vicinity of the current prediction unit. When using mode information predicted from surrounding prediction units to predict the prediction mode of the current prediction unit, if the intra-prediction modes of the current prediction unit and those of the surrounding prediction units are the same, predetermined flag information can be used to convey information indicating that the prediction modes of the current prediction unit and those of the surrounding prediction units are the same. If the prediction modes of the current prediction unit and those of the surrounding prediction units are different, entropy coding can be performed to encode the prediction mode information of the current block.

[0071] Furthermore, residual blocks including residual information can be generated, the residual information being the difference between the prediction unit that performs prediction based on the prediction unit generated in prediction units 120 and 125 and the original block of the prediction unit. The generated residual blocks can be input to the transformation unit 130.

[0072] In the transform unit 130, a transform method such as Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST) can be used to transform the residual block, which includes residual information between the original block and the prediction units generated by the prediction units 120 and 125. The DCT transform kernel includes at least one of DCT2 or DCT8, and the DST transform kernel includes DST7. Whether to apply DCT or DST to transform the residual block can be determined based on the intra-frame prediction mode information of the prediction units used to generate the residual block. Transformation of the residual block can also be skipped. A flag indicating whether to skip the transformation of the residual block can be encoded. Transformation skipping is allowed for residual blocks with a size below a threshold, or for luma or chroma components (4:4:4 format or below).

[0073] The quantization unit 135 can quantize the values that have been transformed into the frequency domain in the transformation unit 130. The quantization coefficients can be changed according to the importance of the block or video. The values calculated in the quantization unit 135 can be provided to the inverse quantization unit 140 and the rearrangement unit 160.

[0074] The rearrangement unit 160 can rearrange the coefficient values of the quantized residual values.

[0075] The rearrangement unit 160 can transform 2D block shape coefficients into 1D vector form using a coefficient scanning method. For example, the rearrangement unit 160 can use a zig-zag scan method to scan the DC coefficients and even the coefficients in the high-frequency domain, and transform them into 1D vector form. Depending on the size of the transform unit and the intra-frame prediction mode, instead of zig-zag scanning, vertical scanning along the column direction and horizontal scanning along the row direction can also be used to scan the 2D block shape coefficients. That is, the choice between zig-zag scanning, vertical scanning, and horizontal scanning can be determined based on the size of the transform unit and the intra-frame prediction mode.

[0076] The entropy coding unit 165 can perform entropy coding based on the value calculated by the rearrangement unit 160. For example, entropy coding can use various coding methods such as Exponential Golomb code, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC).

[0077] The entropy coding unit 165 can encode various information such as residual coefficient information, block type information, prediction mode information, partitioning unit information, prediction unit information and transmission unit information, motion vector information, reference frame information, block interpolation information, and filtering information of the coding units originating from the rearrangement unit 160 and the prediction units 120 and 125.

[0078] The coefficient values of the coding units input from the rearrangement unit 160 can be entropy encoded in the entropy coding unit 165.

[0079] The inverse quantization unit 140 and the inverse transform unit 145 perform inverse quantization on the multiple values quantized in the quantization unit 135, and perform inverse transform on the values transformed in the transform unit 130. The residual values generated in the inverse quantization unit 140 and the inverse transform unit 145 can be merged with the prediction units predicted by the motion prediction unit, motion compensation unit and intra-frame prediction unit included in the prediction units 120 and 125 to generate a reconstructed block.

[0080] The filter unit 150 may include at least one of a deblocking filter, an offset correction unit, and an adaptive loop filter (ALF).

[0081] Deblocking filters remove block distortion generated in the reconstructed image due to the boundaries between blocks. To determine whether to perform deblocking, the number of pixels in the columns or rows included in the block can be used to decide whether to apply a deblocking filter to the current block. When applying a deblocking filter to a block, a strong or weak filter can be applied depending on the desired deblocking intensity. Furthermore, during the use of deblocking filters, horizontal and vertical filtering can be processed simultaneously when performing vertical and horizontal filtering.

[0082] The offset correction unit can correct the offset between the video being deblocked and the original video on a pixel-by-pixel basis. To perform offset correction on a specific image, the following methods can be used: after dividing the pixels included in the video into a predetermined number of regions, determine the region to be offset and apply the offset to the corresponding region, or apply the offset by taking into account the edge information of each pixel.

[0083] Adaptive Loop Filtering (ALF) can be performed based on a comparison between the filtered reconstructed image and the original video. After dividing the pixels in the video into predetermined groups, filtering can be performed differently for each group by determining a filter to be used for the corresponding group. Information related to whether adaptive loop filtering is applied, along with the luminance signal, can be transmitted per coding unit (CU). The shape and filter coefficients of the adaptive loop filter to be applied can vary depending on the block. Furthermore, it is possible to apply the same type (fixed type) of adaptive loop filter regardless of the characteristics of the block to which it is applied.

[0084] The memory 155 can store the reconstructed blocks or images calculated by the filter unit 150, and can provide the stored reconstructed blocks or images to the prediction units 120 and 125 when performing inter-frame prediction.

[0085] Figure 2 This is a block diagram of a video decoder according to an embodiment of the present invention.

[0086] Reference Figure 2 The video decoder 200 may include an entropy decoding unit 210, a rearrangement unit 215, an inverse quantization unit 220, an inverse transform unit 225, a prediction unit 230, a prediction unit 235, a filter unit 240, and a memory 245.

[0087] When inputting a video bitstream from a video encoder, the input bitstream can be decoded by following the reverse steps of the video encoder.

[0088] The entropy decoding unit 210 can perform entropy decoding in the reverse order of the entropy encoding steps performed in the entropy encoding unit of the video encoder. For example, corresponding to the methods performed in the video encoder, various methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) can be applied.

[0089] The entropy decoding unit 210 can decode information related to intra-frame prediction and inter-frame prediction performed by the encoder.

[0090] The rearrangement unit 215 can perform rearrangement based on a method used in the encoding unit to rearrange the bitstream that has been entropily decoded by the entropy decoding unit 210. Multiple coefficients represented in one-dimensional vector form can be reconstructed into two-dimensional block-shaped coefficients for rearrangement. The rearrangement unit 215 receives information related to the coefficient scan performed in the encoding unit and can perform rearrangement by performing a reverse scan based on the scan order performed in the corresponding encoding unit.

[0091] The inverse quantization unit 220 can perform inverse quantization based on the quantization parameters provided by the encoder and the coefficient values of the rearranged blocks.

[0092] The inverse transform unit 225 can perform inverse discrete cosine transform and inverse discrete sine transform on the quantization result performed by the video encoder. These inverse discrete cosine transforms and inverse discrete sine transforms are inverse transforms of the transforms performed in the transform unit, i.e., inverse transforms of the discrete cosine transform and discrete sine transform. The DCT transform kernel can include at least one of DCT2 or DCT8, and the DST transform kernel can include DST7. Alternatively, if the transform is skipped in the video encoder, the inverse transform unit 225 may not perform the inverse transform. The inverse transform can be performed based on the transmission unit determined in the video encoder. In the inverse transform unit 225 of the video decoder, a transform method (e.g., DCT or DST) can be selectively performed based on multiple pieces of information such as the prediction method, the size of the current block, and the prediction direction.

[0093] Prediction units 230 and 235 can generate prediction blocks based on information related to prediction block generation provided by entropy decoding unit 210 and previously decoded block or image information provided by memory 245.

[0094] As described above, when intra-prediction is performed in the same manner as in the video encoder, if the size of the prediction unit is the same as the size of the transform unit, intra-prediction is performed on the prediction unit based on the pixels to its left, the pixels to its upper left, and the pixels above it. If the size of the prediction unit is different from the size of the transform unit, intra-prediction can be performed using reference pixels based on the transform unit. Furthermore, intra-prediction using only N×N partitioning for the smallest coding unit can also be applied.

[0095] Prediction units 230 and 235 may include a prediction unit determination unit, an inter-frame prediction unit, and an intra-frame prediction unit. The prediction unit determination unit receives various information input from the entropy decoding unit 210, such as prediction unit information, prediction mode information of the intra-frame prediction method, and motion prediction-related information of the inter-frame prediction method. It classifies prediction units according to the current coding unit and determines whether the prediction unit is performing inter-frame prediction or intra-frame prediction. The inter-frame prediction unit 230 can use the information required for inter-frame prediction of the current prediction unit provided by the video encoder and perform inter-frame prediction on the current prediction unit based on information included in at least one of the previous or next images of the current image to which the current prediction unit belongs. Alternatively, inter-frame prediction can also be performed based on information from a portion of the reconstructed region within the current image to which the current prediction unit belongs.

[0096] In order to perform inter-frame prediction, it is possible to determine, based on the coding unit, which of the following modes of motion prediction method is used for the prediction units included in the corresponding coding unit: Skip Mode, Merge Mode, Advanced Motion Vector Prediction Mode (AMVP Mode), or Intra-Block Copy Mode.

[0097] The intra-prediction unit 235 can generate prediction blocks based on pixel information within the current image. When the prediction unit is one that has already performed intra-prediction, intra-prediction can be performed based on the intra-prediction mode information of the prediction unit provided by the video encoder. The intra-prediction unit 235 may include an adaptive intra-smoothing (AIS) filter, a reference pixel interpolation unit, and a DC filter. The adaptive intra-smoothing filter is the part that performs filtering on the reference pixels of the current block, and whether to apply the filter can be determined according to the prediction mode of the current prediction unit. Adaptive intra-smoothing filtering can be performed on the reference pixels of the current block using the prediction mode of the prediction unit provided by the video encoder and the adaptive intra-smoothing filter information. If the prediction mode of the current block is a mode that does not perform adaptive intra-smoothing filtering, then the adaptive intra-smoothing filter may not be applied.

[0098] For the reference pixel interpolation unit, if the prediction mode of the prediction unit is a prediction unit that performs intra-frame prediction based on pixel values interpolated from reference pixels, then reference pixels with integer or fractional pixel units can be generated by interpolating the reference pixels. If the prediction mode of the current prediction unit is a prediction mode that generates prediction blocks without interpolating reference pixels, then interpolation of reference pixels is not required. If the prediction mode of the current block is DC mode, then the DC filter can generate prediction blocks by filtering.

[0099] The reconstructed blocks or images can be provided to the filter unit 240. The filter unit 240 may include a deblocking filter, an offset correction unit, and an ALF.

[0100] Information related to whether to apply a deblocking filter to a corresponding block or image can be received from the video encoder, as well as information regarding whether to apply strong or weak filtering when applying the deblocking filter. Information related to the deblocking filter provided by the video encoder can be received from the video decoder's deblocking filter, and deblocking filtering can be performed on the corresponding block at the video decoder.

[0101] The offset correction unit can perform offset correction on the reconstructed video based on the type and amount of offset correction used during video encoding.

[0102] The ALF can be applied to the coding unit based on information provided by the encoder, such as whether the ALF is applied and ALF coefficient information. This ALF information can be provided by including it in a specific parameter set.

[0103] The memory 245 stores the reconstructed image or block, such that the image or block can be used as a reference image or reference block, and the reconstructed image can be provided to the output unit.

[0104] Figure 3 This is a diagram illustrating the basic coding tree unit of an embodiment of the present invention.

[0105] The largest coding block can be defined as the coding tree block. An image can be divided into multiple coding tree units (CTUs). The coding tree unit is the largest coding unit and can also be called the largest coding unit (LCU). Figure 3 An example of dividing an image into multiple coding tree units is shown.

[0106] The size of a coding tree unit can be defined at the image level or the sequence level. Therefore, information representing the size of a coding tree unit can be transmitted via signals using either an image parameter set or a sequence parameter set.

[0107] For example, the size of the coding tree unit for the entire image within the sequence can be set to 128×128. Alternatively, either 128×128 or 256×256 at the image level can be determined as the size of the coding tree unit. For example, the size of the coding tree unit in the first image can be set to 128×128, and the size of the coding tree unit in the second image can be set to 256×256.

[0108] Coded blocks can be generated by dividing the coding tree into units. A coded block represents the basic unit used for encoding / decoding processing. For example, prediction or transformation can be performed on different coded blocks, or prediction coding modes can be determined on different coded blocks. The prediction coding mode represents the method for generating the predicted image. For example, prediction coding modes can include intra-prediction, inter-prediction, current picture referencing (CPR, or intra-block copy (IBC)), or combined prediction. For a coded block, at least one of the prediction coding modes—intra-prediction, inter-prediction, current picture referencing, or combined prediction—can be used to generate the prediction block associated with that coded block.

[0109] Information representing the predictive coding mode of the current block can be transmitted via a bitstream signal. For example, this information could be a 1-bit flag indicating whether the predictive coding mode is intra-frame or inter-frame. Current image reference or combined prediction can be used only if the predictive coding mode of the current block is determined to be inter-frame.

[0110] The current image reference is used to set the current image as the reference image and obtain the prediction block of the current block from the encoded / decoded regions within the current image. Here, the current image means the image that includes the current block. Information indicating whether the current image reference is applied to the current block can be sent via a bitstream signal. For example, this information could be a 1-bit flag. When the flag is true, the prediction coding mode of the current block can be determined as the current image reference; when the flag is false, the prediction mode of the current block can be determined as inter-frame prediction.

[0111] Alternatively, the predictive coding mode for the current block can be determined based on a reference image index. For example, when the reference image index points to the current image, the predictive coding mode for the current block can be determined as current image reference. When the reference image index points to another image instead of the current image, the predictive coding mode for the current block can be determined as inter-frame prediction. That is, current image reference is a prediction method that uses information from already encoded / decoded regions within the current image, and inter-frame prediction is a prediction method that uses information from other encoded / decoded images.

[0112] Combinatorial prediction represents a coding mode composed of two or more of intra-frame prediction, inter-frame prediction, and current image reference. For example, when applying combinatorial prediction, a first prediction block can be generated based on one of intra-frame prediction, inter-frame prediction, or the current image reference, and a second prediction block can be generated based on another. If a first and a second prediction block are generated, a final prediction block can be generated by averaging or weighted summing the first and second prediction blocks. Information indicating whether combinatorial prediction is applied can be transmitted via a bitstream signal. This information can be a 1-bit flag.

[0113] Figures 4(a), 4(b), 4(c), 4(d), and 4(e) are diagrams illustrating various partitioning types of coded blocks.

[0114] A coded block can be divided into multiple coded blocks based on quadtree partitioning, binary tree partitioning, or ternary tree partitioning. Furthermore, the divided coded blocks can be further divided into multiple coded blocks based on quadtree partitioning, binary tree partitioning, or ternary tree partitioning.

[0115] Quadtree partitioning is a partitioning technique that divides the current block into four blocks. As a result of quadtree partitioning, the current block can be divided into four square partitions (refer to “SPLIT_QT” in part 4(a) of Figure 4).

[0116] Binary tree partitioning refers to a partitioning technique that divides the current block into two blocks. The process of partitioning the current block along a vertical direction (i.e., using a vertical line crossing the current block) is called vertical binary tree partitioning, and the process of partitioning the current block along a horizontal direction (i.e., using a horizontal line crossing the current block) is called horizontal binary tree partitioning. After binary tree partitioning, the current block can be divided into two non-square partitions. "SPLIT_BT_VER" in part 4(b) represents the result of vertical binary tree partitioning, and "SPLIT_BT_HOR" in part 4(c) represents the result of horizontal binary tree partitioning.

[0117] Ternary tree partitioning refers to a partitioning technique that divides the current block into three blocks. The process of dividing the current block into three blocks along a vertical direction (i.e., using two vertical lines crossing the current block) is called vertical ternary tree partitioning, and the process of dividing the current block into three blocks along a horizontal direction (i.e., using two horizontal lines crossing the current block) is called horizontal ternary tree partitioning. After ternary tree partitioning, the current block can be divided into three non-square partitions. In this case, the width / height of the partition located at the center of the current block can be twice the width / height of the other partitions. "SPLIT_TT_VER" in part 4(d) represents the result of vertical ternary tree partitioning, and "SPLIT_TT_HOR" in part 4(e) represents the result of horizontal ternary tree partitioning.

[0118] The number of times a coding tree unit is divided can be defined as the partitioning depth. The maximum partitioning depth of a coding tree unit can be determined at the sequence or image level. Therefore, the maximum partitioning depth of a coding tree unit can vary depending on different sequences or images.

[0119] Alternatively, the maximum partitioning depth can be determined individually for each of the multiple partitioning techniques. For example, the maximum partitioning depth allowed for quadtree partitioning can be different from the maximum partitioning depth allowed for binary tree partitioning and / or ternary tree partitioning.

[0120] The encoder can transmit information representing at least one of the partition shape or partition depth of the current block via a bitstream. The decoder can determine the partition shape and partition depth of the coding tree unit based on the information parsed from the bitstream.

[0121] Figure 5 This is a diagram illustrating the partitioning pattern of the coding tree unit.

[0122] The process of dividing coding blocks using partitioning techniques such as quadtree partitioning, binary tree partitioning, and / or ternary tree partitioning is called multitree partitioning.

[0123] The coded blocks generated by applying a multi-way tree partitioning to the coded block can be called multiple downstream coded blocks. When the partitioning depth of the coded block is k, the partitioning depth of the multiple downstream coded blocks is set to k+1.

[0124] On the other hand, for multiple coding blocks with a partitioning depth of k+1, the coding block with a partitioning depth of k can be called the upstream coding block.

[0125] The partition type of the current coding block can be determined based on at least one of the partition shape of the upstream coding block or the partition type of the adjacent coding blocks. The adjacent coding blocks are adjacent to the current coding block and can include at least one of the current coding block's upper adjacent block, left adjacent block, or adjacent block to its upper left corner. The partition type can include at least one of whether to partition into a quadtree, whether to partition into a binary tree, the binary tree partition direction, whether to partition into a ternary tree, or the ternary tree partition direction.

[0126] To determine the shape of the coded block partition, information indicating whether the coded block has been partitioned can be sent via a bitstream signal. This information is a 1-bit flag "split_cu_flag," and when the flag is true, it indicates that the coded block has been partitioned using a multi-way tree partitioning technique.

[0127] When "split_cu_flag" is true, information indicating whether the coded block has been partitioned by a quadtree can be sent via a bitstream signal. This information is a 1-bit flag "split_qt_flag". When this flag is true, the coded block can be divided into 4 blocks.

[0128] For example, in Figure 5 The example shown illustrates how the coding tree unit is partitioned by a quadtree to generate four coding blocks with a partition depth of 1. Furthermore, the example illustrates applying quadtree partitioning again to the first and fourth coding blocks generated as a result of the quadtree partitioning. Ultimately, four coding blocks with a partition depth of 2 can be generated.

[0129] Furthermore, a coded block with a partition depth of 3 can be generated by applying a quadtree partition to the coded block with a partition depth of 2 again.

[0130] When a quadtree partition is not applied to the coded block, it can be determined whether to perform a binary tree partition or a ternary tree partition by considering at least one of the following: the size of the coded block, whether the coded block is located at an image boundary, the maximum partition depth, or the partition shape of adjacent blocks. When it is determined whether to perform a binary tree partition or a ternary tree partition, information indicating the partition direction can be transmitted via a bitstream signal. This information can be a 1-bit flag "mtt_split_cu_vertical_flag". The partition direction (vertical or horizontal) can be determined based on this flag. Alternatively, information indicating whether a binary tree partition or a ternary tree partition is applied to the coded block can be transmitted via a bitstream signal. This information can be a 1-bit flag "mtt_split_cu_binary_flag". The binary tree partition or ternary tree partition can be determined based on this flag.

[0131] For example, in Figure 5The example shown illustrates the application of a vertical binary tree partitioning to a coded block with a partitioning depth of 1, the application of a vertical ternary tree partitioning to the left coded block in the resulting coded block, and the application of a vertical binary tree partitioning to the right coded block.

[0132] Inter-frame prediction refers to using information from the previous image to predict the predictive coding mode of the current block. For example, a block in the previous image that is at the same position as the current block (hereinafter referred to as a collocated block) can be set as the prediction block for the current block. Hereinafter, the prediction block generated based on the block at the same position as the current block will be called a collocated prediction block.

[0133] On the other hand, if an object that existed in the previous image has moved to a different position in the current image, the object's motion can be used to effectively predict the current block. For example, if the direction and size of the object's movement can be known by comparing the previous and current images, the object's motion information can be considered to generate a predicted block (or predicted image) for the current block. Hereinafter, the predicted block generated using motion information can be referred to as a motion prediction block.

[0134] Residual blocks can be generated by subtracting prediction blocks from the current block. In this case, when there is motion of the object, motion prediction blocks can be used instead of prediction blocks at the same location, thereby reducing the energy of the residual blocks and improving their compression performance.

[0135] As mentioned above, the process of generating prediction blocks using motion information can be called motion-compensated prediction. In most inter-frame predictions, prediction blocks can be generated based on motion-compensated prediction.

[0136] Motion information may include at least one of motion vectors, reference image indices, prediction directions, or bidirectional weighted indexes. Motion vectors represent the direction and magnitude of an object's movement. Reference image indices specify the reference image for the current block among a list of reference images. Prediction directions refer to any one of unidirectional L0 prediction, unidirectional L1 prediction, or bidirectional prediction (L0 and L1 prediction). Motion information in either the L0 or L1 direction can be used based on the prediction direction of the current block. Bidirectional weighted indexes specify the weights applied to the L0 prediction block and the weights applied to the L1 prediction block.

[0137] Figure 6 This is a flowchart of the inter-frame prediction method according to an embodiment of the present invention.

[0138] Reference Figure 9The inter-frame prediction method includes: determining the inter-frame prediction mode of the current block (S601); obtaining motion information of the current block according to the determined inter-frame prediction mode (S602); and performing motion compensation prediction of the current block based on the obtained motion information (S603).

[0139] Inter-frame prediction modes represent various techniques used to determine the motion information of the current block, and can include inter-frame prediction modes using translational motion information and inter-frame prediction modes using affine motion information. For example, inter-frame prediction modes using translational motion information can include merging mode and advanced motion vector prediction mode, while inter-frame prediction modes using affine motion information can include affine merging mode and affine motion vector prediction mode. Based on the inter-frame prediction mode, the motion information of the current block can be determined based on neighboring blocks adjacent to the current block or information parsed from the bitstream.

[0140] The following section details the inter-frame prediction method using affine motion information.

[0141] Figure 7 It is a diagram showing the nonlinear motion of an object.

[0142] The motion of objects within a video may be non-linear. For example, such as... Figure 7 The example shown may involve non-linear motion of the object, such as camera zoom-in, zoom-out, rotation, or affine transformation. When non-linear motion occurs, it is impossible to effectively represent the object's motion using translational motion vectors. Therefore, in parts where non-linear motion occurs, affine motion can be used instead of translational motion, thereby improving coding efficiency.

[0143] Figure 8 This is a flowchart illustrating an inter-frame prediction method based on affine motion according to an embodiment of the present invention.

[0144] Whether to apply an affine motion-based inter-frame prediction technique to the current block can be determined based on information parsed from the bitstream. Specifically, whether to apply an affine motion-based inter-frame prediction technique to the current block can be determined based on at least one of a flag indicating whether an affine merging mode is applied to the current block or a flag indicating whether an affine motion vector prediction mode is applied to the current block.

[0145] When an inter-frame prediction technique based on affine motion is applied to the current block, the affine motion model of the current block can be determined (S1101). The affine motion model can be determined by at least one of a 6-parameter affine motion model or a 4-parameter affine motion model. The 6-parameter affine motion model uses 6 parameters to represent the affine motion, while the 4-parameter affine motion model uses 4 parameters to represent the affine motion.

[0146] Equation 1 represents the case of affine motion using 6 parameters. Affine motion represents translational motion relative to a predetermined region determined by an affine seed vector.

[0147] [Formula 1]

[0148] v x =ax-by+e

[0149] v y =cx+dy+f

[0150] While using six parameters to represent affine motion allows for the representation of complex motions, the increased number of bits required to encode each parameter reduces encoding efficiency. Therefore, affine motion can also be represented using four parameters. Equation 2 illustrates the case of representing affine motion using four parameters.

[0151] [Formula 2]

[0152] v x =ax-by+e

[0153] v y =bx+ay+f

[0154] Information used to determine the affine motion model for the current block can be encoded and transmitted via a bitstream signal. For example, this information could be a 1-bit flag, "affine_type_flag". A value of 0 indicates the application of a 4-parameter affine motion model, and a value of 1 indicates the application of a 6-parameter affine motion model. The flag can be encoded at the slice, tile, or block level (e.g., a coding block or coding tree unit). When the flag is transmitted at the slice level, the affine motion model determined at that slice level can be applied to all blocks within that slice.

[0155] Alternatively, the affine motion model of the current block can be determined based on the affine inter-frame prediction mode of the current block. For example, when applying the affine merging mode, the affine motion model of the current block can be determined as a 4-parameter motion model. On the other hand, when applying the affine motion vector prediction mode, the information used to determine the affine motion model of the current block can be encoded and transmitted as a signal via a bitstream. For example, when applying the affine motion vector prediction mode to the current block, the affine motion model of the current block can be determined based on a 1-bit flag "affine_type_flag".

[0156] Next, the affine seed vector of the current block can be exported (S1102). When a 4-parameter affine motion model is selected, motion vectors at the two control points of the current block can be exported. On the other hand, when a 6-parameter affine motion model is selected, motion vectors at the three control points of the current block can be exported. The motion vectors at the control points can be called affine seed vectors. Control points can include at least one of the top-left, top-right, or bottom-left corners of the current block.

[0157] Figure 9 This is a diagram showing an example of the affine seed vector for each affine motion model.

[0158] In a 4-parameter affine motion model, two related affine seed vectors can be derived from the top-left, top-right, or bottom-left corners. For example, ... Figure 9 In the example shown in section (a), when the 4-parameter affine motion model is selected, the affine vectors can be derived by using an affine seed vector sv0 associated with the top-left corner of the current block (e.g., the top-left sample (x0, y0)) and an affine seed vector sv1 associated with the top-right corner of the current block (e.g., the top-right sample (x1, y1)). Alternatively, the affine seed vector associated with the bottom-left corner can be used instead of the affine seed vector associated with the top-left corner, or vice versa.

[0159] In a 6-parameter affine motion model, affine seed vectors related to the top-left, top-right, and bottom-left corners can be derived. For example, ... Figure 9 In the example shown in section (b), when the 6-parameter affine motion model is selected, the affine vectors can be derived by using the affine seed vector sv0 associated with the top-left corner of the current block (e.g., the top-left sample (x0, y0)), the affine seed vector sv1 associated with the top-right corner of the current block (e.g., the top-right sample (x1, y1)), and the affine seed vector sv2 associated with the top-left corner of the current block (e.g., the top-left sample (x2, y2)).

[0160] In the embodiments described later, under the 4-parameter affine motion model, the affine seed vectors of the upper left control point and the upper right control point are referred to as the first affine seed vector and the second affine seed vector, respectively. In the embodiments using the first and second affine seed vectors described later, at least one of the first and second affine seed vectors can be replaced by the affine seed vector of the lower left control point (the third affine seed vector) or the affine seed vector of the lower right control point (the fourth affine seed vector).

[0161] Furthermore, in the 6-parameter affine motion model, the affine seed vectors of the upper left control point, upper right control point, and lower left control point are respectively referred to as the first affine seed vector, the second affine seed vector, and the third affine seed vector. In the embodiments using the first, second, and third affine seed vectors described later, at least one of the first, second, and third affine seed vectors can be replaced by the affine seed vector of the lower right control point (the fourth affine seed vector).

[0162] The affine vector for each sub-block can be derived using an affine seed vector (S1103). Here, the affine vector represents the translational motion vector derived based on the affine seed vector. The affine vector of a sub-block can be called the affine sub-block motion vector or the sub-block motion vector.

[0163] Figure 10 This is a diagram showing an example of the affine vectors of a sub-block under a 4-parameter motion model.

[0164] The affine vector of a sub-block can be derived based on the position of the control point, the position of the sub-block, and the affine seed vector. For example, Equation 3 shows an example of deriving the affine sub-block vector.

[0165] [Formula 3]

[0166]

[0167] In Formula 3, (x, y) represents the position of the sub-block. The position of the sub-block refers to the position of the reference sample included within it. The reference sample can be the sample located at the top left corner of the sub-block, or at least one sample located at the center of the x-axis or y-axis coordinate system. (x0, y0) represents the position of the first control point, and (sv 0x sv 0y Let (x1, y1) represent the first affine seed vector. Additionally, (x1, y1) represents the position of the second control point, and (sv... 1x sv 1y ) represents the second affine seed vector.

[0168] When the first control point and the second control point correspond to the top left corner and the top right corner of the current block, respectively, x1-x0 can be set to the same value as the width of the current block.

[0169] Subsequently, motion compensation prediction for each sub-block can be performed using the affine vector of each sub-block (S1104). After performing motion compensation prediction, prediction blocks associated with each sub-block can be generated. The prediction blocks of the sub-blocks can be set as the prediction blocks of the current block.

[0170] Next, we will explain in detail the inter-frame prediction method that uses translational motion information.

[0171] Motion information for the current block can be derived from the motion information of other blocks. These other blocks can be those that are prioritized for inter-frame prediction encoding / decoding compared to the current block. Setting the motion information of the current block to be the same as that of other blocks is defined as a merging mode. Furthermore, setting the motion vectors of other blocks to the predicted values of the motion vectors of the current block is defined as a motion vector prediction mode.

[0172] Figure 11 This is a flowchart of the process of exporting motion information of the current block in merge mode.

[0173] Merging candidates for the current block can be exported (S1101). Merging candidates for the current block can be exported from blocks that were encoded / decoded using inter-frame prediction before the current block.

[0174] Figure 12 This is a diagram showing an example of a candidate block used to derive merge candidates.

[0175] Candidate blocks can include at least one of the following: neighboring blocks containing samples adjacent to the current block, or non-neighboring blocks containing samples not adjacent to the current block. Hereinafter, the samples used to determine candidate blocks will be designated as reference samples. Furthermore, reference samples adjacent to the current block will be referred to as neighboring reference samples, and reference samples not adjacent to the current block will be referred to as non-neighboring reference samples.

[0176] Adjacent reference samples can be included in the adjacent column of the leftmost column of the current block or the adjacent row of the topmost row of the current block. For example, if the coordinates of the top-left sample of the current block are (0, 0), then at least one of the following blocks—a block including a reference sample at position (-1, H-1), a block including a reference sample at position (W-1, -1), a block including a reference sample at position (W, -1), a block including a reference sample at position (-1, H), or a block including a reference sample at position (-1, -1)—can be used as candidate blocks. Referring to the accompanying drawings, adjacent blocks with indices 0 to 4 can be used as candidate blocks.

[0177] A non-adjacent reference sample refers to a sample whose x-axis distance or y-axis distance to the reference sample adjacent to the current block has a predefined value. For example, a block containing a reference sample whose x-axis distance to the left reference sample is a predefined value, a block containing a non-adjacent sample whose y-axis distance to the upper reference sample is a predefined value, or a block containing non-adjacent samples whose x-axis and y-axis distances to the upper-left reference sample are both predefined values can be used as a candidate block. The predefined value can be an integer such as 4, 8, 12, 16, etc. Referring to the accompanying drawings, at least one of the blocks with indices from 5 to 26 can be used as a candidate block.

[0178] Samples that are not on the same vertical, horizontal, or diagonal line as adjacent reference samples can be set as non-adjacent reference samples.

[0179] Figure 13 This is a diagram showing the location of the reference sample.

[0180] like Figure 13 The example shown allows setting the x-coordinate of a non-adjacent upper reference sample to be different from that of the adjacent upper reference sample. For instance, when the position of the adjacent upper reference sample is (W-1, -1), the position of a non-adjacent upper reference sample that is N away from the adjacent upper reference sample along the y-axis can be set to ((W / 2)-1, -1-N), and the position of a non-adjacent upper reference sample that is 2N away from the adjacent upper reference sample along the y-axis can be set to (0, -1-2N). That is, the position of a non-adjacent reference sample can be determined based on the position of the adjacent reference sample and the distance between them.

[0181] In the following text, a candidate block containing an adjacent reference sample is called a neighboring block, and a block containing a non-adjacent reference sample is called a non-adjacent block.

[0182] When the distance between the current block and a candidate block is greater than or equal to a threshold, the candidate block can be set as unusable as a merging candidate. The threshold can be determined based on the size of the coding tree unit. For example, the threshold can be set to the height of the coding tree unit (ctu_height), or the height of the coding tree unit plus or minus an offset value (e.g., ctu_height ± N). The offset value N is a predefined value in the encoder and decoder, and can be set to 4, 8, 16, 32, or ctu_height.

[0183] If the difference between the y-axis coordinate of the current block and the y-axis coordinate of the samples included in the candidate block is greater than a threshold, the candidate block can be determined as unsuitable for merging.

[0184] Alternatively, candidate blocks that do not belong to the same coding tree unit as the current block can be set as unsuitable for merging. For example, when the reference sample exceeds the upper boundary of the coding tree unit to which the current block belongs, candidate blocks that include the reference sample can be set as unsuitable for merging.

[0185] If the upper boundary of the current block is adjacent to the upper boundary of a coding tree unit, multiple candidate blocks will be determined as unsuitable for merging, which will reduce the encoding / decoding efficiency of the current block. To resolve this issue, candidate blocks can be configured such that the number of candidate blocks above the current block is greater than the number of candidate blocks to the left of the current block.

[0186] Figure 14 This is a diagram showing an example of a candidate block used to derive merge candidates.

[0187] like Figure 14 The example shown allows setting the top block of the N blocks above the current block and the left block of the M blocks to the left of the current block as candidate blocks. In this case, by setting M to be greater than N, the number of left candidate blocks can be set to be greater than the number of top candidate blocks.

[0188] For example, the difference between the y-axis coordinate of the reference sample within the current block and the y-axis coordinate of the block above which can be used as a candidate block can be set to no more than N times the height of the current block. Additionally, the difference between the x-axis coordinate of the reference sample within the current block and the x-axis coordinate of the block to the left of which can be used as a candidate block can be set to no more than M times the width of the current block.

[0189] For example, such as Figure 14 The example shown illustrates setting the blocks belonging to the two blocks above the current block and the five blocks belonging to the left of the current block as candidate blocks.

[0190] Merge candidates can also be derived from temporally adjacent blocks included in images different from the current block. For example, merge candidates can be derived from blocks at the same location included in images at the same location.

[0191] The motion information of the merged candidate can be set to be the same as that of the candidate block. For example, at least one of the motion vector, reference image index, prediction direction, or bidirectional weighted index of the candidate block can be set as the motion information of the merged candidate.

[0192] A list of merge candidates, including merge candidates, can be generated (S1102). The merge candidates can be classified into adjacent merge candidates derived from adjacent blocks adjacent to the current block, and non-adjacent merge candidates derived from non-adjacent blocks.

[0193] The indices of multiple merge candidates within the merge candidate list can be assigned in a predetermined order. For example, the index assigned to an adjacent merge candidate can have a smaller value than the index assigned to a non-adjacent merge candidate. Alternatively, based on Figure 12 or Figure 14 The index shown for each block can be assigned to each merge candidate.

[0194] When the merge candidate list includes multiple merge candidates, at least one of the multiple merge candidates can be selected (S1103). At this time, information indicating whether the motion information of the current block is derived from adjacent merge candidates can be sent via a bit stream signal. The information can be a 1-bit flag. For example, the syntax element isAdjancentMergeFlag indicating whether the motion information of the current block is derived from adjacent merge candidates can be sent via a bit stream signal. When the value of the syntax element isAdjancentMergeFlag is 1, the motion information of the current block can be derived based on adjacent merge candidates. On the other hand, when the value of the syntax element isAdjancentMergeFlag is 0, the motion information of the current block can be derived based on non-adjacent merge candidates.

[0195] Information specifying any one of multiple merge candidates can be transmitted via a bitstream signal. For example, information indicating the index of any merge candidate included in the merge candidate list can be transmitted via a bitstream signal.

[0196] When isAdjacentMergeflag is 1, the syntax element merge_idx can be signaled to determine which of the adjacent merge candidates is being merged. The maximum value of the syntax element merge_idx can be set to a value that is 1 greater than the difference between the number of adjacent merge candidates.

[0197] When isAdjacentMergeflag is 0, the syntax element NA_merge_idx can be signaled to determine any of the non-adjacent merge candidates. The syntax element NA_merge_idx indicates the value obtained by subtracting the index of the non-adjacent merge candidate from the number of adjacent merge candidates. The decoder can select a non-adjacent merge candidate by adding the number of adjacent merge candidates to the index determined by NA_merge_idx.

[0198] When the number of merge candidates in the merge candidate list is less than a threshold, merge candidates included in the inter-frame motion information list can be added to the merge candidate list. The threshold can be a value calculated from the maximum number of merge candidates the merge candidate list can include, or the maximum number of merge candidates minus an offset. The offset can be an integer such as 1 or 2. The inter-frame motion information list can include merge candidates derived based on blocks encoded / decoded prior to the current block.

[0199] The inter-frame motion information list includes merging candidates derived from blocks encoded / decoded based on inter-frame prediction within the current image. For example, the motion information of the merging candidates included in the inter-frame motion information list can be set to be the same as the motion information of the blocks encoded / decoded based on inter-frame prediction. The motion information may include at least one of motion vectors, reference image indexes, prediction directions, or bidirectional weighted indexes.

[0200] For ease of explanation, the merging candidates included in the inter-frame motion information list are referred to as inter-frame merging candidates.

[0201] The maximum number of merge candidates that can be included in the inter-frame motion information list can be predefined in the encoder and decoder. For example, the maximum number of merge candidates that can be included in the inter-frame motion information list can be 1, 2, 3, 4, 5, 6, 7, 8 or greater (e.g., 16).

[0202] Alternatively, information representing the maximum number of merge candidates in the inter-frame motion information list can be transmitted via a bitstream signal. This information can be transmitted at the sequence level, image level, or slice level signal.

[0203] Alternatively, the maximum number of merged candidates for the inter-frame motion information list can be determined based on the image size, the size of the slice, or the size of the coding tree unit.

[0204] The inter-frame motion information list can be initialized at the level of images, slices, tiles, bricks, coding tree units, or coding tree unit lines (rows or columns). For example, the inter-frame motion information list is also initialized during slice initialization, and it may not include any merge candidates.

[0205] Alternatively, information indicating whether to initialize the inter-frame motion information list can be sent via a bitstream signal. This information can be sent at the slice, tile, brick, or block level. A pre-configured inter-frame motion information list can be used before the information indicates initialization.

[0206] Alternatively, information related to inter-frame merge candidates can be sent via signals through the image parameter set or the chip header. Even when the chip is initialized, the inter-frame motion information list can include initial inter-frame merge candidates. Thus, inter-frame merge candidates can be used for the first block encoded / decoded within the chip.

[0207] The blocks are encoded / decoded according to the encoding / decoding order, and multiple blocks encoded / decoded based on inter-frame prediction can be set as inter-frame merging candidates in sequence according to the encoding / decoding order.

[0208] Figure 15 It is a flowchart showing the update status of the inter-frame motion information list.

[0209] When performing inter-frame prediction on the current block (S1501), inter-frame merging candidates can be derived based on the current block (S1502). The motion information of the inter-frame merging candidates can be set to be the same as the motion information of the current block.

[0210] When the inter-frame motion information list is empty (S1503), inter-frame merging candidates derived from the current block can be added to the inter-frame motion information list (S1504).

[0211] When the inter-frame merge candidate is already included in the inter-frame motion information list (S1503), a redundancy check can be performed on the motion information of the current block (or the inter-frame merge candidate derived from the current block) (S1505). The redundancy check is used to determine whether the motion information of the inter-frame merge candidates stored in the inter-frame motion information list is the same as the motion information of the current block. Redundancy checks can be performed on all inter-frame merge candidates stored in the inter-frame motion information list. Alternatively, redundancy checks can be performed on inter-frame merge candidates whose index is above or below a threshold among the inter-frame merge candidates stored in the inter-frame motion information list.

[0212] If inter-frame merge candidates with the same motion information as the current block are not included, inter-frame merge candidates derived from the current block can be added to the inter-frame motion information list (S1508). Whether inter-frame merge candidates are the same can be determined based on whether the motion information (e.g., motion vectors and / or reference image indexes, etc.) of the inter-frame merge candidates are the same.

[0213] In this case, when the maximum number of inter-frame merge candidates has been stored in the inter-frame motion information list (S1506), the earliest inter-frame merge candidate is deleted (S1507), and inter-frame merge candidates derived based on the current block can be added to the inter-frame motion information list (S1508).

[0214] Multiple inter-frame merge candidates can be identified based on their indices. When adding an inter-frame merge candidate derived from the current block to the inter-frame motion information list, the candidate is assigned the lowest index (e.g., 0), and the indices of already stored inter-frame merge candidates can be incremented by 1. In this case, when the maximum number of inter-frame merge candidates is stored in the inter-frame motion information list, the candidate with the highest index is removed.

[0215] Alternatively, when adding inter-frame merge candidates derived from the current block to the inter-frame motion information list, the inter-frame merge candidate can be assigned the largest index. For example, if the number of inter-frame merge candidates already stored in the inter-frame motion information list is less than the maximum value, the inter-frame merge candidate can be assigned an index with the same value as the number of stored inter-frame merge candidates. Alternatively, if the number of inter-frame merge candidates already stored in the inter-frame motion information list is equal to the maximum value, the inter-frame merge candidate can be assigned an index that is 1 less than the maximum value. Furthermore, the inter-frame merge candidate with the smallest index is removed, and the indices of the remaining stored inter-frame merge candidates are each reduced by 1.

[0216] Figure 16 This is a diagram illustrating an embodiment of updating the inter-frame merging candidate list.

[0217] Assume that inter-frame merge candidates derived from the current block are added to the inter-frame merge candidate list, and the largest index is assigned to the inter-frame merge candidate. Also, assume that the inter-frame merge candidate list already stores the maximum number of inter-frame merge candidates.

[0218] When adding the inter-frame merge candidate HmvpCand[n+1] exported from the current block to the inter-frame merge candidate list HmvpCandList, the inter-frame merge candidate HmvpCand[0] with the smallest index is removed from the stored inter-frame merge candidates, and the indices of the remaining inter-frame merge candidates are decreased by 1 respectively. Alternatively, the index of the inter-frame merge candidate HmvpCand[n+1] exported from the current block can be set to the maximum value (in... Figure 16 In the example shown, n).

[0219] If an inter-frame merge candidate that is the same as the inter-frame merge candidate derived from the current block is already stored (S1505), the inter-frame merge candidate derived from the current block may not be added to the inter-frame motion information list (S1509).

[0220] Alternatively, as inter-frame merge candidates derived from the current block are added to the inter-frame motion information list, previously stored inter-frame merge candidates that are identical to those candidates can also be removed. In this case, the indexes of the previously stored inter-frame merge candidates will be updated.

[0221] Figure 17 This is a diagram showing an example of how the index of a stored inter-frame merge candidate is updated.

[0222] When the index of a stored inter-frame merge candidate that is the same as the inter-frame merge candidate mvCand derived based on the current block is hIdx, deleting the stored inter-frame merge candidate can reduce the index of each inter-frame merge candidate with an index greater than hIdx by 1. For example, in Figure 17 The example shown illustrates removing HmvpCand[2], which is identical to mvCand, from the inter-frame motion information list HvmpCandList, and decreasing the indices of HmvpCand[3] to HmvpCand[n] by 1.

[0223] Furthermore, inter-frame merge candidate mvCands derived based on the current block can be added to the end of the inter-frame motion information list.

[0224] Alternatively, the index of a stored inter-frame merge candidate that is assigned to the same inter-frame merge candidate derived based on the current block can be updated. For example, the index of a stored inter-frame merge candidate can be changed to the minimum or maximum value.

[0225] Motion information of blocks included in a predetermined region can be set to not be added to the inter-frame motion information list. For example, inter-frame merge candidates derived from the motion information of blocks included in the merge processing region cannot be added to the inter-frame motion information list. Since the encoding / decoding order of the blocks included in the merge processing region is not defined, it is inappropriate to use the motion information of any of these blocks for inter-frame prediction of other blocks. Therefore, inter-frame merge candidates derived from the blocks included in the merge processing region may not be added to the inter-frame motion information list.

[0226] When performing motion compensation prediction using sub-block units, inter-frame merging candidates can be derived from the motion information of representative sub-blocks within the current block. For example, when using sub-block merging candidates for the current block, inter-frame merging candidates can be derived from the motion information of representative sub-blocks within the sub-block.

[0227] The motion vector of a sub-block can be derived in the following order. First, any of the merge candidates included in the merge candidate list of the current block can be selected, and the initial shift vector (shVector) can be derived based on the motion vector of the selected merge candidate. Then, by adding the initial shift vector to the positions (xSb, ySb) of the reference samples (e.g., the top-left sample or the middle sample) of each sub-block within the coded block, a shifted sub-block with reference sample positions (xColSb, yColSb) can be derived. Equation 4 below shows the formula for deriving the shifted sub-block.

[0228] [Formula 4]

[0229] (xColSb, yColSb) = (xSb+shVector[0]>>4, ySb+shVector[1]>>4)

[0230] Next, the motion vector of the block at the same position corresponding to the center position of the sub-block including (xColSb, yColSb) is set as the motion vector of the sub-block including (xSb, ySb).

[0231] A representative sub-block can mean a sub-block that includes the top-left sample or the center sample of the current block.

[0232] Figure 18 This is a diagram showing the location of a representative sub-block.

[0233] Figure 18 (a) shows an example of setting the child block located to the upper left of the current block as the representative child block. Figure 18 (b) illustrates an example of setting the sub-block located at the center of the current block as the representative sub-block. When performing motion compensation prediction on a sub-block basis, inter-frame merge candidates for the current block can be derived based on the motion vectors of sub-blocks that include the top-left sample of the current block or sub-blocks that include the center sample of the current block.

[0234] Based on the inter-frame prediction mode of the current block, it can also be determined whether the current block should be used as an inter-frame merging candidate. For example, blocks encoded / decoded based on an affine motion model can be set as non-inter-frame merging candidates. Thus, even if the current block is encoded / decoded using inter-frame prediction, the inter-frame prediction motion information list will not be updated based on the current block if the current block's inter-frame prediction mode is affine prediction mode.

[0235] Alternatively, inter-frame merge candidates can be derived from at least one sub-block vector within the sub-blocks included in the block being encoded / decoded based on an affine motion model. For example, an inter-frame merge candidate can be derived using a sub-block located to the upper left, center, or upper right of the current block. Alternatively, the average of the sub-block vectors of multiple sub-blocks can be used as the motion vector for the inter-frame merge candidate.

[0236] Alternatively, inter-frame merge candidates can be derived based on the average of the affine seed vectors of the blocks encoded / decoded using an affine motion model. For example, the average of at least one of the first, second, or third affine seed vectors of the current block can be set as the motion vector of the inter-frame merge candidate.

[0237] Alternatively, inter-frame motion information lists can be configured for different inter-frame prediction modes. For example, at least one of the following can be defined: an inter-frame motion information list for blocks encoded / decoded via intra-block copy, an inter-frame motion information list for blocks encoded / decoded based on a translational motion model, or an inter-frame motion information list for blocks encoded / decoded based on an affine motion model. Any one of the multiple inter-frame motion information lists can be selected depending on the inter-frame prediction mode of the current block.

[0238] Figure 19 An example of generating a list of inter-frame motion information for different inter-frame prediction modes is shown.

[0239] When encoding / decoding a block based on a non-affine motion model, inter-frame merge candidate `mvCand` derived from the block can be added to the inter-frame non-affine motion information list `HmvpCandList`. Conversely, when encoding / decoding a block based on an affine motion model, inter-frame merge candidate `mvAfCand` derived from the block can be added to the inter-frame affine motion information list `HmvpAfCandList`.

[0240] The affine seed vector of a block can be stored in an inter-frame merge candidate derived from the block encoded / decoded based on the affine motion model. Thus, the inter-frame merge candidate can be used as a merge candidate for deriving the affine seed vector of the current block.

[0241] In addition to the described list of inter-frame motion information, another list of inter-frame motion information can be defined. Besides the described list of inter-frame motion information (hereinafter referred to as the first inter-frame motion information list), a long-term motion information list (hereinafter referred to as the second inter-frame motion information list) can also be defined. The long-term motion information list includes long-term merging candidates.

[0242] When both the first and second inter-frame motion information lists are empty, inter-frame merge candidates can be added to the second inter-frame motion information list first. Only after the maximum number of available inter-frame merge candidates in the second inter-frame motion information list has been reached can inter-frame merge candidates be added to the first inter-frame motion information list.

[0243] Alternatively, an inter-frame merge candidate can be added to both the second inter-frame motion information list and the first inter-frame motion information list.

[0244] In this case, the already configured second inter-frame motion information list may no longer be updated. Alternatively, the second inter-frame motion information list may be updated when the decoded region is above a predetermined ratio of slices. Alternatively, the second inter-frame motion information list may be updated every N coding tree unit rows.

[0245] On the other hand, the first inter-frame motion information list can be updated whenever a block is generated using inter-frame prediction for encoding / decoding. However, inter-frame merge candidates added to the second inter-frame motion information list can also be set not to be used to update the first inter-frame motion information list.

[0246] Information for selecting either a first inter-frame motion information list or a second inter-frame motion information list can be transmitted via a bitstream signal. When the number of merge candidates included in the merge candidate list is less than a threshold, merge candidates included in the inter-frame motion information list indicated by the information can be added to the merge candidate list.

[0247] Alternatively, the list of inter-frame motion information can be selected based on the size and shape of the current block, the inter-frame prediction mode, whether bidirectional prediction is enabled or disabled, whether motion vectors are refined or not, or whether triangulation is enabled or not.

[0248] Alternatively, if the number of merge candidates included in the merge candidate list is still less than the maximum number of merges even after adding the inter-frame merge candidates included in the first inter-frame motion information list, then the inter-frame merge candidates included in the second inter-frame motion information list can be added to the merge candidate list.

[0249] Figure 20 This is a diagram illustrating an example of adding inter-frame merge candidates included in the long-term motion information list to the merge candidate list.

[0250] If the number of merge candidates in the merge candidate list is less than the maximum number, inter-frame merge candidates included in the first inter-frame motion information list HmvpCandList can be added to the merge candidate list. Even if the number of merge candidates in the merge candidate list is still less than the maximum number after adding inter-frame merge candidates included in the first inter-frame motion information list, then inter-frame merge candidates included in the long-term motion information list HmvpLTCandList can be added to the merge candidate list.

[0251] Inter-frame merge candidates can be configured to include additional information besides motion information. For example, the size, shape, or partitioning information of storage blocks can be added to the inter-frame merge candidates. When constructing the merge candidate list for the current block, only inter-frame merge candidates with the same or similar size, shape, or partitioning information as the current block are used in the inter-frame merge candidate list, or inter-frame merge candidates with the same or similar size, shape, or partitioning information as the current block are preferentially added to the merge candidate list.

[0252] Alternatively, inter-frame motion information lists can be generated for different block sizes, shapes, or partitioning information. Multiple inter-frame motion information lists corresponding to the shape, size, or partitioning information of the current block can be used to generate a merge candidate list for the current block.

[0253] If the number of merge candidates in the current block's merge candidate list is less than a threshold, inter-frame merge candidates included in the inter-frame motion information list can be added to the merge candidate list. The addition process is performed in ascending or descending order of the index. For example, the inter-frame merge candidate with the largest index can be added to the merge candidate list.

[0254] When adding inter-frame merge candidates included in the inter-frame motion information list to the merge candidate list, a redundancy check can be performed between the inter-frame merge candidates and multiple merge candidates already stored in the merge candidate list.

[0255] Redundancy checks can also be performed only on some of the inter-frame merge candidates included in the inter-frame motion information list. For example, redundancy checks can be performed only on inter-frame merge candidates with indices above or below a threshold. Alternatively, redundancy checks can be performed only on the N merge candidates with the largest indices or the N merge candidates with the smallest indices.

[0256] Alternatively, redundancy checks can be performed only on some of the merge candidates already stored in the merge candidate list. For example, redundancy checks can be performed only on merge candidates with an index above or below a threshold, or on merge candidates derived from a block at a specific location. A specific location may include at least one of the current block's left neighbor, top neighbor, top-right neighbor, or bottom-left neighbor.

[0257] Figure 21 This is a diagram illustrating an example of performing redundancy checks only on some merge candidates.

[0258] When adding an inter-frame merge candidate HmvpCand[j] to the merge candidate list, a redundancy check can be performed between the inter-frame merge candidate and the two merge candidates with the largest indices, mergeCandList[NumMerge-2] and mergeCandList[NumMerge-1]. Here, NumMerge represents the number of available spatial and temporal merge candidates.

[0259] Unlike the example shown in the figure, when adding an inter-frame merge candidate HmvpCand[j] to the merge candidate list, a redundancy check can also be performed between the inter-frame merge candidate and the two merge candidates with the smallest index. For example, it can be verified whether mergeCandList[0] and mergeCandList[1] are the same as HmvpCand[j]. Alternatively, a redundancy check can be performed only on merge candidates derived from a specific location. For example, a redundancy check can be performed only on at least one of the merge candidates derived from the adjacent block to the left of the current block or the merge candidate derived from the adjacent block above the current block. When there is no merge candidate derived from a specific location in the merge candidate list, the inter-frame merge candidate can be added to the merge candidate list without performing a redundancy check.

[0260] If a merge candidate that is identical to the first inter-frame merge candidate is found, the redundancy check of the merge candidate that is identical to the first inter-frame merge candidate can be skipped when performing a redundancy check on the second inter-frame merge candidate.

[0261] Figure 22 This is a diagram illustrating an example of skipping redundancy checks on a specific merge candidate.

[0262] When adding the inter-frame merge candidate HmvpCand[i] at index i to the merge candidate list, a redundancy check can be performed between the inter-frame merge candidate and the merge candidates already stored in the merge candidate list. In this case, if a merge candidate mergeCandList[j] with the same index as inter-frame merge candidate HmvpCand[i] is found, inter-frame merge candidate HmvpCand[i] will not be added to the merge candidate list, and a redundancy check can be performed between inter-frame merge candidate HmvpCand[i-1] at index i-1 and the merge candidate. In this case, the redundancy check between inter-frame merge candidate HmvpCand[i-1] and merge candidate mergeCandList[j] can be skipped.

[0263] For example, in Figure 22 In the example shown, HmvpCand[i] is determined to be the same as mergeCandList[2]. Therefore, HmvpCand[i] is not added to the merge candidate list, and a redundancy check can be performed on HmvpCand[i-1]. In this case, the redundancy check between HvmpCand[i-1] and mergeCandList[2] can be skipped.

[0264] When the number of merge candidates in the current block's merge candidate list is less than a threshold, in addition to inter-frame merge candidates, at least one of paired merge candidates or zero merge candidates may be included. Paired merge candidates refer to merge candidates whose motion vectors are the average of two or more merge candidates, while zero merge candidates refer to merge candidates whose motion vectors are 0.

[0265] Merge candidates for the current block can be added in the following order.

[0266] Spatial merge candidate - Temporal merge candidate - Inter-frame merge candidate - (Inter-frame affine merge candidate) - Pairwise merge candidate - Zero merge candidate

[0267] Spatial merge candidates refer to merge candidates derived from at least one of adjacent or non-adjacent blocks, while temporal merge candidates refer to merge candidates derived from the previous reference image. The inter-frame affine merge candidate column represents inter-frame merge candidates derived from blocks encoded / decoded using an affine motion model.

[0268] The inter-frame motion information list can also be used in advanced motion vector prediction mode. For example, if the number of motion vector prediction candidates included in the current block's motion vector prediction candidate list is less than a threshold, the inter-frame merging candidates included in the inter-frame motion information list are set as motion vector prediction candidates related to the current block. Specifically, the motion vectors of the inter-frame merging candidates are set as motion vector prediction candidates.

[0269] If any one of the motion vector prediction candidates included in the motion vector prediction candidate list for the current block is selected, the selected candidate is set as the motion vector prediction value for the current block. Then, after decoding the motion vector residual value for the current block, the motion vector for the current block can be obtained by adding the motion vector prediction value and the motion vector residual value.

[0270] The candidate list for motion vector prediction of the current block can be constructed in the following order.

[0271] Spatial motion vector prediction candidate - Temporal motion vector prediction candidate - Inter-frame decoding region merging candidate - (Inter-frame decoding region affine merging candidate) - Zero motion vector prediction candidate

[0272] Spatial motion vector prediction candidates are those derived from at least one neighboring or non-neighboring block, while temporal motion vector prediction candidates are those derived from the previous reference image. The inter-frame affine merging candidate column represents inter-frame motion vector prediction candidates derived from blocks encoded / decoded using an affine motion model. Zero motion vector prediction candidates represent candidates with a motion vector value of 0.

[0273] The coded block can be divided into multiple prediction units, and prediction can be performed on each of the divided prediction units. Here, a prediction unit represents the basic unit used for prediction.

[0274] A coded block can be divided using at least one of vertical lines, horizontal lines, diagonal lines, or diagonal lines. Information for determining at least one of the number, angle, or position of lines dividing the coded block can be transmitted via a bitstream signal. For example, information indicating any of the candidate partition types for the coded block can be transmitted via a bitstream signal, or information specifying any of a plurality of line candidates for dividing the coded block can be transmitted via a bitstream signal. Alternatively, information for determining the number or type of line candidates for dividing the coded block can be transmitted via a bitstream signal. For example, using a 1-bit flag, it can be determined whether diagonal lines with angles greater than the diagonal and / or diagonal lines with angles less than the diagonal are suitable as line candidates.

[0275] Alternatively, at least one of the following can be adaptively determined based on at least one of the intra-frame prediction mode, inter-frame prediction mode, available merge candidate locations, or adjacent block partitioning types of the coding block: the number, angle, or location of the lines that partition the coding block.

[0276] If a coding block is divided into multiple prediction units, then intra-frame prediction or inter-frame prediction can be performed on each prediction unit.

[0277] Figure 23 This is a diagram illustrating an example of dividing a coded block into multiple prediction units using diagonals.

[0278] As in Figure 23 (a) and Figure 23 In the example shown in (b), the coding block can be divided into two triangular prediction units using diagonals.

[0279] exist Figure 23 (a) and Figure 23 In (b), it is shown that a coding block can be divided into two prediction units using a diagonal line connecting the two vertices of the coding block. However, a coding block can also be divided into two prediction units using a diagonal line whose at least one end does not cross a vertex of the coding block.

[0280] Figure 24 This is a diagram illustrating an example of dividing a coded block into two prediction units.

[0281] As in Figure 24 (a) and Figure 24 In the example shown in (b), the coding block can be divided into two prediction units using a diagonal line whose two ends respectively touch the upper and lower boundaries of the coding block.

[0282] Alternatively, such as in Figure 24 (c) and Figure 24 In the example shown in (d), the coding block can be divided into two prediction units using a diagonal line whose two ends touch the left and right boundaries of the coding block, respectively.

[0283] Alternatively, the coded block can be divided into two prediction blocks of different sizes. For example, the dividing line of the coded block can be set to contact the two boundary surfaces that form a vertex, thereby dividing the coded block into two prediction units of different sizes.

[0284] Figure 25 A diagram showing an example of dividing a coded block into multiple prediction blocks of different sizes is provided.

[0285] As in Figure 25 (a) and Figure 25 In the example shown in (b), the coding block can be divided into two prediction units of different sizes by setting the diagonal connecting the top-left or bottom-right corner of the coding block to pass through the left, right, top, or bottom boundary of the coding block, instead of passing through the top-left or bottom-right corner of the coding block.

[0286] Alternatively, such as in Figure 25 (c) and Figure 25 In the example shown in (d), the coding block can be divided into two prediction units of different sizes by setting the diagonal connecting the top right or bottom left corner of the coding block to pass through the left, right, top, or bottom boundary of the coding block, instead of passing through the top left or bottom right corner of the coding block.

[0287] Each prediction unit generated by dividing the coding block is called the "Nth prediction unit". For example, in Figures 23 to 25 In the example shown, PU1 can be defined as a first prediction unit, and PU2 can be defined as a second prediction unit. The first prediction unit can refer to a prediction unit that includes samples located in the lower left or upper left of the coding block, and the second prediction unit can refer to a prediction unit that includes samples located in the upper right or lower right of the coding block.

[0288] Conversely, a prediction unit that includes samples located in the upper right or lower right of the coding block can be defined as a first prediction unit, and a prediction unit that includes samples located in the lower left or upper left of the coding block can be defined as a second prediction unit.

[0289] The embodiments described later primarily illustrate examples of partitioning using diagonals. Specifically, the process of dividing a coding block into two prediction units using diagonals is called diagonal partitioning or triangular partitioning, and the prediction units generated based on diagonal partitioning are called triangular prediction units. However, it is also possible, in the embodiments described later, to use examples of partitioning using oblique lines at angles different from vertical lines, horizontal lines, or diagonals.

[0290] Whether to apply diagonal partitioning to a coding block can be determined based on at least one of the following: slice type, maximum number of merging candidates that may be included in the merge candidate list, size of the coding block, shape of the coding block, predicted coding mode of the coding block, or partitioning type of the parent node.

[0291] For example, whether to apply diagonal partitioning to the coded block can be determined based on whether the current slice is of type B. Diagonal partitioning is only allowed when the current slice is of type B.

[0292] Alternatively, the decision to apply diagonal partitioning to the coded block can be based on whether the maximum number of merge candidates included in the merge candidate list is two or more. Diagonal partitioning is only permitted when the maximum number of merge candidates included in the merge candidate list is two or more.

[0293] Alternatively, when at least one of the width or height of the hardware is greater than 64, there is a drawback that 64×64-sized data processing units are redundantly accessed. Therefore, when at least one of the width or height of the coded block is greater than a threshold, it may be possible not to divide the coded block into multiple prediction blocks. For example, when at least one of the height and width of the coded block is greater than 64 (e.g., when at least one of the width and height is 128), diagonal division may not be used.

[0294] Alternatively, considering the maximum number of samples that can be processed simultaneously in a hardware implementation, diagonal partitioning may not be allowed for coded blocks with a sample count greater than a threshold. For example, diagonal partitioning may not be permitted for coded tree blocks with a sample count greater than 4096.

[0295] Alternatively, diagonal splits may not be allowed for coding blocks containing fewer than a threshold number of samples. For example, when a coding block contains fewer than 64 samples, it can be configured not to apply diagonal splits to the coding block.

[0296] Alternatively, whether to apply diagonal division to the coding block can be determined based on whether the width-to-height ratio of the coding block is less than a first threshold or whether the width-to-height ratio of the coding block is greater than a second threshold. Here, the width-to-height ratio whRatio of the coding block can be determined as the ratio of the width CbW to the height CbH of the coding block, as shown in Formula 5 below.

[0297] [Formula 5]

[0298] whRatio=CbW / CbH

[0299] The second threshold can be the reciprocal of the first threshold. For example, if the first threshold is k, the second threshold can be 1 / k.

[0300] Diagonal division can only be applied to a coding block if the width-to-height ratio of the coding block is between the first threshold and the second threshold.

[0301] Alternatively, triangular partitioning can only be used if the width-to-height ratio of the coded block is less than a first threshold or greater than a second threshold. For example, when the first threshold is 16, diagonal partitioning is not allowed for coded blocks of size 64×4 or 4×64.

[0302] Alternatively, the permission to split diagonally can be determined based on the partitioning type of the parent node. For example, when the coded block serving as the parent node is partitioned based on a quadtree, diagonal partitioning can be applied to the coded blocks serving as leaf nodes. On the other hand, when the coded block serving as the parent node is partitioned based on a binary or ternary tree, the coded blocks serving as leaf nodes are set to not allow diagonal partitioning.

[0303] Alternatively, the diagonal splitting can be determined based on the predictive coding mode of the coding block. For example, when a coding block is coded using intra-frame prediction, diagonal splitting is only permitted if the coding block is coded using inter-frame prediction or if the coding block is coded using a predefined inter-frame prediction mode. The predefined inter-frame prediction mode can represent at least one of a merging mode, an advanced motion vector prediction mode, an affine merging mode, or an affine motion vector prediction mode.

[0304] Alternatively, the size of the parallel processing region can be used to determine whether diagonal partitioning is allowed. For example, if the size of the coded block is larger than the size of the parallel processing region, diagonal partitioning may not be used.

[0305] You can also consider two or more of the listed conditions to determine whether to apply diagonal partitioning to the coded block.

[0306] As another example, information indicating whether a diagonal partition has been applied to a coded block can be signaled via a bitstream. This information can be signaled at the sequence level, image level, slice level, or block level. For example, a flag indicating whether a triangle partition has been applied to a coded block can be signaled at the coded block level.

[0307] When determining whether to apply diagonal division to a coded block, information indicating the number of lines dividing the coded block or the position of the lines can be sent via a bit stream signal.

[0308] For example, when a coded block is divided by a diagonal, information indicating the direction of the diagonal dividing the coded block can be sent via a bitstream signal. For instance, a flag indicating the direction of the diagonal, `triangle_partition_type_flag`, can be sent via a bitstream signal. This flag indicates whether the coded block is divided by a diagonal connecting the top left and bottom right, or by a diagonal connecting the top right and bottom left. Dividing the coded block by a diagonal connecting the top left and bottom right is called a left triangle partitioning type, and dividing it by a diagonal connecting the top right and bottom left is called a right triangle partitioning type. For example, a flag value of 0 indicates a left triangle partitioning type, and a flag value of 1 indicates a right triangle partitioning type.

[0309] Alternatively, information indicating whether prediction units have the same size or information indicating the position of the diagonal used to divide the coded block can be transmitted via a bitstream signal. For example, if the information indicating the size of the prediction units indicates that the prediction units are the same size, the encoding of the information indicating the diagonal position is skipped, and the coded block can be divided into two prediction units using a diagonal line passing through the two vertices of the coded block. On the other hand, when the information indicating the size of the prediction units indicates that the prediction units are not the same size, the position of the diagonal used to divide the coded block can be determined based on the information indicating the position of the diagonal. For example, when a left triangle partitioning type is applied to the coded block, the position information can indicate whether the diagonal touches the left and lower boundaries or the upper and right boundaries of the coded block. Alternatively, when a right triangle partitioning type is applied to the coded block, the position information can indicate whether the diagonal touches the right and lower boundaries or the upper and left boundaries of the coded block.

[0310] Information indicating the partition type of a coded block can be sent at the coded block level using signals. Therefore, the partition type can be determined for different coded blocks that are partitioned diagonally.

[0311] As another example, for sequence, image, slice, tile, or coding tree unit, information indicating the partition type can be sent using signals. In this case, the partition type of the coding block with diagonal partitioning can be set to be the same within the sequence, image, slice, tile, or coding tree unit.

[0312] Alternatively, for the first coding unit within the coding tree unit that applies diagonal partitioning, the information used to determine the partitioning type is encoded and transmitted by a signal, and the second and subsequent coding units that apply diagonal partitioning are set to use the same partitioning type as the first coding unit.

[0313] As another example, the partitioning type of a coded block can be determined based on the partitioning type of adjacent blocks. Adjacent blocks can include at least one of the following: adjacent blocks at the top-left corner, adjacent blocks at the top-right corner, adjacent blocks at the bottom-left corner, adjacent blocks above, or adjacent blocks to the left. For example, the partitioning type of the current block can be set to the same type as that of its adjacent blocks. Alternatively, the partitioning type of the current block can be determined based on whether the top-left adjacent block applies a left triangle partitioning type, or whether the top-right or bottom-left adjacent block applies a right triangle partitioning type.

[0314] To perform motion prediction compensation on the first and second triangle prediction units, motion information for each unit can be derived. In this case, the motion information of the first and second triangle prediction units can be derived from the merging candidates included in the merging candidate list. To distinguish between a general merging candidate list and the merging candidate list used to derive motion information of triangle prediction units, the merging candidate list used to derive motion information of triangle prediction units is referred to as the triangle merging candidate list, and the merging candidates included in the triangle merging candidate list are referred to as triangle merging candidates. However, applying the aforementioned merging candidate deriving method and merging candidate list construction method to triangle merging candidates and triangle merging candidate list construction methods is also included within the spirit of this invention.

[0315] Information for determining the maximum number of triangle merging candidates that can be included in the triangle merging candidate list can be transmitted via a bitstream signal. This information can represent the difference between the maximum number of merge candidates that the merge candidate list can include and the maximum number of triangle merging candidates that the triangle merging candidate list can include.

[0316] Triangle merging candidates can be derived from spatially adjacent blocks and temporally adjacent blocks of the coded block.

[0317] Figure 26 This is a diagram showing adjacent blocks used to derive triangle merging candidates.

[0318] Triangle merging candidates can be derived using at least one of the following: an upper adjacent block, a left adjacent block, or a block at the same position included in an image different from the coded block. An upper adjacent block can include at least one of the following: a block containing samples (xCb+CbW-1, yCb-1) located above the coded block; a block containing samples (xCb+CbW, yCb-1) located above the coded block; or a block containing samples (xCb-1, yCb-1) located above the coded block. A left adjacent block can include at least one of the following: a block containing samples (xCb-1, yCb+CbH-1) located to the left of the coded block; or a block containing samples (xCb-1, yCb+CbH) located to the left of the coded block. A colocation block can be defined as either a block that includes samples (xCb+CbW, yCb+CbH) adjacent to the upper right corner of the coded block within the colocation image, or a block that includes samples (xCb / 2, yCb / 2) located at the center of the coded block.

[0319] Neighboring blocks can be searched in a predefined order, and triangle merge candidates can be constructed into a triangle merge candidate list in a predefined order. For example, triangle merge candidates can be searched in the order of B1, A1, B0, A0, C0, B2, and C1 to construct the triangle merge candidate list.

[0320] The motion information of the triangle prediction units can be derived based on the triangle merging candidate list. That is, triangle prediction units can share a single triangle merging candidate list.

[0321] To derive the motion information of the triangle merging unit, information specifying at least one of the triangle merging candidates included in the triangle merging candidate list can be transmitted via a bitstream signal. For example, the index information merge_triangle_idx specifying at least one of the triangle merging candidates can be transmitted via a bitstream signal.

[0322] The index information can specify a combination of merge candidates for the first triangular prediction unit and merge candidates for the second triangular prediction unit. For example, Table 1 below shows an example of a combination of merge candidates based on the index information merge_triangle_idx.

[0323] Table 1

[0324] merge_triangle_idx 0 1 2 3 4 5 6 7 8 First Prediction Unit 1 0 0 0 2 0 0 1 3 Second prediction unit 0 1 2 1 0 3 4 0 0 merge_triangle_idx 9 10 11 12 13 14 15 16 17 First Prediction Unit 4 0 1 1 0 0 1 1 1 Second prediction unit 0 2 2 2 4 3 3 4 4 merge_triangle_idx 18 19 20 21 22 23 24 25 26 First Triangle Prediction Unit 1 2 2 2 4 3 3 3 4 Second Triangle Prediction Unit 3 1 0 1 3 0 2 4 0 merge_triangle_idx 27 28 29 30 31 32 33 34 35 First Triangle Prediction Unit 3 2 4 4 2 4 3 4 3 Second Triangle Prediction Unit 1 3 1 1 3 2 2 3 1 merge_triangle_idx 36 37 38 39 First Triangle Prediction Unit 2 2 4 3 Second Triangle Prediction Unit 4 4 2 4

[0325] A value of 1 in the index information `merge_triangle_idx` indicates that the motion information of the first triangular prediction unit is derived from the merge candidate at index 1, and the motion information of the second triangular prediction unit is derived from the merge candidate at index 0. The merge candidate triangles used to derive the motion information of the first triangular prediction unit and the merge candidate triangles used to derive the motion information of the second triangular prediction unit can be determined through the index information `merge_triangle_idx`.

[0326] The partitioning type of the coding block using diagonal partitioning can also be determined based on the index information. That is, the index information can specify a combination of the merging candidates for the first triangular prediction unit, the merging candidates for the second triangular prediction unit, and the partitioning direction of the coding block. When determining the partitioning type of the coding block based on the index information, the information indicating the diagonal direction of the partitioned coding block, `triangle_partition_type_flag`, does not need to be encoded. Table 2 shows the partitioning types of the coding block based on the index information `merge_triangle_idx`.

[0327] Table 2

[0328] merge_triangle_idx 0 1 2 3 4 5 6 7 8 TriangleDir 0 1 1 0 0 1 1 1 0 merge_triangle_idx 9 10 11 12 13 14 15 16 17 TriangleDir 0 0 0 1 0 0 0 0 1 merge_triangle_idx 18 19 20 21 22 23 24 25 26 TriangleDir 1 1 1 0 0 1 1 1 1 merge_triangle_idx 27 28 29 30 31 32 33 34 35 TriangleDir 1 1 1 0 0 1 0 1 0 merge_triangle_idx 36 37 38 39 TriangleDir 0 1 0 0

[0329] When the variable TriangleDir is 0, it indicates that the coding block uses the left triangle partitioning type; when TriangleDir is 1, it indicates that the coding block uses the right triangle partitioning type. By combining Tables 1 and 2, it can be set to specify the merging candidates for the first triangle prediction unit, the merging candidates for the second triangle prediction unit, and the partitioning direction of the coding block based on the index information merge_triangle_idx.

[0330] As another example, index information can be sent using signals only for either the first or second triangle prediction unit, and the index of the triangle merging candidate for the other of the first and second triangle prediction units can be determined based on this index information. For example, the triangle merging candidate for the first triangle prediction unit can be determined based on the index information merge_triangle_idx representing the index of either of the triangle merging candidates. Furthermore, the triangle merging candidate for the second triangle prediction unit can be specified based on the merge_triangle_idx. For example, the triangle merging candidate for the second triangle prediction unit can be derived by adding or subtracting an offset from the index information merge_triangle_idx. The offset can be an integer such as 1 or 2. For example, the triangle merging candidate for the second triangle prediction unit can be determined as the triangle merging candidate with an index of merge_traingle_idx plus 1. When merge_triangle_idx indicates the triangle merging candidate with the largest index value among the triangle merging candidates, the motion information of the second triangle prediction unit can be derived from the triangle merging candidate with index 0 or the triangle merging candidate with an index obtained by subtracting 1 from merge_triangle_idx.

[0331] Alternatively, motion information for the second triangle prediction unit can be derived from a triangle merging candidate having the same reference image as the triangle merging candidate of the first triangle prediction unit specified according to the index information. The triangle merging candidate having the same reference image as the triangle merging candidate of the first triangle prediction unit can represent at least one of an L0 reference image or an L1 reference image having the same triangle merging candidate as the first triangle prediction unit. When multiple triangle merging candidates exist that have the same reference image as the triangle merging candidate of the first triangle prediction unit, any one can be selected based on at least one of whether the merging candidate includes bidirectional motion information or the difference between the index of the merging candidate and the index information.

[0332] As another example, index information can be transmitted via signals for the first triangle prediction unit and the second triangle prediction unit, respectively. For instance, a first index information 1st_merge_idx for determining triangle merging candidates for the first triangle prediction unit and a second index information 2nd_merge_idx for determining triangle merging candidates for the second triangle prediction unit can be transmitted via signals through a bitstream. The motion information of the first triangle prediction unit can be derived from the triangle merging candidates determined based on the first index information 1st_merge_idx, and the motion information of the second triangle prediction unit can be derived from the triangle merging candidates determined based on the second index information 2nd_merge_idx.

[0333] The first index information 1st_merge_idx can represent any index among the triangle merging candidates included in the triangle merging candidate list. The triangle merging candidate of the first triangle prediction unit can be determined as the triangle merging candidate pointed to by the first index information 1st_merge_idx.

[0334] The triangle merging candidate pointed to by the first index information 1st_merge_idx is set as a triangle merging candidate that cannot be used as a second triangle prediction unit. Therefore, the second index information 2nd_merge_idx of the second triangle prediction unit can indicate the index of any of the remaining triangle merging candidates other than the triangle merging candidate pointed to by the first index information. When the value of the second index information 2nd_merge_idx is less than the value of the first index information 1st_merge_idx, the triangle merging candidate of the second triangle prediction unit can be determined as a triangle merging candidate with the index information represented by the second index information 2nd_merge_idx. On the other hand, when the value of the second index information 2nd_merge_idx is the same as or greater than the value of the first index information 1st_merge_idx, the triangle merging candidate of the second triangle prediction unit can be determined as a triangle merging candidate with the value of the second index information 2nd_merge_idx plus 1 as its index.

[0335] Alternatively, the decision to signal the second index information can be determined based on the number of triangle merging candidates included in the triangle merging candidate list. For example, if the maximum number of triangle merging candidates that the triangle merging candidate list can include is no more than 2, signaling the second index information can be skipped. When signaling the second index information is skipped, the second triangle merging candidate can be derived by adding or subtracting an offset from the first index information. For example, when the maximum number of triangle merging candidates that the triangle merging candidate list can include is 2 and the first index information is index 0, the second triangle merging candidate can be derived by adding 1 to the first index information. Alternatively, when the maximum number of triangle merging candidates that the triangle merging candidate list can include is 2 and the first index information is 1, the second triangle merging candidate can be derived by subtracting 1 from the first index information.

[0336] Alternatively, when skipping the signal transmission of the second index information, the second index information can be set to a default value. The default value can be 0. By comparing the first and second index information, second triangle merging candidates can be derived. For example, when the second index information is less than the first index information, the merging candidate for index 0 is set as the second triangle merging candidate; when the second index information is the same as or greater than the first index information, the merging candidate for index 1 is set as the second triangle merging candidate.

[0337] When a triangle merging candidate has unidirectional motion information, that unidirectional motion information is set as the motion information of the triangle prediction unit. Conversely, when a triangle merging candidate has bidirectional motion information, only either L0 motion information or L1 motion information is set as the motion information of the triangle prediction unit. The choice between L0 and L1 motion information can be determined based on the index of the triangle merging candidate or the motion information of another triangle prediction unit.

[0338] For example, when the index of a triangle merging candidate is even, the L0 motion information of the triangle prediction unit is set to 0, and the L1 motion information of the triangle merging candidate is set to the L1 motion information of the triangle prediction unit. Conversely, when the index of a triangle merging candidate is odd, the L1 motion information of the triangle prediction unit is set to 0, and the L0 motion information of the triangle merging candidate is also set to 0. Conversely, when the index of a triangle merging candidate is even, the L0 motion information of the triangle merging candidate is set to the L0 motion information of the triangle prediction unit, and when the index of a triangle merging candidate is odd, the L1 motion information of the triangle merging candidate can also be set to the L1 motion information of the triangle prediction unit. Alternatively, for the first triangle prediction unit, when the number of triangle merging candidates is even, the L0 motion information of the triangle merging candidate is set to the L0 motion information of the first triangle prediction unit; on the other hand, for the second triangle prediction unit, when the number of triangle merging candidates is odd, the L1 motion information of the triangle merging candidate is set to the L1 motion information of the second triangle prediction unit.

[0339] Alternatively, when the first triangle prediction unit has L0 motion information, the L0 motion information of the second triangle prediction unit is set to 0, and the L1 motion information of the triangle merging candidate is set to the L1 information of the second triangle prediction unit. On the other hand, when the first triangle prediction unit has L1 motion information, the L1 motion information of the second triangle prediction unit is set to 0, and the L0 motion information of the triangle merging candidate is set to the L0 motion signal of the second triangle prediction unit.

[0340] The triangle merging candidate list used to derive motion information of the first triangle prediction unit and the triangle merging candidate list used to derive motion information of the second triangle prediction unit can also be set to be different.

[0341] For example, when specifying triangle merging candidates for deriving motion information of the first triangle prediction unit within the triangle merging candidate list based on index information associated with the first triangle prediction unit, the motion information of the second triangle prediction unit can be derived using a triangle merging list that includes the remaining triangle merging candidates other than those indicated by the index information. Specifically, the motion information of the second triangle prediction unit can be derived from any of the remaining triangle merging candidates.

[0342] Therefore, the maximum number of triangle merging candidates included in the triangle merging candidate list of the first triangle prediction unit and the maximum number of triangle merging candidates included in the triangle merging candidate list of the second triangle prediction unit will be different. For example, when the triangle merging candidate list of the first triangle prediction unit includes M merging candidates, the triangle merging candidate list of the second triangle prediction unit may include M-1 merging candidates other than those indicated by the index information of the first triangle prediction unit.

[0343] As another example, merging candidates for each triangular prediction unit can be derived based on neighboring blocks adjacent to the coded block, and the availability of neighboring blocks can be determined by taking into account the shape or position of the triangular prediction unit.

[0344] Figure 27 This is a diagram used to illustrate an example of determining the availability of neighboring blocks for each triangular prediction unit.

[0345] Neighboring blocks that are not adjacent to the first triangle prediction unit can be set as neighboring blocks that are unavailable to the first triangle prediction unit, and neighboring blocks that are not adjacent to the second triangle prediction unit can be set as neighboring blocks that are unavailable to the second triangle prediction unit.

[0346] For example, as in Figure 27 In the example shown in (a), when the left triangle partitioning type is applied to the coding block, it can be determined that blocks A1, A0, and A2, which are adjacent to the first triangle prediction unit in the adjacent blocks of the coding block, can be used for the first triangle prediction unit, while blocks B0 and B1 cannot be used for the first triangle prediction unit. Therefore, the triangle merging candidate list associated with the first triangle prediction unit includes triangle merging candidates derived from blocks A1, A0, and A2, but excludes triangle merging candidates derived from blocks B0 and B1.

[0347] As in Figure 27 In the example shown in (b), when the left triangle partitioning type is applied to the coding block, it can be determined that blocks B0 and B1, which are adjacent to the second triangle prediction unit, can be used for the second triangle prediction unit, while blocks A1, A0, and A2 cannot be used for the second triangle prediction unit. Therefore, the triangle merging candidate list associated with the second triangle prediction unit includes triangle merging candidates derived from blocks B0 and B1, but excludes triangle merging candidates derived from blocks A1, A0, and A2.

[0348] Therefore, the number or range of triangle merging candidates that a triangle prediction unit can use can be determined based on at least one of the location of the triangle prediction unit or the partitioning type of the coding block.

[0349] As another example, the merging mode can be applied to only either the first triangle prediction unit or the second triangle prediction unit. Furthermore, another piece of motion information in the first and second triangle prediction units can be set to be the same as the motion information of the triangle prediction unit to which the merging mode is applied, or the motion information of the triangle prediction unit to which the merging mode is applied can be refined to derive it.

[0350] For example, the motion vector and reference image index of the first triangle prediction unit can be derived based on the triangle merging candidate. The motion vector of the first triangle prediction unit can be refined to derive the motion vector of the second triangle prediction unit. For example, the motion vector of the second triangle prediction unit can be derived by adding or subtracting the refined motion vector {Rx, Ry} from the motion vector {mvD1LXx, mvD1LXy} of the first triangle prediction unit. The reference image index of the second triangle prediction unit can be set to be the same as that of the first triangle prediction unit.

[0351] Information for determining the fine motion vector representing the difference between the motion vector of the first triangle prediction unit and the motion vector of the second triangle prediction unit can be transmitted via a bitstream signal. This information may include at least one of information representing the magnitude of the fine motion vector or information representing the sign of the fine motion vector.

[0352] Alternatively, the symbols of fine motion vectors can be derived based on at least one of the positions, indices, or partition types applied to the coding block of the triangular prediction unit.

[0353] As another example, the motion vector and reference image index of either the first triangle prediction unit or the second triangle prediction unit can be transmitted via signaling. The other motion vector in the first triangle prediction unit or the second triangle prediction unit can be derived by refining the motion vector transmitted via signaling.

[0354] For example, based on information transmitted as a signal from the bitstream, the motion vector and reference image index of the first triangle prediction unit can be determined. Furthermore, the motion vector of the second triangle prediction unit can be derived by refining the motion vector of the first triangle prediction unit. For instance, the motion vector of the second triangle prediction unit can be derived by adding or subtracting the refined motion vector {Rx, Ry} from the motion vector {mvD1LXx, mvD1LXy} of the first triangle prediction unit. The reference image index of the second triangle prediction unit can be set to be the same as that of the first triangle prediction unit.

[0355] Motion prediction compensation can be performed on the coded block based on the motion information of the first and second triangular prediction units. In this case, image quality degradation may occur at the boundaries of the first and second triangular prediction units. For example, the presence of edges on the boundaries of the first and second triangular prediction units can lead to a deterioration in image quality continuity. To reduce image quality degradation at the boundaries, prediction samples can be derived through smoothing filtering or weighted prediction.

[0356] Predictive samples applying diagonal division within a coding block can be derived by weighted summing of a first predictive sample obtained based on motion information of a first triangular predictive unit and a second predictive sample obtained based on motion information of a second triangular predictive unit. Alternatively, predictive samples of the first triangular predictive unit can be derived from a first prediction block determined based on motion information of the first triangular predictive unit, and predictive samples of the second triangular predictive unit can be derived from a second prediction block determined based on motion information of the second triangular predictive unit. Furthermore, predictive samples located in the boundary region between the first and second triangular predictive units can be derived by weighted summing of the first predictive samples included in the first prediction block and the second predictive samples included in the second prediction block. For example, Equation 6 below illustrates an example of deriving predictive samples of the first and second triangular predictive units.

[0357] [Formula 6]

[0358] P(x, y)=w1*P1(x, y)+(1-w1)*P2(x, y)

[0359] In Equation 6, P1 represents the first predicted sample, and P2 represents the second predicted sample. w1 represents the weight applied to the first predicted sample, and (1-w1) represents the weight applied to the second predicted sample. As shown in the example in Equation 6, the weight applied to the second predicted sample can be derived by subtracting the weight applied to the first predicted sample from the constant.

[0360] When applying the left triangle partitioning type to a coding block, the boundary region may include predicted samples with the same x-axis and y-axis coordinates. On the other hand, when applying the right triangle partitioning type to a coding block, the boundary region may include predicted samples whose sum of x-axis and y-axis coordinates is above a first threshold and below a second threshold.

[0361] The size of the boundary region can be determined based on at least one of the following: the size of the coding block, the shape of the coding block, the motion information of the triangular prediction unit, the motion vector difference of the triangular prediction unit, the output order of the reference image, or the difference between the first and second prediction samples in the diagonal boundary.

[0362] Figure 28 and Figure 29 This is a diagram illustrating an example of deriving a prediction sample based on a weighted sum of a first prediction sample and a second prediction sample. Figure 28 An example of applying the left triangle partitioning type to a coded block is shown, and Figure 29 An example of applying the right triangle partitioning type to a coded block is shown. Additionally, Figure 28 (a) and Figure 29 (a) is a diagram showing the predicted pattern of the luminance component, and Figure 28 (b) and Figure 29 (b) is a diagram showing the predicted pattern of the chromaticity components.

[0363] In the diagram shown, the numbers written in the prediction samples near the boundaries of the first and second prediction units represent the weighting values applied to the first prediction sample. For example, when the number written in the prediction sample is N, a weighting value of N / 8 is applied to the first prediction sample, and a weighting value of (1-(N / 8)) is applied to the second prediction sample, thereby deriving the prediction sample.

[0364] In non-boundary regions, either the first or second predicted sample can be identified as a predicted sample. (Refer to...) Figure 28 For example, in regions where the absolute value of the difference between the x-axis and y-axis coordinates is greater than a threshold and belong to the first triangular prediction unit, a first prediction sample derived from the motion information of the first triangular prediction unit can be determined as a prediction sample. On the other hand, in regions where the difference between the x-axis and y-axis coordinates is greater than a threshold and belong to the second triangular prediction unit, a second prediction sample derived from the motion information of the second triangular prediction unit can be determined as a prediction sample.

[0365] Reference Figure 29 For example, in regions where the sum of the x-axis and y-axis coordinates is less than a first threshold, a first predicted sample derived from the motion information of the first triangular prediction unit can be determined as a predicted sample. On the other hand, in regions where the sum of the x-axis and y-axis coordinates is greater than a second threshold, a second predicted sample derived from the motion information of the second triangular prediction unit can be determined as a predicted sample.

[0366] The threshold for identifying non-boundary regions can be determined based on at least one of the size of the coded block, the shape of the coded block, or the color components. For example, when the threshold associated with the luminance component is set to N, the threshold associated with the chrominance component can be set to N / 2.

[0367] The predicted samples included in the boundary region can be derived based on a weighted sum of the first and second predicted samples. In this case, the weighting value applied to the first and second predicted samples can be determined based on at least one of the position of the predicted sample, the size of the coding block, the shape of the coding block, or the color component of the coding block.

[0368] For example, as in Figure 28 In the example shown in (a), prediction samples with the same x-axis and y-axis coordinates can be derived by applying the same weighting values to the first and second prediction samples. A prediction sample with an absolute value of 1 for the difference between the x-axis and y-axis coordinates can be derived by setting the weighting ratio applied to the first and second prediction samples to (3:1) or (1:3). Furthermore, a prediction sample with an absolute value of 2 for the difference between the x-axis and y-axis coordinates can be derived by setting the weighting ratio applied to the first and second prediction samples to (7:1) or (1:7).

[0369] Alternatively, such as Figure 28 In the example shown in (b), prediction samples with the same x-axis and y-axis coordinates can be derived by applying the same weighting to the first and second prediction samples, and prediction samples with an absolute value of 1 for the difference between the x-axis and y-axis coordinates can be derived by setting the weighting ratio applied to the first and second prediction samples to (7:1) or (1:7).

[0370] For example, as in Figure 29 In the example shown in (a), a prediction sample whose sum of x-axis and y-axis coordinates is 1 less than the width or height of the coding block can be derived by applying the same weighting values to the first and second prediction samples. A prediction sample whose sum of x-axis and y-axis coordinates is the same as or 2 less than the width or height of the coding block can be derived by setting the weighting ratio applied to the first and second prediction samples to (3:1) or (1:3). A prediction sample whose sum of x-axis and y-axis coordinates is 1 greater than or 3 less than the width or height of the coding block can be derived by setting the weighting ratio applied to the first and second prediction samples to (7:1) or (1:7).

[0371] Alternatively, such as in Figure 29 In the example shown in (b), a prediction sample whose sum of x-axis and y-axis coordinates is 1 less than the width or height of the coding block can be derived by applying the same weighting values to the first and second prediction samples. A prediction sample whose sum of x-axis and y-axis coordinates is the same as or 2 less than the width or height of the coding block can be derived by setting the weighting ratio applied to the first and second prediction samples to (7:1) or (1:7).

[0372] As another example, the location of the predicted sample or the shape of the coding block can be considered to determine the weighting value. Equations 7 through 9 show examples of deriving the weighting value when a left-triangle partitioning type is applied to the coding block. Equation 7 shows an example of deriving the weighting value applied to the first predicted sample when the coding block is square in shape.

[0373] [Formula 7]

[0374] w1 = (x - y + 4) / 8

[0375] In Equation 7, x and y represent the positions of the predicted samples. When the coded block is not square, the weighting applied to the first predicted sample can be derived as shown in Equation 8 or 9. Equation 8 shows the case where the width of the coded block is greater than its height, and Equation 9 shows the case where the width of the coded block is less than its height.

[0376] [Formula 8]

[0377] w1=((x / whRatio)-y+4) / 8

[0378] [Formula 9]

[0379] w1 = (x - (y * whRatio) + 4) / 8

[0380] When the right triangle partitioning type is applied to the coded block, the weights applied to the first predicted sample can be determined as shown in Equations 10 to 12. Equation 10 shows an example of deriving the weights applied to the first predicted sample when the coded block is square in shape.

[0381] [Formula 10]

[0382] w1=(CbW-1-xy)+4) / 8

[0383] In Equation 10, CbW represents the width of the coded block. When the coded block is not square, the weighting applied to the first predicted sample can be derived as shown in Equation 11 or 12. Equation 11 shows the case where the width of the coded block is greater than its height, and Equation 12 shows the case where the width of the coded block is less than its height.

[0384] [Formula 11]

[0385] w1=(CbH-1-(x / whRatio)-y)+4) / 8

[0386] [Formula 12]

[0387] w1=(CbW-1-x-(y*whRatio)+4) / 8

[0388] In Formula 11, CbH represents the height of the coded block.

[0389] As shown in the example, for the predicted samples within the boundary region, the samples included in the first triangular prediction unit can be derived by assigning a larger weighting value to the first predicted sample than to the second predicted sample, and the samples included in the second triangular prediction unit can be derived by assigning a larger weighting value to the second predicted sample than to the first predicted sample.

[0390] When applying diagonal partitioning to a coding block, the coding block can be set to a combined prediction mode that does not apply the combination of intra-prediction mode and merging mode.

[0391] Intra-frame prediction uses reconstructed samples that have already been encoded / decoded from the surrounding blocks to predict the current block. In this case, intra-frame prediction of the current block can use reconstructed samples before the application of the in-loop filter.

[0392] Intra-prediction techniques include matrix-based intra-prediction and general intra-prediction that takes into account the directionality with surrounding reconstructed samples. Information indicating the intra-prediction technique for the current block can be signaled via a bitstream. This information may be a 1-bit flag. Alternatively, the intra-prediction technique for the current block can be determined based on at least one of the intra-prediction techniques of the current block's position, size, shape, or neighboring blocks. For example, when the current block crosses an image boundary, the current block is set not to apply matrix-based intra-prediction.

[0393] Matrix-based intra-frame prediction is a method that obtains the predicted block for the current block by performing matrix multiplication between the matrices stored in the encoder and decoder and the reconstructed samples surrounding the current block. Information specifying any one of the stored matrices can be sent via a bitstream signal. The decoder can then determine the matrix for intra-frame prediction of the current block based on this information and the size of the current block.

[0394] Intra-frame prediction is a method that uses either non-angular intra-frame prediction mode or angular intra-frame prediction mode to obtain the prediction block associated with the current block.

[0395] The residual image can be derived by subtracting the predicted image from the original image. In this case, when the residual image is transformed into the frequency domain, even if high-frequency components are removed, the subjective image quality of the video is not significantly degraded. Therefore, reducing the value of high-frequency components or setting the value of high-frequency components to 0 can improve compression efficiency without causing significant visual distortion. Reflecting these characteristics, the current block can be transformed to decompose the residual image into 2D frequency components. This transformation can be performed using transformation techniques such as Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST).

[0396] In the residual image, a portion of the image may not undergo 2D image transformation. This omission of 2D image transformation is called transform skipping. When transform skipping is applied, quantization can be applied to the residual values for which no transformation was performed.

[0397] After transforming the current block using DCT or DST, the transformed current block can be transformed again. In this case, the DCT or DST-based transformation can be defined as the primary transformation, and the process of transforming the block again using the primary transformation can be called the secondary transformation.

[0398] The main transform can be performed using any of a number of transform kernel candidates. For example, the main transform can be performed using any of DCT2, DCT8, or DCT7.

[0399] Different transform cores can be used for the horizontal and vertical directions. Information representing combinations of horizontal and vertical transform cores can also be transmitted as signals via bitstreams.

[0400] The execution units for the primary and secondary transformations will differ. For example, a primary transformation can be performed on an 8×8 block, and a secondary transformation can be performed on 4×4 sub-blocks within the transformed 8×8 block. In this case, the transformation coefficients of the remaining regions where the secondary transformation is not performed can also be set to 0.

[0401] Alternatively, a primary transformation can be performed on a 4×4 block, and a secondary transformation can be performed on an 8×8 region of the 4×4 block that includes the transformation.

[0402] Information indicating whether a second transformation should be performed can be sent via a bitstream signal.

[0403] Alternatively, the decision to perform a secondary transformation can be based on whether the horizontal transformation core and the vertical transformation core are the same. For example, a secondary transformation can only be performed if the horizontal and vertical transformation cores are the same. Alternatively, a secondary transformation can only be performed if the horizontal and vertical transformation cores are different.

[0404] Alternatively, secondary transformations are permitted only when the horizontal and vertical transformations utilize predefined transformation kernels. For example, secondary transformations are permitted when the DCT2 transformation kernel is used for both horizontal and vertical transformations.

[0405] Alternatively, the decision to perform a quadratic transform can be based on the number of non-zero transform coefficients in the current block. For example, if the number of non-zero transform coefficients in the current block is less than or equal to a threshold, the quadratic transform can be set to not be used, and if the number of non-zero transform coefficients in the current block is greater than the threshold, the quadratic transform can be used. Alternatively, the quadratic transform can be set to be used only if the current block is coded with intra-frame prediction.

[0406] The decoder can perform the inverse of the second inverse transform (second inverse transform), and the result can be subjected to the inverse of the main transform (first inverse transform). The residual signal of the current block can be obtained as the result of the second inverse transform and the first inverse transform.

[0407] If transform and quantization are performed in the encoder, the decoder can obtain the residual block through inverse quantization and inverse transform. The decoder then adds the predicted block and the residual block together to obtain the reconstructed block of the current block.

[0408] If a reconstructed block of the current block is obtained, in-loop filtering can be used to reduce information loss during quantization and encoding. In-loop filters can include at least one of a deblocking filter, a sample adaptive offset filter (SAO), or an adaptive loop filter (ALF).

[0409] Embodiments described with a focus on the decoding or encoding process are also included within the scope of this invention. Variations of multiple embodiments described in a predetermined order, in a different order than those described, are also included within the scope of this invention.

[0410] The embodiments have been described based on a series of steps or flowcharts, but this does not limit the chronological order of the invention, and they can be performed simultaneously or in a different order as needed. Furthermore, in the above embodiments, the structural elements constituting the block diagrams (e.g., units, modules, etc.) can also be implemented as hardware devices or software, and multiple structural elements can be combined to implement a single hardware device or software. The embodiments can be implemented in the form of program instructions, which can be executed by various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium can individually or in combination include program instructions, data files, data structures, etc. Examples of computer-readable recording media can include magnetic media such as hard disks, floppy disks, and magnetic tapes; optical recording media such as CD-ROMs and DVDs; magneto-optical media such as floppy optical disks; and hardware devices specifically configured to store and execute program instructions, such as ROMs, RAMs, and flash memory. The hardware device can be configured to operate as one or more software modules to perform the processing according to the invention, and vice versa.

[0411] Industrial applicability

[0412] This invention can be applied to electronic devices that encode / decode video.

Claims

1. A video decoding method, comprising: Export the list of merge candidates for the current block; The first motion information of the first prediction unit is derived based on the first merging candidate included in the merging candidate list, wherein the first merging candidate is determined based on the first index information; The second motion information of the second prediction unit is derived based on the second merging candidate included in the merging candidate list, wherein the second merging candidate is determined based on the second index information; as well as The prediction sample of the current block is determined by a weighted sum of the first prediction sample and the second prediction sample, wherein the first prediction sample is derived based on the first motion information and the second prediction sample is derived based on the second motion information. Specifically, the first index information and the second index information are obtained by decoding the bit stream. When the value of the second index information is equal to or greater than the value of the first index information, the second merging candidate is determined to be a merging candidate with an index obtained by adding 1 to the value of the second index information.

2. The method according to claim 1, wherein, When the value of the second index information is less than the value of the first index information, the second merge candidate is determined to be a merge candidate with an index indicated by the second index information.

3. The method according to claim 1 or 2, wherein Whether to apply a partition-based mode to the current block is determined based on at least one of the slice type, the size of the current block, the shape of the current block, and the prediction mode of the current block.

4. The method according to claim 3, wherein Whether to apply a partition-based mode to the current block is determined based on whether the width-to-height ratio of the current block is lower than a first threshold.

5. The method according to claim 3, wherein Whether to apply a partition-based pattern to the current block is determined based on information indicating whether to apply a partition-based pattern to the current block, wherein the information is sequence-level.

6. The method according to claim 1 or 2, further comprising: The partition type of the current block is determined based on information indicating the partition type of the current block, wherein the information is block-level.

7. A video encoding method, comprising: Export the list of merge candidates for the current block; First motion information of the first prediction unit is derived based on the first merging candidate included in the merging candidate list, wherein the first merging candidate corresponds to the first index information; The second motion information of the second prediction unit is derived based on the second merging candidate included in the merging candidate list, wherein the second merging candidate corresponds to the second index information; The prediction sample of the current block is determined by a weighted sum of the first prediction sample and the second prediction sample, wherein the first prediction sample is derived based on the first motion information and the second prediction sample is derived based on the second motion information. as well as The first index information and the second index information are encoded into a bit stream, wherein when the value of the second index information is equal to or greater than the value of the first index information, the second index information is encoded using the value obtained by subtracting 1 from the index of the second merging candidate.

8. The method according to claim 7, wherein, When the value of the second index information is less than the value of the first index information, the second index information is encoded using a value equal to the index of the second merge candidate.

9. The method according to claim 7 or 8, wherein Whether to apply a partition-based mode to the current block is determined based on at least one of the slice type, the size of the current block, the shape of the current block, and the prediction mode of the current block.

10. The method of claim 9, wherein Whether to apply a partition-based mode to the current block is determined based on whether the width-to-height ratio of the current block is lower than a first threshold.

11. The method of claim 9, wherein Information indicating whether a partition-based pattern should be applied to the current block is sent sequentially via a bitstream using signals.

12. The method according to claim 7 or 8, further comprising: Information based on the partitioning type of the current block is transmitted via a bitstream at the block level using signals.

13. A video decoding apparatus, comprising an inter-frame prediction unit, the inter-frame prediction unit being configured to perform the method according to any one of claims 1 to 6.

14. A video encoding apparatus, comprising an inter-frame prediction unit, the inter-frame prediction unit being configured to perform the method according to any one of claims 7 to 12.

15. A method for transmitting a bit stream, characterized in that, The video encoding method according to any one of claims 7 to 12 is used to generate a bitstream; and the bitstream is transmitted.

16. A computer-readable storage medium storing a computer program and a bit stream thereon, characterized in that, When the computer program is executed by a processor, it implements the video encoding method of any one of claims 7 to 12 to generate the bitstream.