Image encoding / decoding method and apparatus for transmitting compressed video data

By employing histogram-based intra prediction mode candidates and interpolation filters, the method enhances compression efficiency and prediction accuracy for high-resolution video, addressing the challenges of high data volumes and costs in high-quality video transmission and storage.

WO2026142389A1PCT designated stage Publication Date: 2026-07-02KT CORP

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
KT CORP
Filing Date
2025-12-26
Publication Date
2026-07-02

Smart Images

  • Figure KR2025022937_02072026_PF_FP_ABST
    Figure KR2025022937_02072026_PF_FP_ABST
Patent Text Reader

Abstract

An image decoding method according to the present disclosure may comprise the steps of: decoding an intra prediction mode of the current block; and performing intra prediction on the current block, on the basis of the intra prediction mode. Here, the intra prediction mode of the current block is decoded on the basis of at least one intra prediction mode candidate list, and the at least one intra prediction mode candidate list may comprise intra prediction mode candidates derived from a histogram of the current block.
Need to check novelty before this filing date? Find Prior Art

Description

Video encoding / decoding method and device for transmitting compressed video data

[0001] The present disclosure relates to a video signal processing method and apparatus.

[0002] Recently, the demand for high-resolution, high-quality video, such as HD (High Definition) and UHD (Ultra High Definition) video, has been increasing across various application fields. As video data becomes higher in resolution and quality, the relative volume of data increases compared to conventional video data; consequently, transmission and storage costs increase when video data is transmitted using existing wired or wireless broadband lines or stored using conventional storage media. To address these issues arising from the increase in video data resolution and quality, high-efficiency video compression technologies can be utilized.

[0003] Various video compression technologies exist, such as inter-frame prediction technology that predicts pixel values ​​in the current picture from previous or subsequent pictures, intra-frame prediction technology that predicts pixel values ​​in the current picture using pixel information within the current picture, and entropy coding technology that assigns short codes to values ​​with high frequency and long codes to values ​​with low frequency; by utilizing these video compression technologies, video data can be effectively compressed for transmission or storage.

[0004] Meanwhile, along with the increasing demand for high-resolution video, the demand for stereoscopic video content as a new video service is also rising. Discussions are underway regarding video compression technologies to effectively provide high-resolution and ultra-high-resolution stereoscopic video content.

[0005] The present disclosure aims to provide a method for deriving an intra prediction mode of a current block and an apparatus for the same, by using an intra prediction mode derived based on a histogram as an intra prediction mode candidate.

[0006] The present disclosure aims to provide a method for encoding / decoding an intra prediction mode of a current block using a plurality of intra prediction mode candidate lists, and an apparatus for the same.

[0007] The present disclosure aims to provide a method for selecting one of a plurality of interpolation filter candidates at the decoder side in the same manner as the encoder, and an apparatus for doing the same.

[0008] The present disclosure aims to provide a method for rearranging a conversion kernel list and an apparatus for doing so.

[0009] The technical problems to be solved in this disclosure are not limited to those mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art to which this disclosure belongs from the description below.

[0010] An image decoding method according to the present disclosure may include: a step of decoding an intra prediction mode of a current block; and a step of performing an intra prediction for the current block based on the intra prediction mode. In this case, the intra prediction mode of the current block is decoded based on at least one intra prediction mode candidate list, and the at least one intra prediction mode candidate list may include intra prediction mode candidates derived from a histogram of the current block.

[0011] In the image decoding method according to the present disclosure, among a plurality of intra prediction mode candidates, a predetermined number of intra prediction mode candidates with small costs are added to a first intra prediction mode candidate list, and the remaining intra prediction mode candidates may be added to a second intra prediction mode candidate list.

[0012] In the image decoding method according to the present disclosure, intra prediction mode candidates included in the intra prediction mode candidate list may be reordered in ascending order of cost.

[0013] In the image decoding method according to the present disclosure, the cost of an intra prediction mode candidate can be calculated based on the difference between the prediction samples obtained by performing intra prediction based on the intra prediction mode candidate in the reference region of the current block and the restoration samples within the reference region.

[0014] In the image decoding method according to the present disclosure, reordering may be performed only on some intra prediction mode candidates derived according to a previously defined method among the intra prediction mode candidates.

[0015] In the image decoding method according to the present disclosure, the cost of each intra prediction mode candidate included in the intra prediction mode candidate list is calculated, and the intra prediction mode candidate list may be updated to include only a predefined number of intra prediction mode candidates selected in order of lowest cost.

[0016] In the image decoding method according to the present disclosure, a predefined number of intra prediction modes with high amplitude values ​​on the histogram may be set as intra prediction mode candidates.

[0017] In the image decoding method according to the present disclosure, the histogram is a cumulative histogram of occurrence frequencies for each intra prediction mode, the intra prediction mode of a neighbor block adjacent to the current block is represented on the histogram, and the occurrence frequency of the intra prediction mode of the neighbor block can be set to the size of the neighbor block.

[0018] In the image decoding method according to the present disclosure, the histogram is a cumulative amplitude value for each intra prediction mode, the intra prediction mode of a reference sample within a reference region adjacent to the current block is represented on the histogram, and the intra prediction mode of the reference sample and the amplitude value of the intra prediction mode can be derived based on the horizontal slope and vertical slope of the reference sample.

[0019] In the image decoding method according to the present disclosure, the intra prediction is performed using one of a plurality of interpolation filters, and one of the plurality of interpolation filters can be specified based on an index decoded from a bitstream.

[0020] In the image decoding method according to the present disclosure, after calculating the cost of each of a plurality of interpolation filters, the intra prediction can be performed based on the interpolation filter with the smallest cost.

[0021] In the image decoding method according to the present disclosure, the cost of the interpolation filter can be calculated based on the difference between the prediction samples obtained by performing intra prediction based on the interpolation filter in the reference region of the current block and the restoration samples within the reference region.

[0022] In the image decoding method according to the present disclosure, based on each of a plurality of interpolation filters, an intra prediction is performed on the current block to derive a plurality of prediction blocks, and the plurality of prediction blocks are weighted to derive a weighted prediction block of the current block.

[0023] A video encoding method according to the present disclosure may include: a step of performing an intra prediction for a current block based on an intra prediction mode of a current block; and a step of encoding the intra prediction mode of the current block. In this case, the intra prediction mode of the current block is encoded based on at least one intra prediction mode candidate list, and the at least one intra prediction mode candidate list may include intra prediction mode candidates derived from a histogram of the current block.

[0024] According to the present disclosure, a computer-readable recording medium may be provided that records instructions for storing / transmitting a bitstream generated by an image encoding method.

[0025] According to the present disclosure, a computer-readable recording medium may be provided that records instructions for performing an image decoding method or an image encoding method.

[0026] The features briefly summarized above regarding the present disclosure are merely exemplary aspects of the detailed description of the present disclosure that follows and do not limit the scope of the present disclosure.

[0027] According to the present disclosure, compression efficiency can be improved by setting an intra prediction mode derived based on a histogram as an intra prediction mode candidate.

[0028] According to the present disclosure, compression efficiency can be improved by encoding / decoding the intra prediction mode of the current block using a plurality of intra prediction mode candidate lists.

[0029] According to the present disclosure, by providing a method for selecting one of a plurality of interpolation filter candidates at the decoder side in the same way as the encoder, it is possible to improve prediction accuracy while increasing compression efficiency.

[0030] The present disclosure can increase compression efficiency by providing a method for rearranging a conversion kernel list.

[0031] The effects obtainable from the present disclosure are not limited to those mentioned above, and other unmentioned effects will be clearly understood by those skilled in the art to which the present disclosure pertains from the description below.

[0032] FIG. 1 is a block diagram showing an image encoding device according to one embodiment of the present disclosure.

[0033] FIG. 2 is a block diagram showing an image decoding device according to an embodiment of the present disclosure.

[0034] FIG. 3 illustrates an image encoding / decoding method performed by an image encoding / decoding device according to the present disclosure.

[0035] FIG. 4 illustrates an example of a plurality of intra-prediction modes according to the present disclosure.

[0036] Figure 5 shows an example where the directional mode is extended.

[0037] FIG. 6 illustrates a planner mode-based intra prediction method according to the present disclosure.

[0038] FIG. 7 illustrates a DC mode-based intra prediction method according to the present disclosure.

[0039] FIG. 8 illustrates a directional mode-based intra prediction method according to the present disclosure.

[0040] Figure 9 illustrates a method for deriving samples of fractional positions.

[0041] Figures 10 and 11 illustrate tangent values ​​for angles scaled by 32 times for each intra prediction mode.

[0042] FIG. 12 is a diagram illustrating an intra-prediction pattern when the directional mode is one of modes 34 to 49.

[0043] Figure 13 is a diagram illustrating an example of generating an upper reference sample by interpolating left reference samples.

[0044] Figure 14 shows an example in which intra prediction is performed using reference samples arranged in a 1D array.

[0045] Figure 15 is a diagram illustrating an example of setting a reference area.

[0046] Figure 16 is a diagram showing an example of the configuration of a reference area.

[0047] Figure 17 illustrates the filter coefficients for the Sobel mask and the Prewit mask, respectively.

[0048] Figure 18 shows the locations where the vertical and horizontal inclinations are obtained within the reference area.

[0049] Figure 19 shows an example of grouping directional modes into multiple intra-prediction mode groups.

[0050] FIG. 20 is a diagram illustrating an example in which a histogram is generated based on the frequency of intra-prediction mode usage of neighboring blocks adjacent to the current block.

[0051] FIG. 21 is a drawing illustrating a reference area around the current block.

[0052] Figures 22 and 23 show an example of performing intra prediction on a reference area based on planner mode.

[0053] Figures 24 and 25 show an example of performing intra prediction on a reference region based on DC mode.

[0054] Figures 26 and 27 illustrate an example of performing intra prediction on a reference region based on a directional mode.

[0055] Figure 28 is intended to illustrate neighboring blocks used to derive intra-prediction mode candidates.

[0056] Figure 29 is a diagram illustrating the process of performing inter-prediction in the encoder and decoder.

[0057] Figure 30 shows an example where motion estimation is performed.

[0058] Figures 31 and 32 show an example in which a predicted block of the current block is generated based on motion information generated through motion estimation.

[0059] Figure 33 shows the location referenced to derive the motion vector prediction value.

[0060] Figure 34 is a diagram illustrating a template-based motion estimation method.

[0061] Figure 35 shows examples of template configurations.

[0062] Figure 36 is a diagram illustrating a motion estimation method based on a two-way matching method.

[0063] Figure 37 is a diagram illustrating a motion estimation method based on a unidirectional matching method.

[0064] Figures 38 and 39 illustrate examples in which prediction blocks are generated according to the precision of the motion vectors.

[0065] FIG. 40 shows an example in which motion compensation based on a translational model and a zooming model is performed for the current block.

[0066] Figure 41 shows an example in which motion compensation based on a translational model and a rotational model is performed for the current block.

[0067] Figures 42 and 43 show an example of generating a prediction block for the current block using control point motion vectors.

[0068] Figure 44 shows an example of generating a prediction block for the current block using three control point motion vectors.

[0069] Figure 45 shows an example in which a motion vector is derived in sub-block units.

[0070] Figures 46 and 47 show examples in which motion vectors are induced in units of sub-blocks within the current block when SbTMVP is applied.

[0071] Figures 48 and 49 are diagrams illustrating examples in which a prediction block is derived according to the precision of the motion vector.

[0072] FIGS. 50 and FIGS. 51 are diagrams illustrating the process of encoding and decoding motion vector difference values ​​when the AMVR method is applied, respectively.

[0073] FIG. 52 is a diagram illustrating a search area where the prediction vector of the current block is derived.

[0074] Figure 53 is a flowchart of a method for encoding a residual block in an encoder.

[0075] Figure 54 is a flowchart of a method for recovering residual blocks in a decoder.

[0076] Figures 55 and 56 are drawings showing an example to which the second transformation is applied.

[0077] FIGS. 57 and 58 illustrate a second transformation based on an asymmetric form second transformation kernel.

[0078] Figure 59 shows an example in which information about partial transformations is sequentially encoded / decoded.

[0079] Figure 60 shows an example where a partial transformation is applied to the current block.

[0080] Figure 61 is a diagram illustrating an example of calculating the cost of a candidate combination of transformation kernels.

[0081] Figure 62 shows an example in which a transformation / inverse transformation is performed based on each of the candidate transformation kernel combinations.

[0082] Figure 63 shows an example where candidates for transformation kernel combinations are reordered according to cost.

[0083] Figure 64 shows an example where some of the conversion kernel combination candidates are set to be available in the current block.

[0084] Figure 65 shows an example where partial transformation candidates are rearranged according to cost.

[0085] The present disclosure is susceptible to various modifications and may have various embodiments; specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present disclosure to specific embodiments, and it should be understood that it includes all modifications, equivalents, and substitutions that fall within the spirit and scope of the present disclosure. Similar reference numerals have been used for similar components in the description of each drawing.

[0086] Terms such as "first," "second," etc., may be used to describe various components, but said components should not be limited by said terms. Such terms are used solely for the purpose of distinguishing one component from another. For example, without departing from the scope of the present disclosure, the first component may be named the second component, and similarly, the second component may be named the first component. The term "and / or" includes a combination of a plurality of related described items or any of a plurality of related described items.

[0087] When it is stated that one component is "connected" or "connected" to another component, it should be understood that while it may be directly connected or connected to that other component, there may also be other components in between. On the other hand, when it is stated that one component is "directly connected" or "directly connected" to another component, it should be understood that there are no other components in between.

[0088] The terms used in this application are used merely to describe specific embodiments and are not intended to limit the disclosure. The singular expression includes the plural expression unless the context clearly indicates otherwise. In this application, terms such as “comprising” or “having” are intended to specify the presence of the features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, and should be understood as not precluding the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof.

[0089] Hereinafter, preferred embodiments of the present disclosure will be described in more detail with reference to the attached drawings. Hereinafter, the same reference numerals are used for identical components in the drawings, and redundant descriptions of identical components are omitted.

[0090] FIG. 1 is a block diagram showing an image encoding device according to one embodiment of the present disclosure.

[0091] Referring to FIG. 1, the image encoding device (100) may include a picture splitting unit (110), a prediction unit (120, 125), a conversion unit (130), a quantization unit (135), a reordering unit (160), an entropy encoding unit (165), an inverse quantization unit (140), an inverse conversion unit (145), a filter unit (150), and a memory (155).

[0092] Each component shown in FIG. 1 is depicted independently to represent different characteristic functions of the image encoding device and does not imply that each component consists of separate hardware or a single software unit. That is, each component is listed and included as a separate component for convenience of explanation, but at least two of the components may be combined to form a single component, or a single component may be divided into multiple components to perform functions, and such integrated and separated embodiments of each component are included within the scope of the present disclosure as long as they do not deviate from the essence of the present disclosure.

[0093] Additionally, some components may not be essential components performing an essential function in the present disclosure, but may be optional components merely for enhancing performance. The present disclosure may be implemented by including only the components essential to embody the essence of the present disclosure, excluding components used merely for enhancing performance, and a structure including only the essential components, excluding optional components used merely for enhancing performance, is also included within the scope of the rights of the present disclosure.

[0094] The picture segmentation unit (110) can divide an input picture into at least one processing unit. At this time, the processing unit may be a Prediction Unit (PU), a Transform Unit (TU), or a Coding Unit (CU). The picture segmentation unit (110) can divide a picture into a combination of multiple coding units, prediction units, and transformation units, and can encode the picture by selecting one combination of coding units, prediction units, and transformation units based on a predetermined criterion (e.g., a cost function).

[0095] For example, a single picture can be divided into multiple coding units. To divide coding units within a picture, recursive tree structures such as a Quad Tree, Ternary Tree, or Binary Tree can be used. A coding unit divided into other coding units, with a single image or the largest coding unit as the root, can have as many child nodes as the number of divided coding units. A coding unit that is no longer divided according to certain limits becomes a leaf node. For example, assuming Quad Tree division is applied to a single coding unit, a single coding unit can be divided into up to four different coding units.

[0096] In the embodiments of the present disclosure below, the encoding unit may be used to mean a unit that performs encoding, or a unit that performs decoding.

[0097] A prediction unit may be divided into at least one shape, such as a square or rectangle, of the same size within a single encoding unit, or one of the prediction units divided within a single encoding unit may be divided such that any one prediction unit has a different shape and / or size from another prediction unit.

[0098] When performing intra-frame prediction, the transformation unit and the prediction unit may be set to be the same. In this case, the encoding unit may be divided into multiple transformation units, and intra-frame prediction may be performed for each transformation unit. The encoding unit may be divided in a horizontal or vertical direction. The number of transformation units generated by dividing the encoding unit may be two or four, depending on the size of the encoding unit. Alternatively, if the size of the transformation unit is small, multiple transformation units may be set as a single prediction unit.

[0099] The prediction unit (120, 125) may include an inter-frame prediction unit (120) that performs inter-frame prediction and an intra-frame prediction unit (125) that performs intra-frame prediction. It may determine whether to use inter-frame prediction or perform intra-frame prediction for a encoding unit, and determine specific information (e.g., reference sample line, intra-frame prediction mode, motion vector, reference picture, etc.) according to each prediction method. At this time, the processing unit in which the prediction is performed and the processing unit in which the prediction method and specific details are determined may be different. For example, the prediction method and prediction mode, etc., may be determined by the encoding unit, and the prediction may be performed by the prediction unit or the conversion unit. The residual value (residual block) between the generated prediction block and the original block may be input to the conversion unit (130). In addition, the prediction mode information, motion vector information, etc. used for prediction may be encoded together with the residual value in the entropy encoding unit (165) and transmitted to the decoding device. When using a specific encoding mode, it is also possible to encode the original block as is and transmit it to the decoding unit without generating a prediction block through the prediction unit (120, 125).

[0100] The inter-frame prediction unit (120) may predict a prediction unit based on information of at least one picture among the previous picture or the subsequent picture of the current picture, and in some cases, may predict a prediction unit based on information of a partially encoded area within the current picture. The inter-frame prediction unit (120) may include a reference picture interpolation unit, a motion prediction unit, and a motion compensation unit.

[0101] In the reference picture interpolation unit, reference picture information is received from memory (155), and pixel information of integer pixels or less can be generated from the reference picture. In the case of luminance pixels, a DCT-based 8-tap interpolation filter with different filter coefficients can be used to generate pixel information of integer pixels or less in 1 / 4 pixel units. In the case of chrominance signals, a DCT-based 4-tap interpolation filter with different filter coefficients can be used to generate pixel information of integer pixels or less in 1 / 8 pixel units.

[0102] The motion prediction unit can perform motion prediction based on a reference picture interpolated by the reference picture interpolation unit. Various methods, such as FBMA (Full search-based Block Matching Algorithm), TSS (Three Step Search), and NTS (New Three-Step Search Algorithm), can be used to calculate motion vectors. Based on the interpolated pixels, the motion vector can have motion vector values ​​in units of 1 / 2 or 1 / 4 pixels. The motion prediction unit can predict the current prediction unit by using different motion prediction methods. Various motion prediction methods, such as the Skip method, Merge method, AMVP (Advanced Motion Vector Prediction) method, and Intra Block Copy method, can be used.

[0103] The in-screen prediction unit (125) can generate a prediction block based on reference pixel information, which is pixel information within the current picture. Reference pixel information can be derived from one selected from a plurality of reference pixel lines. The Nth reference pixel line among the plurality of reference pixel lines may include left pixels with an x-axis difference of N with the top-left pixel in the current block and top pixels with a y-axis difference of N with said top-left pixel. The number of reference pixel lines that the current block can select may be 1, 2, 3, or 4.

[0104] If a neighboring block of the current prediction unit is a block that has undergone inter-frame prediction, and the reference pixel is a pixel that has undergone inter-frame prediction, the reference pixel included in the block that has undergone inter-frame prediction can be replaced with the reference pixel information of a neighboring block that has undergone intra-frame prediction. That is, if the reference pixel is not available, the information of the unavailable reference pixel can be replaced with the information of at least one of the available reference pixels.

[0105] In intra-frame prediction, the prediction mode may include a directional prediction mode that uses reference pixel information according to the prediction direction, and a non-directional mode that does not use directional information when performing prediction. The mode for predicting luminance information and the mode for predicting chrominance information may be different, and the intra-frame prediction mode information used to predict luminance information or the predicted luminance signal information may be utilized to predict chrominance information.

[0106] When performing intra-frame prediction, if the size of the prediction unit and the size of the transformation unit are the same, intra-frame prediction for the prediction unit can be performed based on the pixels to the left of the prediction unit, the pixels at the top left, and the pixels at the top.

[0107] The in-frame prediction method can generate a prediction block after applying a smoothing filter to a reference pixel according to the prediction mode. Depending on the selected reference pixel line, it may be determined whether to apply the smoothing filter.

[0108] To perform an intra-frame prediction method, the intra-frame prediction mode of the current prediction unit can be predicted from the intra-frame prediction mode of the prediction unit existing in the vicinity of the current prediction unit. When predicting the prediction mode of the current prediction unit using the mode information predicted from the surrounding prediction unit, if the intra-frame prediction mode of the current prediction unit and the surrounding prediction unit are the same, information indicating that the prediction modes of the current prediction unit and the surrounding prediction unit are the same can be transmitted using predetermined flag information; if the prediction modes of the current prediction unit and the surrounding prediction unit are different, entropy coding can be performed to encode the prediction mode information of the current block.

[0109] Additionally, a residual block can be generated that includes residual value information, which is the difference between the prediction unit that performed the prediction based on the prediction unit generated in the prediction unit (120, 125) and the original block of the prediction unit. The generated residual block can be input to the conversion unit (130).

[0110] In the transformation unit (130), the residual block containing residual value information of the prediction unit generated through the original block and the prediction unit (120, 125) can be transformed using a transformation method such as DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), or KLT. Whether to apply DCT, DST, or KLT to transform the residual block can be determined based on at least one of the size of the transformation unit, the shape of the transformation unit, the prediction mode of the prediction unit, or the in-frame prediction mode information of the prediction unit. Meanwhile, the transformation can be performed by separating the horizontal direction and the vertical direction.

[0111] After performing transformations for the horizontal and vertical directions, a second transformation can be performed. The second transformation may be in a form where the horizontal and vertical directions are not separated. Final transformation coefficients can be generated by performing a second transformation on the transformation coefficients obtained by the first transformation. Meanwhile, the number of final transformation coefficients output by the second transformation may be smaller than the number of transformation coefficients input for the second transformation. Specifically, the second transformation can be performed using a reduced transformation matrix with different numbers of columns and rows.

[0112] The quantization unit (135) can quantize the values ​​converted into the frequency domain in the conversion unit (130). The quantization coefficient may vary depending on the block or the importance of the image. The values ​​produced by the quantization unit (135) may be provided to the inverse quantization unit (140) and the reordering unit (160).

[0113] The reordering unit (160) can perform reordering of coefficient values ​​for quantized residual values.

[0114] The reordering unit (160) can convert two-dimensional block-shaped coefficients into one-dimensional vector forms through a coefficient scanning method. For example, the reordering unit (160) can convert the coefficients from DC to high-frequency ranges into one-dimensional vector forms by scanning using a Zig-Zag Scan method. Depending on the size of the conversion unit and the in-frame prediction mode, instead of Zig-Zag Scan, a vertical scan that scans two-dimensional block-shaped coefficients in the column direction, a horizontal scan that scans two-dimensional block-shaped coefficients in the row direction, or a diagonal scan that scans two-dimensional block-shaped coefficients in the diagonal direction may be used. That is, depending on the size of the conversion unit and the in-frame prediction mode, it can be determined whether to use a Zig-Zag Scan, a vertical scan, a horizontal scan, or a diagonal scan.

[0115] The entropy encoding unit (165) can perform entropy encoding based on the values ​​calculated by the reordering unit (160). Entropy encoding can use various encoding methods, such as, for example, Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC).

[0116] The entropy encoding unit (165) can encode various information from the reordering unit (160) and the prediction unit (120, 125), such as residual value coefficient information of the encoding unit, block type information, prediction mode information, division unit information, prediction unit information and transmission unit information, motion vector information, reference frame information, block interpolation information, and filtering information.

[0117] The entropy encoding unit (165) can entropy-encode the coefficient value of the encoding unit input from the rearrangement unit (160).

[0118] In the inverse quantization unit (140) and inverse transformation unit (145), the values ​​quantized in the quantization unit (135) are inversely quantized, and the values ​​transformed in the transformation unit (130) are inversely transformed. The residual value generated in the inverse quantization unit (140) and inverse transformation unit (145) can be combined with the predicted unit predicted through the motion estimation unit, motion compensation unit, and in-frame prediction unit included in the prediction unit (120, 125) to generate a reconstructed block.

[0119] The filter section (150) may include at least one of a deblocking filter, an offset correction section, and an ALF (Adaptive Loop Filter).

[0120] The deblocking filter can remove block distortion caused by boundaries between blocks in the restored picture. To determine whether to perform deblocking, the decision to apply the deblocking filter to the current block can be made based on the pixels contained in a certain number of columns or rows within the block. When applying the deblocking filter to a block, a Strong Filter or a Weak Filter can be applied depending on the required deblocking filtering strength. Additionally, when applying the deblocking filter, horizontal and vertical filtering can be processed in parallel.

[0121] The offset correction unit can correct the offset from the original image on a pixel-by-pixel basis for the image that has undergone deblocking. To perform offset correction for a specific picture, a method can be used in which pixels included in the image are divided into a certain number of regions, the region to be offset is determined, and the offset is applied to that region, or a method can be used in which the offset is applied by considering the edge information of each pixel.

[0122] Adaptive Loop Filtering (ALF) can be performed based on a comparison between the filtered restored image and the original image. After dividing the pixels included in the image into predetermined groups, a single filter to be applied to each group can be determined, allowing for differential filtering for each group. Information regarding whether to apply ALF can be transmitted per coding unit (CU), and the shape and filter coefficients of the ALF filter to be applied may vary depending on each block. Additionally, an ALF filter of the same form (fixed form) may be applied regardless of the characteristics of the block to be applied.

[0123] The memory (155) can store a restoration block or picture calculated through the filter unit (150), and the stored restoration block or picture can be provided to the prediction unit (120, 125) when performing inter-frame prediction.

[0124] FIG. 2 is a block diagram showing an image decoding device according to an embodiment of the present disclosure.

[0125] Referring to FIG. 2, the image decoding device (200) may include an entropy decoding unit (210), a reordering unit (215), an inverse quantization unit (220), an inverse transformation unit (225), a prediction unit (230, 235), a filter unit (240), and a memory (245).

[0126] When a video bitstream is input to a video encoding device, the input bitstream can be decoded by the reverse procedure of the video encoding device.

[0127] The entropy decoding unit (210) can perform entropy decoding in the opposite procedure to that which the entropy encoding unit of the image encoding device performed entropy encoding. For example, various methods such as Exponential Golomb, CAVLC (Context-Adaptive Variable Length Coding), and CABAC (Context-Adaptive Binary Arithmetic Coding) may be applied in correspondence with the method performed by the image encoding device.

[0128] The entropy decoding unit (210) can decode information related to intra-frame prediction and inter-frame prediction performed by the encoding device.

[0129] The reordering unit (215) can perform reordering based on the method of reordering the entropy-decoded bitstream in the encoding unit in the entropy decoding unit (210). It can reorder by restoring the coefficients expressed in the form of a one-dimensional vector back into coefficients in the form of a two-dimensional block. The reordering unit (215) can perform reordering by receiving information related to the coefficient scanning performed in the encoding unit and scanning in reverse based on the scanning order performed in the encoding unit.

[0130] The inverse quantization unit (220) can perform inverse quantization based on the coefficient values ​​of the rearranged block and the quantization parameters provided by the encoding device.

[0131] The inverse transform unit (225) can perform an inverse transform of the transform performed by the transform unit on the quantization result performed by the image encoding device. That is, it can perform at least one of an inverse transform of the second transform (second inverse transform) or an inverse transform for DCT, DST, and KLT (i.e., first inverse transform). The inverse transform can be performed based on a transmission unit determined by the image encoding device. The inverse transform unit (225) of the image decoder can determine a transform matrix for the second inverse transform or a transform technique for the first inverse transform (e.g., DCT, DST, KLT) according to a plurality of information such as a prediction method, the size and shape of the current block, a prediction mode, and an intra-frame prediction direction. Alternatively, information for determining the transform matrix or transform technique may be explicitly encoded and signaled.

[0132] The prediction unit (230, 235) can generate a prediction block based on the prediction block generation information provided by the entropy decoding unit (210) and the previously decoded block or picture information provided by the memory (245).

[0133] As described above, when performing intra-frame prediction identical to the operation in the video encoding device, if the size of the prediction unit and the size of the transform unit are the same, intra-frame prediction for the prediction unit is performed based on the pixels to the left of the prediction unit, the pixels to the top left, and the pixels to the top; however, if the size of the prediction unit and the size of the transform unit are different when performing intra-frame prediction, intra-frame prediction can be performed using reference pixels based on the transform unit. Additionally, intra-frame prediction using NxN partitioning only for the minimum encoding unit may also be used.

[0134] The prediction unit (230, 235) may include a prediction unit determination unit, an inter-frame prediction unit, and an intra-frame prediction unit. The prediction unit determination unit receives various information, such as prediction unit information input from the entropy decoding unit (210), prediction mode information of the intra-frame prediction method, and motion prediction related information of the inter-frame prediction method, distinguishes the prediction unit in the current encoding unit, and determines whether the prediction unit performs inter-frame prediction or intra-frame prediction. The inter-frame prediction unit (230) may perform inter-frame prediction for the current prediction unit based on information included in at least one picture among the previous picture or subsequent picture of the current picture containing the current prediction unit, using information necessary for inter-frame prediction of the current prediction unit provided by the video encoding device. Alternatively, it may perform inter-frame prediction based on information of a partially restored area within the current picture containing the current prediction unit.

[0135] To perform inter-frame prediction, based on the encoding unit, it is possible to determine whether the motion prediction method of the prediction unit included in the corresponding encoding unit is Skip Mode, Merge Mode, AMVP Mode, or Intra-frame Block Copy Mode.

[0136] The intra-frame prediction unit (235) can generate a prediction block based on pixel information within the current picture. If the prediction unit is a prediction unit that has performed intra-frame prediction, it can perform intra-frame prediction based on the intra-frame prediction mode information of the prediction unit provided by the video encoding device. The intra-frame prediction unit (235) may include an Adaptive Intra Smoothing (AIS) filter, a reference pixel interpolation unit, and a DC filter. The AIS filter is a part that performs filtering on the reference pixel of the current block, and can determine whether to apply the filter based on the prediction mode of the current prediction unit. AIS filtering can be performed on the reference pixel of the current block using the prediction mode of the prediction unit and the AIS filter information provided by the video encoding device. If the prediction mode of the current block is a mode that does not perform AIS filtering, the AIS filter may not be applied.

[0137] The reference pixel interpolation unit can generate a reference pixel of an integer value or less by interpolating the reference pixel when the prediction mode of the prediction unit is a prediction unit that performs intra-frame prediction based on the pixel value interpolated from the reference pixel. If the prediction mode of the current prediction unit is a prediction mode that generates a prediction block without interpolating the reference pixel, the reference pixel may not be interpolated. The DC filter can generate a prediction block through filtering when the prediction mode of the current block is DC mode.

[0138] The restored block or picture may be provided to a filter unit (240). The filter unit (240) may include a deblocking filter, an offset correction unit, and an ALF.

[0139] Information regarding whether a deblocking filter has been applied to the corresponding block or picture can be received from the video encoding device, and if a deblocking filter has been applied, information regarding whether a strong filter or a weak filter has been applied. The deblocking filter of the video decoder receives information related to the deblocking filter provided by the video encoding device, and the video decoder can perform deblocking filtering on the corresponding block.

[0140] The offset correction unit can perform offset correction on the restored image based on the type of offset correction and offset value information applied to the image during encoding.

[0141] ALF can be applied to the encoding unit based on information on whether to apply ALF, ALF coefficient information, etc., provided by the encoding device. This ALF information can be provided included in a specific parameter set.

[0142] The memory (245) can store the restored picture or block so that it can be used as a reference picture or reference block, and can also provide the restored picture to the output unit.

[0143] As described above, in the embodiments of the present disclosure below, the term "Coding Unit" is used as "encoding unit" for convenience of explanation, but it may be a unit that performs not only encoding but also decoding.

[0144] Additionally, the current block represents a block to be encoded / decoded, and depending on the encoding / decoding stage, it may represent a coding tree block (or coding tree unit), an encoding block (or encoding unit), a conversion block (or conversion unit), a prediction block (or prediction unit), or a block to which an in-loop filter is applied. In this specification, 'unit' represents a basic unit for performing a specific encoding / decoding process, and 'block' may represent a pixel array of a predetermined size. Unless otherwise distinguished, 'block' and 'unit' may be used with the same meaning. For example, in the embodiments described below, the encoding block (coding block) and the encoding unit (coding unit) may be understood as having the same meaning.

[0145] In addition, encoding parameters for the current block may be commonly applied to multiple color components of the current block. For example, if the encoding mode of the current block is determined, predictions for the Y component block, Cb component block, and Cr component block can be performed based on the corresponding encoding mode.

[0146] Alternatively, depending on the color component to be encoded / decoded, the current block may refer to a Y component block, a Cb component block, or a Cr component block.

[0147] Furthermore, the picture containing the current block will be referred to as the current picture.

[0148] In the encoder, the current picture can be divided into multiple reference blocks. Here, the reference block may be referred to as a CTU (Coding Tree Unit) or CTB (Coding Tree Block).

[0149] The size of the reference block may be predefined in the encoder and decoder. Alternatively, information related to the size of the reference block may be encoded and signaled to the decoder. This information may be encoded / decoded through an upper header. For example, this information may be encoded / decoded through a sequence parameter set or a picture header.

[0150] The reference block may be further divided into multiple blocks (i.e., multiple coding blocks) based on tree structure partitioning. Here, the tree structure partitioning may include at least one of quad tree partitioning, binary tree partitioning, or ternary tree partitioning.

[0151] A prediction block for the current block can be obtained by performing a prediction block on the current block generated by dividing the reference block. Specifically, a prediction block for the current block can be obtained through inter-prediction or intra-prediction.

[0152] Inter-prediction may be intended to remove duplicate data between pictures, and intra-prediction may be intended to remove duplicate data within a picture. For example, a prediction block of the current block may be generated from a reference picture using motion information of the current block, or a prediction block of the current block may be generated from reference samples of the current block after determining the intra-prediction mode of the current block. Here, the motion information may include at least one of a motion vector, a reference picture index, and a prediction direction.

[0153] FIG. 3 illustrates an image encoding / decoding method performed by an image encoding / decoding device according to the present disclosure.

[0154] Referring to FIG. 3, a reference line for intra prediction of the current block can be determined (S300).

[0155] The current block may use one or more of the multiple reference line candidates predefined in the video encoding / decoding device as reference lines for intra-prediction. Here, the multiple reference line candidates predefined may include neighbor reference lines adjacent to the current block to be decoded and N non-neighbor reference lines located 1 to N samples away from the boundary of the current block. N may be 1, 2, 3, or more integers. For convenience of explanation, it is assumed that the multiple reference line candidates available to the current block consist of a neighbor reference line candidate and three non-neighbor reference line candidates, but are not limited thereto. That is, it is obvious that the multiple reference line candidates available to the current block may include four or more non-neighbor reference line candidates.

[0156] A video encoding device can determine an optimal reference line candidate among a plurality of reference line candidates and encode an index to specify it. A video decoding device can determine the reference line of the current block based on the index signaled through a bitstream. The index can specify any one of the plurality of reference line candidates. The reference line candidate specified by the index can be used as the reference line of the current block.

[0157] The number of signaled indices to determine the reference line of the current block may be one, two, or more. For example, if the number of signaled indices is one, the current block may perform intra prediction using only a single reference line candidate specified by the signaled index among multiple reference line candidates. Or, if the number of signaled indices is two or more, the current block may perform intra prediction using multiple reference line candidates specified by multiple indices among multiple reference line candidates.

[0158] Referring to FIG. 3, the intra prediction mode of the current block can be determined (S310).

[0159] The intra prediction mode of the current block can be determined from among a plurality of predefined intra prediction modes in the video encoding / decoding device. The plurality of predefined intra prediction modes will be examined with reference to FIGS. 4 and FIGS. 5.

[0160] FIG. 4 illustrates an example of a plurality of intra-prediction modes according to the present disclosure.

[0161] Referring to FIG. 4, a plurality of pre-defined intra-prediction modes in an image encoding / decoding device may be composed of non-directional modes and directional modes. The non-directional mode may include at least one of a planar mode or a DC mode. The directional mode may include directional modes 2 through 66.

[0162] The directional mode may be further extended than shown in FIG. 4. FIG. 5 shows an example of an extended directional mode.

[0163] In FIG. 5, modes -1 through -14 and modes 67 through 80 are shown as being added. These directional modes may be referred to as wide-angle intra-predicted modes. Whether to use wide-angle intra-predicted modes may be determined based on the shape of the current block. For example, if the current block is a non-square block where the width is greater than the height, some directional modes (e.g., 2 through 15) may be switched to wide-angle intra-predicted modes between 67 and 80. On the other hand, if the current block is a non-square block where the height is greater than the width, some directional modes (e.g., 53 through 66) may be switched to wide-angle intra-predicted modes between -1 and -14.

[0164] The range of available wide-angle intra prediction modes can be adaptively determined based on the width-to-height ratio of the current block. Table 1 shows the range of available wide-angle intra prediction modes based on the width-to-height ratio of the current block.

[0165] Width / Height Available Wide Angle Intra Predicted Mode Range W / H = 16 67~80 W / H = 8 67~78 W / H = 4 67~76 W / H = 2 67~74 W / H = 1 None W / H = 1 / 2 -1~-8 W / H = 1 / 4 -1~-10 W / H = 1 / 8 -1~-12 W / H = 1 / 16 -1~-14

[0166] Among the plurality of intra prediction modes mentioned above, K candidate modes (most probable mode, MPM) can be selected. A candidate list including the selected candidate modes can be generated. An index indicating any one of the candidate modes in the candidate list can be signaled. The intra prediction mode of the current block can be determined based on the candidate mode indicated by the index. For example, the candidate mode indicated by the index can be set as the intra prediction mode of the current block. Alternatively, the intra prediction mode of the current block may be determined based on the value of the candidate mode indicated by the index and a predetermined difference value. The difference value may be defined as the difference between the value of the intra prediction mode of the current block and the value of the candidate mode indicated by the index. The difference value may be signaled via a bitstream. Alternatively, the difference value may be a value pre-defined in the video encoding / decoding device. Alternatively, the intra prediction mode of the current block may be determined based on a flag indicating whether a mode identical to the intra prediction mode of the current block exists in the candidate list. For example, if the flag is a first value, the intra prediction mode of the current block may be determined from the candidate list. In this case, an index indicating any one of the multiple candidate modes belonging to the candidate list may be signaled. The candidate mode indicated by the index may be set as the intra prediction mode of the current block. On the other hand, if the flag is a second value, any one of the remaining intra prediction modes may be set as the intra prediction mode of the current block. The remaining intra prediction mode may refer to a mode among the pre-defined multiple intra prediction modes excluding the candidate mode belonging to the candidate list. If the flag is a second value, an index indicating any one of the remaining intra prediction modes may be signaled.The intra prediction mode indicated by the signaled index can be set to the intra prediction mode of the current block.

[0167] The intra prediction mode of a chroma block can be selected from among multiple intra prediction mode candidates of the chroma block. To this end, index information indicating one of the intra prediction mode candidates of the chroma block can be explicitly encoded and signaled through a bitstream. Table 2 is an example of intra prediction mode candidates of the chroma block.

[0168] Intra-prediction mode candidates for index chroma blocks: Luma Mode: 0 Luma Mode: 50 Luma Mode: 18 Luma Mode: 1 Others 0 6 6 0 0 0 1 5 0 6 6 5 0 5 5 0 2 1 8 1 8 6 6 1 8 1 8 3 1 1 1 6 6 1 4 DM

[0169] In the example of Table 2, DM (Direct Mode) means setting the intra prediction mode of the luminance block located at the same position as the chroma block to the intra prediction mode of the chroma block. Meanwhile, the luminance block located at the same position as the chroma block can be determined based on the position of the top-left sample or the position of the center sample of the chroma block.

[0170] For example, if the intra prediction mode (luminance mode) of the luminance block is 0 (planar mode) and the index points to 2, the intra prediction mode of the chroma block can be determined as horizontal mode (18). For example, if the intra prediction mode (luminance mode) of the luminance block is 1 (DC mode) and the index points to 0, the intra prediction mode of the chroma block can be determined as planner mode (0).

[0171] Consequently, the intra prediction mode of the chroma block may also be set to one of the intra prediction modes shown in FIG. 4 or FIG. 5. The intra prediction mode of the current block may also be used to determine the reference line of the current block, in which case step S310 may be performed before step S300.

[0172] Meanwhile, in the present disclosure, the chroma block may represent at least one of a Cb component block or a Cr component block.

[0173] Referring to FIG. 3, an intra prediction can be performed on the current block based on the reference line of the current block and the intra prediction mode (S320).

[0174] Hereinafter, with reference to FIGS. 6 to 8, we will examine in detail the intra prediction method for each intra prediction mode. However, for the sake of convenience of explanation, it is assumed that a single reference line is used for the intra prediction of the current block, but the intra prediction method described below can be applied in the same or similar way even when multiple reference lines are used.

[0175] FIG. 6 illustrates a planner mode-based intra prediction method according to the present disclosure.

[0176] Referring to FIG. 6, T represents a reference sample located at the upper-right corner of the current block, and L represents a reference sample located at the lower-left corner of the current block. P1 can be generated through horizontal interpolation. For example, P1 can be generated by interpolating T with a reference sample located on the same horizontal line as P1. P2 can be generated through vertical interpolation. For example, P2 can be generated by interpolating L with a reference sample located on the same vertical line as P2. The current sample within the current block can be predicted through the weighted sum of P1 and P2 as shown in the following Equation 1.

[0177]

[0178] In Equation 1, weights α and β can be determined by considering the width and height of the current block. Depending on the width and height of the current block, weights α and β may have the same value or different values. If the width and height of the current block are the same, weights α and β can be set equally, and the predicted sample of the current sample can be set to the average value of P1 and P2. If the width and height of the current block are not the same, weights α and β may have different values. For example, if the width is greater than the height, a smaller value can be set for the weight corresponding to the width of the current block and a larger value can be set for the weight corresponding to the height of the current block. Conversely, if the width is greater than the height, a larger value can be set for the weight corresponding to the width of the current block and a smaller value can be set for the weight corresponding to the height of the current block. Here, the weight corresponding to the width of the current block may be β, and the weight corresponding to the height of the current block may be α.

[0179] FIG. 7 illustrates a DC mode-based intra prediction method according to the present disclosure.

[0180] Referring to FIG. 7, the average value of surrounding samples adjacent to the current block can be calculated, and the calculated average value can be set as the predicted value for all samples within the current block. Here, the surrounding samples may include the top reference sample and the left reference sample of the current block. However, depending on the shape of the current block, the average value may be calculated using only the top reference sample or only the left reference sample. For example, if the width of the current block is greater than the height, the average value may be calculated using only the top reference sample of the current block. Alternatively, if the ratio of the width to the height of the current block is greater than or equal to a predetermined threshold value, the average value may be calculated using only the top reference sample of the current block. Alternatively, if the ratio of the width to the height of the current block is less than or equal to a predetermined threshold value, the average value may be calculated using only the top reference sample of the current block. On the other hand, if the width of the current block is smaller than the height, the average value may be calculated using only the left reference sample of the current block. Alternatively, if the ratio of the width to the height of the current block is less than or equal to a predetermined threshold value, the average value may be calculated using only the left reference sample of the current block. Alternatively, if the ratio of the width to the height of the current block is greater than or equal to a predetermined threshold value, the average value can be calculated using only the left reference sample of the current block.

[0181] FIG. 8 illustrates a directional mode-based intra prediction method according to the present disclosure.

[0182] If the intra prediction mode of the current block is a directional mode, projection can be performed on a reference line according to the angle of the directional mode. If a reference sample exists at the projected location, that reference sample can be set as the prediction sample of the current sample. If no reference sample exists at the projected location, a sample corresponding to the projected location can be generated using one or more neighboring samples adjacent to the projected location. For example, a sample corresponding to the projected location can be generated by performing interpolation based on two or more neighboring samples adjacent in both directions relative to the projected location. Alternatively, a single neighboring sample adjacent to the projected location can be set as the sample corresponding to the projected location. In this case, among multiple neighboring samples adjacent to the projected location, the neighboring sample closest to the projected location may be used. The sample corresponding to the projected location can be set as the prediction sample of the current sample.

[0183] Referring to FIG. 8, for the current sample B, if projection is performed to a reference line according to the angle of the intra-prediction mode at that location, a reference sample exists at the projected location (i.e., a reference sample at an integer location, R3). In this case, the reference sample at the projected location can be set as the prediction sample for the current sample B. For the current sample A, if projection is performed to a reference line according to the angle of the intra-prediction mode at that location, a reference sample (i.e., a reference sample at an integer location) does not exist at the projected location. In this case, a sample (r) at a fractional location can be generated by performing interpolation based on neighboring samples (e.g., R2 and R3) adjacent to the projected location. The generated sample (r) at a fractional location can be set as the prediction sample for the current sample A.

[0184] Figure 9 illustrates a method for deriving samples of fractional positions.

[0185] In the example of Fig. 9, the variable h represents the vertical distance (i.e., vertical distance) from the position of predicted sample A to the reference sample line, and the variable w represents the horizontal distance (i.e., horizontal distance) from the position of predicted sample A to the fractional position sample. Additionally, the variable θ represents a predefined angle according to the directionality of the intra-prediction mode, and the variable x represents the fractional position.

[0186] The variable w can be derived as shown in the following mathematical equation 2.

[0187]

[0188] Subsequently, by removing the integer position from the variable w, the fractional position can finally be derived.

[0189] Fractional position samples can be generated by interpolating adjacent integer position reference samples. For example, fractional position reference samples at position x can be generated by interpolating integer position reference samples R2 and integer position reference samples R3.

[0190] In deriving fractional position samples, a scaling factor can be used to avoid real number operations. For example, if the scaling factor f is set to 32, the distance between neighboring integer reference samples can be set to 32 instead of 1, as in the example shown in FIG. 8 (b).

[0191] In addition, the tangent value for the angle θ determined by the directionality of the intra prediction mode can also be scaled up using the same scaling factor (e.g., 32).

[0192] Figures 10 and 11 illustrate tangent values ​​for angles scaled by 32 times for each intra prediction mode.

[0193] Figure 10 shows the scaled result of the tangent value for the non-wide angle intra prediction mode, and Figure 11 shows the scaled result of the tangent value for the wide angle intra prediction mode.

[0194] If the tangent value (tanθ) for the angle value in the intra prediction mode is positive, intra prediction can be performed using only one of the reference samples belonging to the top line of the current block (i.e., top reference samples) or the reference samples belonging to the left line of the current block (i.e., left reference samples). On the other hand, if the tangent value for the angle value in the intra prediction mode is negative, both the reference samples located at the top and the reference samples located at the left are utilized.

[0195] At this time, to simplify the implementation, the left reference samples may be projected upward or the top reference samples may be projected to the left to arrange the reference samples into a 1D array, and intra prediction may be performed using the reference samples in the 1D array.

[0196] FIG. 12 is a diagram illustrating an intra-prediction pattern when the directional mode is one of modes 34 to 49.

[0197] When the intra prediction mode of the current block is one of modes 34 to 49, intra prediction is performed using not only the upper reference samples of the current block but also the left reference samples. At this time, as in the example shown in FIG. 12, the reference samples located on the left side of the current block can be copied to the position of the upper line, or the reference samples located on the left side can be interpolated to generate the reference samples of the upper line.

[0198] For example, if one wishes to obtain a reference sample for position A at the top of the current block, projection can be performed from position A on the top line to the left line of the current block, taking into account the directionality of the intra prediction mode of the current block. If the projected position is denoted as 'a', the value corresponding to position 'a' can be copied, or a fractional position value corresponding to 'a' can be generated and set as the value of position A. For example, if position 'a' is an integer position, the value of position A can be generated by copying the integer position reference sample. On the other hand, if position 'a' is a fractional position, the reference sample located above position 'a' and the reference sample located below position 'a' can be interpolated, and the interpolated value can be set as the value of position A. Meanwhile, the direction of projection from position A at the top of the current block to the left line of the current block may be parallel to the direction of the intra prediction mode of the current block, while being opposite.

[0199] Figure 13 is a diagram illustrating an example of generating an upper reference sample by interpolating left reference samples.

[0200] In Fig. 13, the variable h represents the horizontal distance between position A on the top line and position a on the left line. The variable w represents the vertical distance between position A on the top line and position a on the left line. Additionally, the variable θ represents a predefined angle according to the directionality of the intra prediction mode, and the variable x represents a fractional position.

[0201] The variable h can be derived as shown in the following mathematical equation 3.

[0202]

[0203] Subsequently, by removing the integer position from the variable h, the fractional position can finally be derived.

[0204] In deriving fractional position samples, a scaling factor can be used to avoid real-valued operations. For example, the tangent value for the variable θ can be scaled using a scaling factor f1. Here, since the direction projected to the left line is parallel and opposite to the directional prediction model, the scaled tangent value shown in FIGS. 10 and FIGS. 11 may also be used.

[0205] When a scaling factor f1 is applied, Equation 3 can be modified and used as shown in Equation 4 below.

[0206]

[0207] In the above manner, a 1D reference sample array can be constructed using only the reference samples belonging to the top line. As a result, an intra prediction for the current block can be performed using only the top reference samples constructed as a 1D array.

[0208] Figure 14 shows an example in which intra prediction is performed using reference samples arranged in a 1D array.

[0209] As shown in the example illustrated in Fig. 14, by projecting the left reference samples to generate the top reference samples, the prediction samples of the current block can be obtained using only the reference samples belonging to the top line.

[0210] Contrary to what is shown in FIGS. 12 and 14, a 1D reference sample array may be constructed using only the reference samples belonging to the left line by projecting the top reference sample onto the left line. Specifically, for directional modes 19 through 33 among the directional modes where the tangent value (tanθ) for the angle of the directional mode is negative, the reference samples belonging to the top line may be projected onto the left line to generate the left reference sample.

[0211] The intra prediction mode of the current block can also be derived using reference samples surrounding the current block. Specifically, the gradients for the horizontal and vertical directions of the reference samples are calculated, and the calculated gradients are used to derive the intra prediction mode of the current block.

[0212] Figure 15 is a diagram illustrating an example of setting a reference area.

[0213] For the sake of convenience of explanation, the current block size is assumed to be 4x4.

[0214] A reference area can be set to induce an intra-prediction mode of the current block. For example, in FIG. 15, it is assumed that w0 columns adjacent to the left of the current block and h0 rows adjacent to the top of the current block are set as the reference area.

[0215] The number of columns (w0) and / or rows (h0) constituting the reference region may be fixed in the encoder and decoder. Alternatively, the number of columns (w0) and / or rows (h0) may be determined based on at least one of the size / shape of the current block, whether Intra Sub-Partitioning (ISP) is applied to the current block, or whether the current block is adjacent to a CTU boundary.

[0216] As another example, the size of the reference area may be determined according to the type of filter applied to the reference area. Specifically, the width and height of the filter can be set to the number of columns w0 and the number of rows h0, respectively. For example, assuming that a 3x3 mask as shown in FIG. 17, which will be described later, is used, the number of columns w0 and the number of rows h0 can each be set to 3.

[0217] The reference area may extend beyond the right boundary and / or bottom boundary of the current block. For example, in the example illustrated in FIG. 15, the reference area is shown as extending w1 from the right boundary of the current block and h1 from the bottom boundary of the current block.

[0218] The right extension distance w1 and / or bottom extension distance h1 can be set to be equal to the width and / or height of the current block. For example, if the size of the current block is 4x4, the right extension distance w1 can be set to 4, equal to the width of the current block, and the bottom extension distance h1 can be set to 4, equal to the height of the current block.

[0219] As another example, a reference area can also be set, as in the example shown in FIG. 16.

[0220] Specifically, as in the example illustrated in FIG. 16 (a), the right extension distance w1 and / or the bottom extension distance h1 can be set to 0. Furthermore, as in the example illustrated in FIG. 16 (b), the upper reference area can be formed using only reference samples with x-axis coordinates between 0 and (w-1), and the left reference area can be formed using only reference samples with y-axis coordinates between 0 and (h-1). Here, w represents the width of the current block, and h represents the height of the current block.

[0221] As another example, reference line candidates for intra prediction of the current block or at least one of the reference line candidates may be set as a reference region.

[0222] As another example, depending on whether the current block is adjacent to the CTU boundary, the reference area may be configured using only the top reference area or only the left reference area.

[0223] Filtering (i.e., convolution) using a mask within a reference region can be performed. In this case, the filter used may be at least one of a Sobel mask or a Prewitt mask that outputs a gradient value.

[0224] Figure 17 illustrates the filter coefficients for the Sobel mask and the Prewit mask, respectively.

[0225] Filters of a different type than those shown in FIG. 17 may also be applied to the reference area. For example, instead of a 3x3 square filter, a 1D filter of 1x3 or 3x1, a rectangular filter of 2x3 or 2x3, a cross-shaped filter, or a diamond-shaped filter may be applied to the reference area. Alternatively, filters of a different size than those shown in FIG. 17 (e.g., 2x2, 4x4, or 5x5, etc.) may also be applied to the reference area.

[0226] The type of filter applied to the reference area may be predefined in the encoder and decoder. Alternatively, multiple filter candidates may be predefined, and index information pointing to one of the multiple filter candidates may be encoded and explicitly signaled through the bitstream.

[0227] As another example, at least one of a plurality of filter candidates may be adaptively selected based on at least one of the size / shape of the current block, whether an ISP is applied to the current block, the size of the reference region, the intra-prediction mode of neighboring blocks, or whether the current block touches a CTU boundary. Here, the neighboring blocks may include at least one of the top neighboring block or the left neighboring block of the current block.

[0228] The type of filter applied to the top reference area and the type of filter applied to the left reference area may be different.

[0229] By applying a vertical direction mask to a specific reference sample within a reference region, the vertical direction slope Dy for the reference sample can be obtained. Additionally, by applying a horizontal direction mask to a specific reference sample within a reference region, the horizontal direction slope Dx for the reference sample can be obtained.

[0230] Figure 18 shows the locations where the vertical and horizontal inclinations are obtained within the reference area.

[0231] Assuming that a 3x3 mask is applied as in the example shown in FIG. 18, a vertical slope Dy and a horizontal slope Dx can be obtained for each of the reference samples that are not adjacent to the boundary of the reference region. For example, when w0 and h0 are 3 and w1 and h1 are 4, as in the example shown in FIG. 18, 17 vertical slopes Dy and 17 horizontal slopes Dx can be obtained for each of the 17 reference samples.

[0232] If a filter of a different size or shape than that shown in Fig. 18 is applied, the vertical slope Dy and the horizontal slope Dx can be obtained for more / fewer reference samples than shown.

[0233] Based on the vertical slope Dy and horizontal slope Dx of each reference sample, an intra-prediction mode can be determined for each reference sample.

[0234] We will explain how to determine the intra-prediction mode of a reference sample using the vertical slope Dy and the horizontal slope Dx.

[0235] For example, if either the vertical slope Dy or the horizontal slope Dx is 0, the directional mode of the reference sample can be determined as the horizontal mode (18) or the vertical mode (50). Specifically, if the horizontal slope Dx is 0 and the vertical slope Dy is not 0, the intra-prediction mode of the reference sample can be determined as the vertical mode (50). Conversely, if the vertical slope Dy is 0 and the horizontal slope Dx is not 0, the intra-prediction mode of the reference sample can be determined as the horizontal mode (18).

[0236] If the vertical slope Dy and the horizontal slope Dx are both not zero, one of the remaining directional modes, excluding the horizontal mode and the vertical mode, can be determined as the intra-prediction mode of the reference sample.

[0237] Here, the intra prediction mode group to which the intra prediction mode of the reference sample belongs can be determined by comparing the absolute values ​​of the vertical slope Dy and the horizontal slope Dx. Here, the intra prediction mode group may consist of multiple directional modes of similar directionality.

[0238] Figure 19 shows an example of grouping directional modes into multiple intra-prediction mode groups.

[0239] In FIG. 19, directional modes are exemplified as being classified into four intra-predicted mode groups (a to d) based on the horizontal direction mode (18), diagonal direction mode (34), and vertical direction mode (50).

[0240] In the illustrated example, groups a and b are symmetrical with respect to the horizontal direction mode (18), and groups c and d are symmetrical with respect to the vertical direction mode (50).

[0241] In addition, the angles of directional modes 36 through 66 are the same as the angles of modes 2 through 34 transposed.

[0242] If the absolute value of the horizontal slope Dx of a reference sample is greater than the absolute value of the vertical slope Dy, the directional mode of the reference sample may belong to group a or group b.

[0243] Conversely, if the absolute value of the vertical slope Dy of a reference sample is greater than the slope of the horizontal slope Dx, the directional mode of the reference sample may belong to group c or group d.

[0244] Table 3 shows the intra prediction mode groups to which the reference sample's intra prediction mode belongs, depending on the magnitudes of the horizontal slope Dx and the vertical slope Dy.

[0245] if (|Dx| > |Dy|)ElseDx >= 0Dy >= 0bDx >= 0Dy >= 0cDx < 0Dy >= 0aDx < 0Dy >= 0dDx >= 0Dy < 0aDx >= 0Dy < 0dDx < 0Dy < 0bDx < 0Dy < 0c

[0246] Using the horizontal slope Dx and vertical slope Dy of the reference sample, the slope of the directional mode to be assigned to the reference sample can be derived. To this end, a variable R representing the ratio between the horizontal slope and the vertical slope can be derived as shown in Equation 5 below.

[0247]

[0248] As exemplified in mathematical formula 5, the variable R can be derived by using the larger absolute value between the horizontal slope Dx and the vertical slope Dy as the denominator.

[0249] Subsequently, the directional mode of the reference sample can be determined by comparing the variable R with the tangent value (tanθ) for the angle of each directional mode. Specifically, a directional mode having the same tangent value as the variable R or the most similar tangent value can be assigned to the reference sample.

[0250] At this time, if the tangent values ​​for each angle of the directional modes are stored in the encoder and decoder in a scaled state as in the example illustrated in FIG. 10 or FIG. 11, the directional mode of the reference sample can be determined by scaling the variable R using the same scaling factor.

[0251] Next, the amplitude of each of the reference samples can be derived. The amplitude can be derived as the sum of the absolute value of the horizontal slope Dx and the absolute value of the vertical slope Dy, as shown in Equation 6 below.

[0252]

[0253] Next, for each of the intra prediction modes, the amplitude value of each of the reference samples assigned to the same intra prediction mode can be accumulated.

[0254]

[0255] In Equation 7, intra_mode represents an intra-predicted mode. For example, the amplitude accumulation value for a directional mode with mode number N is derived by summing the amplitude values ​​of reference samples assigned to mode N within a reference region, and the amplitude accumulation value for a directional mode with mode number M can be derived by summing the amplitude values ​​of reference samples assigned to mode M within a reference region.

[0256] The buffer storing the amplitude accumulation value can be initialized in blocks. For example, when specifying a reference area around the current block, the amplitude accumulation value for each intra prediction mode can be initialized to 0.

[0257] Through the above process, when a histogram recording the amplitude accumulation values ​​for each intra prediction mode is derived, at least one intra prediction mode can be selected in order of increasing amplitude accumulation values ​​within the histogram. The number of selected intra prediction modes may be M, and M may be a natural number greater than or equal to 1. The value of M may be predefined in the encoder and decoder. Alternatively, the value of M may be adaptively determined by considering at least one of the size / shape of the current block and whether an ISP is applied to the current block. That is, M intra prediction modes may be selected in descending order of amplitude accumulation values.

[0258]

[0259] A histogram may also be derived in a manner different from the example described above. Specifically, a histogram can be constructed based on the frequency of intra-prediction mode usage of neighboring blocks adjacent to the current block.

[0260] FIG. 20 is a diagram illustrating an example in which a histogram is generated based on the frequency of intra-prediction mode usage of neighboring blocks adjacent to the current block.

[0261] For the sake of convenience of explanation, Wn is assumed to be the width of neighbor block n, and Hn is assumed to be the height of neighbor block n.

[0262] Neighbor blocks adjacent to the current block can be sequentially searched to accumulate the intra prediction modes of the neighbor blocks in the histogram. At this time, the amplitude value of the intra prediction mode can be set to the size value of the neighbor block (i.e., the number of samples within the neighbor block). Equation 8 shows an example where the amplitude value of the intra prediction mode is accumulated by the size of the neighbor block.

[0263]

[0264] For example, if an intra prediction mode exists within neighboring block A, the amplitude value (i.e., frequency of occurrence) of said intra prediction mode is the size (W) of neighboring block A. A x H A You can accumulate as much as ) in the histogram.

[0265] In other words, the larger the size of neighboring blocks, the greater the amplitude value (i.e., frequency of occurrence) accumulated in the histogram. Conversely, the smaller the size of neighboring blocks, the smaller the amplitude value (i.e., frequency of occurrence) accumulated in the histogram.

[0266] In the example illustrated in FIG. 20, a histogram is exemplified as being derived using neighbor blocks adjacent to the left of the current block, neighbor blocks adjacent to the top of the current block, and neighbor blocks adjacent to the top-left of the current block.

[0267] Although not shown, neighboring blocks adjacent to the bottom-left of the current block and neighboring blocks adjacent to the top-right of the current block can also be derived by the histogram.

[0268] Meanwhile, information indicating whether to use a histogram derived through slope analysis (i.e., a histogram derived by the embodiment of FIG. 18) or a histogram derived through frequency of occurrence (i.e., a histogram derived by the embodiment of FIG. 20) may be encoded and signaled.

[0269] At least one intra prediction mode selected from the histogram can be set as the intra prediction mode of the current block, and a prediction block of the current block can be obtained based on the intra prediction mode of the current block. For example, if one intra prediction mode is selected from the histogram, a prediction block obtained based on the selected intra prediction mode can be used as the final prediction block of the current block.

[0270] When multiple intra prediction modes are selected from a histogram, intra prediction can be performed based on each of the multiple intra prediction modes. Accordingly, when multiple prediction blocks are generated, the final prediction block of the current block can be obtained through an average operation or a weighted sum operation of the multiple prediction blocks.

[0271] At this time, for the weighted sum operation, the weights applied to each prediction block can be determined based on the amplitude of the intra prediction mode. That is, among the multiple intra prediction modes, the largest weight can be assigned to the prediction block derived based on the intra prediction mode with the largest amplitude, and the smallest weight can be assigned to the prediction block derived based on the intra prediction mode with the smallest amplitude.

[0272] At this time, the weight assigned to each prediction block can be determined based on the ratio between amplitudes. Alternatively, the values ​​of the weights for each amplitude rank can be stored, and then the weights mapped to the amplitude ranks of the corresponding intra-prediction mode can be applied to the prediction blocks.

[0273] A prediction block for the current block can be obtained by considering at least one default mode along with at least one intra prediction mode selected from the histogram. For example, multiple prediction blocks for the current block can be obtained by performing intra prediction based on each of the intra prediction mode and the default mode selected from the histogram. Subsequently, a final prediction block for the current block can be obtained through an average operation or a weighted sum operation of the multiple prediction blocks.

[0274] The number of default modes N can be an integer greater than or equal to 0 or 1. When M intra prediction modes are selected from the histogram, intra prediction can be performed based on each of the M intra prediction modes and N default modes to obtain (M+N) prediction blocks. Subsequently, the final prediction block of the current block can be obtained through an average operation or a weighted sum operation of the (M+N) prediction blocks.

[0275] The number of default modes N may be predefined in the encoder and decoder. Alternatively, the number of default modes N may be adaptively determined based on at least one of the size / shape of the current block, whether an ISP is applied to the current block, or whether at least one intra-prediction mode selected from the histogram includes a default mode.

[0276] The default mode may include at least one of a planar mode, a DC mode, or a predefined directional mode.

[0277] The encoder and decoder may also be configured to use a predefined mode (e.g., planner mode) among the modes listed above as the default mode.

[0278] Alternatively, the type of default mode may be adaptively determined based on the type of directional mode selected via the histogram. For example, if at least one directional mode selected via the histogram is a vertical mode or a horizontal mode, the planar mode or DC mode may be set as the default mode. On the other hand, if a vertical and / or horizontal mode is not selected via the histogram, the vertical mode or horizontal mode may be set as the default mode.

[0279] Instead of setting the region adjacent to the current block as the reference region, you can also set the reference block indicated by the current block's block vector as the reference region.

[0280] Depending on the shape of the current block, the availability of wide-angle intra prediction modes may be determined. For example, if the current block is a square shape with equal width and height, the directional modes selected from the histogram may consist of non-wide-angle intra prediction modes. On the other hand, if the current block is a non-square shape with different widths and heights, some of the directional modes selected from the histogram may be converted into wide-angle intra prediction modes.

[0281] When performing intra prediction based on an intra prediction mode derived through a histogram, a predefined reference line may be used. Here, the predefined reference line may be an adjacent reference line (i.e., index 0) or a non-adjacent reference line (e.g., index 1) adjacent to the current block.

[0282] Information indicating whether to apply a method of performing intra prediction by selecting an intra prediction mode through the histogram described above can be encoded and signaled through a bitstream. The information may be a 1-bit flag.

[0283] Alternatively, whether to select an intra prediction mode through a histogram can be determined based on at least one of the size / shape of the current block, whether an ISP is applied to the current block, whether the current block touches a CTU boundary, or whether neighboring blocks are encoded with intra prediction.

[0284] For example, if at least one of the top neighbor block or left neighbor block of the current block is not encoded in intra prediction, a method for selecting an intra prediction mode through a histogram can be applied to the current block.

[0285] The method of selecting an intra-prediction mode via a histogram can be applied to both the luminance component and the chroma component. Alternatively, the method described above can be applied only to the luminance component. Or, for each of the luminance component and the chroma component, it may be determined independently whether to select an intra-prediction mode via a histogram.

[0286] The intra prediction mode of the current block can also be derived by utilizing the region surrounding the current block. The surrounding region referenced to derive the intra prediction mode of the current block can be referred to as the reference region.

[0287] FIG. 21 is a drawing illustrating a reference area around the current block.

[0288] In FIG. 21, the width w and height h of the current block are both 4.

[0289] As shown in the example illustrated in FIG. 21, a surrounding area adjacent to the current block can be set as a reference area. Specifically, a left reference area adjacent to the left of the current block and a top reference area adjacent to the top of the current block can be set, respectively.

[0290] The size of the left reference area can be represented by w0, and the size of the top reference area can be represented by h0. For example, w0 represents the number of reference sample lines (i.e., reference sample columns) included in the left reference area, and h0 represents the number of reference sample lines (i.e., reference sample rows) included in the top reference area. In this case, w0 and h0 can each be a natural number greater than or equal to 1. Additionally, w0 and h0 may be predefined in the encoder and decoder.

[0291] For example, as shown in the example illustrated in FIG. 21, if the size of the current block is 4x4 or 2x2, the 4x4 or 2x2 area to the left of the current block can be set as the left reference area, and the 4x4 or 2x2 area to the top of the current block can be set as the top reference area.

[0292] Alternatively, at least one of the size w0 and / or h0 of the reference area may be adaptively determined based on at least one of the size of the current block, the shape of the current block, whether Intra Sub-partitioning (ISP) is applied to the current block, or whether the current block is adjacent to a CTU boundary. Here, the size of the current block represents at least one of the width, height, or product of the width and height of the current block. For example, at least one of the left reference area and the top reference area may be determined to be equal to the size of the current block. Alternatively, the left reference area may be set as a square area with a side length equal to the height of the current block, and the top reference area may be set as a square area with a side length equal to the width of the current block.

[0293] Alternatively, the size of the reference area can be determined by comparing the size of the current block with a threshold value. For example, if the size of the current block is greater than or equal to the threshold value, the size of at least one of the left reference area or the top reference area can be set to 4x4. On the other hand, if the size of the current block is less than the threshold value, the size of at least one of the left reference area or the top reference area can be set to 2x2.

[0294] Intra prediction can be performed on a reference region using reference samples from the reference region. Here, reference samples for the left reference region may belong to a column adjacent to the left of the left reference region, and reference samples for the top reference region may belong to a row adjacent to the top of the top reference region.

[0295] In the example illustrated in FIG. 21, w1 and h1 are variables representing the range of reference samples used to perform intra-prediction on a reference region. Specifically, w1 may represent the number of reference samples in the upper-right region of the upper reference region, and h1 may represent the number of reference samples in the lower-left region of the left reference region.

[0296] In the example shown in Fig. 21, w1 and h1 are both 4.

[0297] At this time, w1 and h1 may be predefined in the encoder and decoder. For example, w1 and h1 may each be a natural number greater than or equal to 0 or 1.

[0298] Alternatively, at least one of w1 or h1 may be adaptively determined based on at least one of the size of the current block, the shape of the current block, whether Intra Sub-partitioning (ISP) is applied to the current block, or whether the current block is adjacent to a CTU boundary. Here, the size of the current block represents at least one of the width, height, or the product of the width and height of the current block.

[0299] For example, if the current block size (e.g., width or height) is greater than or equal to a threshold value, at least one of w1 or h1 may be set to 8 or 16. On the other hand, if the current block size (e.g., width or height) is less than a threshold value, at least one of w1 or h1 may be set to 4.

[0300] Meanwhile, under the above conditions, w1 can be determined dependently on the width w of the current block, and h1 can be determined dependently on the height h of the current block.

[0301] Alternatively, if the current block is square, w1 and h1 may be identical. On the other hand, if the current block is non-square, w1 and h1 may be different.

[0302] Intra-prediction can be performed on a reference region using reference samples for the reference region. Specifically, after performing intra-prediction on the reference region based on multiple intra-prediction modes, the cost for each prediction result can be calculated.

[0303] Figures 22 and 23 show an example of performing intra prediction on a reference area based on planner mode.

[0304] Specifically, in FIG. 22, reference samples used to perform intra prediction based on planer mode for the left reference area and reference samples used to perform intra prediction based on planer mode for the top reference area are shown.

[0305] As in the example illustrated in FIG. 22, reference samples may be included in the line adjacent to the left of the left reference area and the line adjacent to the top of the top reference area.

[0306] Accordingly, left reference samples for the left reference area are adjacent to the left reference area, whereas top reference samples for the left reference area may not be adjacent to the left reference area.

[0307] Additionally, the top reference samples for the top reference area are adjacent to the top reference area, whereas the left reference samples for the top reference area may not be adjacent to the top reference area.

[0308] Alternatively, as in the example illustrated in FIG. 23, the reference samples for the upper reference area may consist of upper reference samples adjacent to the upper reference area and left reference samples adjacent to the upper reference area, and the reference samples for the left reference area may consist of upper reference samples adjacent to the upper reference area and left reference samples adjacent to the upper reference area.

[0309] Figures 24 and 25 show an example of performing intra prediction on a reference region based on DC mode.

[0310] When intra prediction based on DC mode is performed, the prediction samples can be set as the average value of the reference samples. In this case, as shown in the example illustrated in FIG. 24, the average value for the upper reference area (i.e., DCval) can be calculated using only the upper reference samples adjacent to the upper reference area, and the average value for the left reference area can be calculated using only the left reference samples adjacent to the left reference area.

[0311] Alternatively, as in the example illustrated in FIG. 25, the average value for the upper reference area can be derived using the left reference samples adjacent to the upper reference area together with the upper reference samples adjacent to the upper reference area, and the average value for the left upper reference area can be derived using the upper reference samples adjacent to the left reference area together with the left reference samples adjacent to the left reference area.

[0312] Figures 26 and 27 illustrate an example of performing intra prediction on a reference region based on a directional mode.

[0313] Meanwhile, depending on the directional mode, intra prediction for the reference region can be performed using only the reference samples belonging to the top row of the top reference region, or intra prediction for the reference region can be performed using only the reference samples belonging to the left column of the left reference region.

[0314] For example, FIG. 26 shows an example in which an intra prediction for a reference region is performed using only the reference samples belonging to the top row of the upper reference region.

[0315] For example, if the index of the directional mode is equal to or greater than the index of the top-left diagonal directional mode (i.e., 34), an intra prediction for the reference regions (i.e., the top reference region and the left reference region) can be performed using only the reference samples belonging to the top row of the top reference region. Accordingly, for the top reference region, reference samples adjacent to the top reference region are used, but for the left reference region, reference samples not adjacent to the left reference region may be used.

[0316] Alternatively, as in the example illustrated in FIG. 27, for the upper reference region, intra prediction may be performed using upper reference samples adjacent to the upper reference region, and for the left reference region, intra prediction may be performed using upper reference samples adjacent to the left reference region.

[0317] Meanwhile, if the index of the directional mode is smaller than the index of the vertical mode (i.e., 50), the reference samples belonging to the left column of the left reference area (i.e., left reference samples) can be projected to the top row of the top reference area according to the direction of the directional mode to derive the reference samples belonging to the top row (i.e., top reference samples). Meanwhile, if the position projected from the left reference samples is not an integer position, the left reference samples can be interpolated to obtain the top reference samples.

[0318] Although not explicitly stated, if the index of the directional mode is smaller than the index of the top-left diagonal directional mode, an intra prediction for the reference region (i.e., the top reference region and the left reference region) can be performed using only the reference samples belonging to the left column of the left reference region.

[0319] Meanwhile, if the index of the directional mode is greater than the index of the horizontal directional mode (i.e., 18), the reference samples belonging to the top row of the top reference area (i.e., top reference samples) can be projected to the left column of the left reference area according to the direction of the directional mode to derive the reference samples belonging to the left column (i.e., left reference samples). Meanwhile, if the position projected from the left reference sample is not an integer position, the left reference samples can be interpolated to obtain the top reference sample.

[0320] After performing multiple intra predictions on a reference region based on multiple intra prediction modes, the cost for each intra prediction mode can be calculated. Specifically, the cost for an intra prediction mode can be calculated based on the difference between the reconstructed samples within the reference region and the predicted samples within the reference region obtained through intra prediction.

[0321] Meanwhile, the cost function for calculating the cost may include at least one of SAD (Sum of Absolute Difference), SATD (Sum of Absolute Transformed Differences), SSD (Sum of Squared Difference), or MR-SAD (Mean-Removed Sum of Absolute Differences).

[0322] Once the cost for each intra prediction mode is calculated, the intra prediction mode with the lowest cost can be selected.

[0323] Alternatively, N intra-prediction modes with low costs can be selected. Here, N is a natural number greater than or equal to 1, such as 2, 3, or 4.

[0324] Subsequently, based on N intra prediction modes, N intra predictions are performed on the current block to obtain N prediction blocks. Subsequently, the final prediction block of the current block can be obtained by weighting the N prediction blocks.

[0325] Meanwhile, the weights for the weighted sum can be determined by the ratio of the costs of each intra-prediction mode. That is, if the cost of an intra-prediction mode is low, a high weight may be assigned to the prediction block derived from that intra-prediction mode. Conversely, if the cost of an intra-prediction mode is high, a low weight may be assigned to the prediction block derived from that intra-prediction mode.

[0326] Meanwhile, an intra prediction mode can be induced for each of the upper reference area and the left reference area. For example, based on the results of performing multiple intra predictions on the upper reference area, a first intra prediction mode with the lowest cost can be selected, and based on the results of performing multiple intra predictions on the left reference area, a second intra prediction mode with the lowest cost can be selected. Subsequently, based on the first intra prediction mode and the second intra prediction mode, two intra predictions can be performed on the current block to obtain the first prediction block and the second prediction block. Subsequently, the current block can be obtained by weighting the first prediction block and the second prediction block or averaging them.

[0327] Alternatively, at least one intra prediction mode selected in order of lowest cost may be inserted into the MPM list of the current block. For example, a first intra prediction mode derived from the top reference region and a second intra prediction mode derived from the left reference region may be inserted into the MPM list of the current block.

[0328] Afterwards, at least one of the intra prediction mode candidates included in the MPM list can be selected to perform an intra prediction for the current block.

[0329] Alternatively, for each of the left reference area and the top reference area, the cost for each intra prediction mode can be calculated. Subsequently, the area with the smaller cost among the left reference area and the top reference area can be selected, and the intra prediction mode having the smallest cost in the selected area can be set as the intra prediction mode of the current block. In this case, the cost of each reference area may be derived by summing the costs of the intra prediction modes for that reference area.

[0330] Meanwhile, the number and / or types of intra prediction modes applied to the left reference area and the intra prediction modes applied to the top reference area may be the same or different.

[0331] Alternatively, N intra prediction modes can be selected from the top reference area in order of decreasing cost, and N intra prediction modes can be selected from the left reference area in order of decreasing cost.

[0332] Subsequently, based on 2N intra prediction modes, the cost of 2N intra prediction modes can be calculated again by applying them to the upper reference area and the left reference area. That is, if the initial cost of an intra prediction mode was obtained by applying intra prediction to only one of the left reference area and the upper reference area, the cost of the intra prediction mode in this round can be obtained by applying intra prediction to the left reference area and the upper reference area.

[0333] Afterwards, the intra prediction mode with the smallest cost among 2N intra prediction modes, or M intra prediction modes selected in order of smallest cost, can be used for the intra prediction of the current block.

[0334] Meanwhile, in FIGS. 22 to 27, intra prediction for a reference region is exemplified as being performed using a single reference sample line adjacent to the reference region. Unlike the illustrated example, intra prediction for a reference region may also be performed using a reference sample line that is not adjacent to the reference region.

[0335] Specifically, by performing an intra-prediction on a reference region based on each of multiple reference sample lines, the cost can be calculated for each reference sample line. Accordingly, the cost can be calculated for a set combining the intra-prediction mode and the reference sample lines.

[0336] Subsequently, by selecting the combination of the intra prediction mode and reference sample line with the smallest cost, intra prediction for the current block can be performed.

[0337]

[0338] The intra prediction mode of the current block can be encoded / decoded using an intra prediction mode candidate list.

[0339] For example, information indicating whether the intra prediction mode of the current block is included in the intra prediction mode candidate list can be encoded / decoded.

[0340] If the intra prediction mode of the current block is included in the intra prediction mode candidate list, index information indicating an intra prediction mode candidate identical to the current block's intra prediction mode within the intra prediction mode candidate list can be encoded / decoded.

[0341] On the other hand, if the intra prediction mode of the current block is not included in the intra prediction mode candidate list, index information indicating one of the remaining intra prediction modes, excluding the intra prediction mode candidates included in the intra prediction mode candidate list, can be encoded / decoded. For example, if the number of intra prediction modes is 67 and the intra prediction mode candidate list contains 6 intra prediction mode candidates, index information indicating one of the 61 remaining intra prediction modes can be encoded / decoded.

[0342] Meanwhile, after reallocating the indices of the remaining intra prediction modes, index information pointing to the reallocated indices can be encoded / decoded. For example, after reallocating the indices of the 61 remaining intra prediction modes to values ​​from 0 to 60, the reallocated index of the one among the 61 remaining intra prediction modes used as the intra prediction mode of the current block can be encoded / decoded as index information.

[0343] Meanwhile, the intra prediction mode candidate with the smallest candidate index (i.e., index 0) in the intra prediction mode candidate list may be a predefined intra prediction mode. For example, a planner mode or a DC mode can be set as the intra prediction mode candidate with candidate index 0.

[0344] Alternatively, instead of inserting a predefined intra prediction mode into the intra prediction mode candidate list, a flag indicating whether the current block’s intra prediction mode is the same as a predefined intra prediction mode may be separately encoded / decoded. For example, a flag indicating whether the current block’s intra prediction mode is the same as a planar mode (e.g., not_planar_flag) may be encoded / decoded.

[0345] Meanwhile, a flag indicating whether the current block's intra prediction mode is identical to a predefined intra prediction mode can be encoded / decoded only if a flag indicating whether the current block's intra prediction mode is included in the intra prediction mode candidate list (e.g., mpm_flag) indicates that the current block's intra prediction mode is included in the intra prediction mode candidate list.

[0346] That is, if mpm_flag indicates that the intra prediction mode of the current block is included in the intra prediction mode candidate list, the intra prediction mode of the current block may be set to a predefined intra prediction mode or to an intra prediction mode candidate included in the intra prediction mode candidate list. Specifically, if not_planar_flag is 0, the intra prediction mode of the current block may be set to planner mode. On the other hand, if not_planar_flag is 1, index information (e.g., mpm_idx) indicating one of the intra prediction mode candidates included in the intra prediction mode candidate list may be additionally encoded / decoded. The intra prediction mode candidate indicated by the index information may be set to the intra prediction mode of the current block.

[0347] On the other hand, if mpm_flag indicates that the intra prediction mode of the current block is not included in the intra prediction mode candidate list, the intra prediction mode of the current block may be set to one of the remaining intra prediction modes, excluding the predefined intra prediction mode and the intra prediction mode candidates included in the intra prediction mode candidate list.

[0348] The intra prediction mode of the current block may be encoded / decoded using multiple intra prediction mode candidate lists. For convenience of explanation, it is assumed that among the two intra mode candidate lists, the size of the first intra mode candidate list is n1 and the size of the second intra mode candidate list is n2. Here, the size of the intra mode candidate list may represent the number of intra prediction mode candidates included in the intra mode candidate list. Also, it is assumed that n1 is 6 and n2 is 16.

[0349] In the encoder and decoder, intra prediction mode candidates to be inserted into the first intra prediction mode candidate list and the second intra prediction mode candidate list (i.e., n1 + n2 intra prediction mode candidates) can be derived.

[0350] A predefined intra prediction mode can be set as an intra prediction mode candidate. In this case, the predefined intra prediction mode may have the smallest candidate index (i.e., index 0) within the intra prediction mode candidate list. For example, a planner mode or a DC mode can be set as an intra prediction mode candidate with a candidate index of 0.

[0351] Neighboring blocks adjacent to the current block can be scanned in a predefined order, and the intra prediction mode discovered by the scanning can be set as an intra prediction mode candidate.

[0352] Figure 28 is intended to illustrate neighboring blocks used to derive intra-prediction mode candidates.

[0353] At least one intra prediction mode candidate can be derived from at least one of the top neighbor block including position A, the left neighbor block including position L, the upper right neighbor block including position AR, the lower left neighbor block including position BL, or the upper left neighbor block including position AL.

[0354] For example, neighbor blocks can be scanned in the order of left neighbor block (L), top neighbor block (A), bottom-left neighbor block (BL), top-right neighbor block (AR), and top-left neighbor block (AL) to derive at least one intra prediction mode candidate. For example, if an available intra prediction mode exists in a neighbor block to be scanned, the intra prediction mode of that neighbor block can be set as an intra prediction mode candidate.

[0355] The scan order of neighboring blocks may be predefined in the encoder and decoder.

[0356] Alternatively, the scanning order of neighboring blocks may be determined differently depending on the shape of the current block. For example, if the width of the current block is greater than its height, neighboring blocks can be scanned in the order of left neighboring block (L), top neighboring block (A), bottom-left neighboring block (BL), top-right neighboring block (AR), and top-left neighboring block (AL). On the other hand, if the height of the current block is greater than its width, neighboring blocks can be scanned in the order of top neighboring block (A), left neighboring block (L), bottom-left neighboring block (BL), top-right neighboring block (AR), and top-left neighboring block (AL).

[0357] Candidate intra prediction modes can also be derived based on a gradient histogram. Specifically, when a histogram is generated from a reference region of the current block, m intra prediction modes can be selected in order of largest amplitude values ​​within the histogram, and the selected m intra prediction modes can be set as candidates for intra prediction modes. Here, the histogram may be a histogram derived through gradient analysis (i.e., FIG. 18) (hereinafter referred to as a gradient histogram) or a histogram derived through frequency analysis (i.e., FIG. 20) (hereinafter referred to as a frequency histogram).

[0358] Meanwhile, the intra prediction mode may be set as an intra prediction mode candidate only if the amplitude value of the intra prediction mode is greater than the threshold value. Here, the amplitude value may be predefined in the encoder and decoder. Alternatively, the threshold value may be adaptively determined based on the size and / or shape of the current block.

[0359] A new intra prediction mode candidate can be derived by adding or subtracting an offset from a previously derived intra prediction mode candidate. For example, at least one new intra prediction mode candidate can be derived by adding or subtracting an offset from the intra prediction mode candidate with the smallest candidate index or from the intra prediction mode candidate with the smallest candidate index among the directional intra prediction modes.

[0360] By sequentially changing the offset, multiple intra prediction mode candidates can be derived from a single intra prediction mode candidate. For example, at least one new intra prediction mode candidate can be derived by changing the offset in the order of -1, +1, -2, +2, -2, +3, -4, +4.

[0361] When the number of offsets is p, up to p additional intra prediction mode candidates can be derived from a single intra prediction mode candidate.

[0362] Deriving new intra prediction mode candidates by changing the offset can be performed repeatedly until the number of intra prediction mode candidates reaches the maximum number (i.e., n1+n2).

[0363] Even if additional intra prediction mode candidates are derived from the intra prediction mode candidates, if the number of intra prediction mode candidates has not reached the maximum number, additional intra prediction mode candidates can be derived by adding or subtracting an offset from the next sequence of intra prediction mode candidates.

[0364] If the number of intra prediction mode candidates derived according to the above methods does not reach the maximum number, a default intra prediction mode can be set as an intra prediction mode candidate. The default intra prediction mode may include at least one of a planner mode, a DC mode, a vertical direction mode, a horizontal direction mode, an upper right diagonal direction mode, an upper left diagonal direction mode, or a lower left diagonal direction mode.

[0365] When intra prediction mode candidates are derived, a cost for each intra prediction mode candidate can be calculated. In this case, the cost of an intra prediction mode candidate can be calculated based on the difference between the prediction samples obtained by performing intra prediction based on the corresponding intra prediction mode candidate for a template region adjacent to the current block and the restoration samples within said template region.

[0366] For example, the Sum of Absolute Difference (SAD) between the prediction samples obtained by performing intra prediction on a reference region based on an intra prediction mode candidate and the reconstructed samples within the reference region can be set as the cost of the intra prediction mode candidate.

[0367] Subsequently, intra prediction mode candidates can be reordered according to cost. Specifically, intra prediction mode candidates can be reordered in ascending order of cost.

[0368] To simplify the operation, cost calculation and reordering may be performed on only some of the intra prediction mode candidates. For example, cost calculation and reordering may be performed on only m intra prediction mode candidates derived from the histogram.

[0369] Alternatively, cost calculation and reordering can be performed only on intra prediction mode candidates derived from the histogram and intra prediction mode candidates derived from neighboring blocks of the current block.

[0370] Alternatively, cost calculation and reordering may be performed only on the remaining intra prediction mode candidates, excluding a predetermined number of intra prediction mode candidates with small candidate indices. In this case, the predetermined number of intra prediction mode candidates with small candidate indices may not be reordered. Accordingly, the candidate indices of the predetermined number of intra prediction mode candidates with small candidate indices may be maintained without change.

[0371] After reordering the intra prediction mode candidates, the reordered intra prediction mode candidates can be inserted into the first intra prediction mode candidate list and the second intra prediction mode candidate list. For example, among the reordered intra prediction mode candidates, n1 candidates with smaller candidate indices can be inserted into the first intra prediction mode candidate list, and the remaining n2 candidates can be inserted into the second intra prediction mode candidate list.

[0372] Meanwhile, the above embodiment can be applied to reconstruct the first intra prediction mode candidate list and the second intra prediction mode candidate list.

[0373] Specifically, after constructing a first intra prediction mode candidate list and a second intra prediction mode candidate list, the cost for each intra prediction mode candidate included in the first intra prediction mode candidate list and the second intra prediction mode candidate list can be calculated. When the intra prediction mode candidates are reordered according to cost, the first intra prediction mode candidate list can be reconstructed with n1 candidates with small candidate indices (i.e., n1 candidates with small costs), and the second intra prediction mode candidate list can be reconstructed with the remaining n2 candidates.

[0374] That is, depending on the cost, an intra prediction mode candidate included in the first intra prediction mode candidate list may be moved to the second intra prediction mode candidate list, or an intra prediction mode candidate included in the second intra prediction mode candidate list may be moved to the first intra prediction mode candidate list.

[0375] As another example, candidate movement between lists can be restricted, and the intra prediction mode candidates can be configured to be reordered only within a single list. That is, the first intra prediction mode candidate list can be updated by sorting the intra prediction mode candidates included in the first intra prediction mode candidate list in ascending order of cost, and the second intra prediction mode candidate list can be updated by sorting the intra prediction mode candidates included in the second intra prediction mode candidate list in ascending order of cost.

[0376] Information indicating whether the intra prediction mode of the current block is included in the first intra prediction mode candidate list or the second intra prediction mode candidate list can be encoded and signaled.

[0377] For example, a flag indicating whether one of a plurality of intra prediction mode candidate lists contains an intra prediction mode candidate identical to the intra prediction mode of the current block may be encoded and signaled. If the flag indicates that one of the plurality of intra prediction mode candidate lists contains an intra prediction mode candidate identical to the intra prediction mode of the current block, index information specifying one of the plurality of intra prediction mode candidate lists may be encoded and signaled. For example, the index information may be a flag indicating one of a first intra prediction mode candidate list and a second intra prediction mode candidate list.

[0378] Alternatively, for each intra prediction mode candidate list, a flag indicating whether the intra prediction mode candidate list contains an intra prediction mode candidate identical to the intra prediction mode of the current block may be encoded and signaled. For example, a flag indicating whether the first intra prediction mode candidate list contains an intra prediction mode candidate identical to the intra prediction mode of the current block may be encoded and signaled. If the flag indicates that the first intra prediction mode candidate list does not contain an intra prediction mode candidate identical to the intra prediction mode of the current block, an additional flag indicating whether the second intra prediction mode candidate list contains an intra prediction mode candidate identical to the intra prediction mode of the current block may be encoded and signaled.

[0379] When one of the first intra prediction mode list and the second intra prediction mode list is selected, index information identifying an intra prediction mode candidate that is identical to the intra prediction mode of the current block among the intra prediction mode candidates included in the list may be additionally encoded and signaled.

[0380] If the first intra prediction mode candidate list and the second intra prediction mode candidate list do not contain an intra prediction mode identical to the intra prediction mode of the current block, information identifying one of the remaining intra prediction modes may be encoded and signaled. For example, among the 67 intra prediction modes, index information identifying the intra prediction mode of the current block among the 45 intra prediction modes, excluding the 6 intra prediction mode candidates included in the first intra prediction mode candidate list and the 16 intra prediction mode candidates included in the second intra prediction mode candidate list, may be encoded and signaled.

[0381] Meanwhile, the indices of the remaining intra prediction modes can be reallocated, and index information can be encoded / decoded based on the reallocated indices. That is, the index information being encoded / decoded can point to the reallocated index rather than the original index of the intra prediction mode.

[0382] Table 4 shows an example of how the indices of the remaining intra prediction modes are reallocated.

[0383] Intra Prediction Mode Included Candidate List Reassigned Index 0 1st Intra Prediction Mode Candidate List 1 2nd Intra Prediction Mode Candidate List 2 1st Intra Prediction Mode Candidate List 3 1st Intra Prediction Mode Candidate List 4 05 16 27 38 2nd Intra Prediction Mode Candidate List 9 1st Intra Prediction Mode Candidate List … … … 62-40 63-41 64-42 65-43 66-44

[0384] As exemplified in Table 4, the indices of intra prediction modes not included in the first intra prediction mode candidate list or the second intra prediction mode candidate list may be changed by reassignment.

[0385] In the decoder, the original index of the remaining intra-prediction mode can be restored by comparing the index of the remaining intra-prediction mode candidate with the index of the intra-prediction mode candidate. For example, if the index of the remaining intra-prediction mode is equal to or smaller than the index of the intra-prediction mode candidate, an update can be performed to increase the index of the remaining intra-prediction mode by 1.

[0386] More than two intra prediction mode candidate lists may be used. For example, the intra prediction mode of the current block can be encoded / decoded by additionally using a third intra prediction candidate list along with the first intra prediction candidate list and the second intra prediction candidate list.

[0387] M intra prediction modes selected based on the histogram of the reference region of the current block can be inserted into the third intra prediction mode candidate list as intra prediction mode candidates. The M intra prediction modes can be selected in order of increasing amplitude value.

[0388] For each of the multiple reference regions, a histogram may be derived. For example, a histogram may be derived from at least one of a first reference region composed of the top restoration region and the left restoration region of the current block, a second reference region composed only of the top restoration region of the current block, and a third reference region composed of the left restoration region of the current block. For example, a first histogram may be derived from the first reference region, a second histogram from the second reference region, and a third histogram from the third reference region.

[0389] Afterwards, at least one intra prediction mode can be selected from each histogram, and the selected intra prediction modes can be inserted into a third intra prediction mode candidate list.

[0390] For example, M1 intra prediction modes selected from the first histogram, M2 intra prediction modes selected from the second histogram, and M3 intra prediction modes selected from the third histogram can be inserted into the third intra prediction mode candidate list as intra prediction mode candidates.

[0391] Meanwhile, pruning may be performed on intra prediction mode candidates. That is, if an intra prediction mode candidate identical to the current intra prediction mode candidate is already included in the intra prediction mode candidate list, the current intra prediction mode candidate may not be inserted into the intra prediction mode candidate list.

[0392] Accordingly, when constructing the second histogram, amplitude values ​​can be accumulated only for intra prediction modes excluding the M1 intra prediction modes selected from the first histogram.

[0393] Likewise, when constructing the third histogram, amplitude values ​​can be accumulated only for intra prediction modes excluding M1 intra prediction modes selected from the first histogram and M2 intra prediction modes selected from the second histogram.

[0394] Alternatively, when selecting M2 intra prediction modes from the second histogram, M2 intra prediction modes can be selected from the remaining intra prediction modes excluding the M1 intra prediction modes selected from the first histogram.

[0395] Likewise, when selecting M3 intra prediction modes from the third histogram, M3 intra prediction modes can be selected from the remaining intra prediction modes excluding the M1 intra prediction modes selected from the first histogram and the M2 intra prediction modes selected from the second histogram.

[0396] Alternatively, if a duplicate intra prediction mode is selected, an alternative intra prediction mode for that mode may be derived, and then the alternative intra prediction mode may be inserted into the intra prediction mode candidate list.

[0397] For example, if an intra prediction mode selected from a first histogram is also selected from a second histogram, an alternative intra prediction mode can be derived by adding or subtracting an offset from the intra prediction mode selected from the second histogram. If the alternative intra prediction mode has already been inserted into the intra prediction mode candidate list, the offset can be changed to derive the alternative intra prediction mode again. Here, the offset can be an integer such as +1, -1, +2, or -2. Subsequently, the derived alternative intra prediction mode can be inserted into the intra prediction mode list as an intra prediction mode candidate.

[0398] Meanwhile, the third intra prediction mode candidate list does not need to include the intra prediction mode candidates included in the first intra prediction mode candidate list and the second intra prediction mode candidate list.

[0399] Accordingly, among the remaining intra prediction modes excluding the intra prediction mode candidates included in the first intra prediction mode candidate list and the second intra prediction mode candidate list, an intra prediction mode candidate can be derived.

[0400] If the third intra prediction mode candidate list contains the same candidate as the current block's intra prediction mode candidate list, index information specifying one of the intra prediction mode candidates included in the third intra prediction mode candidate list can be encoded and signaled.

[0401] On the histogram, only one intra prediction mode with the largest amplitude value may be used as an intra prediction mode candidate. That is, the third intra prediction mode candidate list may contain only one intra prediction mode candidate. In this case, the encoding / decoding of index information specifying the intra prediction candidate included in the third intra prediction mode candidate list may be omitted. That is, if the information indicating whether the third intra prediction mode candidate list contains an intra prediction mode candidate identical to the intra prediction mode of the current block indicates that the third intra prediction mode candidate list contains an intra prediction mode candidate identical to the intra prediction mode of the current block, then the single intra prediction mode candidate included in the third intra prediction mode candidate list may be set as the intra prediction mode of the current block.

[0402] If the third intra prediction mode candidate list includes multiple intra prediction mode candidates, the multiple intra prediction mode candidates may be reordered according to cost. In this case, the cost of the intra prediction mode candidate may be calculated based on the difference between the prediction samples obtained by performing intra prediction on the reference region of the current block and the restoration samples within the reference region.

[0403] Multiple intra prediction mode candidates included in the third intra prediction mode candidate list can be reordered in ascending order of cost. Subsequently, index information indicating one of the reordered intra prediction mode candidates can be encoded and signaled.

[0404] Alternatively, among the multiple intra prediction mode candidates included in the third intra prediction mode candidate list, only the intra prediction mode candidate with the smallest cost may be retained in the third intra prediction mode candidate list. Additionally, the remaining intra prediction mode candidates may be removed from the third intra prediction mode candidate list. Consequently, the updated third intra prediction mode candidate list contains only a single intra prediction mode candidate, and accordingly, the encoding / decoding of index information specifying the intra prediction mode candidate included in the third intra prediction mode candidate list may be omitted.

[0405] Among the three intra prediction mode candidate lists described above, the third intra prediction mode candidate list may be generated first. Specifically, among the remaining intra prediction modes excluding the intra prediction mode candidates included in the third intra prediction mode candidate list, the intra prediction mode candidates inserted into the first intra prediction mode candidate list and the second intra prediction mode candidate list may be derived.

[0406] Based on at least one histogram derived from at least one reference region, the number of intra prediction modes available in the current block may be adjusted. For example, K intra prediction modes with low amplitude values ​​in the histogram may be set not to be available in the current block. For example, when K is 8, only 59 intra prediction modes, excluding 8 of the 67 intra prediction modes, may be set to be available in the current block.

[0407] Accordingly, intra prediction mode candidates can be derived from among 59 intra prediction modes. Additionally, if an intra prediction mode identical to the current block’s intra prediction mode is not included in the intra prediction mode candidate list, the current block’s intra prediction mode can be derived from the remaining intra prediction modes among the 59 intra prediction modes, excluding the intra prediction mode candidates included in the intra prediction mode candidate list.

[0408] As the number of intra prediction modes available to the current block decreases, the amount of bits required to encode / decode the intra prediction mode of the current block may decrease.

[0409] Instead of immediately excluding the K intra prediction modes with low amplitude values, the cost of each of the K intra prediction modes can be calculated. Subsequently, the availability of a corresponding intra prediction mode can be determined based on its cost.

[0410] For example, among the K intra prediction modes, only the K1 intra prediction modes with high costs (where K1 is a natural number smaller than K or K) can be set as unavailable in the current block.

[0411] Here, K and / or K1 may be predefined values ​​in the encoder and decoder.

[0412] Alternatively, information representing K and / or K1 can be encoded and signaled through an upper header.

[0413] Alternatively, a third intra prediction mode candidate list may be generated using a gradient histogram, and a fourth intra prediction mode candidate list may be generated using an occurrence frequency histogram.

[0414] Alternatively, an integrated list of intra prediction mode candidates may be generated by combining intra prediction mode candidates derived from a gradient histogram and intra prediction mode candidates derived using a frequency histogram.

[0415] Meanwhile, in the embodiments described above, first to fourth candidate intra prediction mode candidate lists were described. However, not all described intra prediction mode candidate lists must be used to encode / decode the current intra prediction mode candidate list. For example, one, two, three, or four of the four described intra prediction mode candidate lists may be used to encode / decode the intra prediction mode of the current block.

[0416] In reallocating indices for the remaining intra prediction modes excluding intra prediction mode candidates, a reference mode may be used. The reference mode may represent one of the intra prediction mode candidates.

[0417] For example, the reference mode may be the first intra prediction mode candidate in the first intra prediction mode candidate list (i.e., the intra prediction mode candidate with a candidate index of 0).

[0418] Alternatively, the reference mode may be a non-directional intra-prediction mode (e.g., planner mode or DC mode), or a preset directional mode.

[0419] Alternatively, the reference mode may be the directional mode with the highest index or the directional mode with the lowest index among the directional modes included in the first intra prediction mode candidate list. Alternatively, the reference mode may be the intra prediction mode candidate that is a directional mode and has the smallest candidate index among the intra prediction mode candidates included in the first intra prediction mode candidate list.

[0420] Alternatively, the intra-prediction mode with the largest amplitude value on the histogram can be set as the reference mode.

[0421] Alternatively, the intra-prediction mode with the lowest cost can be set as the reference mode.

[0422] Table 5 shows an example of how the indices of the remaining intra prediction modes are reallocated.

[0423] Candidate List Included in Intra Prediction Mode Reassigned Index 0 1st Intra Prediction Mode Candidate List 1 2nd Intra Prediction Mode Candidate List 2 1st Intra Prediction Mode Candidate List 3 1st Intra Prediction Mode Candidate List 4 - 25 - 46 - 67 - 88 2nd Intra Prediction Mode Candidate List 9 1st Intra Prediction Mode Candidate List … … … 62 - 763 - 564 - 365 - 166 - 0

[0424] After establishing a reference mode among the remaining modes, indices can be reassigned in order of the remaining modes closest to the reference mode by alternating upward and downward directions relative to the reference mode. Here, the upward direction relative to the reference mode indicates the direction in which intra-prediction modes with indices larger than the reference mode exist, and the downward direction relative to the reference mode indicates the direction in which intra-prediction modes with indices smaller than the reference mode exist. In other words, the upward direction indicates the direction in which the index difference with the reference mode is positive, and the downward direction indicates the direction in which the index difference with the reference mode is negative.

[0425] Meanwhile, if there are no more intra-prediction modes in the upward or downward direction of the reference mode, the order can be recalculated from the intra-prediction modes in the opposite direction. For example, if the search up to directional prediction mode 66 in the upward direction is completed, the next step is to start the search from directional prediction mode 2. Similarly, if the search up to directional prediction mode 2 in the downward direction is completed, the next step is to start the search up to directional prediction mode 66.

[0426] Reference modes can be reassigned by increasing the absolute value of the offset in the upward or downward direction. For example, assume that mode 66, which has the lowest cost among the remaining intra prediction modes, is set as the reference mode. In this case, index 0 can be reassigned to mode 66.

[0427] Next, the next index can be assigned to the remaining intra-prediction mode closest to the reference mode in the downward direction of the reference mode. Accordingly, in Table 5, index 1 is exemplified as being reassigned to mode 65.

[0428] Next, the next index can be assigned to the remaining intra-prediction mode closest to the reference mode in the upward direction of the reference mode. Accordingly, in Table 5, index 2 is shown as being reassigned to mode 4.

[0429] Next, the next index can be assigned to the remaining intra-prediction mode that is the second closest to the reference mode in the downward direction of the reference mode. Accordingly, in Table 5, index 3 is shown to have been reassigned to mode 64.

[0430] Next, the next index can be assigned to the remaining intra-prediction mode that is the second closest to the reference mode in the upward direction from the reference mode. Accordingly, in Table 5, it is exemplified that index 3 was reassigned to mode 5. Through the above reassignment process, the indices of the remaining intra-prediction modes that are not included in the intra-prediction mode candidate list can be reassigned. Additionally, by reassigning indices in the downward and upward directions alternately based on the reference mode, the index values ​​reassigned to the remaining intra-prediction modes can be reduced as they get closer to the reference mode.

[0431] Meanwhile, the result of summing the offsets to the reference modes can be configured to indicate only the directional mode. Alternatively, it can be configured to indicate the remaining mode where the index has not been reallocated, without distinguishing between directional and non-directional modes.

[0432] Alternatively, if a predefined intra prediction mode among the remaining intra prediction modes is not included in the intra prediction mode candidate list, the smallest index may be reassigned to a predefined intra prediction mode. Here, the predefined intra prediction mode may be a planar mode, a DC mode, or a pre-set directional prediction mode.

[0433] Alternatively, the indices of the remaining intra-prediction modes can be reassigned in order of amplitude values ​​on the histogram.

[0434]

[0435] Figure 29 is a diagram illustrating the process of performing inter-prediction in the encoder and decoder.

[0436] As shown in the example illustrated in FIG. 29, motion information for the current block can be obtained to perform inter-prediction (S2910). Here, the motion information may include at least one of a motion vector, a reference picture index, or a weight applied to the prediction block. For the current block, motion information for at least one of the L0 direction or the L1 direction may be obtained.

[0437] In the encoder, motion information of the current block can be derived through motion estimation, and the derived motion information can be encoded and signaled to the decoder. Meanwhile, the encoding / decoding of motion information may be based on a motion information merging mode, a motion vector prediction mode, a template-based motion estimation method, or a two-way matching method, which will be described later.

[0438] In the decoder, movement information of the current block can be derived based on the information transmitted from the encoder.

[0439] Alternatively, motion information of the current block can be derived in the decoder in the same way as in the encoder. This method can be referred to as decoder-side motion estimation.

[0440] When motion information of the current block is induced, a prediction block for the current block can be obtained based on the induced motion information (S2920). For example, a reference block spaced apart by a motion vector from the position of the current block in a reference picture can be set as the prediction block of the current block.

[0441] Below, the process of performing inter-prediction will be explained in more detail.

[0442] The motion information of the current block can be generated through motion estimation.

[0443] Figure 30 shows an example where motion estimation is performed.

[0444] In Fig. 30, it was assumed that the Picture Order Count (POC) of the current picture is T, and the POC of the reference picture is (T-1).

[0445] A search range for motion estimation can be set from the same location as the reference point of the current block within the reference picture. Here, the reference point may be the location of the top-left sample of the current block.

[0446] For example, in FIG. 30, a rectangle of sizes (w0+w1) and (h0+h1) centered on a reference point is exemplified as being set as a search range. In the above example, w0, w1, h0, and h1 may have mutually identical values. Alternatively, at least one of w0, w1, h0, and h1 may be set to have a different value. Or, the sizes of w0, w1, h0, and h1 may be determined so as not to exceed the Coding Tree Unit (CTU) boundary, slice boundary, tile boundary, or picture boundary.

[0447] Within the search range, reference blocks of the same size as the current block can be set, and the cost of each reference block relative to the current block can be measured. The cost can be calculated using the similarity between the two blocks.

[0448] For example, the cost can be calculated based on the sum of the absolute differences between the original samples in the current block and the original samples (or restored samples) in the reference block. The smaller the sum of the absolute values, the lower the cost can be.

[0449] Afterward, the cost of each of the reference blocks is compared, and the reference block with the optimal cost can be set as the prediction block of the current block.

[0450] In addition, the distance between the current block and the reference block can be set as a motion vector. Specifically, the x-coordinate difference and the y-coordinate difference between the current block and the reference block can be set as a motion vector.

[0451] Furthermore, the index of the picture containing the reference block identified through motion estimation is set as the reference picture index.

[0452] In addition, the prediction direction can be set based on whether the reference picture belongs to the L0 reference picture list or the L1 reference picture list.

[0453] Additionally, motion estimation can be performed for the L0 direction and the L1 direction, respectively. If prediction is performed for both the L0 direction and the L1 direction, motion information for the L0 direction and motion information for the L1 direction can be generated, respectively.

[0454] Figures 31 and 32 show an example in which a predicted block of the current block is generated based on motion information generated through motion estimation.

[0455] Figure 31 shows an example of generating a prediction block with unidirectional (i.e., L0 direction) prediction, and Figure 32 shows an example of generating a prediction block with bidirectional (i.e., L0 and L1 directions) prediction.

[0456] In the case of unidirectional prediction, a prediction block of the current block is generated using a single motion information. For example, the motion information may include an L0 motion vector, an L0 reference picture index, and prediction direction information covering the L0 direction.

[0457] In the case of bidirectional prediction, a prediction block is generated using two sets of motion information. For example, a reference block for the L0 direction, specified based on motion information for the L0 direction (L0 motion information), can be set as the L0 prediction block, and a reference block for the L1 direction, specified based on motion information for the L1 direction (L1 motion information), can be generated as the L1 prediction block. Subsequently, the prediction block of the current block can be generated by performing a weighted sum of the L0 prediction block and the L1 prediction block.

[0458] In the example illustrated in FIGS. 30 to 32, the L0 reference picture is shown as existing in the direction before the current picture (i.e., having a smaller POC value than the current picture), and the L1 reference picture is shown as existing in the direction after the current picture (i.e., having a larger POC value than the current picture).

[0459] However, unlike the illustrated example, the L0 reference picture may exist in the direction after the current picture, or the L1 reference picture may exist in the direction before the current picture. For example, both the L0 reference picture and the L1 reference picture may exist in the direction before the current picture, or both may exist in the direction after the current picture. Alternatively, bidirectional prediction may be performed using the L0 reference picture existing in the direction after the current picture and the L1 reference picture existing in the direction before the current picture.

[0460] The motion information of the block for which inter-prediction has been performed can be stored in memory. In this case, the motion information can be stored on a sample basis. Specifically, the motion information of the block to which a specific sample belongs can be stored as the motion information of that specific sample. The stored motion information can be used to derive the motion information of neighboring blocks to be encoded / decoded in the future.

[0461] In the encoder, information encoding residual samples corresponding to the difference value between the sample of the current block (i.e., the original sample) and the prediction sample, and motion information necessary to generate the prediction block, can be signaled to the decoder. In the decoder, information regarding the signaled difference value is decoded to derive a difference sample, and a prediction sample within the prediction block generated using the motion information is added to the difference sample to generate a reconstructed sample.

[0462] At this time, one of a plurality of inter-prediction modes may be selected to effectively compress motion information signaled to the decoder. Here, the plurality of inter-prediction modes may include a motion information merging mode and a motion vector prediction mode.

[0463] The motion vector prediction mode is a mode that signals by encoding the difference between the motion vector and the motion vector prediction value. Here, the motion vector prediction value can be derived based on motion information of surrounding blocks or surrounding samples adjacent to the current block.

[0464] Figure 33 shows the location referenced to derive the motion vector prediction value.

[0465] For the sake of convenience of explanation, the current block is assumed to have a size of 4x4.

[0466] In the illustrated example, 'LB' represents a sample contained in the leftmost column and bottom row within the current block. 'RT' represents a sample contained in the rightmost column and top row within the current block. A0 through A4 represent samples adjacent to the left of the current block, and B0 through B5 represent samples adjacent to the top of the current block. For example, A1 represents a sample adjacent to the left of LB, and B1 represents a sample adjacent to the top of RT.

[0467] Col indicates the location of a sample adjacent to the bottom-right of the current block within the co-located picture. The co-located picture is a picture distinct from the current picture, and information to identify the co-located picture (e.g., co-located picture index) can be explicitly encoded and signaled in the bitstream. Alternatively, a reference picture having a predefined reference picture index can be set as the co-located picture.

[0468] The motion vector prediction value of the current block can be derived from at least one motion vector prediction candidate included in the Motion Vector Prediction List.

[0469] The number of motion vector prediction candidates that can be inserted into the motion vector prediction list (i.e., the size of the list) may be predefined in the encoder and decoder. For example, the maximum number of motion vector prediction candidates may be 2.

[0470] A motion vector stored at the location of a neighbor sample adjacent to the current block, or a scaled motion vector derived by scaling the said motion vector, can be inserted into the motion vector prediction list as a motion vector prediction candidate. At this time, the motion vector prediction candidate can be derived by scanning the neighbor samples adjacent to the current block according to a predefined order.

[0471] For example, it is possible to check whether a motion vector is stored at each location in the order from A0 to A4. Then, according to the above scan order, the first available motion vector found can be inserted into the motion vector prediction list as a motion vector prediction candidate.

[0472] As another example, checking whether a motion vector is stored at each location in the order from A0 to A4 allows the motion vector at the location with the same reference picture as the current block, found first, to be inserted into the motion vector prediction list as a motion vector prediction candidate. If no neighbor sample with the same reference picture as the current block exists, a motion vector prediction candidate can be derived based on the first available vector found. Specifically, the first available motion vector found can be scaled, and the scaled motion vector can be inserted into the motion vector prediction list as a motion vector prediction candidate. In this case, scaling can be performed based on the difference in output order between the current picture and the reference picture (i.e., POC difference) and the difference in output order between the current picture and the neighbor sample's reference picture (i.e., POC difference).

[0473] Furthermore, it is possible to check whether a motion vector is stored at each location in the order from B0 to B5. Then, according to the above scan order, the first available motion vector found can be inserted into the motion vector prediction list as a motion vector prediction candidate.

[0474] As another example, checking whether a motion vector is stored at each location in the order from B0 to B5 allows the motion vector at the location with the same reference picture as the current block, found first, to be inserted into the motion vector prediction list as a motion vector prediction candidate. If no neighbor sample with the same reference picture as the current block exists, a motion vector prediction candidate can be derived based on the first available vector found. Specifically, the first available motion vector found can be scaled, and the scaled motion vector can be inserted into the motion vector prediction list as a motion vector prediction candidate. In this case, scaling can be performed based on the difference in output order between the current picture and the reference picture (i.e., POC difference) and the difference in output order between the current picture and the neighbor sample's reference picture (i.e., POC difference).

[0475] As in the example described above, motion vector prediction candidates can be derived from samples adjacent to the left of the current block, and motion vector prediction candidates can be derived from samples adjacent to the top of the current block.

[0476] In this case, a motion vector prediction candidate derived from the left sample may be inserted into the motion vector prediction list before a motion vector prediction candidate derived from the top sample. In this case, the index assigned to the motion vector prediction candidate derived from the left sample may have a smaller value than that of the motion vector prediction candidate derived from the top sample.

[0477] Conversely, motion vector prediction candidates derived from the top sample may be inserted into the motion vector prediction list before motion vector prediction candidates derived from the left sample.

[0478] Among the motion vector prediction candidates included in the above motion vector prediction list, the motion vector prediction candidate with the highest encoding efficiency can be set as the motion vector prediction value (Motion Vector Predictor, MVP) of the current block. Additionally, index information pointing to the motion vector prediction candidate set as the motion vector prediction value of the current block among multiple motion vector prediction candidates can be encoded and signaled to the decoder. If the number of motion vector prediction candidates is two, the index information may be a 1-bit flag (e.g., an MVP flag). Furthermore, the motion vector difference value (Motion Vector Difference, MVD), which is the difference between the motion vector of the current block and the motion vector prediction value, can be encoded and signaled to the decoder.

[0479] The decoder can construct a motion vector prediction list in the same way as the encoder. Additionally, it can decode index information from the bitstream and select one of multiple motion vector prediction candidates based on the decoded index information. The selected motion vector prediction candidate can be set as the motion vector prediction value of the current block.

[0480] In addition, the motion vector difference value can be decoded from the bitstream. Subsequently, the motion vector prediction value and the motion vector difference value are combined to derive the motion vector of the current block.

[0481] When bidirectional prediction is applied to the current block, motion vector prediction lists can be generated for both the L0 and L1 directions. That is, the motion vector prediction lists can consist of motion vectors of the same direction. Accordingly, the motion vector of the current block and the motion vector prediction candidates included in the motion vector prediction lists have the same direction.

[0482] When the motion vector prediction mode is selected, the reference picture index and prediction direction information can be explicitly encoded and signaled to the decoder. For example, if multiple reference pictures exist on a reference picture list and motion estimation is performed for each of the multiple reference pictures, a reference picture index for identifying the reference picture from which the motion information of the current block was derived among the multiple reference pictures can be explicitly encoded and signaled to the decoder.

[0483] In this case, if the reference picture list contains only one reference picture, the encoding / decoding of the reference picture index may be omitted.

[0484] The prediction direction information may be an index indicating one of L0 unidirectional prediction, L1 unidirectional prediction, or bidirectional prediction. Alternatively, an L0 flag indicating whether a prediction for the L0 direction is performed and an L1 flag indicating whether a prediction for the L1 direction is performed may be encoded and signaled, respectively.

[0485] The motion information merging mode is a mode that sets the motion information of the current block to be identical to the motion information of neighboring blocks. In the motion information merging mode, motion information can be encoded or decoded using a motion information merging list.

[0486] Motion information merging candidates can be derived based on motion information from neighboring blocks or neighbor samples adjacent to the current block. For example, after defining reference locations around the current block, it is possible to check whether motion information exists at the defined reference locations. If motion information exists at the defined reference locations, the motion information at those locations can be inserted into the motion information merging list as a motion information merging candidate.

[0487] In the example of FIG. 33, the previously defined reference positions may include at least one of A0, A1, B0, B1, B5, and Col. Furthermore, motion information merging candidates can be derived in the order of A1, B1, B0, A0, B5, and Col.

[0488] The motion information of the motion information merge candidate with the optimal cost among the motion information merge candidates included in the motion information merge list can be set as the motion information of the current block. Furthermore, index information (e.g., merge index) pointing to the selected motion information merge candidate among multiple motion information merge candidates can be encoded and transmitted to a decoder.

[0489] In the decoder, a motion information merge list can be configured in the same way as in the encoder. Then, motion information merge candidates can be selected based on the merge index decoded from the bitstream. The motion information of the selected motion information merge candidate can be set as the motion information of the current block.

[0490] Unlike the motion vector prediction list, the motion information merging list consists of a single list regardless of the prediction direction. That is, the motion information merging candidates included in the motion information merging list may have only L0 motion information or L1 motion information, or they may have bidirectional motion information (i.e., L0 motion information and L1 motion information).

[0491]

[0492] Movement information of the current block can also be derived using a restoration sample area around the current block. Here, the restoration sample area used to derive the movement information of the current block may be referred to as a template.

[0493] Figure 34 is a diagram illustrating a template-based motion estimation method.

[0494] In FIG. 30, it was explained that the predicted block of the current block is determined based on the cost between the current block and the reference block within the search range. According to the present embodiment, unlike FIG. 30, motion estimation for the current block can be performed based on the cost between a template adjacent to the current block (hereinafter referred to as the current template) and a reference template having the same size and shape as the current template.

[0495] For example, the cost can be calculated based on the sum of the absolute differences between the restored samples in the current template and the restored samples in the reference block. The smaller the sum of the absolute values, the lower the cost can be.

[0496] When a reference template with the optimal cost and the current template within the search range is determined, a reference block adjacent to the reference template can be set as the predicted block of the current block.

[0497] Additionally, movement information of the current block can be set based on the distance between the current block and the reference block, the index of the picture to which the reference block belongs, and whether the reference picture is included in the L0 or L1 reference picture list.

[0498] Since the template is defined by the previously restored area surrounding the current block, the decoder can perform motion estimation itself in the same manner as the encoder. Accordingly, when deriving motion information using a template, there is no need to encode and signal the motion information, except for information indicating whether the template is being used.

[0499] The current template may include at least one of an area adjacent to the top of the current block or an area adjacent to the left. In this case, the area adjacent to the top may include at least one row, and the area adjacent to the left may include at least one column.

[0500] Figure 35 shows examples of template configurations.

[0501] The current template can be configured following one of the examples shown in Fig. 35.

[0502] Alternatively, unlike the example shown in FIG. 35, the template may be configured with only the area adjacent to the left of the current block, or with only the area adjacent to the top of the current block.

[0503] The size and / or shape of the current template may be predefined in the encoder and decoder.

[0504] Alternatively, multiple template candidates of different sizes and / or shapes can be defined, and index information specifying one of the multiple template candidates can be encoded and signaled to a decoder.

[0505] Alternatively, one of a plurality of template candidates may be adaptively selected based on at least one of the size, shape, or location of the current block. For example, if the current block touches the top boundary of the CTU, the current template may be configured using only the area adjacent to the left of the current block.

[0506] Motion estimation based on a template can be performed for each of the reference pictures stored in the reference picture list. Alternatively, motion estimation can be performed for only some of the reference pictures. For example, motion estimation can be performed only for the reference picture with a reference picture index of 0, or only for reference pictures with a reference picture index smaller than a threshold value, or for reference pictures with a POC difference with the current picture smaller than a threshold value.

[0507] Alternatively, after explicitly encoding and signaling the reference picture index, motion estimation can be performed only on the reference picture pointed to by the reference picture index.

[0508] Alternatively, motion estimation can be performed on a reference picture of a neighbor block corresponding to the current template. For example, if the template consists of a left neighbor area and a top neighbor area, at least one reference picture can be selected using at least one of the reference picture index of the left neighbor block or the reference picture index of the top neighbor block. Subsequently, motion estimation can be performed on the selected at least one reference picture.

[0509] Information indicating whether template-based motion estimation has been applied can be encoded and signaled to a decoder. The information may be a 1-bit flag. For example, if the flag is true (1), it indicates that template-based motion estimation is applied to the L0 and L1 directions of the current block. On the other hand, if the flag is false (0), it indicates that template-based motion estimation is not applied. In this case, motion information of the current block can be derived based on a motion information merging mode or a motion vector prediction mode.

[0510] Conversely to the above, if it is determined that the motion information merging mode and the motion vector prediction mode are not applied to the current block, then a template-based motion estimation may be applied. For example, if a first flag indicating whether the motion information merging mode is applied and a second flag indicating whether the motion vector prediction mode is applied are both 0, then a template-based motion estimation may be performed.

[0511] For each of the L0 and L1 directions, information indicating whether template-based motion estimation has been applied can be signaled. That is, whether template-based motion estimation is applied to the L0 direction and whether it is applied to the L1 direction can be determined independently of each other. Accordingly, while template-based motion estimation is applied to either the L0 or L1 direction, another mode (e.g., motion information merging mode or motion vector prediction mode) may be applied to the other.

[0512] If template-based motion estimation is applied to both the L0 and L1 directions, the prediction block of the current block can be generated based on the weighted sum operation of the L0 prediction block and the L1 prediction block. Alternatively, even if template-based motion estimation is applied to one of the L0 and L1 directions, but another mode is applied to the other, the prediction block of the current block can be generated based on the weighted sum operation of the L0 prediction block and the L1 prediction block.

[0513] Alternatively, a template-based motion estimation method may be inserted as a motion information merging candidate in the motion information merging mode or as a motion vector prediction candidate in the motion vector prediction mode. In this case, whether to apply the template-based motion estimation method may be determined based on whether the selected motion information merging candidate or the selected motion vector prediction candidate points to the template-based motion estimation method.

[0514] Based on the two-way matching method, movement information of the current block can also be generated.

[0515] Figure 36 is a diagram illustrating a motion estimation method based on a two-way matching method.

[0516] The two-way matching method can be performed only when the temporal order of the current picture (i.e., POC) exists between the temporal order of the L0 reference picture and the temporal order of the L1 reference picture.

[0517] When a two-way matching method is applied, a search range can be set for each of the L0 reference picture and the L1 reference picture. In this case, an L0 reference picture index for identifying the L0 reference picture and an L1 reference picture index for identifying the L1 reference picture can be encoded and signaled, respectively.

[0518] As another example, only the L0 reference picture index is encoded and signaled, and an L1 reference picture can be selected based on the distance between the current picture and the L0 reference picture (hereinafter referred to as the L0 POC difference). For example, among the L1 reference pictures included in the L1 reference picture list, an L1 reference picture can be selected in which the absolute value of the distance from the current picture (hereinafter referred to as the L1 POC difference) is equal to the absolute value of the distance between the current picture and the L0 reference picture. If there is no L1 reference picture having an L1 POC difference identical to the L0 POC difference, the L1 reference picture among the L1 reference pictures in which the L1 POC difference is most similar to the L0 POC difference can be selected.

[0519] At this time, among the L1 reference pictures, only L1 reference pictures that have a different temporal direction from the L0 reference picture can be used for two-way matching. For example, if the POC of the L0 reference picture is smaller than that of the current picture, one of the L1 reference pictures with a POC larger than that of the current picture can be selected.

[0520] Conversely to the above, only the L1 reference picture index is encoded and signaled, and the L0 reference picture is selected based on the distance between the current picture and the L1 reference picture.

[0521] Alternatively, a two-way matching method may be performed using the L0 reference picture closest to the current picture among the L0 reference pictures and the L1 reference picture closest to the current picture among the L1 reference pictures.

[0522] Alternatively, a two-way matching method may be performed using an L0 reference picture (e.g., index 0) assigned to a previously defined index in the L0 reference picture list and an L1 reference picture (e.g., index 0) assigned to a previously defined index in the L1 reference picture list.

[0523] Alternatively, LX (X is 0 or 1) reference picture may be selected based on an explicitly signaled reference picture index, and L|X-1| reference picture may be selected as the reference picture closest to the current picture among L|X-1| reference pictures, or as a reference picture having a predefined index within the L|X-1| reference picture list.

[0524] As another example, L0 and / or L1 reference pictures can be selected based on movement information of neighbor blocks of the current block. For example, L0 and / or L1 reference pictures to be used for bidirectional matching can be selected using the reference picture index of the left or top neighbor block of the current block.

[0525] The search range can be set within a predetermined range from the collocated blocks within the reference picture.

[0526] As another example, the search range can be set based on initial movement information. The initial movement information can be derived from the neighbor blocks of the current block. For example, the movement information of the current block's left neighbor block or top neighbor block can be set as the current block's initial movement information.

[0527] When the two-way matching method is applied, the L0 motion vector and the L1 motion vector are set in opposite directions. This indicates that the sign of the L0 motion vector and the L1 motion vector have opposite signs. Additionally, the magnitude of the LX motion vector can be proportional to the distance between the current picture and the LX reference picture (i.e., the POC difference).

[0528] Subsequently, motion estimation can be performed using the cost between a reference block (hereinafter referred to as the L0 reference block) within the search range of the L0 reference picture and a reference block (hereinafter referred to as the L1 reference block) within the search range of the L1 reference picture.

[0529] If an L0 reference block is selected with a vector (x, y) with respect to the current block, an L1 reference block can be selected at a location spaced (-Dx, -Dy) away from the current block. Here, D can be determined by the ratio of the distance between the current picture and the L0 reference picture to the distance between the L1 reference picture and the current picture.

[0530] For example, in the example illustrated in FIG. 36, the absolute value of the distance between the current picture (T) and the L0 reference picture (T-1) and the absolute value of the distance between the current picture (T) and the L1 reference picture (T+1) are mutually identical. Accordingly, in the illustrated example, the L0 motion vector (x0, y0) and the L1 motion vector (x1, y1) have the same magnitude but opposite distances. If the L1 reference picture with POC (T+2) is used, the L1 motion vector (x1, y1) will be set to (-2*x0, -2*y0).

[0531] When the L0 reference block and L1 reference block having the optimal cost are selected, the L0 reference block and L1 reference block can be set as the L0 prediction block and L1 prediction block of the current block, respectively. Subsequently, the final prediction block of the current block can be generated through a weighted sum operation of the L0 reference block and L1 reference block.

[0532] When a two-way matching method is applied, the decoder can perform motion estimation in the same way as the encoder. Accordingly, information indicating whether a two-way motion matching method is applied is explicitly encoded / decoded, while the encoding / decoding of motion information, such as motion vectors, can be omitted. As previously explained, at least one of the L0 reference picture index or the L1 reference picture index may be explicitly encoded / decoded.

[0533] As another example, information indicating whether a two-way matching method has been applied may be explicitly encoded / decoded; if the two-way matching method has been applied, the L0 motion vector or the L1 motion vector may be explicitly encoded and signaled. If the L0 motion vector is signaled, the L1 motion vector can be derived based on the POC difference between the current picture and the L0 reference picture and the POC difference between the current picture and the L1 reference picture. If the L1 motion vector is signaled, the L0 motion vector can be derived based on the POC difference between the current picture and the L0 reference picture and the POC difference between the current picture and the L1 reference picture. In this case, the encoder may explicitly encode the smaller of the L0 motion vector and the L1 motion vector.

[0534] Information indicating whether a two-way matching method is applied may be a 1-bit flag. For example, if the flag is true (e.g., 1), it may indicate that a two-way matching method is applied to the current block. If the flag is false (e.g., 0), it may indicate that a two-way matching method is not applied to the current block. In this case, a motion information merging mode or a motion vector prediction mode may be applied to the current block.

[0535] Conversely to the above, a two-way matching method may be applied only when it is determined that the motion information merging mode and the motion vector prediction mode are not applied to the current block. For example, if both the first flag indicating whether the motion information merging mode is applied and the second flag indicating whether the motion vector prediction mode is applied are 0, the two-way matching method may be applied.

[0536] Alternatively, a two-way matching method may be inserted as a motion information merging candidate in the motion information merging mode or as a motion vector prediction candidate in the motion vector prediction mode. In this case, whether to apply the two-way matching method may be determined based on whether the selected motion information merging candidate or the selected motion vector prediction candidate points to the two-way matching method.

[0537] In the two-way matching method, it was exemplified that the temporal order of the current picture must exist between the temporal order of the L0 reference picture and the temporal order of the L1 reference picture. A one-way matching method, to which the constraints of the above two-way matching method do not apply, may be applied to generate a predicted block of the current block. Specifically, in the one-way matching method, two reference pictures with a temporal order (i.e., POC) smaller than the current block or two reference pictures with a temporal order larger than the current block may be used. In this case, both of the two reference pictures may be derived from the L0 reference picture list or the L1 reference picture list. Alternatively, one of the two reference pictures may be derived from the L0 reference picture list and the other from the L1 reference picture list.

[0538] Figure 37 is a diagram illustrating a motion estimation method based on a unidirectional matching method.

[0539] A unidirectional matching method can be performed based on two reference pictures (i.e., Forward reference pictures) that have a POC smaller than the current picture or two reference pictures (i.e., Backward reference pictures) that have a POC larger than the current picture. In FIG. 37, motion estimation based on a unidirectional matching method is exemplified as being performed based on a first reference picture (T-1) and a second reference picture (T-2) that have a POC smaller than the current picture (T).

[0540] At this time, a first reference picture index for identifying the first reference picture and a second reference picture index for identifying the second reference picture can each be encoded and signaled. At this time, among the two reference pictures used in the unidirectional matching method, the reference picture with a smaller POC difference with the current picture can be set as the first reference picture. Accordingly, when the first reference picture is selected, only reference pictures among the reference pictures included in the reference picture list that have a POC difference with the current picture greater than that of the first reference picture can be set as the second reference picture. The second reference picture index can be set to point to the index of one of the reordered reference pictures after reordering the reference pictures that have the same temporal direction as the first reference picture and have a POC difference with the current picture greater than that of the first reference picture.

[0541] Conversely to the above, the reference picture with the larger POC difference with the current picture among the two reference pictures may be set as the first reference picture. In this case, the index of the second reference picture may be set to point to the index of one of the reordered reference pictures after reordering the reference pictures that have the same temporal direction as the first reference picture and have a smaller POC difference with the current picture than the first reference picture.

[0542] Alternatively, a unidirectional matching method may be performed using a reference picture assigned to a predefined index within the reference picture list and a reference picture having the same temporal direction. For example, a reference picture with an index of 0 within the reference picture list may be set as the first reference picture, and among the reference pictures with the same temporal direction as the first reference picture within the reference picture list, the reference picture with the smallest index may be selected as the second reference picture.

[0543] Both the first reference picture and the second reference picture can be selected from the L0 reference picture list or the L1 reference picture list. In FIG. 37, two L0 reference pictures are shown being used in a unidirectional matching method. Alternatively, the first reference picture may be selected from the L0 reference picture list and the second reference picture may be selected from the L1 reference picture list.

[0544] Information indicating whether the first reference picture and / or the second reference picture belongs to the L0 reference picture list or the L1 reference picture list may be additionally encoded / decoded.

[0545] Alternatively, unidirectional matching can be performed using one of the L0 reference picture list and the L1 reference picture list set as the default. Alternatively, two reference pictures can be selected from the L0 reference picture list and the L1 reference picture list that has a larger number of reference pictures.

[0546] Afterwards, a search range can be set within the first reference picture and the second reference picture.

[0547] The search range can be set within a predetermined range from the collocated blocks within the reference picture.

[0548] As another example, the search range can be set based on initial movement information. The initial movement information can be derived from the neighbor blocks of the current block. For example, the movement information of the current block's left neighbor block or top neighbor block can be set as the current block's initial movement information.

[0549] Subsequently, motion estimation can be performed using the cost between the first reference block within the search range of the first reference picture and the second reference block within the search range of the second reference picture.

[0550] At this time, under the unidirectional matching method, the magnitude of the motion vector should be set to increase in proportion to the distance between the current picture and the reference picture. Specifically, if a first reference block is selected with a vector (x, y) with respect to the current picture, the second reference block should be separated from the current block by (Dx, Dy). Here, D can be determined by the ratio of the distance between the current picture and the first reference picture to the distance between the current picture and the second reference picture.

[0551] For example, in the example of FIG. 37, the distance between the current picture and the first reference picture (i.e., POC difference) is 1, and the distance between the current picture and the second reference picture (i.e., POC difference) is 2. Accordingly, if the first motion vector for the first reference block in the first reference picture is (x0, y0), the second motion vector (x1, y1) for the second reference block in the second reference picture can be set to (2x0, 2y0).

[0552] When a first reference block and a second reference block having optimal costs are selected, the first reference block and the second reference block can be set as the first prediction block and the second prediction block of the current block, respectively. Subsequently, the final prediction block of the current block can be generated through a weighted sum operation of the first prediction block and the second prediction block.

[0553] When a unidirectional matching method is applied, the decoder can perform motion estimation in the same way as the encoder. Accordingly, information indicating whether a unidirectional motion matching method is applied is explicitly encoded / decoded, while the encoding / decoding of motion information, such as motion vectors, can be omitted. As previously explained, at least one of the first reference picture index or the second reference picture index may be explicitly encoded / decoded.

[0554] As another example, information indicating whether a unidirectional matching method has been applied may be explicitly encoded / decoded, and if a unidirectional matching method has been applied, a first motion vector or a second motion vector may be explicitly encoded and signaled. If the first motion vector is signaled, the second motion vector may be derived based on the POC difference between the current picture and the first reference picture and the POC difference between the current picture and the second reference picture. If the second motion vector is signaled, the first motion vector may be derived based on the POC difference between the current picture and the first reference picture and the POC difference between the current picture and the second reference picture. In this case, the encoder may explicitly encode the one with the smaller magnitude between the first motion vector and the second motion vector.

[0555] Information indicating whether a unidirectional matching method is applied may be a 1-bit flag. For example, if the flag is true (e.g., 1), it may indicate that a unidirectional matching method is applied to the current block. If the flag is false (e.g., 0), it may indicate that a unidirectional matching method is not applied to the current block. In this case, a motion information merging mode or a motion vector prediction mode may be applied to the current block.

[0556] Conversely to the above, a unidirectional matching method may be applied only when it is determined that the motion information merging mode and the motion vector prediction mode are not applied to the current block. For example, if both the first flag indicating whether the motion information merging mode is applied and the second flag indicating whether the motion vector prediction mode is applied are 0, a unidirectional matching method may be applied.

[0557] Alternatively, a unidirectional matching method may be inserted as a motion information merging candidate in the motion information merging mode or as a motion vector prediction candidate in the motion vector prediction mode. In this case, whether to apply the unidirectional matching method may be determined based on whether the selected motion information merging candidate or the selected motion vector prediction candidate points to the unidirectional matching method.

[0558] By adjusting the precision of the motion vector, the movement of an object between frames can also be detected. Specifically, the position of each pixel within a picture is specified as an integer. On the other hand, the movement of an object between frames may not be represented by an integer position.

[0559] Considering this, motion vectors can be explored in fractional pixel units by performing interpolation on the reference picture.

[0560] Figures 38 and 39 illustrate examples in which prediction blocks are generated according to the precision of the motion vectors.

[0561] FIG. 38 shows the position of the current block in the current picture, and FIG. 39 illustrates an example in which a predicted block is acquired according to a motion vector.

[0562] Specifically, FIG. 39 (a) shows an example where the motion vector precision is in integer pixel units, and FIG. 39 (b) and (c) show examples where the motion vector precision is in 1 / 2 pixel units and 1 / 4 pixel units, respectively.

[0563] Motion vector precision can also be set in units smaller than those described. For example, motion vector precision can be set in units of 1 / 8 pixel, 1 / 16 pixel, or 1 / 32 pixel.

[0564] When the motion vector of the current block is expressed in integer units, a reference block composed of integer position samples can be set as the prediction block of the current block, as in the example illustrated in FIG. 39 (a).

[0565] On the other hand, when the motion vector of the current block is expressed in fractional units, a reference block composed of fractional position samples can be set as the prediction block of the current block, as in the examples illustrated in FIG. 39 (b) and (c). In this case, the fractional position samples within the reference block can be generated by interpolating integer position samples. The interpolation filter can have a size of 4 taps or 8 taps.

[0566] As another example, to reduce complexity, fractional position samples can be generated through linear interpolation using only integer position samples adjacent to the fractional position.

[0567] Information indicating the motion vector precision of the current block can be encoded and signaled. For example, after assigning different indices to each of multiple motion vector precision candidates, the index of the motion vector precision candidate corresponding to the motion vector precision of the current block can be encoded and signaled.

[0568] At this time, the number and / or types of available motion vector candidates may be determined based on at least one of the size of the current block, the shape of the current block, the reference picture, or the motion compensation model. Here, the motion compensation model may include at least one of a translation model, a zooming model, or a rotation model. A motion compensation model in which at least one of a zooming model or a rotation model is combined with a translation model may be referred to as an affine model.

[0569] An index indicating one of the motion vector candidates available for the current block can be encoded. Depending on the number of motion vector candidates available for the current block, the maximum number of bits required to encode the index can be determined.

[0570] By adjusting the precision of the motion vector, the motion vector can be explored more precisely, and accordingly, the prediction accuracy for the current block can be improved.

[0571] Meanwhile, motion vectors expressed as fractional positions can be scaled up to integers and encoded.

[0572] Compensation for the movement of an object may be performed based on at least one of a translation model to compensate for linear movement of the object (e.g., movement in the horizontal and / or vertical directions), a zooming model to compensate for changes in the size of the object, and a rotation model to compensate for rotational movement of the object. Here, zooming may refer to enlargement or reduction in size.

[0573] FIG. 40 shows an example in which motion compensation based on a translational model and a zooming model is performed for the current block.

[0574] For the convenience of explanation, the current block is assumed to have a size of 4x4, as shown in FIG. 40.

[0575] In FIG. 40, the variable α represents the scaling parameter. The size of the reference block can be derived by multiplying the size of the current block by the variable α.

[0576] A scaling parameter α less than 1 indicates that the reference block is smaller than the current block, and a scaling parameter α greater than 1 indicates that the reference block is larger than the current block.

[0577] Figures 40 (a) and (b) show examples where the scaling parameter α is less than 1, and Figure 40 (c) shows an example where the scaling parameter α is greater than 1.

[0578] Based on the motion vector of the current block, the top-left position of the reference block can be determined. Specifically, the top-left position of the reference block can be set to a position offset by the motion vector from the position corresponding to the top-left sample of the current block within the reference picture. Subsequently, a reference block can be set such that its width and height are each α times the width and height of the current block, respectively, according to a scaling parameter. Fractional position samples within the reference block can be generated by interpolating integer position samples.

[0579] The reference block derived by the motion vector and scaling parameter can be set as the prediction block of the current block.

[0580] Meanwhile, information regarding the size adjustment parameter α can be encoded and signaled. Specifically, a different index is assigned to each of the multiple size adjustment parameter candidates, and an index specifying the size adjustment parameter candidate applied to the current block can be encoded and signaled.

[0581] Alternatively, the size adjustment parameter of the current block may be derived based on the size adjustment parameter of a neighbor block. For example, the size adjustment parameter of a neighbor block at a predefined location can be set as the size adjustment parameter of the current block.

[0582] Alternatively, when multiple neighbor blocks are searched sequentially, the size adjustment parameter of the first available neighbor block found can be set as the size adjustment parameter of the current block.

[0583] Alternatively, a size control parameter of a neighboring block can be set as a size control parameter candidate. In this case, a list of size control parameter candidates containing multiple size control parameter candidates can be generated by sequentially searching multiple neighboring blocks. One of the multiple size control parameter candidates included in the list of multiple size control parameter candidates can be set as the size control parameter of the current block. In this case, an index indicating a candidate among the multiple size control parameter candidates that is identical to the size control parameter of the current block can be encoded and signaled.

[0584] Meanwhile, the neighbor blocks used to derive the size adjustment parameters of the current block may include at least one of the top neighbor block, left neighbor block, top-left neighbor block, top-right neighbor block, or bottom-left neighbor block.

[0585] Figure 41 shows an example in which motion compensation based on a translational model and a rotational model is performed for the current block.

[0586] For the convenience of explanation, the current block is assumed to have a size of 4x4, as shown in FIG. 38.

[0587] First, as in the example illustrated in FIG. 41 (a), the position of a temporary block within a reference picture can be determined based on the motion vector of the current block. Specifically, a block position can be determined by taking a position spaced apart by the motion vector from a position corresponding to the top-left sample of the current block within the reference picture as the top-left sample.

[0588] Afterwards, the temporary block can be rotated as in the example shown in FIG. 41 (b). The block at the rotated position is set as a reference block, and the reference block can be set as a prediction block of the current block.

[0589] Meanwhile, a rotation matrix may be used when rotating a temporary block specified by a motion vector. That is, the predicted sample for the current block can be set to a sample at a position obtained by applying a rotation matrix to the sample position within the temporary block.

[0590] Mathematical equation 9 represents the rotation matrix.

[0591]

[0592] In the above mathematical equation 9, (pos_x, pos_y) represents the position of a sample within a temporary block. That is, (pos_x, pos_y) can be derived by adding a motion vector to the position of the target sample to be predicted within the current block.

[0593] (pos_x', pos_y') represents the position rotated from the position of the sample within the temporary block, and θ represents the rotation angle.

[0594] The sample value at position (pos_x', pos_y') within the reference picture can be set as the value of the predicted sample for the position of the sample to be predicted. If position (pos_x', pos_y') is a fractional position, the sample at that position can be generated by interpolating integer position samples.

[0595] Meanwhile, information representing the rotation angle θ can be encoded and signaled. For example, after assigning different indices to each of a plurality of rotation angle candidates, the index of the rotation angle candidate corresponding to the rotation angle of the current block can be encoded and signaled.

[0596] Alternatively, the rotation angle of the current block can be derived based on the rotation angle of a neighbor block. For example, the rotation angle of a neighbor block at a predefined position can be set as the rotation angle of the current block.

[0597] Alternatively, when multiple neighbor blocks are searched sequentially, the rotation angle of the first available neighbor block found can be set as the rotation angle of the current block.

[0598] Alternatively, the rotation angle of a neighboring block can be set as a rotation angle candidate. In this case, a rotation angle candidate list containing multiple rotation angle candidates can be generated by sequentially searching multiple neighboring blocks. One of the multiple rotation angle candidates included in the list of multiple rotation angle candidates can be set as the rotation angle of the current block. In this case, an index indicating the candidate among the multiple rotation angle candidates that is identical to the rotation angle of the current block can be encoded and signaled.

[0599] Meanwhile, the neighbor block used to induce the rotation angle of the current block may include at least one of the top neighbor block, left neighbor block, top-left neighbor block, top-right neighbor block, or bottom-left neighbor block.

[0600] Although not explicitly stated, motion compensation for the current block can also be performed by simultaneously applying translational, zooming, and rotational models.

[0601] Meanwhile, the motion vector precision for the current block or the number and / or types of motion vector precision candidates available for the current block may be determined differently depending on the motion compensation model.

[0602] For example, the number and / or types of motion vector precision candidates available for the current block may differ between the case where only a translation model is applied and the case where at least one of a zooming model or a rotation model is applied.

[0603] As a specific example, when a translation model is applied to the current block, candidates of at least 1 / 4 pixel unit may be available for the current block. On the other hand, when at least one of a zooming model or a rotation model is additionally applied along with the translation model to the current block, candidates of at least 1 / 16 pixel unit may be available for the current block.

[0604] Alternatively, if a translation model is applied to the current block, the motion vector precision of the current block may be set to 1 / 4 pixel units. On the other hand, if at least one of a zooming model or a rotation model is additionally applied to the current block along with the translation model, the motion vector precision of the current block may be set to 1 / 16 pixel units.

[0605] Meanwhile, available motion vector precision or available motion vector precision candidates for each motion compensation model may be stored in the encoder and decoder. Alternatively, information representing available motion vector precision or available motion vector precision candidates for each motion compensation model may be encoded and signaled through an upper header.

[0606] Motion compensation for an affine model, to which a zooming model and / or a rotation model are added to a translation model, can be performed using the motion vector of a control point. Here, the control point may correspond to a corner of the current block. For example, to perform motion compensation based on an affine model, at least one of the motion vector of the top-left corner, the motion vector of the top-right corner, or the motion vector of the bottom-left corner may be used.

[0607] Hereinafter, the motion vector of a control point will be referred to as the control point motion vector.

[0608] Figures 42 and 43 show an example of generating a prediction block for the current block using control point motion vectors.

[0609] For the convenience of explanation, the current block is assumed to have a size of 4x4, as shown in FIG. 38.

[0610] In FIG. 42 (a) and (b), a prediction block for the current block is exemplified by the motion vector of the first control point corresponding to the top-left corner of the current block (first control point motion vector, A) and the motion vector of the second control point corresponding to the top-right corner of the current block (second control point motion vector, B).

[0611] Beyond the illustrated examples, it is also possible to derive the predicted block of the current block by additionally utilizing the motion vector of the bottom-left corner or by using the motion vector of the bottom-left corner instead of the top-right corner.

[0612] Figure 44 shows an example of generating a prediction block for the current block using three control point motion vectors.

[0613] In FIG. 44 (a) and (b), a prediction block for the current block is exemplified by the motion vector of the first control point corresponding to the upper-left corner of the current block (first control point motion vector, A), the motion vector of the second control point corresponding to the upper-right corner of the current block (second control point motion vector, B), and the motion vector of the third control point corresponding to the lower-left corner of the current block (third control point motion vector, C).

[0614] As shown in the examples illustrated in FIGS. 42 to 44, translation, zooming, and rotational movement compensation for the current block can be performed using two or three control point movement vectors.

[0615] Information indicating the number of control point motion vectors can be encoded and signaled. The information can be signaled in blocks. For example, the information can indicate whether two control point motion vectors or three control point motion vectors are used in the current block.

[0616] Alternatively, the number of control point motion vectors can be adaptively determined based on at least one of the size or shape of the current block.

[0617] Alternatively, if the control point motion vectors of the current block are derived from neighboring blocks, the number of control point motion vectors for the current block can be set to be equal to the number of control point motion vectors of neighboring blocks.

[0618] Using control point motion vectors, sample-specific motion vectors within the current block can be derived. Equation 10 represents a formula for deriving a motion vector for each sample using two control point motion vectors.

[0619]

[0620] In the above mathematical formula 10, (mv x , mv y ) represents the motion vector at the (x, y) position within the current block. (mv Ax , mv Ay ) represents the first control point motion vector (A), and (mv Bx , mv By ) represents the second control point motion vector (B). W represents the width of the current block.

[0621] When three control point motion vectors are used, a motion vector per sample can be derived by the following mathematical formula 11.

[0622]

[0623] In the above mathematical formula 11, (mv Cx , mv Cy ) represents the third control point motion vector (C).

[0624] When motion vectors are derived for each sample, motion compensation can be performed for each sample, as in the example illustrated in FIG. 43. Specifically, a reference sample indicated by the motion vector of the sample to be predicted can be set as a prediction sample for the sample to be predicted.

[0625] Meanwhile, if the motion vector of the sample to be predicted is expressed in fractional units, integer position samples can be interpolated to generate fractional position samples, and the generated fractional position samples can be set as prediction samples for the sample to be predicted.

[0626] At this time, the precision of the motion vector for each sample may differ. For example, the motion vector for the first prediction target sample may be derived in units of 1 / 2 pixels, while the motion vector for the second prediction target sample may be derived in units of 1 / 4 pixels.

[0627] In this case, fractional position samples can be generated according to the motion vector precision for each of the prediction target samples. Alternatively, the motion vector of the prediction target sample can be adjusted according to the reference motion vector precision, and then prediction samples for the prediction target sample can be derived based on the adjusted motion vector. For example, if the reference motion vector precision is 1 / 2, the motion vector for the second prediction target sample can be adjusted in 1 / 4 pixel increments.

[0628] The reference motion vector precision can be determined in block units. Alternatively, the precision of the control point motion vectors can be set to the reference motion vector precision. Alternatively, the reference motion vector precision may be predefined in the encoder and decoder.

[0629] As another example, to reduce complexity, motion vectors can be derived at the sub-block level.

[0630] Figure 45 shows an example in which a motion vector is derived in sub-block units.

[0631] The size and / or shape of the sub-block may be predefined in the encoder and decoder. For example, the sub-block may be a square block of size 2x2 or 4x4.

[0632] Alternatively, the size and / or shape of the sub-block may be adaptively determined based on the size and / or shape of the current block. For example, if the current block is square, the sub-block may also be square. Conversely, if the current block is non-square, the sub-block may also be non-square.

[0633] Alternatively, information regarding at least one of the partitioning method or partitioning form of the current block may be explicitly encoded and signaled. For example, information regarding at least one of the size of a sub-block, the shape of a sub-block, the location of a partition line dividing the current block, or the number of partition lines may be explicitly encoded and signaled. The information may be encoded and signaled on a block-by-block basis, or it may be encoded and signaled through an upper header.

[0634] In Fig. 45, it was assumed that the sub-block is a square block of size 2x2.

[0635] The motion vector of a sub-block can be derived using the coordinates of a predefined location within the sub-block. Here, the predefined location may be one of the location of the top-left sample, the top-right sample, the bottom-left sample, the bottom-right sample, or the center location within the sub-block.

[0636] By substituting the coordinates of a predefined position within the sub-block into (x, y) of Equation 10, the motion vector of the sub-block can be derived.

[0637] As in the example described above, motion vectors can be derived in sub-block units based on an affine motion model.

[0638] Meanwhile, motion vectors can also be derived in sub-block units using collocated pictures. As described above, deriving motion vectors in sub-block units using collocated pictures can be referred to as SbTMVP (Sub-block Temporal Motion Vector Prediction).

[0639] A collocated picture may be one of the reference pictures included in the reference picture list. For example, a picture with index 0 in the reference picture list may be selected as the collocated picture.

[0640] Alternatively, information indicating the index of a reference picture set as a collocated picture within the reference picture list may be explicitly encoded and signaled.

[0641] Figures 46 and 47 show examples in which motion vectors are induced in units of sub-blocks within the current block when SbTMVP is applied.

[0642] The size and / or shape of the sub-block may be predefined in the encoder and decoder.

[0643] Alternatively, the size and / or shape of the sub-block may be adaptively determined according to the size and / or shape of the current block. For example, if at least one of the width or height of the current block is greater than a threshold value, the size of the sub-block may be set to 8x8. Otherwise, the size of the sub-block may be set to 4x4.

[0644] Alternatively, information indicating the size and / or shape of the sub-block may be explicitly encoded and signaled.

[0645] In the example illustrated in Fig. 46, it is assumed that the current block size is 16x16 and the sub-block size is 4x4.

[0646] When SbTMVP is applied, the initial motion vector of the current block can be derived. The initial motion vector can be derived based on at least one of a motion vector prediction list or a motion information merge list. For example, an index indicating one of the motion vector prediction candidates included in the motion vector prediction list can be encoded and signaled. The initial motion vector can be derived by adding a motion vector difference value to the motion vector prediction candidate indicated by the index. Meanwhile, the motion vector difference value can also be explicitly encoded and signaled.

[0647] Alternatively, the encoding of the index may be omitted, and a motion vector prediction candidate with a predefined index within the motion vector prediction list may be set as the prediction value for the initial motion vector. Here, the motion vector prediction candidate with a predefined index may be a motion vector prediction candidate with an index of 0 or a motion vector prediction candidate with the largest index.

[0648] Alternatively, an index indicating one of the motion information merge candidates included in the motion information merge list may be encoded and signaled. The initial motion vector may be set to be identical to the motion vector of the motion information merge candidate indicated by the index.

[0649] Alternatively, the encoding of the index can be omitted, and an initial motion vector can be derived based on a motion information merging candidate having a predefined index within the motion information merging list. Here, the motion information merging candidate having a predefined index may be a motion information merging candidate with an index of 0 or a motion information merging candidate with the largest index.

[0650] Alternatively, an initial motion vector can be derived using the motion vector of a neighbor block at a predefined position. Here, the neighbor block at the predefined position may be a left neighbor block or an top neighbor block.

[0651] The motion vector of a neighbor block at a predefined position can be set as the predicted value of the initial motion vector, and the initial motion vector can be derived by adding a difference value to the predicted value.

[0652] Alternatively, the motion vector of a neighbor block at a predefined position can be set as the initial motion vector.

[0653] Alternatively, the initial motion vector can be derived using a template-based motion estimation method (i.e., a template matching method) or two-way matching.

[0654] The precision of the initial motion vector may be predefined in the encoder and decoder. For example, the precision of the initial motion vector may be fixed in integer pixel units.

[0655] Alternatively, information indicating the precision of the initial motion vector may be explicitly encoded and signaled. The information may be an index indicating one of a plurality of motion vector precision candidates.

[0656] When deriving an initial motion vector using motion vector prediction candidates, motion vector prediction candidates can be derived based on the motion vector precision of the initial motion vector. That is, after adjusting the motion vector prediction candidates to match the motion vector precision of the initial motion vector, the adjusted initial motion vector prediction candidates can be inserted into the motion vector prediction list.

[0657] When deriving initial motion vectors using motion information merging candidates, motion information merging candidates can be derived based on the motion vector precision of the initial motion vectors. That is, after adjusting the motion information merging candidates according to the motion vector precision of the initial motion vectors, the adjusted initial motion information merging candidates can be inserted into the motion information merging list.

[0658] Meanwhile, among the motion information merging candidates included in the motion information merging list, only those candidates whose reference picture is identical to the collocated picture of the current block can be used to derive the initial motion vector. That is, if the reference picture of a motion information merging candidate is different from the collocated picture of the current block, the initial motion vector may not be derived from that motion information merging candidate.

[0659] If there are multiple candidates among the motion information merging candidates for which the reference picture is identical to the collocated picture of the current block, an index indicating one of the multiple candidates can be encoded and signaled. Alternatively, if there are multiple candidates among the motion information merging candidates for which the reference picture is identical to the collocated picture of the current block, an initial motion vector can be derived from the candidate with the smallest index or the candidate with the largest index among the multiple candidates.

[0660] If a motion information merging candidate has both motion information in the L0 direction and motion information in the L1 direction, one of the motion information in the L0 direction and the motion information in the L1 direction is selected according to a preset priority, and an initial motion vector can be derived from the selected motion information.

[0661] The priority can be determined based on at least one of the magnitude of the motion vector of the motion merge candidate, the index of the reference picture of the motion merge candidate, or whether the reference picture of the motion merge candidate is the same as the collocated picture.

[0662] Alternatively, it may be set to always derive an initial motion vector based on motion information in the L0 direction.

[0663] When initial motion vectors are derived based on a template matching method, motion estimation can be performed according to the precision of the initial motion vectors. For example, if the precision of the initial motion vectors is in the integer pixel unit, motion estimation based on template matching can also be performed only at integer locations.

[0664] Similarly, when an initial motion vector is derived based on two-way matching, motion estimation can be performed according to the precision of the initial motion vector.

[0665] Meanwhile, as a result of the two-way matching, a motion vector for the L0 direction (L0 motion vector) and a motion vector for the L1 direction (L1 motion vector) are derived. In this case, according to a pre-set priority, one of the L0 motion vector and the L1 motion vector can be set as the initial motion vector.

[0666] Alternatively, it may be set to always derive an initial motion vector based on motion information in the L0 direction.

[0667] Alternatively, information indicating which of the L0 motion vector and L1 motion vector is set as the initial motion vector may be encoded and signaled.

[0668] Once an initial motion vector is derived, the position of a collocated block within a collocated block can be determined using the initial motion vector. For example, a block located at a position offset by the initial motion vector from a position corresponding to the current block within a reference picture can be set as a collocated block. In this case, the position of the collocated block can be determined based on a predefined position within the current block. Here, the predefined position may be the top-left position, top-right position, bottom-left position, bottom-right position, or center position.

[0669] Depending on the division method of the current block, the collocated block can be divided into multiple collocated sub-blocks. Additionally, the motion vector of each collocated sub-block within the collocated block can be set as the motion vector of each sub-block within the current block.

[0670] As another example, the positions of collocated sub-blocks corresponding to each of the sub-blocks within the current block in the collocated picture can be determined using initial motion vectors. In this case, the positions of the collocated sub-blocks can be derived based on predefined positions within the sub-blocks. Here, the predefined positions may be the top-left, top-right, bottom-left, bottom-right, or center positions.

[0671] Subsequently, the motion vector of the collocated sub-block corresponding to the sub-block can be set as the motion vector of the sub-block. Specifically, the motion vector stored at a position corresponding to a predefined position within the sub-block within the collocated sub-block can be set as the motion vector of the sub-block.

[0672] Meanwhile, if the motion information of the collocated sub-block is unavailable, a predefined motion vector can be set as the motion vector of the sub-block. Here, the predefined motion vector may be a zero vector (i.e., (0, 0)) or an initial motion vector.

[0673] Alternatively, if the motion information of the collocated sub-block corresponding to the sub-block is unavailable, the motion vector of the sub-block may be derived from another location within the collocated sub-block.

[0674] Specifically, when a position corresponding to a predefined position within a collocated sub-block is encoded by intra-prediction, there is no motion vector at that position. For example, if a predefined position is assumed to be a central position (e.g., c10 in FIG. 47), and no motion vector is stored at the central position, the motion vector of the sub-block cannot be derived.

[0675] In this case, the motion vector of the sub-block can be derived based on the motion vector stored at a location different from the center position. Specifically, the motion vector of the sub-block can be derived from the motion vector stored at a location adjacent to the center position (e.g., top adjacent position c6, left adjacent position c9, or top-left adjacent position c5).

[0676] Alternatively, if the center location is unavailable, samples within the collocated sub-block may be searched according to the scan order, and the first available motion vector found may be set as the motion vector of the sub-block. Here, the scan order may be a horizontal scan, a vertical scan, a diagonal scan, or a raster scan.

[0677] Alternatively, if the motion information of the collocated sub-block is unavailable, the motion vector of the sub-block can be set as the motion vector of the collocated block. For example, the motion vector stored at a position corresponding to a previously defined position within the current block within the collocated block can be set as the motion vector of the sub-block.

[0678] As in the example described above, motion vectors can be derived in sub-block units using an affine motion model or SbTMVP. When motion vectors are derived in sub-block units, motion compensation can be performed for each sub-block based on the motion vector of each sub-block.

[0679] By performing motion compensation for each of the sub-blocks, a prediction block for the current block can be obtained. That is, the prediction block may be composed of prediction samples for each of the sub-blocks.

[0680] When detecting movement between frames, the precision of the motion vector can be adjusted. Specifically, the position of each sample within a picture is defined as an integer position. However, the position reflecting the movement can be a real number rather than an integer position.

[0681] Considering this, motion vectors can be explored more precisely through reference picture interpolation.

[0682] Figures 48 and 49 are diagrams illustrating examples in which a prediction block is derived according to the precision of the motion vector.

[0683] FIG. 48 shows the position of the current block in the current picture, and FIG. 49 shows the position of the reference block according to the motion vector precision.

[0684] As shown in the example illustrated in FIGS. 48 and 49, the motion vector of the current block can be defined as the distance from the sample corresponding to the top-left position of the current block in the reference picture to the sample corresponding to the top-left position of the reference block in the reference picture.

[0685] FIG. 49 (a) illustrates the case where the motion vector precision of the current block is an integer Pel, FIG. 49 (b) illustrates the case where the motion vector precision of the current block is 1 / 2 Pel. Also, FIG. 49 (c) illustrates the case where the motion vector precision of the current block is 1 / 4 Pel.

[0686] In FIG. 49, the vector precision is expressed up to 1 / 4, but the motion vector can be expressed with even greater precision, such as 1 / 8, 1 / 16, or 1 / 32.

[0687] Meanwhile, information for indicating the motion vector precision of the current block may be encoded and signaled. For example, the information may be an index identifying one of the motion vector precision candidates. Specifically, a different index may be assigned to each of the motion vector precision candidates, and the information may indicate the index of the motion vector precision candidate applied to the current block.

[0688] By adjusting the precision of the motion vectors used for cross-frame prediction, more precise motion vector detection may be possible. If the reference block indicated by the motion vector exists at a real-valued location, the samples at the real-valued location can be generated using samples at integer locations and an interpolation filter. Additionally, motion vectors represented by real numbers can be scaled up to integers for encoding / decoding.

[0689] Thus, the motion vector (MV), motion vector predicted value (MVP), and motion vector difference value (MVD) can be encoded / decoded into integer values ​​through integerization. Specifically, the motion vector, motion vector predicted value, and / or motion vector difference value can be integerized based on the motion vector precision.

[0690] For example, if the motion vector precision is 1 / N, the motion vector difference value MVD can be converted to an integer by multiplying it by N. For example, if the motion vector difference value MVD is (4 / 16, 8 / 16), the motion vector difference value MVD can be converted to an integer by multiplying it by 16. That is, the converted motion vector difference value MVD can be expressed as (4, 8).

[0691] Based on motion vector precision, the actual MVD can be derived from the integerized MVD. For example, if the motion vector precision is 1 / N, the actual MVD can be derived by dividing the integerized MVD by N. For example, if the integerized MVD is (4, 8) and the motion vector precision is 1 / 8, the actual MVD can be (4 / 8, 8 / 8). Or, if the integerized MVD is (4, 8) and the motion vector precision is 1 / 4, the actual MVD can be (4 / 4, 8 / 4).

[0692] Depending on the motion vector precision, the range of representation of the integerized MVD may differ. For example, assume that the motion vector difference value MVD is (4 / 16, 8 / 16) (i.e., (1 / 4, 2 / 4)). When the motion vector precision is 1 / 16, the integerized MVD is derived as (4, 8). On the other hand, when the motion vector precision is 1 / 4, the integerized MVD is derived as (1, 2).

[0693] Comparing the two cases above, if the motion vector precision is adjusted from 1 / 16 to 1 / 4, the value of the integerized MVD can be reduced from (4, 8) to (1, 2).

[0694] Consequently, depending on the motion vector precision, the number of bits required to encode / decode the integerized motion vector difference value MVD may vary. Accordingly, a motion vector precision that minimizes the number of bins can be selected when encoding / decoding the motion vector difference value MVD. Then, based on the selected motion vector precision, the motion vector difference value MVD can be converted to an integer, and the integerized motion vector difference value MVD can be encoded / decoded. In addition, information regarding the motion vector precision can be additionally encoded / decoded.

[0695] In the decoder, the actual MVD can be restored from the decoded MVD based on the motion vector precision. Then, the motion vector MV can be derived by combining the restored MVD and the motion vector prediction value MVP.

[0696] As described above, adjusting the value of the motion vector difference value MVD, which is encoded / decoded based on motion vector precision, is called the AMVR (Adaptive Motion Vector Resolution) method.

[0697] FIGS. 50 and FIGS. 51 are diagrams illustrating the process of encoding and decoding motion vector difference values ​​when the AMVR method is applied, respectively.

[0698] For the sake of convenience of explanation, it is assumed that the motion vector and the motion vector difference value are expressed in units of 1 / 16 before integerization is performed, and 1 / 16 is referred to as the original motion vector precision.

[0699] The motion vector difference value MVD can be derived by differencing the motion vector prediction value MVP from the motion vector MV (S5010).

[0700] The motion vector difference value MVD may consist of a horizontal component (i.e., the x-axis component) and a vertical component (i.e., the y-axis component).

[0701] When the motion vector difference value is 0, that is, when both the horizontal and vertical components are 0, the value of the motion vector difference value MVD to be encoded becomes 0 regardless of the motion vector precision. Therefore, when the motion vector difference value MVD is 0, the encoding of AMVR-related information can be omitted (S5020).

[0702] On the other hand, if the motion vector difference value is not zero, that is, if at least one of the horizontal component and the vertical component is not zero, the motion vector precision can be determined (S5030). Meanwhile, the motion vector precision can be encoded as AMVR-related information.

[0703] Information related to AMVR may include at least one of a flag (e.g., amvr_flag) indicating whether the AMVR method is applied to the current block and an index (e.g., amvr_prec_idx) indicating one of a plurality of motion precision candidates if the AMVR method is applied.

[0704] If the AMVR method is not applied to the current block, the motion vector precision can be set to a default value. In this case, amvr_flag can be encoded as a value of 0. Meanwhile, the default value can be 1, 1 / 2, 1 / 4, 1 / 8, or 1 / 16.

[0705] When the AMVR method is applied to the current block, an index indicating one of multiple motion vector precision candidates, i.e., amvr_prec_idx, may be additionally decoded. In this case, amvr_flag is encoded with a value of 1, and amvr_prec_idx may be encoded with a value from 0 to (n-1). Here, n represents the number of motion vector precision candidates. For example, multiple motion vector precision candidates may include at least one of 4, 2, 1, 1 / 2, 1 / 4, 1 / 8, or 1 / 16. Meanwhile, the default value may not be set to the multiple motion vector precision candidates indicated by the index. That is, if the motion vector precision of the current block is the default value, it is encoded and signaled as 0, which is the value of amvr_flag, and the encoding of amvr_prec_idx may be omitted.

[0706] In the encoder, the optimal motion vector precision can be determined by performing Rate Distortion Optimization (RDO) for each combination of amvr_flag and amvr_prec_idx. That is, by performing RDO for the following cases, the combination with the optimal cost can be selected.

[0707] 1) When amvr_flag is 0

[0708] 2) When amvr_flag is 1 and amvr_prec_idx is 0

[0709] 3) When amvr_flag is 1 and amvr_prec_idx is 1

[0710] 4) When amvr_flag is 1 and amvr_prec_idx is 2

[0711] Depending on the motion vector precision of the current block, a variable for scaling the motion vector difference value, i.e., a scaling parameter, can be set. For example, Table 6 shows the values ​​of the variable amvrshift according to the motion vector precision.

[0712] amvr_flagamvr_prec_idxamvrshift0 (1 / 4)-210 (1 / 2)311 (1-pel)412 (4-pel)6

[0713] If the finest motion vector precision applicable to the current block is 1 / 16, the motion vector precision can be expressed as shown in the following mathematical formula 12.

[0714]

[0715] As shown in Table 6, when the value of amvr_flag is 0, the variable amvrshift is set to 2. This indicates that the motion vector precision is 1 / 4 according to Equation 12.

[0716] When the value of amvr_flag is 1, the variable amvrshift can be determined according to the value of amvr_prec_idx. For example, when amvr_prec_idx is 1, the variable amvrshift is set to 4. This indicates that the motion vector precision is 1 according to Equation 12.

[0717] In the encoder, the motion vector difference value MVD can be scaled down and encoded using the variable amvrshift, which is based on the motion vector precision. For example, Equation 13 shows an example of a scale-down operation being performed on the motion vector difference value MVD.

[0718]

[0719] In the above mathematical formula 13, MVD_x represents the horizontal component of the motion vector difference value, and MVD_y represents the vertical component of the motion vector difference value. MVD'_x and MVD'_y represent the results of performing a scale-down operation.

[0720] The encoder can encode motion vector difference values ​​and AMVR information with changed precision (S5040).

[0721] In the decoder, the motion vector difference value MVD can be decoded (S5110).

[0722] If the motion vector difference value is 0, the decoding of AMVR-related information is omitted, and the motion vector MV of the current block can be set to be the same as the motion vector prediction value (S5120).

[0723] On the other hand, if the motion vector difference value is not zero, that is, if at least one of the horizontal component and the vertical component is not zero, information related to AMVR can be additionally decoded (S5130).

[0724] Based on AMVR information, a variable amvrshift for scaling motion vector difference values ​​can be derived. For example, as shown in the example in Table 6, a variable amvrshfit can be derived based on amvr_flag and / or amvr_prec_idx.

[0725] Afterwards, the decoded MVD can be scaled up using the variable amvrshift to obtain the motion vector difference value MVD restored to the original precision (S5140). Equation 14 shows an example of applying a scale-up operation to the decoded MVD.

[0726]

[0727] In Equation 14, MVD' represents the decoded motion vector difference value. MVD represents the motion vector difference value restored to its original precision, i.e., 1 / 16, through a scale-up operation.

[0728] Afterwards, the motion vector MV can be obtained by combining the motion vector difference value MVD restored to the original precision and the motion vector prediction value MVP.

[0729] As in the example above, when a motion vector prediction mode is applied, the decoder can derive the motion vector MV by combining the motion vector prediction value MVP and the motion vector difference value MVD.

[0730]

[0731] The predicted block of the current block can also be derived from the restored region within the current picture. Specifically, based on the block vector of the current block, a reference block within the current picture can be identified, and the reference block can be set as the predicted block of the current block. That is, the block vector can be set as the position difference between the reference block and the current block.

[0732] FIG. 52 is a diagram illustrating a search area where the prediction vector of the current block is derived.

[0733] In the example illustrated in FIG. 52, w0 and w1 are variables related to the width of the search area, and h0 and h1 are variables related to the height of the search range. Specifically, an upper restoration area of ​​size ((w0+w1) x h0) and a left restoration area of ​​size (w1 x (h0+h1)) can be set as the search range.

[0734] In the encoder, a reference block in the region most similar to the current block within the search region can be determined. Subsequently, the difference between the current block and the reference block is set as a block vector, and information regarding the block vector can be encoded and signaled so that the decoder can derive the block vector.

[0735] Meanwhile, a prediction method using block vectors can be referred to as an intra-block copy mode.

[0736] Intra-block copy mode can be applied to both the luminance and chroma components. Alternatively, it can be configured to apply Intra-block copy mode to only one of the luminance or chroma components.

[0737] Block vectors can be encoded based on a motion vector prediction mode or a motion information merging mode.

[0738] For example, similar to the case where a motion vector prediction mode is applied, a Block Vector Prediction List (BVP List) can be constructed, and the difference between a block vector predictor selected from the Block Vector Prediction List and a block vector can be encoded. Additionally, information indicating a selected block vector predictor within the Block Vector Prediction List can be encoded.

[0739] Alternatively, similar to when the motion information merging mode is applied, a block vector merging list can be constructed, and information indicating a block vector and a block vector merging candidate identical to the block vector of the current block can be encoded.

[0740] Alternatively, the block vector of the current block can be derived through template matching. That is, after searching for the reference template most similar to the template of the current block within the current picture, the positional difference between the current template and the reference template can be set as the block vector of the current block.

[0741]

[0742] As explained, when deriving prediction samples of the current block based on a directional mode, fractional position reference samples can be derived by interpolating integer position reference samples according to the angle of the directional mode. In this case, one of a plurality of interpolation filters can be used to generate fractional position reference samples.

[0743] In the encoder, an index indicating one of a plurality of interpolation filters can be encoded and signaled. In the decoder, based on the interpolation filter indicated by the index, integer position reference samples can be interpolated to generate fractional position reference samples.

[0744] Alternatively, according to predefined conditions, one of a plurality of interpolation filters can be adaptively selected. The predefined conditions may include at least one of the size of the current block (e.g., the number of samples), the index of the reference sample line, or the difference between the intra prediction mode of the current block and the reference mode. Here, the reference mode may include at least one of a horizontal direction mode (e.g., mode 18) or a vertical direction mode (e.g., mode 50).

[0745] For example, when the intra prediction mode of the current block is one of the intra prediction modes facing upward (e.g., intra prediction modes from 34 to 66), the reference mode can be set to the vertical direction mode (i.e., intra prediction mode 50).

[0746] On the other hand, when the intra prediction mode of the current block is one of the intra prediction modes facing left (e.g., intra prediction modes from 2 to 33), the reference mode can be set to the horizontal direction mode (i.e., intra prediction mode 18).

[0747] Table 7 illustrates the conditions for selecting one of a plurality of interpolation filters.

[0748] ConditionThreshold Tn < 642464 <= n <2561425>6 <= n < 10242n >= 10240

[0749] In Table 7, n represents the size of the current block, i.e., the number of samples in the current block. As shown in the example illustrated in Table 7, the threshold T can be set differently according to the size of the current block.

[0750] One of multiple interpolation filters can be selected by comparing the absolute value of the difference between the intra prediction mode and the reference mode of the current block with the threshold value.

[0751] For example, the difference D between the current block's intra prediction mode and reference mode can be calculated, and then the absolute value of the difference D can be taken.

[0752] If the absolute value of the difference |D| is greater than the threshold T, a first interpolation filter may be used. Conversely, if the absolute value of the difference |D| is not greater than the threshold T, a second interpolation filter may be used. Between the first interpolation filter and the second interpolation filter, at least one of the number of taps or filter coefficients may differ. By adaptively selecting an interpolation filter according to a predefined condition, the same interpolation filter may be used in the encoder and decoder without encoding / decoding an index indicating one of the multiple interpolation filters.

[0753] Alternatively, based on the reference area of ​​the current block, the cost of each of the multiple interpolation filters may be calculated, and one of the multiple interpolation filters may be selected based on the cost.

[0754] For example, based on the intra prediction mode of the current block, prediction samples can be obtained by performing intra prediction on the reference region of the current block. In this case, intra prediction for the reference region can be performed using each of the multiple interpolation filters.

[0755] Subsequently, based on the difference between the predicted samples within the reference region obtained through intra-prediction and the restored samples within the reference region, it can be set as the cost of the interpolation filter.

[0756] Once the cost of each of the multiple interpolation filters is calculated, it can be decided to use the interpolation filter with the smallest cost for intra prediction of the current block.

[0757] Alternatively, the indices of the interpolation filters may be rearranged in ascending order of cost, and then the index indicating one of the multiple interpolation filters may be encoded / decoded.

[0758] Alternatively, n interpolation filters may be selected from multiple interpolation filters in order of decreasing cost, and an index indicating one of the n interpolation filters may be encoded / decoded. Here, n may be a value equal to or smaller than the number of interpolation filter candidates N.

[0759] Alternatively, n prediction blocks can be obtained by selecting n interpolation filters in order of decreasing cost among multiple interpolation filters and performing intra prediction based on each of the n interpolation filters. That is, n prediction blocks can be obtained by utilizing one intra prediction mode and n interpolation filters.

[0760] Subsequently, the weighted sum of the current block can be obtained by weighting the n prediction blocks. Meanwhile, the weighted sum of the prediction blocks can be set as the final prediction block of the current block. Alternatively, the final prediction block of the current block can be obtained by correcting the weighted sum of the prediction blocks.

[0761] When performing a weighted sum operation, the weights assigned to each of the n prediction blocks can be set to the same value.

[0762] Alternatively, the weights assigned to the prediction blocks can be adaptively determined based on the cost of the interpolation filter. For example, the weights assigned to prediction blocks derived from a low-cost interpolation filter may have larger values ​​than the weights assigned to prediction blocks derived from a high-cost interpolation filter.

[0763] Even when an inter-prediction or intra-block copy mode is applied to the current block, one of multiple interpolation filters can be selected.

[0764] For example, if the motion vector or block vector of the current block indicates a fractional position, fractional position samples can be generated using one of a plurality of interpolation filters. In this case, one of the plurality of interpolation filters can be selected by calculating the cost for each of the plurality of interpolation filters.

[0765] The cost of the interpolation filter can be obtained based on the results of performing a prediction on the current block's template. For example, a reference template for the current block's template can be derived based on the current block's motion vector or block vector. In this case, the reference template can be derived using the interpolation filter, which is the subject of cost calculation. That is, if the motion vector or block vector indicates a fractional position, the reference template can be derived by interpolating integer position samples.

[0766] Subsequently, the cost of the interpolation filter can be calculated based on the difference between the restored samples and the predicted samples (i.e., the samples in the reference template) within the current block's template.

[0767] Once the cost of each of the multiple interpolation filters is calculated, it can be decided to use the interpolation filter with the smallest cost when applying inter-block prediction or intra-block copying to the current block.

[0768] Alternatively, the indices of the interpolation filters may be rearranged in ascending order of cost, and then the index indicating one of the multiple interpolation filters may be encoded / decoded.

[0769] Alternatively, n interpolation filters may be selected from multiple interpolation filters in order of decreasing cost, and an index indicating one of the n interpolation filters may be encoded / decoded. Here, n may be a value equal to or smaller than the number of interpolation filter candidates N.

[0770] Alternatively, n interpolation filters can be selected in order of smallest cost among multiple interpolation filters, and n prediction blocks can be obtained by performing inter-prediction or intra-block copying based on each of the n interpolation filters.

[0771] Subsequently, the weighted sum of the current block can be obtained by weighting the n prediction blocks. Meanwhile, the weighted sum of the prediction blocks can be set as the final prediction block of the current block. Alternatively, the final prediction block of the current block can be obtained by correcting the weighted sum of the prediction blocks.

[0772]

[0773] After performing a prediction on the current block, the residual block can be obtained by differencing the original block and the predicted block.

[0774] FIG. 53 is a flowchart of a method for encoding a residual block in an encoder, and FIG. 54 is a flowchart of a method for recovering a residual block in a decoder.

[0775] In the encoder, residual coefficients can be obtained by performing at least one of transform or quantization on the residual block (S5310, S5320). If quantization is omitted for the residual block, the residual coefficients may refer to transform coefficients obtained by transforming. Alternatively, if both transform and quantization are performed on the residual block, the residual coefficients may refer to quantized transform coefficients obtained by quantizing the transform coefficients. Alternatively, if the transformation is omitted for the residual block, the residual coefficients may be obtained by quantizing the residual samples.

[0776] In addition, the encoder can encode the residual coefficient (specifically, entropy encoding) and transmit the encoded data to the decoder (S5330).

[0777] The decoder decodes the encoded data to recover the residual coefficients (S5410). Then, it performs inverse quantization on the residual coefficients to derive transform coefficients (i.e., inverse quantized residual coefficients), and performs inverse transform on the transform coefficients to derive the residual block (S5420, S5430).

[0778] Information indicating whether a transformation is applied to the current block may be encoded and signaled. For example, transform_skip_flag may be encoded and signaled. If transform_skip_flag is 1, it indicates that no transformation is applied to the current block. Here, the transformation may include not only the first transformation described later, but also the second transformation. If transform_skip_flag is 0, it indicates that a transformation is applied to the current block. If transform_skip_flag is 0, the first transformation is necessarily applied to the current block, while the second transformation may be optionally applied.

[0779] The transformation for the current block may be performed based on at least one of a plurality of transformation kernel candidates. For example, a transformation kernel applicable to the current block may be a transformation kernel of the DCT (Discrete Cosine Transform) family or a transformation kernel of the DST (Discrete Sine Transform) family.

[0780] Equations 15 through 17 represent the basis functions of transformation kernels applicable to the current block. Equation 15 represents the basis function for DCT-2, Equation 16 represents the basis function for DCT-8, and Equation 17 represents the basis function for DST-7.

[0781]

[0782]

[0783]

[0784] If multiple transformation kernel candidates exist, information indicating the transformation kernel applied to the current block among the multiple transformation kernel candidates can be encoded and signaled. Here, the transformation kernel candidate may include at least one of DCT-2, DST-7, or DCT-8. Additionally, the information may be an index indicating one of the multiple transformation kernel candidates.

[0785] Meanwhile, the transformations for the horizontal direction and the vertical direction of the current block can be separated. In this case, a common transformation kernel can be applied to both the horizontal and vertical directions. That is, if one of multiple transformation kernel candidates is selected, the selected transformation kernel can be applied to both the horizontal and vertical transformations of the current block.

[0786] Alternatively, the transformation kernels for the horizontal and vertical directions can be determined independently. In this case, the encoder can encode and signal information indicating the transformation kernel for the horizontal direction and information indicating the transformation kernel for the vertical direction, respectively.

[0787] Alternatively, multiple transformation kernel combination candidates indicating a combination of a horizontal transformation kernel and a vertical transformation kernel may be defined, and information indicating one of the multiple transformation kernel combination candidates may be encoded and signaled. Table 8 is an example of multiple transformation kernel combination candidates.

[0788] Index 01 2 3 4 Horizontal Direction Conversion DCT-2 DST-7 DCT-8 DST-7 DCT-8 Vertical Direction Conversion DCT-2 DST-7 DST-7 DCT-8 DCT-8

[0789] In the decoder, a transformation kernel for the horizontal direction and a transformation kernel for the vertical direction can be determined based on an index indicating one of a plurality of candidate transformation kernel combinations.

[0790] Meanwhile, information indicating whether the conversion kernel for horizontal and vertical conversions is determined integrally may be encoded / decoded. The information may be a 1-bit flag.

[0791] For example, if the above information indicates that the transformation kernels for horizontal and vertical transformations are determined integrally, an index indicating one of multiple transformation kernel candidates may be encoded / decoded. The transformation kernel indicated by the index may be applied to both the horizontal and vertical transformations of the current block.

[0792] On the other hand, if the above information indicates that the transformation kernels for horizontal and vertical transformations are not determined integrally, an index indicating one of multiple candidate transformation kernel combinations may be encoded / decoded. The combination of the horizontal transformation kernel and the vertical transformation kernel indicated by the index may be applied to the current block.

[0793] After performing the aforementioned transformation, additional transformations may be performed on the current block. For the sake of convenience of explanation, the transformation performed by a DCT or DST-based transformation kernel will be referred to as the first transformation, and the transformation additionally applied to the result of the first transformation will be referred to as the second transformation. Furthermore, the transformation coefficients generated as a result of the first transformation will be referred to as the first transformation coefficients, and the transformation coefficients generated as a result of the second transformation will be referred to as the second transformation coefficients.

[0794] In the encoder, a second transformation can be performed on the first transformation coefficients generated as a result of performing the first transformation to generate second transformation coefficients.

[0795] When both the first and second transforms are performed in the encoder, the decoder can generate the first transform coefficients by performing a second inverse transform (i.e., the inverse transform of the second transform) on the inversely quantized residual coefficients (i.e., the second transform coefficients). Then, residual samples can be obtained by performing a first inverse transform (i.e., the inverse transform of the first transform) on the first transform coefficients.

[0796] The second transformation may be applied to at least some of the first transformation coefficients. For example, depending on the size of the second transformation kernel, the second transformation may be applied to 16, 48, or 64 first transformation coefficients. The shape of the region containing the first transformation coefficients to which the second transformation is applied may be square, non-square, or polygonal.

[0797] Mathematical Equation 18 represents the application pattern of the second transformation.

[0798]

[0799] When a second transformation is performed, the first transformation coefficients can be aligned in one dimension. For example, in the above Equation 18, A Nx1 represents the first transformation coefficients consisting of N rows and 1 column. Also, B Rx1 represents the second transformation coefficients consisting of R rows and 1 column. T RXN represents a second transformation kernel consisting of R rows and N columns.

[0800] Figures 55 and 56 are drawings showing an example to which the second transformation is applied.

[0801] FIG. 55 illustrates an example where the second transformation kernel is 64x64 in size. The first transformation coefficients generated as a result of the first transformation within an 8x8 block can be arranged in one dimension. At this time, the first transformation coefficients can be scanned using a predetermined scanning method to generate a one-dimensional array. The predetermined scanning method may include at least one of a diagonal scan, a horizontal scan, a vertical scan, or a raster scan.

[0802] When a 64x1 input matrix is ​​generated through the above rearrangement, the second transformation coefficient can be derived through matrix multiplication between the 64x64 second transformation kernel and the 64x1 input matrix.

[0803] As a result of performing the second transformation, 64 second transformation coefficients are generated, and the second transformation coefficients within an 8x8 block can be rearranged. After quantizing the 8x8 block in which the second transformation coefficients have been rearranged, the quantized transformation block can be encoded.

[0804] FIG. 56 shows an example where the second transformation kernel is 48x48 in size. Among the first transformation coefficients generated as a result of the first transformation within the 8x8 block, 48 first transformation coefficients can be rearranged in one dimension. At this time, the 48 first transformation coefficients may be included in a polygonal shape area excluding the 4x4 sub-block at the bottom right of the 8x8 block.

[0805] When 48 first transformation coefficients are rearranged into one dimension to generate a 48x1 input matrix, the second transformation coefficients can be derived through matrix multiplication between a 48x48 second transformation kernel and a 48x1 input matrix.

[0806] As a result of performing the second transformation, 48 second transformation coefficients are generated, and the second transformation coefficients within the 8x8 block can be rearranged. For example, the 48 second transformation coefficients can be rearranged in a polygonal area excluding the bottom-right 4x4 sub-block within the 8x8 block.

[0807] In regions where the second transformation coefficients are not placed, the first transformation coefficients may be retained as they are. After applying quantization to a block containing the second transformation coefficients and the first transformation coefficients, the quantized transformation block can be encoded.

[0808] Alternatively, the transformation coefficients in the region where the second transformation coefficients are not placed may be set to 0. That is, the values ​​of the transformation coefficients in the region where the second transformation is not applied may be set to 0, and then quantization and encoding may proceed.

[0809] The size of the second transformation kernel can be determined based on the size of the current block. For example, if at least one of the width or height of the current block is 4, the second transformation can be applied to 16 first transformation coefficients. On the other hand, if the width and height of the current block are 8 or greater, the second transformation can be applied to 48 or 64 first transformation coefficients.

[0810] Alternatively, information indicating the size and type of the second transformation kernel may be encoded and signaled. The information may be signaled at the block level. For example, information specifying at least one of the number of rows or the number of columns of the transformation size may be encoded. Alternatively, different indices may be assigned to each of the combinations of the number of rows and columns, and then an index specifying one of the combinations may be encoded. Alternatively, different indices may be assigned to each of the plurality of second transformation kernel candidates, and then an index specifying one of the second transformation kernel candidates may be encoded. Here, each of the plurality of second transformation kernel candidates may differ in at least one of the size or coefficient.

[0811] Alternatively, based on the size of the current block, the size of the second transformation kernel may be determined, and an index specifying one of a plurality of second transformation kernel candidates having the determined size may be encoded.

[0812] In the example illustrated in FIGS. 55 and 56, a second transformation kernel is shown to be used in which the number of rows and the number of columns are the same. To simplify the second transformation, it is also possible to set the number of rows and the number of columns differently.

[0813] FIGS. 57 and 58 illustrate a second transformation based on an asymmetric form second transformation kernel.

[0814] The number of rows R of the second transformation kernel can be set to a value smaller than the number of columns N. For example, the number of rows R can be set to 8 and the number of columns N can be set to 48.

[0815] If the number of rows of the second transformation kernel decreases, the number of second transformation coefficients output as a result of the second transformation also decreases. For example, if a matrix multiplication is performed between an 8x48 second transformation kernel and a 48x1 input matrix, 8x1 second transformation coefficients are generated.

[0816] The eight second transformation coefficients can be rearranged within an 8x8 block. In this case, within the application area of ​​the second transformation (i.e., the area containing the first transformation coefficients to which the second transformation is applied), the values ​​of the transformation coefficients can be set to 0 in the areas where the second transformation coefficients are not assigned. For example, if the application area of ​​the second transformation is a polygonal shape area containing 48 samples, the values ​​of the transformation coefficients can be set to 0 in the remaining areas of the polygonal shape area excluding the areas where the eight second transformation coefficients are assigned.

[0817] In regions where the second transformation is not applied, the first transformation coefficients can be maintained as they are.

[0818] Alternatively, at least some of the first transformation coefficients within the region where the second transformation is not applied can be encoded by converting them to zero. FIG. 58 illustrates an example in which at least some of the region where the second transformation is not applied is converted to zero.

[0819] As shown in the example illustrated in FIG. 58 (a), the values ​​of the first transformation coefficients corresponding to the high-frequency region within the region where the second transformation is not performed can be converted to 0. For example, the values ​​of the first transformation coefficients where the sum of the x-axis and y-axis coordinates is greater than or equal to a threshold value can be converted to 0.

[0820] Alternatively, depending on a specific form, first transformation coefficients that are converted to zero may be selected. For example, as shown in the example illustrated in FIG. 58 (b), the first transformation coefficients included in the bottom n rows within the region where the second transformation is not performed may be converted to zero. Alternatively, as shown in the example illustrated in FIG. 58 (c), the first transformation coefficients included in the right n columns within the region where the second transformation is not performed may be converted to zero.

[0821] Alternatively, as in the example shown in (d) of FIG. 58, all first transformation coefficients within the region where the second transformation is not performed may be transformed to 0.

[0822] The shape of the region containing first transformation coefficients that are converted to 0 may be determined based on at least one of the current block size, shape, intra prediction mode, or transformation kernel. Alternatively, an index that specifies one of a plurality of candidate shapes that matches the region may be encoded and signaled.

[0823] Whether a second conversion is permitted can be determined based on at least one of the encoding mode of the current block or the first conversion kernel. Here, the encoding mode refers to intra-prediction or inter-prediction. For example, if the current block is encoded with intra-prediction, a second conversion is permitted, whereas if the current block is encoded with inter-prediction, a second conversion may not be permitted.

[0824] Information indicating whether the second transformation has been applied may be encoded and signaled. The information may be a 1-bit flag. Depending on whether the flag is true or false, it may be determined whether the second transformation has been applied to the current block. Alternatively, the information may be index information. An index value of 0 indicates that the second transformation has not been applied to the current block. Conversely, an index value greater than 0 indicates that the second transformation has been applied to the current block. When the index value is greater than 0, the second transformation kernel can be identified by the index.

[0825] Information indicating whether a second transformation has been performed on the current block can be encoded individually for each color component. For example, for each of the luminance component (Y), the first chrominance component (Cb), and the second chrominance component (Cr), information indicating whether a second transformation has been performed can be encoded.

[0826] Alternatively, information indicating whether a second transformation has been performed on the color difference components may be integratedly encoded. For example, for each color difference component (Cb, Cr), whether the second transformation is applied may be determined jointly. That is, the first color difference component (Cb) and the second color difference component (Cr) may share information indicating whether the second transformation has been performed.

[0827] Alternatively, based on the tree structure, it may be determined whether the above information is encoded for each color component. For example, if the luminance component and the chroma component have the same tree structure, the three color components (i.e., Y, Cb, Cr) may share information indicating whether the second transformation is performed. On the other hand, if the luminance component and the chroma component have different tree structures, information indicating whether the second transformation is performed may be signaled for each of the luminance component and the chroma component, respectively.

[0828] Multiple second transformation kernel candidates can be grouped into at least multiple groups. One group can be specified among the multiple groups based on at least one of the current block size, shape, or intra prediction modes. Once a group is specified, at least one of the multiple second transformation kernel candidates included in the specified group can be specified using the index information.

[0829] Table 9 shows an example in which one of a plurality of second transformation kernel sets is selected based on the intra prediction mode.

[0830] Intra-frame Predicted Mode 2nd Transformation Kernel Set Intra-frame Predicted Mode < 010 <= Intra-frame Predicted Mode <= 102 <= Intra-frame Predicted Mode <= 12113 <= Intra-frame Predicted Mode <= 23224 <= Intra-frame Predicted Mode <= 44345 <= Intra-frame Predicted Mode <= 55256 <= Intra-frame Predicted Mode <= 80181 <= Intra-frame Predicted Mode <= 830

[0831] In the example of Table 9, one of four second transformation kernel sets (i.e., second transformation kernel sets from index 0 to index 3) is selected according to the intra prediction mode. A lookup table defining the mapping relationship between the intra prediction modes of Table 9 and the second transformation kernel sets may be stored in the encoder and decoder.

[0832] When the second transformation kernel set is determined, one of the transformation kernels included in the second transformation kernel set can be selected through an index.

[0833] Meanwhile, in the example of Table 9, intra prediction modes 81 through 83 may represent CCLM (Cross-Component Linear Model) modes for the chrominance component. When CCLM modes are applied, the prediction samples of the chrominance component may be derived by applying prediction parameters to the luminance component restoration samples. Here, the prediction parameters may include at least one of weights or offsets.

[0834] Meanwhile, prediction parameters can be derived based on reconstructed samples around the luminance block and chroma block. In this case, the range of reconstructed samples used to derive the prediction parameters may vary depending on the intra prediction mode.

[0835] For example, mode 81 may derive prediction parameters based on the upper reconstructed samples of the luminance block and chroma block, and mode 82 may derive prediction parameters using the upper reconstructed samples and the left reconstructed samples of the luminance block and chroma block. Mode 83 may derive prediction parameters using the left reconstructed samples of the luminance block and chroma block.

[0836] Transformation / inverse transformation may be performed only on some regions within the current block. Here, the transformation may include at least one of the first transformation and the second transformation.

[0837] For example, a transformation can be performed only on a portion of the current block, and quantization and entropy encoding can be performed only on the transformation coefficients of said portion of the portion. Accordingly, residual coefficient information may be encoded and signaled only on a portion of the current block, and residual coefficient information may not be encoded or decoded for the remaining portion.

[0838] Accordingly, the decoder can obtain residual samples by performing inverse quantization and inverse transform only on a portion of the current block. For the remaining portion of the current block, the values ​​of the residual coefficients (or residual samples) can all be set to 0.

[0839] As described above, performing a transformation or inverse transformation on only a part of the current block can be referred to as a partial transformation. For example, the current block may be divided into two regions, and a transformation or inverse transformation may be performed on only one of the divided regions. In this case, the region where the transformation or inverse transformation is performed can be referred to as the transformation-applied region. Meanwhile, the region where the transformation or inverse transformation is not performed can be referred to as the transformation-unapplied region.

[0840] Information about partial transformations can be encoded and signaled. The information about partial transformations may include information indicating whether partial transformations have been applied to the current block. The information may be a 1-bit flag (e.g., sbt_flag).

[0841] When a partial transformation is applied to the current block, the information regarding the partial transformation may further include at least one of information indicating the division ratio of the current block, information indicating the division direction of the current block, and information indicating the location of the area where the transformation is applied within the current block.

[0842] Information indicating the partition ratio of the current block indicates whether the partition ratio of the current block is 1:3 or 1:1. Information indicating the partition ratio of the current block may be a 1-bit flag (e.g., sbt_quad_flag). A partition ratio of 1:3 indicates that the current block is partitioned into an area 1 / 4 the size of the current block and an area 3 / 4 the size of the current block. A partition ratio of 1:1 indicates that the current block is partitioned into two areas 1 / 2 the size of the current block. The current block may be partitioned into two areas at ratios different from the above examples. For example, the partition ratio may be 7:1 or 15:1.

[0843] Information indicating the division direction of the current block indicates whether the division direction of the current block is horizontal or vertical. Information indicating the division direction of the current block may be a 1-bit flag (e.g., sbt_hor_flag).

[0844] Information indicating the location of the area where the transformation is applied within the current block indicates whether the transformation area is the first area or the second area within the current block. Information indicating the location of the area where the transformation is applied within the current block may be a 1-bit flag (e.g., sbt_pos_flag). In this case, if the partition ratio is not 1:1, the transformation area indicated by sbt_pos_flag may be the larger of the two areas. Or, if the partition ratio is not 1:1, the transformation area indicated by sbt_pos_flag may be the smaller of the two areas.

[0845] Figure 59 shows an example in which information about partial transformations is sequentially encoded / decoded.

[0846] As shown in the example illustrated in FIG. 59, if the information indicating whether a partial transformation has been applied to the current block (e.g., sbt_flag) indicates that a partial transformation has been applied to the current block, additional information indicating the division ratio of the current block (e.g., sbt_quad_flag), information indicating the division direction of the current block (sbt_hor_flag), and information indicating the location of the area where the transformation is applied within the current block (sbt_pos_flag) may be further encoded / decoded.

[0847] Among the contents described above, the signaling of sbt_pos_flag can be skipped, and the value of sbt_pos_flag can be derived on the decoder side. For example, an amplitude value is calculated by deriving a gradient for each predicted sample position within the prediction block used in the current block. Then, the value of sbt_pos_flag can be set to 0 or 1 depending on which region the sample position containing the largest amplitude value is included in. For example, the position containing the largest amplitude value can be set as the region to be transformed or the region to be not transformed.

[0848] Figure 60 shows an example where a partial transformation is applied to the current block.

[0849] In FIG. 60, area A represents the area where the transformation / inverse transformation is performed. Depending on sbt_pos_flag, the first or second area within the current block may be set as the transformation application area.

[0850] For the remaining region excluding region A, the transformation / inverse transformation may not be performed, and accordingly, the values ​​of the residual coefficients / residual samples within the remaining region may be derived to 0.

[0851] w and h represent the width and height of the current block, respectively. w1 and h1 may be variables used to represent the size of the divided areas.

[0852] For example, if the splitting direction is vertical, w1 can be set to w / 2 or w / 4 depending on the value of sbt_quad_flag.

[0853] If the splitting direction is horizontal, h1 can be set to h / 2 or h / 4 depending on the value of sbt_quad_flag.

[0854] The conversion kernel of the conversion application area may be predefined in the encoder and decoder. Specifically, a corresponding conversion kernel may be predefined for each size of the conversion target area or for each location of the conversion target area.

[0855] For example, if the current block is divided vertically (Fig. 60 (a) and (b)), the vertical transformation kernel of the area to be transformed can be determined as DST-7. Meanwhile, if the area to be transformed is the first area within the current block, the horizontal transformation kernel can be determined as DCT-8. Conversely, if the area to be transformed is the second area within the current block, the horizontal transformation kernel can be determined as DST-7. Here, the first area represents the left area within the current block, and the second area represents the right area within the current block.

[0856] If the current block is divided horizontally (Fig. 60 (c) and (d)), the horizontal kernel of the area to be transformed can be determined as DST-7. Meanwhile, if the area to be transformed is the first area within the current block, the vertical transformation kernel can be determined as DCT-8. Conversely, if the area to be transformed is the second area within the current block, the vertical transformation kernel can be determined as DST-7. Here, the first area represents the upper area within the current block, and the second area represents the lower area within the current block.

[0857] As another example, the transformation kernel of the transformation application area can be adaptively determined based on the size of the current block or the transformation application area.

[0858] For example, if at least one of the width or height of the current block or transformation application area is greater than a threshold value, both the horizontal transformation kernel and the vertical transformation kernel may be set to DCT-2.

[0859] Meanwhile, partial transformation can be applied not only to the luminance component but also to the chrominance component. In this case, the transformation kernel of the application area may be determined differently depending on the color component.

[0860] For example, in the case of the luma component, the transformation kernel of the transformation application area can be determined according to the example shown in Fig. 60.

[0861] On the other hand, for the color difference component, both the horizontal and vertical transformation kernels of the target area can be determined as DCT-2 (or DCT-8).

[0862] Meanwhile, depending on the encoding mode of the current block, it may be determined whether applying a partial transformation is permitted. Here, the encoding mode may represent intra-prediction or inter-prediction.

[0863] For example, if the current block is encoded in inter-predict, it may not be allowed to apply partial transformation to the current block. Accordingly, when the encoding mode of the current block is inter-predict, the encoding / decoding of information regarding partial transformation may be omitted. In addition, the decoder may infer the value of information (i.e., sbt_flag) indicating whether partial transformation has been applied to the current block to be False.

[0864] Alternatively, it may be permitted to apply partial transformations when the encoding mode of the current block is intra-prediction and when it is inter-prediction.

[0865] Meanwhile, the second transform / second inverse transform may be performed only when the encoding mode of the current block is intra prediction. In this case, a second transform kernel set for the second transform / second inverse transform may be determined based on the intra prediction mode of the current block.

[0866] As another example, even when the encoding mode of the current block is inter-prediction, it may be permissible to apply the second transformation / second inverse transformation. However, when the encoding mode of the current block is inter-prediction, the intra-prediction mode of the current block does not exist, so the second transformation kernel set cannot be selected based on the intra-prediction mode. Accordingly, when the encoding mode of the current block is inter-prediction, index information indicating one of the multiple second transformation kernel sets can be encoded and signaled.

[0867] As another example, if the encoding mode of the current block is inter-prediction, the decoder side can derive an intra-prediction mode for the current block in the same way as the encoder. The derived intra-prediction mode may be used only to select a second transformation kernel set and may not be used to predict the current block.

[0868] On the decoder side, an intra prediction mode for the current block can be derived based on the intra prediction mode histogram of the current block. Here, the intra prediction mode histogram can be obtained by applying a filter of a predetermined size to a template area adjacent to the current block.

[0869] Alternatively, the cost of each of multiple intra prediction modes may be calculated, and the intra prediction mode with the lowest cost may be selected as the intra prediction mode for the current block. Here, the cost of the intra prediction mode can be derived based on the difference between the prediction sample and the reconstructed sample obtained through the prediction, after performing a prediction on the template region adjacent to the current block based on the corresponding intra prediction mode.

[0870]

[0871] The transformation information for the first transformation may be an index indicating one of the transformation kernel combination candidates included in the list of transformation kernel combination candidates (e.g., Table 8) that is defined in the encoder and decoder.

[0872] Meanwhile, to improve encoding / decoding efficiency, the conversion kernel combination candidates included in the conversion kernel combination candidate list may be reordered, and the index of one of the reordered conversion kernel combination candidates may be encoded and signaled.

[0873] At this time, the cost of each transformation kernel combination candidate can be derived based on the restoration area around the current block.

[0874] Figure 61 is a diagram illustrating an example of calculating the cost of a candidate combination of transformation kernels.

[0875] As shown in the example illustrated in FIG. 61, the top restoration area adjacent to the top of the current block and the left restoration area adjacent to the left of the current block can be set as reference areas.

[0876] Subsequently, the cost of a candidate transformation kernel combination can be calculated based on the difference between the restored samples included in the reference region and the residual samples located at the boundary of the current block.

[0877] Specifically, assuming the coordinates of the top-left sample within the current block are (0, 0), the cost of the transformation kernel combination candidates can be derived based on the following mathematical formulas 19 to 21.

[0878]

[0879]

[0880]

[0881] In mathematical equations 19 and 20, R(x, y) represents the restored sample at position (x, y).

[0882] r i (x, y) represents a residual sample within the current block. Here, i represents the index of the transformation kernel combination for which the cost is calculated. That is, r i (x, y) may represent the values ​​of residual samples obtained when applying candidate transformation kernel combinations for cost calculation. For example, r0(x, y) represents residual samples obtained when DCT2 is used for both horizontal transformation type and vertical transformation type.

[0883] On the other hand, while the value of a recovery sample existing outside the current block has a fixed value, the value of a recovery sample within the current block can be variable depending on the candidate transformation kernel combination. In other words, the value of the recovery sample within the current block is determined according to the candidate transformation kernel combination for which the cost is to be calculated.

[0884] Specifically, the encoder derives a prediction block for the current block and then generates a residual block by differencing the prediction block from the original block. Subsequently, a transformation is performed on the residual block based on each of the transformation kernel combination candidates. Additionally, the encoder can generate residual coefficients by performing quantization on the transformation coefficients. Afterward, the encoder can reconstruct the residual block by performing inverse quantization on the residual coefficients and then performing an inverse transformation on the inversely quantized residual coefficients.

[0885] In the decoder, the residual block can be restored by performing inverse quantization on the residual coefficients of the current block and then performing an inverse transform on the inversely quantized residual coefficients.

[0886] In the above transformation / inverse transformation process, the transformation / inverse transformation can be performed based on each of the candidate transformation kernel combinations.

[0887] Figure 62 shows an example in which a transformation / inverse transformation is performed based on each of the candidate transformation kernel combinations.

[0888] When there are five candidate transform kernel combinations (for example, candidate transform kernel combinations with indices 0 to 4), the encoder can generate transform coefficients by performing a transform on the residual block based on each of the five candidate transform kernel combinations. Then, the encoder can select the optimal candidate transform kernel combination among the multiple candidate transform kernel combinations, taking into account the encoding efficiency.

[0889] Additionally, in the encoder and decoder, an inverse transform can be performed on the inversely quantized residual coefficients based on each of the transform kernel combination candidates. Subsequently, based on the residual blocks obtained based on each transform kernel combination candidate, the reconstructed samples of the current block are derived, and based on the derived reconstructed samples, the cost of each transform kernel combination candidate can be calculated.

[0890] Mathematical formula 19 represents the first cost calculated based on the restoration samples located at the upper boundary of the current block.

[0891] Specifically, the absolute value of the difference between the upper restored sample (i.e., the restored sample R(x, -1) within the reference region) and the lower restored sample (i.e., the restored sample (P(x, 1) + ri(x, 1)) within the current block) from twice the value of the restored sample R(x, 0) at the upper boundary position of the current block can be set as the vertical gradient of the upper boundary position of the current block. Subsequently, the vertical gradients of the restored samples located at the upper boundary of the current block can be summed to calculate the first cost c0.

[0892] Mathematical formula 20 represents a second cost calculated based on restoration samples located at the left boundary of the current block.

[0893] Specifically, the absolute value of the difference between the left restored sample (i.e., the restored sample R(-1, y) within the reference region) and the bottom restored sample (i.e., the restored sample (P(1, y) + ri(1, y)) within the current block) from twice the value of the restored sample R(x, 0) at the left boundary position of the current block can be set as the horizontal gradient of the left boundary position of the current block. Subsequently, the horizontal gradients of the restored samples located at the left boundary of the current block can be summed to calculate the second cost c1.

[0894] Afterwards, as shown in Equation 21, the cost of the candidate transformation kernel combination can be calculated by adding the first cost and the second cost.

[0895] Once the cost of each candidate transformation kernel combination is calculated, the candidates transformation kernel combinations can be reordered according to the cost.

[0896] Figure 63 shows an example where candidates for transformation kernel combinations are reordered according to cost.

[0897] Candidate transformation kernel combinations can be reordered in ascending order of cost. That is, the smallest index can be assigned to the candidate transformation kernel combination with the smallest cost, and the largest index can be assigned to the candidate transformation kernel combination with the largest cost.

[0898] Subsequently, based on the updated list of candidate transformation kernel combinations, the transformation type information of the current block can be encoded and signaled. That is, the index of the candidate selected as the transformation kernel combination of the current block within the updated list of candidate transformation kernel combinations can be encoded and signaled.

[0899] In the decoder, the list of candidate transformation kernel combinations is updated in the same way as in the encoder, and the inverse transformation can be performed based on the candidate transformation kernel combinations pointed to by the indices in the updated list.

[0900] The transformation kernel combination candidates used in the current block can be determined based on the costs of the transformation kernel combination candidates. For example, when the transformation kernel combination candidates are sorted in ascending order of cost, only the top N candidates can be determined as available for use in the current block.

[0901] Alternatively, it may be determined that only candidates whose costs are above or below a threshold value are available in the current block.

[0902] Figure 64 shows an example where some of the conversion kernel combination candidates are set to be available in the current block.

[0903] In the example illustrated in FIG. 64, it is illustrated that three candidates for transformation kernel combinations with lower costs among five candidates for transformation kernel combinations are determined to be available. Meanwhile, the three available candidates for transformation kernel combinations can be rearranged in ascending order of cost.

[0904] An index indicating one of the candidate transformation kernel combinations available in the current block can be encoded and signaled. That is, the encoded index can indicate one of the top N candidates with the smallest cost.

[0905] If the number of available transformation kernel combination candidates is 1, the encoding / decoding of the index may be omitted. In this case, a transformation kernel combination candidate with index 0 may be selected for the current block.

[0906] In the example described above, cost derivation and reordering are exemplified as being performed for all candidate transformation kernel combinations. For computational simplification, cost derivation and reordering may be performed only for a predefined number of candidate transformation kernel combinations.

[0907] For example, cost derivation and reordering can be performed on candidates for transformation kernel combinations whose indices are smaller than a predefined value or candidates for transformation kernel combinations whose indices are larger than a predefined value.

[0908] For example, cost derivation and reordering can be performed only on the remaining candidates (i.e., candidates for transformation kernel combinations with indices 1 through 4), excluding the candidate for transformation kernel combination with index 0. In this case, the index of the candidate for transformation kernel combination excluded from cost derivation and reordering can be derived without changing even after reordering.

[0909] Information indicating whether rearranging the transformation kernel combination candidates is allowed can be encoded and signaled.

[0910] Alternatively, it may be determined whether to reorder the transformation kernel combination candidates based on at least one of the current block's size, shape, encoding mode, or intra-prediction mode.

[0911] The second transformation and the inverse transformation for the second transformation (i.e., the second inverse transformation) can be performed using a transformation kernel included in the transformation kernel set. At this time, the transformation kernel set is determined based on the intra prediction mode of the current block, and an index indicating one of the transformation kernel candidates included in the transformation kernel set can be encoded / decoded.

[0912] Cost calculation and reordering can also be performed on transformation kernel candidates within a transformation kernel set according to the same method as described in the above-described embodiment. That is, transformation kernel candidates included in the transformation kernel set can be reordered in ascending order according to cost. Subsequently, an index indicating the index of a transformation kernel candidate within the updated transformation kernel set can be encoded / decoded.

[0913] Depending on the intra prediction mode, the process of selecting a transformation kernel set may be omitted, and only a single transformation kernel set may be used. In this case as well, cost calculation and reordering are performed on the transformation kernel candidates included in the transformation kernel set, and an index indicating one of the transformation kernel candidates included in the updated transformation kernel set can be encoded / decoded.

[0914]

[0915] In the example described above, it was explained that when a partial transformation is applied to the current block, the transformation kernel combination is determined according to the partitioning direction of the current block and the location of the transformation target area. That is, once the transformation target area is determined, transformation / inverse transformation can be performed on the transformation target area based on a predefined transformation kernel combination.

[0916] As another example, a plurality of candidate transformation kernel combinations may be provided for a transformation application area, and one of the candidate transformation kernel combinations may be selected to perform a transformation / inverse transformation on the transformation application area. In this case, depending on the location and / or size of the transformation application area, the number and / or types of transformation kernels that can constitute the candidate transformation kernel combinations may differ.

[0917] For example, Table 10 shows the configuration of candidate conversion kernel combinations when only DST-7 and DCT-8 are available in the conversion application area.

[0918] CandidateHorVer0DST-7DST-71DST-7DCT-82DCT-8DST-73DCT-8DCT-8

[0919] For a region to which a transformation is applied, a reverse transformation can be performed on the region to which the transformation is applied based on each of the candidate transformation kernel combinations to obtain the residual block of the current block. Specifically, residual samples of the region to which the transformation is applied within the current block can be obtained through a reverse transformation based on the candidate transformation kernel combinations. On the other hand, residual samples of the region to which the transformation is not applied within the current block can be derived to 0.

[0920] Afterwards, restoration samples are derived based on the residual block, and vertical gradients and / or horizontal gradients at the top and / or left boundary positions of the current block can be calculated. Then, the cost of a candidate transformation kernel combination can be calculated by adding the first cost obtained based on the vertical gradient and the second cost obtained based on the horizontal gradient.

[0921] Once the cost of each candidate transformation kernel combination is calculated, the candidates can be sorted in ascending order of cost. Subsequently, based on an index indicating one of the reordered candidates, a candidate transformation kernel combination to be applied to the transformation application area can be selected.

[0922] Alternatively, the one with the smallest cost among the candidate transformation kernel combinations can be selected as the candidate transformation kernel combination for the transformation application area.

[0923]

[0924] As another example, when a partial transformation is applied to the current block, depending on the size and / or location of the transformation application area, a predefined transformation type candidate may be used, but instead of explicitly encoding / decoding information related to the partial transformation, the cost for each of the partial transformation candidates may be calculated to select the optimal partial transformation candidate. Here, the information related to the partial transformation may include at least one of a flag indicating whether a partial transformation is applied to the current block (e.g., sbt_flag), a flag indicating the splitting ratio of the current block (e.g., sbt_quad_flag), a flag indicating the splitting direction of the current block (e.g., sbt_hor_flag), or a flag indicating the location of the transformation application area within the current block (e.g., sbt_pos_flag).

[0925] Table 11 lists the partial transformation candidates. In Table 11, examples are provided showing how each candidate forms a mapping relationship with the existing partial transformation information.

[0926] candidatesbt_flagsbt_quad_flagsbt_hor_flagsbt_pos_flag00---1100021001310104101151100611017111081111

[0927] For example, in Table 11, Candidate 0 indicates that no partial transformation is applied to the current block. Candidates 1 through 4 correspond to cases where a partial transformation is applied to the current block and the partition ratio is 1:3 (or 3:1). Candidates 5 through 8 correspond to cases where a partial transformation is applied to the current block and the partition ratio is 1:1.

[0928] You may use fewer partial transformation candidates than those listed. For example, a flag indicating whether a partial transformation is applied to the current block (i.e., sbt_flag) can be explicitly encoded / decoded and set combinations of the split ratio, split direction, and location of the transformation application area as partial transformation candidates.

[0929] Alternatively, at least one of the partitioning ratio, partitioning direction, or location of the transformation application area may be set as partial transformation candidates. In this case, information indicating an element unrelated to the partial transformation candidates may be explicitly encoded / decoded.

[0930] For example, sbt_flag, sbt_quad_flag, and sbt_hor_flag can explicitly encode / decode and set two partial transformation candidates based on the location of the transformation application area.

[0931] Alternatively, sbt_flag and sbt_quad_flag can explicitly encode / decode and set four partial transformation candidates based on a combination of the current block's partitioning direction and the location of the transformation application area.

[0932] Residual blocks can be obtained by performing inverse transformations based on each of the partial transformation candidates. For example, as shown in Table 11, if there are 9 partial transformation candidates, residual blocks can be obtained for each of the 9 cases. After deriving a restoration block based on the residual blocks, the cost of each partial transformation candidate can be calculated based on the horizontal gradient and / or vertical gradient at the boundary positions of the current block.

[0933] Afterwards, the partial transformation candidates can be reordered in ascending order of cost.

[0934] Figure 65 shows an example where partial transformation candidates are rearranged according to cost.

[0935] After reordering partial transformation candidates according to cost, an index indicating one of the reordered partial transformation candidates can be encoded and signaled.

[0936] In the decoder, a partial transformation candidate indicated by an index can be selected to determine whether to apply the partial transformation to the current block, the size of the transformation application area, or the location of the transformation application area.

[0937] Alternatively, a partial transformation candidate with the optimal cost (i.e., the partial transformation candidate with the smallest cost) may be set as a prediction value, and information indicating whether the partial transformation application pattern of the current block is the same as the prediction value may be encoded and signaled. The information may be a 1-bit plane.

[0938] For example, the above flag being 1 may indicate that a partial transformation candidate with the optimal cost is applied to the current block.

[0939] On the other hand, if the above flag is 0, it may indicate that a partial transformation candidate with the optimal cost is not applied to the current block. In this case, information indicating the positive application of the partial transformation to the current block may be additionally encoded / decoded. This information may be an index indicating one of the remaining partial transformation candidates excluding the partial transformation candidate with the optimal cost.

[0940] Only some of the partial transformation candidates can be set to be available for use in the current block. For example, N partial transformation candidates selected in order of lowest cost can be set to be available for use in the current block. In this case, one of the N partial transformation candidates can be used for the partial transformation of the current block.

[0941]

[0942] Whether the described embodiments are applied to candidates for transformation type combinations, candidates for transformation kernels, or candidates for partial transformations can be adaptively determined based on the number of quantized transformation coefficients (i.e., residual coefficients) in the current block. Here, the number of residual coefficients may represent the number of residual coefficients having non-zero values, or the number from the first residual coefficient to the last non-zero residual coefficient in the current block.

[0943] For example, if the number of residual coefficients within the current block is smaller than a preset threshold value, the above-described embodiments may not be applied to the current block. Accordingly, the encoding / decoding of information indicating whether the above-described embodiments are applied may be omitted.

[0944] For example, when the threshold value is 3 and the number of residual coefficients in the current block is 2, the above-describ...

Claims

1. A step of decoding the intra prediction mode of the current block; and Based on the above intra prediction mode, the method includes the step of performing an intra prediction for the current block, The intra prediction mode of the current block is decoded based on at least one intra prediction mode candidate list, and An image decoding method characterized in that the above-mentioned at least one intra prediction mode candidate list includes an intra prediction mode candidate derived from the histogram of the current block.

2. In Paragraph 1, A video decoding method characterized by adding a predefined number of intra prediction mode candidates with small costs among a plurality of intra prediction mode candidates to a first intra prediction mode candidate list, and adding the remaining intra prediction mode candidates to a second intra prediction mode candidate list.

3. In Paragraph 1, An image decoding method characterized in that intra prediction mode candidates included in the intra prediction mode candidate list are reordered in ascending order of cost.

4. In Paragraph 3, An image decoding method characterized in that the cost of an intra prediction mode candidate is calculated based on the difference between the prediction samples obtained by performing intra prediction based on the intra prediction mode candidate in the reference region of the current block and the restoration samples within the reference region.

5. In Paragraph 3, An image decoding method characterized by performing reordering only on some intra prediction mode candidates derived according to a predefined method among the above intra prediction mode candidates.

6. In Paragraph 1, A video decoding method characterized by calculating the cost of each intra prediction mode candidate included in the intra prediction mode candidate list, and then updating the intra prediction mode candidate list to include only a predefined number of intra prediction mode candidates selected in order of lowest cost.

7. In Paragraph 1, An image decoding method characterized by a predefined number of intra prediction modes with high amplitude values ​​on the above histogram being set as intra prediction mode candidates.

8. In Paragraph 1, The above histogram is the cumulative frequency of occurrence by intra-prediction mode, and The intra prediction mode of a neighbor block adjacent to the current block is represented on the histogram, and An image decoding method characterized in that the frequency of occurrence of the intra prediction mode of the neighbor block is set to the size of the neighbor block.

9. In Paragraph 1, The above histogram is the cumulative amplitude values ​​for each intra-prediction mode, and The intra-prediction mode of a reference sample within a reference region adjacent to the current block is represented on the histogram, and An image decoding method characterized in that the intra prediction mode and the amplitude value of the intra prediction mode of the reference sample are derived based on the horizontal slope and vertical slope of the reference sample.

10. In Paragraph 1, The above intra prediction is performed using one of a plurality of interpolation filters, and An image decoding method characterized in that one of the plurality of interpolation filters is specified based on an index decoded from a bitstream.

11. In Paragraph 1, An image decoding method characterized by calculating the cost of each of a plurality of interpolation filters and then performing the intra prediction based on the interpolation filter with the smallest cost.

12. In Paragraph 11, An image decoding method characterized in that the cost of the interpolation filter is calculated based on the difference between the prediction samples obtained by performing intra prediction based on the interpolation filter in the reference region of the current block and the restoration samples within the reference region.

13. In Paragraph 1, Based on each of the plurality of interpolation filters, the intra prediction is performed on the current block to derive the plurality of prediction blocks, and An image decoding method characterized by weighting the plurality of prediction blocks to derive the weighted sum prediction block of the current block.

14. A step of performing an intra prediction for the current block based on the intra prediction mode of the current block; and The step of encoding the intra prediction mode of the current block above, wherein The intra prediction mode of the above current block is encoded based on at least one intra prediction mode candidate list, and An image encoding method characterized in that the above-mentioned at least one intra prediction mode candidate list includes an intra prediction mode candidate derived from the histogram of the current block.

15. A processor for acquiring compressed video data; and It includes a transmission unit that transmits the above-mentioned compressed video data, The above compressed video data is, A step of performing an intra prediction for the current block based on an intra prediction mode of the current block; and It is generated through the step of encoding the intra prediction mode of the current block above, and The intra prediction mode of the above current block is encoded based on at least one intra prediction mode candidate list, and An apparatus for transmitting compressed video data, characterized in that the above-mentioned at least one intra prediction mode candidate list includes an intra prediction mode candidate derived from the histogram of the current block.