Video encoding method and device for matrix-based intra prediction, and video decoding method and device

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By replacing intra prediction mode indices with a matrix-based intra prediction mode and optimizing signaling, the method enhances coding efficiency and reduces bit rate for high-resolution video encoding and decoding.

WO2026135281A1PCT designated stage Publication Date: 2026-06-25INTELLECTUAL DISCOVERY CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: INTELLECTUAL DISCOVERY CO LTD
Filing Date: 2025-12-18
Publication Date: 2026-06-25

Application Information

Patent Timeline

18 Dec 2025

Application

25 Jun 2026

Publication

WO2026135281A1

IPC: H04N19/593; H04N19/70; H04N19/105; H04N19/176

AI Tagging

Application Domain

Digital video signal modification

Technology Topics

Video encoding Reference Region

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing video encoding and decoding technologies face challenges in efficiently encoding and decoding high-resolution or high-quality video content, particularly in reducing signaling overhead and improving coding efficiency for matrix-based intra-prediction.

Method used

The method involves replacing specific intra prediction mode indices with a matrix-based intra prediction (MIP) mode and efficiently signaling MIP mode information, thereby reducing signaling overhead and improving representation accuracy.

Benefits of technology

This approach reduces the overall bit rate under the same image quality conditions and enhances subjective and objective image quality indicators, leading to improved coding efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure KR2025022106_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A video decoding method according to the present disclosure involves: determining, on the basis of prediction mode information, whether a prediction mode of the current block corresponds to an MIP mode; when the prediction mode of the current block corresponds to the MIP mode, acquiring prediction samples of the current block by determining a matrix and a reference region used for MIP on the basis of the size of the current block; and when one or more intra prediction mode indexes among a plurality of intra prediction mode indexes are set to indicate the MIP mode, determining the prediction mode of the current block as the MIP mode if intra prediction mode information about the current block indicates the MIP mode.

Need to check novelty before this filing date? Find Prior Art

Description

Video encoding method and device for matrix-based intra prediction, and video decoding method and device

[0001] The present disclosure relates to the field of video encoding and decoding. More specifically, the present disclosure relates to a method for encoding and decoding information related to matrix-based intra-prediction, and to a video encoding and decoding method and apparatus utilizing such matrix-based intra-prediction.

[0002] With the development and widespread adoption of hardware capable of playing and storing high-resolution or high-quality video content, the need for codecs that can effectively encode or decode such content is increasing. Recently, methods to effectively compress this high-resolution or high-quality video content are being implemented. Representative examples of codecs include HEVC (High Efficiency Video Coding) and VVC (Versatile Video Coding).

[0003] In HEVC, a single picture is divided into one or more tiles / slices and then further divided into multiple Coding Tree Units (CTUs). VVC, on the other hand, can first divide a single picture into multiple sub-pictures. A sub-picture is defined as a group of rectangular slices and was added to VVC to support the ability to partially and independently encode, decode, and transmit the picture. A single sub-picture can be divided into tiles / slices, similar to HEVC. Additionally, VVC introduces the "brick" as a new picture division structure. Bricks are created by dividing tiles horizontally and serve as the basic unit for parallel processing. To process higher resolution video than HEVC, VVC uses Coding Tree Units (CTUs) of up to 256x256, which is 16 times larger than those in HEVC, as the basic unit for encoding and decoding.

[0004] In the case of intra-block prediction, to eliminate redundancy within the screen, a prediction block is generated using restoration pixels adjacent to the current coding block, and a difference value is created between this and the current coding block. In the case of inter-block prediction, unlike intra-block prediction, the prediction block is generated by searching for the block most similar to the current coding block in the previous or subsequent frame.

[0005] Numerous tools have been proposed to improve the efficiency and performance of intra coding. For example, in addition to directional intra prediction modes using DC, Planar, and reference samples of a predetermined direction, various intra coding methods have been proposed, such as DIMD (Decoder Side Intra Mode Derivation), TIMD (Template based Intra Mode Derivation), EIP (Edge-based Intra Prediction), OBIC (Orientation-Based Intra Coding), MIP (Matrix based Intra Prediction), SGPM (Spatial Geometric Partitioning Mode), IntraTMP (Intra Template Matching Prediction), IBC (Intra Block Copy), MRL (Multiple Reference Lines), and ISP (Intra Sub-partitions).

[0006] In response to the demand for continuous coding improvement, it is necessary to enhance coding efficiency by improving various existing coding tools.

[0007] The present disclosure is intended to provide a video encoding / decoding method and apparatus for efficiently encoding and decoding MIP mode information or MIP-related information in relation to matrix-based intra-prediction.

[0008] Embodiments of the present disclosure disclose a method of replacing at least one intra prediction mode index among a plurality of intra prediction mode indices with a matrix-based intra prediction (MIP) mode. Furthermore, the present disclosure enables the efficient representation of an MIP mode without the allocation of a separate new mode index by replacing a specific intra prediction mode candidate included in a Most Probable Mode (MPM) list or a non-MPM list with a mode indicating an MIP mode.

[0009] In addition, embodiments of the present disclosure provide a method for efficiently signaling various information required in a prediction process according to an MIP mode, based on an MIP mode index substitution structure.

[0010] According to embodiments of the present disclosure, by efficiently encoding prediction parameters and mode information related to a matrix-based intra prediction mode, the signaling overhead required for intra prediction can be effectively reduced, and at the same time, the representation accuracy of the prediction model can be improved. Accordingly, the overall bit rate can be reduced under the same image quality conditions, and subjective and objective image quality indicators are improved under the same bit rate conditions, so that the coding efficiency throughout the video encoding can be substantially improved.

[0011] Figure 1 is a block diagram illustrating the configuration and operation of a video encoder for encoding images.

[0012] FIG. 2 is a diagram illustrating an example of a method for dividing blocks of an image.

[0013] FIGS. 3 and FIGS. 4 are drawings for illustrating embodiments of an intra-prediction method.

[0014] Figure 5 is a block diagram illustrating the configuration and operation of a video decoder for decoding an image.

[0015] FIG. 6 is a diagram illustrating the process of generating a prediction sample of the current block according to the MIP mode according to one embodiment of the present invention.

[0016] FIGS. 7a and 7b illustrate an example of a process for generating a predicted sample of a current block according to an MIP mode according to an embodiment of the present invention.

[0017] FIG. 8 is a diagram illustrating an interpolation process performed in MIP mode in one embodiment of the present invention.

[0018] FIG. 9 is a diagram illustrating the process of configuring an MPM list according to an embodiment of the present invention.

[0019] FIGS. 10a and 10b are drawings for explaining a method of determining the cost when configuring an MPM candidate according to one embodiment of the present disclosure.

[0020] FIG. 11 is a drawing showing an example of a reference area used in MIP mode according to one embodiment of the present disclosure.

[0021] FIG. 12 is a drawing showing another example of a reference area used in MIP mode according to one embodiment of the present disclosure.

[0022] FIG. 13 is a drawing illustrating a reference mode that selectively applies surrounding pixels included in a reference area according to one embodiment of the present disclosure.

[0023] FIG. 14 is a flowchart of a video encoding method according to one embodiment of the present disclosure.

[0024] FIG. 15 is a flowchart of a video decoding method according to one embodiment of the present disclosure.

[0025] A video decoding method according to one embodiment of the present disclosure comprises: a step of acquiring prediction mode information of a current block; a step of determining whether the prediction mode of the current block corresponds to a Matrix-based Intra prediction (MIP) mode based on the prediction mode information; a step of setting a reference region used for the MIP based on the size of the current block when it is determined that the prediction mode of the current block corresponds to the MIP mode; a step of determining a matrix used for the MIP; and a step of acquiring a prediction sample of the current block based on the reference region and the matrix, wherein the step of determining whether it corresponds to the MIP mode includes, when one or more of the intra prediction mode indices among a plurality of intra prediction mode indices are set to point to the MIP mode, if the intra prediction mode information of the current block points to the MIP mode, determining the prediction mode of the current block as the MIP mode.

[0026] A video encoding method according to one embodiment of the present disclosure comprises: a step of obtaining a prediction sample of the current block by applying a plurality of prediction modes to the current block; a step of determining the prediction mode of the current block based on the cost of the prediction sample; and a step of encoding the prediction mode information of the current block, wherein the step of obtaining the prediction sample of the current block includes a step of performing MIP, and when the prediction mode of the current block is determined to be a MIP mode and one or more intra prediction mode indices among a plurality of intra prediction mode indices are set to point to the MIP mode, the prediction mode information of the current block is set to one of the one or more intra prediction mode indices pointing to the MIP mode.

[0027] In addition, embodiments of the present invention disclose a video decoding device and a decoding device for performing the aforementioned video decoding method and encoding method, and a computer-readable recording medium having a program for executing the video decoding method and encoding method on a computer.

[0028] Hereinafter, a video encoding, decoding method, and apparatus according to an embodiment of the present invention will be described in detail with reference to the attached drawings.

[0029] In the following description of the present invention, specific descriptions of related known functions or configurations will be omitted if it is determined that such detailed descriptions could unnecessarily obscure the essence of the invention. Furthermore, the terms described below are defined in consideration of their functions within the present invention, and these definitions may vary depending on the intentions or practices of the user or operator. Therefore, their definitions should be based on the overall content of the present invention.

[0030] In addition, the preferred embodiments of the present invention described below will focus on explaining the functional configurations that must be additionally provided for the present invention, while omitting as much as possible the system functional configurations that are already provided in each system functional configuration or are ordinarily provided in the technical field to which the present invention belongs, in order to efficiently explain the technical components constituting the present invention.

[0031] If a person skilled in the art to which the present invention pertains can easily understand the function of a component that has been used in the past among the functional configurations that are omitted and not illustrated below, and can also clearly understand the relationship between the component that was omitted as above and the component added for the present invention.

[0032] In this specification, a device that encodes an image to generate a video signal bitstream is referred to as an encoding device, an encoding device, or an encoder, and a device that decodes a video signal bitstream to restore an image is referred to as a decoding device, a decoding device, or a decoder.

[0033] A pixel or pel refers to the smallest unit that constitutes an image, and the terms pixel and sample may be used interchangeably. Generally, a sample can represent a pixel or its value, and it may represent only the pixel or its value of the luminance component, or only the pixel or its value of the chroma component.

[0034] Furthermore, the term "unit" is used to refer to a basic unit of image processing or a specific location within a picture, representing an image region containing at least one of the luminance component and the chrominance component. Specifically, the term "unit" can be used as a concept encompassing the Coding Tree Unit (CTU), Coding Unit (CU), Prediction Unit (PU), and Transform Unit (TU). Additionally, the term "block" represents an image region containing a specific component among the luminance and chrominance components, and an MxN block can represent a set of samples or transform coefficients consisting of M columns and N rows. Here, terms such as unit, block, partition, signal, and region may be used interchangeably.

[0035] Meanwhile, the term "picture" refers to a field or a frame, and these terms may be used interchangeably. For example, if the image is interlaced, a single frame is separated into an odd (or top) field and an even (or bottom) field, and each field is configured as a single picture unit for encoding or decoding. And if the image is sequential, a single frame is configured as a picture for encoding or decoding.

[0036] FIG. 1 is a block diagram illustrating an encoding device according to an embodiment of the present invention, and is intended to explain the configuration and operation of a video encoder for encoding images.

[0037] Referring to FIG. 1, the video encoder (100) may be configured to include a conversion unit (110), a quantization unit (120), an inverse quantization unit (130), an inverse conversion unit (140), a filtering unit (150), a prediction unit (160), a DPB (Decoded Picture Buffer, 170), and an entropy coding unit (180).

[0038] The conversion unit (110) converts the residual signal, which is the difference between the input video signal and the prediction signal generated by the prediction unit (160), to obtain a conversion coefficient value.

[0039] For example, the Discrete Cosine Transform (DCT), Discrete Sine Transform (DST), or Wavelet Transform may be used.

[0040] The transformation kernel used for the transformation of a residual block may be a transformation kernel having separable characteristics of vertical transformation and horizontal transformation. In this case, the transformation of the residual block can be performed by separating it into a vertical transformation and a horizontal transformation. For example, an encoder may perform a vertical transformation by applying a transformation kernel in the vertical direction of the residual block. Additionally, an encoder may perform a horizontal transformation by applying a transformation kernel in the horizontal direction of the residual block.

[0041] Meanwhile, a transformation kernel can be used as a term referring to a set of parameters used for transforming a residual signal, such as a transformation matrix, transformation array, transformation function, or transformation; it can be any one of a plurality of available kernels, and transformation kernels based on different transformation types may be used for vertical transformation and horizontal transformation, respectively.

[0042] Transformation coefficients may be distributed such that higher coefficients are found towards the top-left corner of the block, and coefficients close to '0' are found towards the bottom-right corner. Additionally, as the current block size increases, there is a higher likelihood of '0' coefficients existing in the bottom-right region; to reduce the transformation complexity of large blocks, only an arbitrary top-left region may be retained, while the remaining regions may be reset to '0'.

[0043] In addition, an error signal may exist only in some areas of a coding block, in which case the conversion process may be performed only on some arbitrary areas. For example, in a block of size 2Nx2N, an error signal may exist only in the first 2NxN block, in which case the conversion process is performed only on the first 2NxN block, but the second 2NxN block may not be encoded or decoded without the conversion process being performed.

[0044] The encoder may perform additional transformations before the transformation coefficients are quantized. The transformation method described above is referred to as a primary transform, and the additional transformation may be referred to as a secondary transform.

[0045] The second-order conversion can be optional for each residual block. For example, the encoder can improve coding efficiency by performing a second-order conversion for regions where it is difficult to concentrate energy in the low-frequency region with only the first-order conversion.

[0046] Specifically, a second transformation may be additionally performed on blocks where residual values appear significantly in directions other than the horizontal or vertical direction of the residual block, and unlike the first transformation, the second transformation may not be performed separately into a vertical transformation and a horizontal transformation. The second transformation described above may be referred to as a Low Frequency Non-Separable Transform (LFNST).

[0047] The quantization unit (120) quantizes the conversion coefficient value output from the conversion unit (110).

[0048] In order to increase coding efficiency, instead of coding the picture signal as is, a method is used to predict the picture using an area that has already been coded through a prediction unit (160), and to obtain a restored picture by adding the residual value between the original picture and the predicted picture to the predicted picture.

[0049] To prevent mismatches from occurring between the encoder and decoder, when performing prediction in the encoder, information available in the decoder must be used; to this end, the encoder may perform a process of restoring the currently encoded block.

[0050] The inverse quantization unit (130) inversely quantizes the conversion coefficient value, and the inverse conversion unit (140) restores the residual value using the inversely quantized conversion coefficient value.

[0051] The filtering unit (150) performs filtering operations using a deblocking filter, a Sample Adaptive Offset (SAO), an Adaptive Loop Filter (ALF), etc., to improve the quality of the restored picture and enhance encoding efficiency.

[0052] A deblocking filter is a filter for removing distortion within blocks generated at the boundaries between blocks in a restored picture, and the encoder can determine whether to apply the deblocking filter to a boundary based on the distribution of pixels included in a few columns or rows relative to an arbitrary boundary within a block.

[0053] When a deblocking filter is applied, the filtering unit (150) can apply a long filter, a strong filter, or a weak filter depending on the deblocking filtering strength, and can process horizontal filtering and vertical filtering in parallel.

[0054] Sample Adaptive Offset (SAO) can be used to correct the offset from the original image on a pixel-by-pixel basis for a residual block to which a deblocking filter has been applied. The filtering unit (150) may use a Band Offset method to correct the offset for a specific picture by dividing the pixels included in the image into a certain number of regions, determining the region to perform offset correction, and applying the offset to the region. Additionally, the filtering unit (150) may use an Edge Offset method to apply the offset by considering the edge information of each pixel.

[0055] An Adaptive Loop Filter (ALF) is a method that divides pixels contained in an image into specific groups, determines a single filter to apply to each group, and performs differential filtering for each group. Information regarding whether to apply an Adaptive Loop Filter can be signaled at the coding unit level, and the shape and filter coefficients of the ALF filter to be applied may vary depending on each block. Additionally, the same type of Adaptive Loop Filter may be applied regardless of the characteristics of the target block.

[0056] The filtered picture can be stored in DPB (170) to be used as a reference picture.

[0057] The prediction unit (160) includes an intra prediction unit (161) and an inter prediction unit (165), and the intra prediction unit (161) performs intra prediction within the current picture. Various intra prediction methods such as DIMD (Decoder Side Intra Mode Derivation), TIMD (Template based Intra Mode Derivation), EIP (Edge-based Intra Prediction), OBIC (Orientation-Based Intra Coding), MIP (Matrix based Intra Prediction), SGPM (Spatial Geometric Partitioning Mode), IntraTMP (Intra Template Matching Prediction), IBC (Intra Block Copy), MRL (Multiple Reference Lines), and ISP (Intra Sub-partitions) may be applied as intra prediction methods.

[0058] The inter prediction unit (165) performs an inter prediction that predicts the current picture using a reference picture stored in the DPB (170).

[0059] The intra prediction unit (161) performs intra prediction from the restored regions within the current picture and transmits intra encoding information to the entropy coding unit (180). Here, the intra encoding information may include an intra prediction mode, an MPM (Most Probable Mode) flag, an MPM index, information regarding a reference sample, and at least one of various parameter information required for intra prediction according to each intra prediction method.

[0060] The inter prediction unit (165) refers to a specific area of the restored reference picture to find the part most similar to the current area and obtains a motion vector value which is the distance between the areas, and transmits motion information for the obtained reference area (reference direction indicator information (L0 prediction, L1 prediction, bidirectional prediction), reference picture index, motion vector information, etc.) to the entropy coding unit (180).

[0061] Additionally, the inter prediction unit (165) performs motion compensation using motion information to generate a prediction block for the current block, and transmits inter encoding information containing motion information for a reference area to the entropy coding unit (180).

[0062] Meanwhile, quantized transformation coefficients in the form of a two-dimensional array can be rearranged into a one-dimensional array for entropy coding.

[0063] The method for scanning quantized transform coefficients can be determined by the size of the transform block and the intra-prediction mode, and diagonal, vertical, or horizontal scans can be applied, and the scan information can be signaled in block units or derived from a decoder according to a defined rule.

[0064] The entropy coding unit (180) generates a bitstream by entropy coding information representing quantized conversion coefficients, intra-coding information, and inter-coding information, etc. For this purpose, a variable length coding (VLC) method and an arithmetic coding method may be used.

[0065] Variable Length Coding (VLC) converts input symbols into a sequence of codewords, the length of which can be variable. For example, frequently occurring symbols can be represented by short codewords, while infrequently occurring symbols can be represented by long codewords.

[0066] As a variable-length coding method, Context-based Adaptive Variable Length Coding (CAVLC) can be used.

[0067] Arithmetic coding utilizes the probability distribution of each data symbol to convert consecutive data symbols into a single prime number, thereby obtaining the optimal prime bit required to represent each symbol.

[0068] As an arithmetic coding method, the Context-based Adaptive Binary Arithmetic Code (CABAC) method can be used.

[0069] The generated bitstream is encapsulated with NAL (Network Abstraction Layer) units as the basic unit.

[0070] NAL units are classified into VCL (Video Coding Layer) NAL units containing video data and non-VCL NAL units containing parameter information for decoding video data, and various types of VCL or non-VCL NAL units may exist.

[0071] A NAL unit consists of NAL header information and data, which is a Raw Byte Sequence Payload (RBSP), and the NAL header information includes summary information about the RBSP. The RBSP of a VCL NAL unit contains encoded integer coding tree units.

[0072] In order to decode a bitstream in a decoder, the bitstream must first be separated into NAL units, and then each separated NAL unit must be decoded. Meanwhile, the information required for decoding the bitstream can be transmitted in the Picture Parameter Set (PPS), Sequence Parameter Set (SPS), Video Parameter Set (VPS), etc.

[0073] Meanwhile, the configuration and operation of the encoder described with reference to FIG. 1 are according to an embodiment of the present invention, and some configurations may be omitted or added as needed.

[0074] Additionally, a single picture can be divided and encoded into sub-pictures, slices, tiles, etc. A sub-picture may include one or more slices or tiles. If a single picture is divided and encoded into multiple slices or tiles, it can be displayed on the screen only after all slices or tiles within the picture have been fully decoded.

[0075] When a single picture is encoded into multiple sub-pictures, any single sub-picture may be decoded and displayed on the screen. A slice may include multiple tiles or sub-pictures, and a tile may include multiple sub-pictures or slices.

[0076] Sub-pictures, slices, and tiles can be encoded or decoded independently of each other, which is effective for parallel processing and improving processing speed; however, since encoded information from adjacent sub-pictures, slices, and tiles cannot be utilized, the bit size may increase.

[0077] And sub-pictures, slices, and tiles can be divided into multiple Coding Tree Units (CTUs) and encoded.

[0078] The coding tree unit may consist of a 128x128 luminance coding tree block (Coding Tree Block, CTB) and two 64x64 color difference coding tree blocks.

[0079] A single coding tree unit may not be divided and may constitute a single coding unit (CU) itself, or it may be divided into multiple coding units as shown in FIG. 2. A coding unit may consist of a luminance coding block (CB) and two color difference coding blocks.

[0080] A single coding unit may consist of a single transform unit (TU) or be divided into multiple transform units. A transform unit may consist of a luminance transform block (TB) and two chrominance transform blocks.

[0081] Here, a coding unit represents a basic unit for processing a picture during the prediction, transformation, quantization, entropy coding, and decoding processes, and the size and shape of the coding unit within a single picture may not be constant.

[0082] The coding unit may have a square or non-square shape, and the non-square coding unit may include a vertical coding unit in which the height is greater than the width and a horizontal coding unit in which the width is greater than the height.

[0083] The coding tree unit is first partitioned into a Quad Tree (QT) structure, so that a single node of size 2NX2N can be partitioned into four nodes of size NXN. Meanwhile, the Quad Tree partitioning can be performed recursively, and not all nodes need to be partitioned to the same depth.

[0084] The leaf nodes of a quad tree can be further divided into a Multi-Type Tree (MTT) structure. For example, in a Multi-Type Tree structure, a single node can be divided into a binary or ternary tree structure of horizontal or vertical splitting. Accordingly, there can be four types of splitting structures in a Multi-Type Tree structure: vertical binary splitting, horizontal binary splitting, vertical ternary splitting, and horizontal ternary splitting.

[0085] In each tree structure, the width and height of the nodes can both have powers of 2. For example, in a binary tree (BT) structure, a node of size 2NX2N can be divided into 2 NX2N nodes by a vertical binary split and into 2 2NXN nodes by a horizontal binary split.

[0086] In addition, in a ternary tree (TT) structure, a node of size 2NX2N can be divided into (N / 2)X2N, NX2N, and (N / 2)X2N nodes by vertical ternary partitioning, and into 2NX(N / 2), 2NXN, and 2NX(N / 2) nodes by horizontal ternary partitioning. This multi-type tree partitioning can be performed recursively.

[0087] Leaf nodes of a multi-type tree can serve as coding units, and if a coding unit is not larger than the maximum transformation length, it can be used as a unit of prediction and transformation without further splitting. On the other hand, if the width or height of a coding unit is greater than the maximum transformation length, it can be split into multiple transformation units without explicit signaling regarding splitting.

[0088] The tree partitioning structure described above may have the same form (Single Tree) for the luminance block and the chrominance block, or different forms (Dual Tree) for the luminance block and the chrominance block.

[0089] Meanwhile, regarding the block division from the coding tree unit (CTU) to the coding unit (CU) as described above, the final coding unit can be determined by selecting the division structure with the smallest rate-distortion cost (RD cost) value within the allowable size and depth conditions through the rate-distortion optimization (RDO) process.

[0090] Hereinafter, embodiments of an intra-prediction method will be described in more detail with reference to FIGS. 3 and FIGS. 4.

[0091] For intra prediction, intra prediction mode information may be signaled, and the intra prediction mode information may indicate any one of a plurality of intra prediction modes. The plurality of intra prediction modes may include various methods such as directional intra prediction mode as illustrated in FIG. 3, IntraTMP, IBC, DIMD, TIMD, OBIC, MIP, SGPM, MRL, ISP, etc. The video encoder (100) includes the intra prediction mode information applied to the current block in the bitstream and transmits it to the video decoder (200), and the video decoder (200) can determine the intra prediction mode of the current block by parsing the intra prediction information included in the bitstream.

[0092] For example, as illustrated in FIG. 3, the directional intra prediction mode may include a planar mode, a DC mode, and 65 directional modes, and each intra prediction mode may be indicated through an intra prediction mode index.

[0093] Intra-prediction mode index "0" indicates a planar mode, intra-prediction mode index "1" indicates a DC mode, and intra-prediction mode indices "2" through "66" can each indicate different directional modes.

[0094] Directional modes each indicate different angles within a preset angle range, for example, a directional mode can indicate an angle within an angle range between 45 degrees and -135 degrees clockwise.

[0095] In this case, the intra prediction mode index "2" indicates a horizontal diagonal (HDIA) mode, the intra prediction mode index "18" indicates a horizontal (HOR) mode, the intra prediction mode index "34" indicates a diagonal (DIA) mode, the intra prediction mode index "50" indicates a vertical (VER) mode, and the intra prediction mode index "66" indicates a vertical diagonal (VDIA) mode.

[0096] If the current block is a non-square block, 20 additional wide angular modes indicating angles greater than 45 degrees clockwise or less than -135 degrees may be used.

[0097] Based on the intra prediction mode information as described above, reference samples to be used for intra prediction for the current block are determined.

[0098] For example, if the intra prediction mode index indicates a specific directional mode, a reference sample corresponding to the angle from the current sample in the current block is used for prediction regarding the current sample. For intra prediction, surrounding already restored samples are used as reference samples, and the reference samples may be restored samples located to the left or above the current block.

[0099] Referring to FIG. 4, the reference samples may be samples adjacent to the left boundary and the upper boundary of the current block.

[0100] For example, if the current block size is NxN and samples from a single reference line adjacent to the current block are used for intra prediction, reference samples can be established using (2N*2+1) surrounding samples located to the left (L, Left), top (T, Top), and top-left (TL, Top-left) of the current block.

[0101] Meanwhile, samples of multiple reference lines (MRLs) may be used for intra prediction of the current block, and the multiple reference lines may consist of n reference lines located within a pre-set range from the current block. In this case, separate reference line index information indicating the reference lines to be set as reference pixels may be signaled.

[0102] In addition, if at least some of the samples to be used as reference samples have not yet been restored, reference samples can be obtained through a reference sample padding process, and a reference sample filtering process can be performed to reduce the error of intra-prediction.

[0103] FIG. 5 is a block diagram illustrating a decoding device according to an embodiment of the present invention, and is intended to explain the configuration and operation of a video decoder for decoding an image.

[0104] Referring to FIG. 5, the video decoder (200) may be configured to include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transform unit (230), a filtering unit (240), a prediction unit (260), and a DPB (Decoded Picture Buffer, 270).

[0105] The entropy decoding unit (210) entropies decodes the bitstream to extract conversion coefficient information, intra-coding information, inter-coding information, etc. for each region.

[0106] For example, the entropy decoding unit (210) can obtain a binary code for conversion coefficient information of a specific region from the bitstream, and can obtain quantized conversion coefficients by debinding the binary code.

[0107] The inverse quantization unit (220) inversely quantizes the quantized transformation coefficients, and the inverse transformation unit (230) restores the residual value using the inversely quantized transformation coefficients. After performing a first-order inverse transformation on a transformation block containing the inversely quantized transformation coefficients, the inverse transformation unit (230) can additionally perform a second-order inverse transformation to obtain the residual value.

[0108] Meanwhile, the residual value obtained from the inverse conversion unit (230) is added to the predicted value obtained from the prediction unit (260) to restore the original pixel value.

[0109] The filtering unit (240) performs filtering operations using a deblocking filter, a sample adaptive offset, an adaptive loop filter, etc., to improve the quality of the restored picture, and the filtered picture can be stored in the DPB (270) to be output or used as a reference picture for the next picture.

[0110] The prediction unit (260) includes an intra prediction unit (261) and an inter prediction unit (265), and generates a prediction picture by utilizing the encoding type decoded through the entropy decoding unit (210), the conversion coefficient for each region, intra / inter encoding information, etc.

[0111] To restore the current block being decrypted, the decrypted regions of the current picture containing the current block or other pictures may be utilized. A picture (or tile / slice) that performs intra prediction or intra-BC prediction using only the current picture for restoration is called an intra picture or I picture (or tile / slice), and a picture (or tile / slice) capable of performing intra prediction, inter prediction, and intra-BC prediction is called an inter picture (or tile / slice).

[0112] Meanwhile, a picture (or tile / slice) that uses at most one motion vector and reference picture index to predict sample values of each block among the inter-pictures (or tiles / slices) is called a predictive picture or P picture (or tile / slice), and a picture (or tile / slice) that uses at most two motion vectors and reference picture indices is called a Bi-predictive picture or B picture (or tile / slice).

[0113] That is, the P picture (or tile / slice) uses at most one set of motion information to predict each block, and the B picture (or tile / slice) uses at most two sets of motion information to predict each block. Here, a set of motion information may include one or more motion vectors and one reference picture index.

[0114] The intra prediction unit (261) generates a prediction block using intra encoding information and restored samples within the current picture, and the intra encoding information may include at least one of an intra prediction mode, an MPM (Most Probable Mode) flag, an MPM index, information regarding a reference sample, and various parameter information required for intra prediction according to each intra prediction method.

[0115] The intra prediction unit (261) can predict the sample values of the current block by using the restored samples located to the left and / or above the current block as reference samples.

[0116] For example, the reference samples may be samples adjacent to the left boundary of the current block and / or samples adjacent to the upper boundary, and among the samples of the surrounding blocks of the current block, samples located on a line within a preset distance from the left boundary of the current block and / or samples located on a line within a preset distance from the upper boundary of the current block. In this case, the surrounding blocks of the current block may include at least one of the left (L) block, upper (A) block, lower-left (BL) block, upper-right (AR) block, or upper-left (AL) block adjacent to the current block.

[0117] The inter prediction unit (265) generates a prediction block using reference picture and inter encoding information stored in the DPB (270), and the inter encoding information may include a set of motion information of the current block for the reference block (reference picture index, motion vector information, etc.).

[0118] Meanwhile, inter-prediction can include L0 prediction, L1 prediction, and bi-prediction.

[0119] L0 prediction means a prediction using one reference picture included in the L0 picture list, and L1 prediction means a prediction using one reference picture included in the L1 picture list. To do this, a set of motion information (e.g., motion vector and reference picture index) may be required.

[0120] In the bidirectional prediction method, up to two reference regions can be used, and the two reference regions may exist in the same reference picture or in different pictures. Accordingly, in the up to two sets of motion information used in the bidirectional prediction method, two motion vectors may correspond to the same reference picture index or to different reference picture indices.

[0121] At this time, the reference pictures are pictures located temporally before or after the current picture, and can be completed pictures that have already been restored, and the two reference regions used in the bidirectional prediction method can be regions selected from the L0 picture list and the L1 picture list, respectively.

[0122] The inter prediction unit (265) can obtain a reference block of the current block using a motion vector and a reference picture index, and the reference block exists within a reference picture corresponding to the reference picture index.

[0123] In addition, a sample value of a block specified by a motion vector or an interpolated value thereof can be used as a predictor for the current block. For motion prediction with subpel-level pixel accuracy, an 8-tap interpolation filter can be used for the luminance signal and a 4-tap interpolation filter can be used for the chrominance signal.

[0124] Meanwhile, the configuration and operation of the decoder described with reference to FIG. 5 is according to an embodiment of the present invention, and some configurations may be omitted or added as needed, and the decoder can decode an image by performing the reverse process of the encoding method of the encoder described above.

[0125] Meanwhile, as previously described, a matrix-based intra prediction (hereinafter referred to as MIP) may be used as one of the multiple intra prediction modes. MIP may also be referred to as MPDIP (Matrix-based position-Dependent Intra-Prediction). Below, the process of generating prediction samples according to the MIP mode in the intra prediction unit (161, 261) is described.

[0126] FIG. 6 is a diagram illustrating the process of generating a prediction sample of the current block according to the MIP mode according to one embodiment of the present invention.

[0127] When the MIP mode is applied to the current block, the intra prediction unit (161, 261) performs a matrix multiplication procedure using surrounding reference samples and, if necessary, further performs a horizontal / vertical interpolation procedure to obtain prediction samples for the current block.

[0128] In order to predict samples of a current block of size W*H with width (W) and height (H) in MIP mode, reference samples (Reference(r(k)) of a reference region consisting of restored left peripheral reference samples and upper peripheral reference samples of the current block are used as input values. As shown in Equation 1 below, a predicted sample (P(x,y)) of the current block at position (x,y) can be obtained through matrix multiplication operations between the reference samples (r(k)) and the matrix (M(x,y,k)).

[0129]

[0130] In mathematical formula 1, k is a reference sample index pointing to one of a plurality of MIP matrices, and one of the plurality of MIP matrices can be set for each predicted pixel location (x,y) of the current block. Each of the plurality of MIP matrices includes weight vectors composed of weights that are multiplied to surrounding reference pixels for each predicted pixel location.

[0131] FIGS. 7a and 7b illustrate an example of a process for generating a predicted sample of a current block according to an MIP mode according to an embodiment of the present invention.

[0132] To predict samples of a current block (700) of size W*H having width (W) and height (H), a reference sample (r(k)) used as an input for MIP mode can be obtained from the restored upper peripheral reference samples (710, 730) and left peripheral reference samples (720, 740) of the current block. For example, when the size of the current block (700) is 4*4, as shown in FIG. 7a, four upper peripheral reference samples (710) and four left peripheral reference samples (720) can be used as input values for matrix multiplication operations. Additionally, as shown in FIG. 7b, when the size of the current block (700) is 4*4, peripheral reference pixels reconstructed through an averaging process for eight upper peripheral reference samples (730) and eight left peripheral reference samples (740) can be used as input values for matrix multiplication operations. FIGS. 7a and 7b are merely examples, and the reference region including the reference pixels used in MIP mode is not limited thereto and may be changed. The method for setting the reference region will be described later.

[0133] It is assumed that the values of surrounding reference pixels through the above process, or the values of surrounding reference pixels reconstructed through the averaging process, are as shown in Table 1 below.

[0134] Reference Sample Reference Sample Value T1[0]100 T1[1]102 T1[2]101 T1[3]99 T2[0]103 T2[1]105 T2[2]107 T2[3]108

[0135] In addition, it is assumed that the MIP matrix applied to each pixel position of the current 4x4 block is set as shown in Table 2 below.

[0136] Pixel position (x,y) Weight vector F(x,y,k)(0,0)[0.20, 0.15, 0.10, 0.05, 0.25, 0.15, 0.07, 0.03](1,0)[0.25, 0.20, 0.10, 0.05, 0.20, 0.10, 0.07, 0.03](0,1)[0.10, 0.10, 0.05, 0.05, 0.30, 0.25, 0.10, 0.05](1,1)[0.15, 0.10, 0.10, 0.05, 0.25, 0.20, 0.10, 0.05]

[0137] For example, the predicted sample P(0,0) at the (0,0) position of the current block (700) and the predicted sample P(1,1) at the (1,1) position can be obtained as shown in the following mathematical formulas 2 and 3, respectively, through a weighted sum using the reference sample value of the surrounding pixels in Table 1 and the weight vector applied to the (0,0) and (1,1) pixel positions in Table 2.

[0138]

[0139]

[0140] FIG. 8 is a diagram illustrating an interpolation process performed in MIP mode in one embodiment of the present invention.

[0141] Referring to FIG. 8, when the prediction mode of the current block is MIP mode, only the pixel values of the current block's specific locations (A1, A2, B1, B2, etc.) are predicted by applying the surrounding reference sample values and the MIP matrix as described in Equation 1 above, and the pixel values of the remaining locations can be predicted through a horizontal or vertical interpolation process using adjacent surrounding pixels. Pixels (a1 to a6, b1 to b6) located in the same row as the pixels of the specific locations (A1, A2, B1, B2, etc.) are predicted through horizontal linear interpolation using adjacent available surrounding pixels, and the remaining pixels (810) not located in the same row as the pixels of the specific locations (A1, A2, B1, B2, etc.) can be predicted through vertical linear interpolation using adjacent available surrounding pixels.

[0142] Meanwhile, whether the prediction mode of the current block is MIP mode can be signaled through a specific intra prediction mode. Existing codecs support various intra prediction modes. For example, in the intra prediction of VVC / H.266, a total of 65 intra prediction modes are supported for square blocks, and a total of 93 intra prediction modes are supported for non-square blocks. Multiple intra prediction modes include non-angular modes such as Planar and DC modes, and multiple angular modes such as the example in FIG. 3 described above. In one embodiment, a specific intra prediction mode among the multiple intra prediction modes may be replaced to indicate the MIP mode. For example, the intra prediction mode of {0, 1, (2+a*k)} (where a is a predetermined integer and k is any non-negative integer) may be replaced with the MIP mode. If a is set to 2, all intra prediction modes having indices {0, 1, 2, 4, 6, 8,} can be replaced with MIP modes. That is, if the intra prediction mode of the current block corresponds to one of {0, 1, 2, 4, 6, 8,}, the prediction mode of the current block can be determined as MIP mode instead of the original intra prediction mode. Information indicating the intra prediction mode to be replaced with MIP mode can be set identically in advance in the video encoder (100) and the video decoder (200). Additionally, the video encoder (100) can encode the information and include it in the bitstream transmitted to the video decoder (200). For example, as in the example above, if the intra prediction mode of {0, 1, (2+a*k)} is replaced with MIP mode, the video encoder (100) can transmit information of the value of a to the video decoder (200). {0, 1} can be set to be replaced with MIP mode in advance.

[0143] In one embodiment, the intra prediction mode that is replaced to point to the MIP mode can be adaptively set based on various conditions. For example, the intra prediction mode that is replaced to point to the MIP mode can be adaptively set based on the size, shape, slice type, QP (Quantization parameter) of the current block, information on surrounding reference blocks above and to the left, template cost obtained based on the template, intra prediction related tools (tool information) such as DIMD, TIMD, IntraTMP, OBIC of the current block, MPM value, prediction mode tool information of surrounding blocks, MRL related information, reference value of the block vector, and L-shaped reference area information.

[0144] For example, if a specific intra prediction mode is configured to be replaced with an MIP mode in different ways depending on the size of the current block, the specific intra prediction mode may be replaced with an MIP mode as shown in Table 3 below, depending on the size of the current block (W*H).

[0145] Current block size Intra prediction mode replaced by MIP mode W≤16 && H≤160, 1, (2+2*k) Others 0, 1, (2+4*k)

[0146] According to Table 3, if the intra prediction mode of the current block corresponds to a specific intra prediction mode that is replaced by the MIP mode, the prediction mode of the current block may be determined to be the MIP mode. For example, if the width (W) and height (H) of the current block are both 16 or less, and the intra prediction mode of the current block corresponds to 0 (planar mode), 1 (DC mode), or (2+2*k) (k is an integer), the prediction mode of the current block is determined to be the MIP mode. As another example, if the size of the current block is 8*16 and the intra prediction mode of the current block corresponds to one of {0, 1, 2, 4, 6, 8, 10, 12}, the prediction mode of the current block is determined to be the MIP mode. In addition, if the width (W) of the current block is not less than or equal to 16 or the height (H) is not less than or equal to 16, and the intra prediction mode of the current block corresponds to one of 0 (planar mode), 1 (DC mode), or (2+4*k), the prediction mode of the current block is determined to be MIP mode. As another example, if the size of the current block is 16*32 and the intra prediction mode of the current block corresponds to one of {0, 1, 2, 6, 10, 14, 18}, the prediction mode of the current block is determined to be MIP mode.

[0147] The MIP mode may be restricted depending on the size and shape of the current block, not limited to Table 3 described above. The MIP mode may not be applied if the size of the current block exceeds a predetermined size, if the shape of the current block is excessively large in only one direction (horizontal or vertical), if the ratio of width to height exceeds a certain value, or if the current block is not rectangular. For example, the MIP mode may be restricted for blocks that are excessively long in only one direction (horizontal or vertical), such as 4*32, 32*4, 8*32, or 32*8, or for blocks where both width and height exceed a predetermined threshold size (e.g., 32*32). In cases where the MIP mode is restricted in this way, the intra prediction mode of the current block is applied as is, and the intra prediction mode is not replaced by the MIP mode.

[0148] FIG. 9 is a diagram illustrating the process of configuring an MPM list according to an embodiment of the present invention.

[0149] Referring to FIG. 9, six MPM lists can be configured based on the intra prediction modes of the surrounding blocks (A) adjacent to the upper side of the current block and the surrounding blocks (L) adjacent to the left side. For example, if the size of the current block is W*H and the top-left corner position of the current block is (0,0), the surrounding block (A) adjacent to the upper side may be a block containing the surrounding pixels at the position (W-1, -1), and the surrounding block (L) adjacent to the left side may be a block containing the surrounding pixels at the position (-1, H-1).

[0150] If the intra prediction mode of the surrounding blocks (A, L) is not available, the prediction mode of the surrounding blocks can be set to planar mode to construct the MPM list. If the intra prediction mode of the surrounding blocks (A, L) is a non-angular mode such as planar or DC, a 6-MPM list including {Planar, DC, V, H, V-4, V+4} (V is vertical mode, H is horizontal mode) can be constructed. If the intra prediction mode of the surrounding blocks (A, L) is an angular mode, the MPM list can be constructed using the surrounding intra prediction modes of that angular mode. For example, the MPM list can be constructed using an intra prediction mode to which a predetermined offset (±1, ±2, ��) is applied to the intra prediction mode of the surrounding blocks (A, L). In this case, considering redundancy, prediction modes already included in the MPM list are not added to the MPM list.

[0151] Additionally, an MPM list may be constructed based on the intra prediction modes of the surrounding blocks adjacent to the upper side of the current block (A), adjacent to the left side (L), adjacent to the upper right side (AR), adjacent to the lower left side (BL), and adjacent to the upper left side (AL). Specifically, an MPM list containing 22 MPM candidates may be constructed based on the intra prediction modes of the surrounding blocks L, A, BL, AR, and AL, and the surrounding intra prediction modes and default modes to which a predetermined offset (±1, ±2, ��) is applied to the first two directional intra prediction modes available among them. Here, the default modes may include a horizontal mode, a vertical mode, a diagonal mode in the upper-left direction, and a DIMD mode.

[0152] A PMPM (Primary MPM) containing the top 6 intra-prediction modes from the list of 22 MPMs may be configured, or a PMPM containing the top 5 intra-prediction modes and a Planar mode may be configured. The remaining 16 intra-prediction modes not included in the PMPM constitute a SMPM (Secondary MPM).

[0153] Meanwhile, when configuring MPM candidates included in the MPM list, the MPM candidates can be sorted in ascending order based on costs such as SATD (Sum of Absolute Transformed Differences) to determine the order of the MPM candidates included in the MPM list. That is, the smaller the SATD, the higher the rank (smaller mpm_idx) the MPM candidate can be determined.

[0154] FIGS. 10a and 10b are drawings for explaining a method of determining the cost when configuring an MPM candidate according to one embodiment of the present disclosure.

[0155] Referring to FIG. 10a, a prediction mode according to an MPM candidate is applied to an upper template (1012) of a predetermined size composed of upper surrounding pixels of the current block (1011) to obtain a predicted value of the upper template (1012) from L-shaped surrounding reference pixels (1013), and a first cost such as SATD can be calculated based on the value of the upper template (1012) and the predicted value. Additionally, referring to FIG. 10b, a prediction mode according to an MPM candidate is applied to a left template (1022) of a predetermined size composed of left surrounding pixels of the current block (1021) to obtain a predicted value of the left template (1022) from L-shaped surrounding reference pixels (1023), and a second cost such as SATD can be calculated based on the value of the left template (1022) and the predicted value. The first and second costs obtained from the upper template (1012) and the left template (1022) can be summed to obtain the cost of the final template. Based on this template cost, MPM candidates included in the MPM list can be sorted. As described below, when a specific MPM candidate in the MPM list is replaced with an MIP mode, the cost of the template can be calculated according to the MIP mode when calculating the template cost according to the MPM candidate.

[0156] Whether the MPM list is in use can be signaled explicitly or implicitly. For example, if the MPM list is explicitly signaled, the video encoder (100) may transmit flag information (MPM_flag) indicating whether the MPM list is in use to the video decoder (200). If MPM_flag indicates that the MPM list is in use, the MPM index (mpm_idx) may be transmitted as intra prediction mode information instead of the intra prediction mode index information. In this specification, the intra prediction mode information of the current block may be the mpm index when the MPM list is in use, and general intra prediction mode index information when the MPM list is not in use. Whether the MPM list is in use may also be implicitly derived based on additional information related to the current block.

[0157] Meanwhile, in one embodiment, a specific intra prediction mode within an MPM list or a non-MPM list may be replaced with an MIP mode, and in this case, if the prediction mode of the current block corresponds to a specific intra prediction mode that has been replaced with an MIP mode, the prediction mode of the current block may be determined as an MIP mode. Additionally, a specific intra prediction mode may be configured to be replaced with an MIP mode in different ways depending on the size of the current block. For example, depending on the size of the current block (W*H), a specific intra prediction mode that is replaced with an MIP mode from an initial intra prediction mode (Intra_Prediction_Mode) may be determined as shown in Table 4 below.

[0158] Current block size Intra prediction mode replaced by MIP mode W≤16 && H≤16((Intra_Prediction_Mode -2)+1>>1)*2+2 Others((Intra_Prediction_Mode -2)+2>>1)*4+2

[0159] For example, if the current block's width (W) and height (H) are both 16 or less and Intra_Prediction_Mode is 3, the intra prediction mode with a value of 4 ((3-2)+1>>1)*2+2) is replaced with the MIP mode. That is, the intra prediction mode index 4 is set to point to the MIP mode instead of pointing to the existing directional intra prediction mode. If Intra_Prediction_Mode is 6, the value of ((6-2)+1>>1)*2+2 is 6, so the intra prediction mode with a value of 6 is replaced with the MIP mode. If Intra_Prediction_Mode is 2, the value of ((2-2)+1>>1)*2+2 is 2, so the intra prediction mode with a value of 2 is replaced with the MIP mode. If Intra_Prediction_Mode is 8, the value of ((8-2)+1>>1)*2+2 is 8, so the intra prediction mode with a value of 8 is replaced with the MIP mode. In this way, the index of the intra prediction mode pointing to the MIP mode among all Intra_Prediction_Modes can be determined.

[0160] As another example, if the current block width (W) is greater than 16 or the height (H) is greater than 16 and Intra_Prediction_Mode is 3, the intra prediction mode with a value of 6 ((3-2)+2)>>1)*4+2 is replaced with MIP mode. If Intra_Prediction_Mode is 6, the intra prediction mode with a value of 14 ((6-2)+2>>1)*4+2) is replaced with MIP mode.

[0161] In this way, specific intra prediction modes can be determined to point to MIP modes. When the intra prediction mode of the current block is signaled through the MPM (Most Probable Mode) list, whether the prediction mode of the current block is a MIP mode can be determined by whether it corresponds to a specific intra prediction mode that has been replaced to point to a MIP mode within the MPM list. For example, as in the example above, if the width (W) and height (H) of the current block are both 16 or less, if the intra prediction mode of the current block is one of the intra prediction modes that has been replaced to a MIP mode such as {2, 4, 6}, the prediction mode of the current block can be determined to be a MIP mode.

[0162] The MIP mode may be restricted based on the size and shape of the current block, not limited to the aforementioned Table 4. The MIP mode may not be applied if the size of the current block exceeds a predetermined size, if the shape of the current block is excessively large in only one direction (horizontal or vertical), if the ratio of width to height exceeds a certain value, or if the current block is not rectangular. For example, the MIP mode may be restricted for blocks that are excessively long in only one direction (horizontal or vertical), such as 4*32, 32*4, 8*32, or 32*8, or for blocks where both width and height exceed a predetermined threshold size (e.g., 32*32). In cases where the MIP mode is restricted in this manner, the intra prediction mode of the current block determined from the MPM list is applied as is, and the intra prediction mode is not replaced by the MIP mode.

[0163] An intra prediction mode replaced by an MIP mode may be added to the MPM list, and redundancy is considered when configuring the MPM list. If an intra prediction mode replaced by an MIP mode is already included in the MPM list, an intra prediction mode obtained by adding a predetermined offset to the intra prediction mode replaced by an MIP mode may be added to the MPM list. For example, if an intra prediction mode such as {2, 4, 6...} is replaced by an MIP mode and {2,4,6...} is already configured in the MPM list, and an MPM candidate cannot be newly added to the MPM list due to redundancy considerations because it is already included in the MPM list as one of {2,4,6...}, then an intra prediction mode obtained by adding a predetermined offset to the intra prediction mode of that MPM candidate may be added to the MPM list. In this case, the offset may be set according to the current block size. For example, if the width and height are 16 or less, the offset is set to 2, otherwise it is set to 4, and the value obtained by adding the offset to the corresponding intra prediction mode can be included in the MPM list. Even when adding an MPM candidate with the offset added to the MPM list, redundancy is considered, and if the MPM candidate with the offset added is also included in the MPM list, it is not added to the MPM list.

[0164] In addition, as mentioned above, when MPM candidates are sorted in ascending order based on the cost of SATD when configuring MPM candidates included in the MPM list, the cost of the MPM candidate corresponding to the MIP mode is obtained by applying the MIP mode.

[0165] Meanwhile, since the matrix used for MIP is determined based on the current block size and intra-prediction mode, a separate matrix must be stored for each. This can place a burden on the memory of the encoder and decoder.

[0166] Accordingly, in one embodiment, elements included in a reference matrix (Fmax) set for the largest available block size are subsampled based on the size of the current block, and the subsampled matrix can be used in MIP mode.

[0167] Specifically, a reference matrix (F) applied to a reference block size (e.g., 32*32), which is the largest block size to which MIP is applicable. max (x, y, k) (k is the reference sample index) is generated in advance through online and offline learning.

[0168] The matrix applied to blocks smaller than the maximum block size to which MIP is applicable (e.g., 32*32) is a reference matrix (F) according to a subsampling rate determined based on the size of the largest reference block and the current block. max It can be determined by sampling some elements within (x,y,k)). The maximum block size to which MIP is applicable is W. max * H max , If the current block size is W*H, the matrix (F) used for the MIP of the current block WⅹH (x,y,k)) can be obtained based on the following mathematical formula 4.

[0169]

[0170] According to mathematical formula 4, the reference matrix (F max It can be determined which pixel location weight vector to sample from among the pixel location-specific weight vectors included within (x,y,k)). For example, the reference matrix (F max When (x,y,k)) is set for an 8*8 block and the current block size is 4*4, the reference matrix (F) applied to the 8*8 block is a weight vector applied to the predicted value of the pixel at position (a,b) within the current block. max F applied to the pixel at position (2a, 2b) among the weight vectors of (x,y,k). max(2x,2y,k) is used. That is, the weight vector applied to the pixel at position (0,0) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (0,0) of the current block, the weight vector applied to the pixel at position (0,2) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (0,1) of the current block, the weight vector applied to the pixel at position (2,0) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (1,0) of the current block, and the weight vector applied to the pixel at position (2,2) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (1,1) of the current block.

[0171] In addition, each element included in the subsampled matrix is normalized according to the following Equation 5 so that the sum of the elements (weights) applied to a specific position in the current block becomes a predetermined value (e.g., 1), and finally, the normalized subsampled matrix (F' WⅹH (x,y,k)) can be used as a matrix for predicting the MIP of the current block of size W*H.

[0172]

[0173] That is, when predicting the MIP of the current block of size W*H, a prediction sample (P(x,y)) at the (x,y) position within the current block is obtained using a normalized subsampled matrix according to the following mathematical formula 6.

[0174]

[0175] For example, the process of setting the matrix used for MIP prediction of the current 4x4 block from the reference matrix set for an 8x8 block is explained as follows.

[0176] Reference matrix (F) applied to an 8*8 block maxAssume that the weight vectors applied to some pixel locations among the weight vectors included in (x,y,k) are as shown in Table 5 below.

[0177] Weight vector (F) applied to position (x, y) within the reference block max (x,y,k))(0,0)[0.14, 0.10, 0.06, 0.04, 0.22, 0.20, 0.14, 1.10](2,0)[0.12, 0.10, 0.08, 0.04, 0.23, 0.19, 0.14, 0.10](4,0)[0.11, 0.09, 0.08, 0.05, 0.24, 0.20, 0.13, 0.10](6,0)[0.10, 0.09, 0.07, 0.04, 0.25, 0.21, 0.13, 0.11](0,2)[0.13, 0.09, 0.06, 0.05, 0.23, 0.20, 0.14, 0.10](2,2)[0.12, 0.10, 0.07, 0.05, 0.23, 0.20, 0.14, 0.09](4,0)[0.11, 0.09, 0.07, 0.05, 0.24, 0.20, 0.14, 0.10](6,2)[0.10, 0.09, 0.08, 0.04, 0.25, 0.20, 0.14, 0.10]

[0178] As described above, the weight vector applied to the pixel at position (0,0) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (0,0) of the current block, the weight vector applied to the pixel at position (0,2) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (0,1) of the current block, the weight vector applied to the pixel at position (2,0) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (1,0) of the current block, and the weight vector applied to the pixel at position (2,2) of the 8*8 reference block is used as the weight vector applied to the predicted value of the pixel at position (1,1) of the current block. That is, according to the following Table 6, the weight vector applied to the pixel at the subsampling position of the reference block is used as the weight vector applied to the position of the corresponding 4*4 block.

[0179] Weight vector (F) applied to the (x, y) position based on the (x, y) position within the block and the corresponding (x, y) position within the block. 4X4 (x,y,k))(0,0)(0,0)[0.14, 0.10, 0.06, 0.04, 0.22, 0.20, 0.14, 1.10](1,0)(2,0)[0.12, 0.10, 0.08, 0.04, 0.23, 0.19, 0.14, 0.10](0,1)(0,2)[0.13, 0.09, 0.06, 0.05, 0.23, 0.20, 0.14, 0.10](1,1)(2,2)[0.12, 0.10, 0.07, 0.05, 0.23, 0.20, 0.14, 0.09]

[0180] Since the sum of the weights corresponding to the (1,0), (0,1), and (1,1) positions among the weight vectors in Table 6 is not 1, each weight is divided by the sum of the weights according to the aforementioned mathematical formula 5 to normalize the sum of all weights to 1, and through the normalization process for the subsampled matrix of Table 6, normalized subsampled weight vectors applied to each pixel position of a 4*4 block can be obtained as shown in the following Table 7.

[0181] Summarized weight vector (F) of position weights at (x, y) within a 4*4 block 4X4 (x,y,k))(0,0)1.00[0.14, 0.10, 0.06, 0.04, 0.22, 0.20, 0.14, 1.10](1,0)0.90[0.133, 0.111, 0.089, 0.044, 0.256, 0.211, 0.156, 0.111](0,1)0.96[0.135, 0.094, 0.063, 0.052, 0.240, 0.208, 0.146, 0.104](1,1)0.90[0.133, 0.111, 0.078, 0.056, 0.256, 0.222, 0.156, 0.100]

[0182] Thus, in one embodiment, if the largest block size to which MIP is applicable is 32*32, then the reference matrix (F) applied to the 32*32 block size max Only ) is generated in advance through online and offline learning, and the matrix applied to blocks smaller than 32*32 is a reference matrix (F) according to the subsampling ratio determined by the size of the largest block and the current block. maxA matrix applied to a small block can be determined by sampling some elements within it. Additionally, each element included in the subsampled matrix is normalized such that the sum of the elements (weights) applied to a specific position in the current block becomes a specific value (e.g., 1), and finally, the normalized subsampled matrix can be used as a matrix for predicting the MIP of the small block.

[0183] Meanwhile, the video encoder (100) may include information about the matrix in one of the data units such as sequence, GOP, frame, picture, slice, CTU, CT, TU, etc. and transmit it to the video decoder (200). For example, the video encoder (100) may form a matrix candidate group and signal information about the applied matrix candidate within the matrix candidate group, or signal a subsampling rate or normalization information regarding subsampling.

[0184] Meanwhile, in one embodiment, a prediction sample of the current block may be obtained through one of the following modes: i) a first mode in which a specific intra prediction mode among a plurality of intra prediction modes is replaced with an MIP mode and a prediction sample of the current block is obtained based on the intra prediction mode information of the current block; ii) a second mode in which the MIP mode is used as a standalone mode distinct from the intra prediction mode; and iii) a third mode in which, when the intra prediction mode of the current block corresponds to an intra prediction mode in which the intra prediction mode is replaced with an MIP mode, a prediction sample of the current block is obtained by combining a first prediction sample obtained based on the intra prediction mode information and a second prediction sample obtained according to the MIP mode.

[0185] The first mode is a mode in which a specific intra prediction mode among a plurality of intra prediction modes is replaced to point to the MIP mode as described above. If the intra prediction mode of the current block corresponds to the intra prediction mode replaced by the MIP mode, a prediction sample of the current block is obtained using MIP prediction, and if the intra prediction mode of the current block does not correspond to the intra prediction mode replaced by the MIP mode, a prediction sample of the current block is obtained according to a general intra prediction mode.

[0186] In the second mode, the MIP mode is set as a separate, independent mode distinct from the intra prediction mode. In the second mode, whether it corresponds to the MIP mode may be implicitly determined based on other additional information, instead of the intra prediction mode information set to be replaced by the MIP mode as in the first mode described above. For example, whether the prediction mode of the current block corresponds to the MIP mode may be adaptively determined based on the size of the current block, adaptively determined based on the shape of the current block, or adaptively derived and set based on the slice type of the current block, QP, information on surrounding reference blocks above and to the left, template-based template cost, intra prediction tool information such as the current block's DIMD, TIMD, IntraTMP, and OBIC, MPM value, tool information used for intra / inter prediction of surrounding blocks, MRL information, reference value of the block vector, and L-shaped reference area value. That is, if various additional information satisfies a predetermined rule according to a pre-set rule, the prediction mode of the current block may be set to the MIP mode.

[0187] Additionally, when the second mode is applied, the video encoder (100) may transmit information to the video decoder (200) indicating whether the prediction mode of the current block corresponds to the MIP mode. In this case, information indicating whether the prediction mode of the current block is the MIP mode may be signaled through separate information instead of intra-prediction mode information as in the first mode described above. The video decoder (200) can determine whether the prediction mode of the current block corresponds to the MIP mode through the information.

[0188] Additionally, the video encoder (100) can transmit information about the matrix used for MIP prediction to the video decoder (200) in data units such as sequence, GOP, frame, picture, slice, CTU, CU, TU, etc. The video encoder (100) can determine the optimal matrix among multiple matrices based on RDO and transmit information about the determined optimal matrix to the video decoder (200). Multiple matrices can be adaptively determined based on the size, shape, slice type, QP, information about surrounding reference blocks on the upper and left sides, template-based template cost, tool information related to intra prediction such as DIMD, TIMD, IntraTMP, OBIC of the current block, MPM, tool information used for intra / inter prediction of surrounding blocks, MRL information, block vector reference value, and the value of an L-shaped reference area. For example, if there are four applicable matrices depending on the size of the current block, the video encoder (100) can determine the optimal matrix among the four matrices based on RDO and include information pointing to the determined optimal matrix in the bitstream and transmit it to the video decoder (200).

[0189] Additionally, the video encoder (100) may first generate a set of matrix candidates as matrix information, determine an optimal matrix based on RDO among the matrix candidates included in the matrix candidate set, and transmit the determined matrix information to the video decoder (200). Specifically, the video encoder (100) may generate a set of matrix candidates using a default matrix candidate and a matrix candidate used in a surrounding block of the current block, and transmit index information pointing to the matrix applied to the current block from the matrix candidate set to the video decoder (200).

[0190] Similar to sorting techniques such as ARMC (Adaptive Reordering of Merge Candidate), candidate matrices included in a matrix candidate set can be sorted according to a cost obtained using a template matching method. That is, candidate matrices are sorted in ascending order according to the template matching cost, and index information pointing to the matrix applied to the current block in the sorted matrix candidate set can be signaled. One of various costs such as SAD, SSD, SATD, RM-SAD, and RM-SATD can be used as the template matching cost. The video decoder (200) can generate a matrix candidate set in the same way as the video encoder (100) and determine one matrix among the matrix candidate sets based on the transmitted matrix information.

[0191] In the third mode, if the intra prediction mode of the current block corresponds to an intra prediction mode where the intra prediction mode is replaced by the MIP mode, a prediction sample of the current block is obtained by combining a first prediction sample obtained based on the intra prediction mode information and a second prediction sample obtained according to the MIP mode. This is because, depending on the image characteristics, the prediction value obtained through the intra prediction mode may have higher prediction efficiency than the prediction value based on MIP. Therefore, in the third mode, if the intra prediction mode of the current block corresponds to an intra prediction mode where the intra prediction mode is replaced by the MIP mode as in the first mode, instead of unconditionally generating a prediction sample of the current block through the MIP mode, a prediction sample of the current block can be obtained by combining a first prediction sample obtained based on the intra prediction mode information and a second prediction sample obtained according to the MIP mode.

[0192] Whether the method of generating the prediction value of the current block corresponds to the third mode can be implicitly determined based on various additional information. For example, whether the third mode is applied can be adaptively determined based on the size of the current block, adaptively determined based on the shape of the current block, or adaptively derived and set based on the slice type of the current block, QP, information on surrounding reference blocks above and to the left, template-based template cost, intra prediction tool information such as DIMD, TIMD, IntraTMP, and OBIC of the current block, MPM value, tool information used for intra / inter prediction of surrounding blocks, MRL information, reference value of the block vector, and L-shaped reference area value. That is, when various additional information satisfies a predetermined rule according to a pre-set rule, a prediction sample of the current block can be obtained by combining a first prediction sample obtained based on intra prediction mode information and a second prediction sample obtained according to the MIP mode when predicting the current block.

[0193] In addition, a blending ratio, which is a weight applied when combining a first prediction sample and a second prediction sample, can be derived based on the above additional information. For example, if the first prediction sample is P1 and the second prediction sample is P2, the weights (W1, W2) multiplied by the first prediction sample and the second prediction sample are set based on the above additional information, and in this case, P1*W1+P2*W2 can be obtained as the prediction value of the current block.

[0194] Additionally, when a third mode is applied, the video encoder (100) may transmit information to the video decoder (200) indicating whether the prediction value generation method of the current block corresponds to the third mode. The video encoder (100) may transmit the information to the video decoder (200) in data units such as sequence, GOP, frame, picture, slice, CTU, CU, TU, etc.

[0195] Meanwhile, the video encoder (100) can explicitly transmit information about the mode applied to the prediction sample generation method of the current block among the first mode, the second mode, and the third mode to the video decoder (200). The video encoder (100) can transmit information about the mode applied to the prediction sample generation method of the current block among the first mode, the second mode, and the third mode to the video decoder (200) in data units such as sequence, GOP, frame, picture, slice, CTU, CU, TU, etc.

[0196] The mode applied to the prediction sample generation method of the current block among the first mode, second mode, and third mode can be adaptively set based on various additional information, and the video encoder (100) can transmit the additional information used to determine the mode applied to the prediction sample generation method of the current block among the first mode, second mode, and third mode to the video decoder (200) in data units such as sequence, GOP, frame, picture, slice, CTU, CU, TU, etc. For example, one of the first mode, second mode, and third mode may be adaptively determined according to the size of the current block, or one of the first mode, second mode, and third mode may be adaptively determined according to the shape of the current block, or one of the first mode, second mode, and third mode may be adaptively determined based on at least one of the slice type of the current block, QP, information of surrounding reference blocks on the upper and left sides, template-based template cost, intra prediction tool information such as DIMD, TIMD, IntraTMP, and OBIC of the current block, MPM value, tool information used for intra / inter prediction of surrounding blocks, MRL information, reference value of block vector, and L-shaped reference area value. For example, if one of the first mode, second mode and third mode is adaptively determined according to the current block size, the first mode may be adaptively determined for blocks of size 4*4 to 16*16, the second mode for blocks of size 16*16 to 32*32, and the third mode for blocks of size 32*32 to 256*256. In this case, the video encoder (100) may encode information about the block size range used for determining these modes and transmit it to the video decoder (200).

[0197] Meanwhile, as described above, in one embodiment, when the intra prediction mode of the current block is signaled through an MPM list, whether the prediction mode of the current block is a MIP mode could be determined by whether it corresponds to a specific intra prediction mode that has been replaced to point to a MIP mode within the MPM list. In this case, if the prediction mode of the current block corresponds to a specific intra prediction mode that has been replaced to point to a MIP mode within the MPM list, the prediction sample of the current block is acquired according to the MIP mode.

[0198] In another embodiment, if the prediction mode of the current block corresponds to a specific intra prediction mode that is replaced to point to the MIP mode within the MPM list, the prediction sample of the current block can be obtained by combining a first prediction sample obtained based on the intra prediction mode and a second prediction sample obtained according to the MIP mode, as in the second mode described above. For example, assuming that the MPM list is {3, 6, 9} and includes intra prediction modes 3, 6, and 9, and among these, intra prediction mode 6 is an intra prediction mode replaced by the MIP mode, and the intra prediction mode of the current block is the intra prediction mode 6 included in the MPM list, instead of generating a prediction value according to the MIP mode as the prediction sample of the current block, the prediction sample of the current block can be obtained by combining a first prediction sample, which is an intra prediction value obtained according to the intra prediction mode 6, and a second prediction sample obtained according to the MIP mode. In this case, if we denote the first prediction sample as P1, the second prediction sample as P2, and the weights multiplied by the first and second prediction samples as W1 and W2, then P1*W1+P2*W2 can be obtained as the prediction sample of the current block.

[0199] Meanwhile, as described above, a PMPM (Primary MPM) including the top 6 intra prediction modes among the 22 MPM lists may be configured, or a PMPM including the top 5 intra prediction modes and a Planar mode may be configured. The remaining 16 intra prediction modes not included in the PMPM constitute a SMPM (Secondary MPM). When the MPM is configured with PMPM and SMPM in this way, the method of generating prediction samples for the current block may change depending on which MPM, PMPM or SMPM, the prediction mode of the current block is included in.

[0200] For example, if the prediction mode of the current block corresponds to a specific intra prediction mode that is replaced to point to the MIP mode within the PMPM list, a prediction sample of the current block can be obtained by combining a first prediction sample obtained based on the intra prediction mode, as described in the second mode above, and a second prediction sample obtained according to the MIP mode. Additionally, if the prediction mode of the current block corresponds to a specific intra prediction mode that is replaced to point to the MIP mode within the SMPM list, a prediction sample of the current block can be obtained by combining a first prediction sample obtained based on the intra prediction mode, as described in the second mode above, and a second prediction sample obtained according to the MIP mode. Additionally, if the prediction mode of the current block corresponds to a specific intra prediction mode that is replaced to point to the MIP mode within the PMPM list or the SMPM list, a prediction sample of the current block can be obtained by combining a first prediction sample obtained based on the intra prediction mode, as described in the second mode above, and a second prediction sample obtained according to the MIP mode.

[0201] Information regarding the method of generating prediction samples for the current block, depending on whether the prediction mode of the current block is included in PMPM or SMPM, may be set implicitly or explicitly. When set implicitly, the information may be determined based on at least one of the following: the size of the current block, the shape of the current block, the slice type of the current block, QP, information on surrounding reference blocks on the upper and left sides, template-based template cost, intra prediction tool information such as DIMD, TIMD, IntraTMP, OBIC of the current block, MPM value, tool information used for intra / inter prediction of surrounding blocks, MRL information, reference value of the block vector, and L-shaped reference area value. When set explicitly, the video encoder (100) may transmit the information to the video decoder (200) in data units such as sequence, GOP, frame, picture, slice, CTU, CU, TU, etc.

[0202] In one embodiment, for all intra prediction modes included in the MPM list, a prediction sample of the current block can be obtained by combining a first prediction sample obtained based on an intra prediction mode, as described in the second mode above, and a second prediction sample obtained according to the MIP mode. For example, if the MPM list is {3, 6, 9} and includes intra prediction modes 3, 6, and 9, and the intra prediction mode of the current block corresponds to one of the intra prediction modes 3, 6, and 9 included in the MPM list, a prediction sample of the current block can be obtained by combining a first prediction sample, which is an intra prediction value obtained according to one of the intra prediction modes 3, 6, and 9, and a second prediction sample obtained according to the MIP mode.

[0203] In one embodiment, when the MPM is composed of PMPM and SMPM, a specific intra prediction mode included in the PMPM or SMPM may be changed to an MIP mode. In this case, the specific intra prediction mode that is replaced by the MIP mode may be determined based on at least one of the following: the size of the current block, the shape of the current block, the slice type of the current block, QP, information on the upper and left surrounding reference blocks, the template-based template cost, intra prediction tool information such as the current block's DIMD, TIMD, IntraTMP, and OBIC, the MPM value, tool information used for intra / inter prediction of surrounding blocks, MRL information, the reference value of the block vector, and the L-shaped reference area value. For example, if the intra prediction mode of the upper surrounding block and the left surrounding block included in the PMPM list is not an MIP mode, the intra prediction mode of the upper surrounding block and the left surrounding block may be changed to an MIP mode.

[0204] In one embodiment, a specific intra prediction mode that is replaced by an MIP mode can be determined based on history. For example, among the intra prediction modes included in the MPM list, the frequency of intra prediction modes that were previously replaced by an MIP mode is calculated, and the intra prediction modes with the highest frequency can be replaced by an MIP mode.

[0205] In one embodiment, if the intra prediction mode used in the reference block retrieved through IntraTMP is the MIP mode, the intra prediction mode used in the reference block may be replaced with the MIP mode.

[0206] Meanwhile, when MIP mode is used, the method of using intra prediction mode by existing tools that utilize intra prediction mode information may change. For example, in transformation coding methods such as MTS (Multiple Transform Selection), LFNST (Low-Frequency Non-Separable Transform), and NSPT (Non-Separable Secondary Transform), the method of selecting the kernel set varies depending on the intra prediction mode. In this case, i) kernel sets such as MTS, LFNST, and NSPT may be set based on the existing intra prediction mode by inheriting the existing intra prediction mode, ii) kernel sets may be set through a separate mapping table when MIP mode is set as a standalone mode without utilizing existing intra prediction mode information, such as in the second mode, or iii) separate kernel sets for MTS, LFNST, and NSPT suitable for the prediction values according to MIP mode may be set and applied to transformation coding because the characteristics of the prediction values according to MIP mode may differ from the characteristics of the existing intra prediction values.

[0207] In addition, CIIP (Combined Inter and Intra Prediction), GPM (Geometric Partitioning Mode)-Intra, and TIMD (Template-Based Intra Mode Derivation) are techniques that improve prediction performance by utilizing intra prediction. When performing intra prediction in these techniques, intra prediction can be performed according to one of the aforementioned first and third modes. That is, in the case where the intra prediction mode of the current block is replaced by the MIP mode, the prediction value generated according to the MIP mode can be used as the intra prediction value in CIIP, GPM-Intra, TIMD, etc., instead of the prediction value according to the existing intra prediction mode as in the first mode, or the prediction value obtained by combining the first prediction value according to the existing intra prediction mode and the second prediction value generated according to the MIP mode as in the third mode can be used as the intra prediction value in CIIP, GPM-Intra, TIMD, etc.

[0208] Information regarding which method among the first mode and the third mode is applied to be used as an intra prediction value in CIIP, GPM-Intra, TIMD, etc. can be determined based on at least one of the following: the size of the current block, the shape of the current block, the slice type of the current block, QP, information on surrounding reference blocks on the upper and left sides, the template-based template cost, information on intra prediction tools such as DIMD, TIMD, IntraTMP, OBIC, etc. of the current block, the MPM value, information on tools used for intra / inter prediction of surrounding blocks, MRL information, the reference value of the block vector, and the L-shaped reference area value. When explicitly set, the video encoder (100) can transmit the information to the video decoder (200) in data units such as sequence, GOP, frame, picture, slice, CTU, CU, TU, etc.

[0209] FIG. 14 is a flowchart of a video encoding method according to one embodiment of the present disclosure.

[0210] Referring to FIG. 14, the video encoder (100) obtains a prediction sample of the current block by applying a plurality of prediction modes to the current block (S1410). Then, the video encoder (100) determines the prediction mode of the current block based on the cost of the prediction sample. The video encoder (100) can apply various prediction modes and determine the optimal prediction mode based on Rate-Distortion Optimization (RDO).

[0211] The video encoder (100) encodes the prediction mode information of the determined current block (S1430).

[0212] As described above, the intra prediction unit (161) can perform intra prediction for the current block by applying a plurality of intra prediction modes, including an MIP mode, to the current block. If the prediction mode of the current block is determined to be an MIP mode and one or more of the plurality of intra prediction mode indices are set to point to the MIP mode, the prediction mode information of the current block can be set to one of the one or more intra prediction mode indices pointing to the MIP mode.

[0213] As described above, whether the prediction mode of the current block is MIP mode can be signaled through a specific intra prediction mode. Additionally, depending on the size of the current block, the specific intra prediction mode can be configured to be replaced by the MIP mode in different ways. If the prediction mode of the current block is MIP mode, specific intra prediction mode information replaced to point to the MIP mode can be encoded as the intra prediction mode information of the current block.

[0214] Additionally, a specific intra prediction mode within an MPM list or a non-MPM list may be replaced with an MIP mode, in which case the intra prediction mode information replaced to point to the MIP mode may be encoded as the prediction mode of the current block. If the intra prediction mode of the current block is signaled through an MPM list, and the prediction mode of the current block is an MIP mode, then the specific intra prediction mode information replaced to point to the MIP mode within the MPM list or a non-MPM list may be encoded as the intra prediction mode information of the current block.

[0215] The video encoder (100) includes information about the MIP in one of the data units such as sequence, GOP, frame, picture, slice, CTU, CT, TU, and transmits it to the video decoder (200), and if the current block is a CU, it may include intra prediction mode information in the syntax element of the CU.

[0216] FIG. 15 is a flowchart of a video decoding method according to one embodiment of the present disclosure.

[0217] The video decoder (200) obtains prediction mode information of the current block from the bitstream (S1510). Then, the video decoder (200) determines whether the prediction mode of the current block corresponds to the MIP mode based on the prediction mode information (S1520). When determining whether it corresponds to the MIP mode, the video decoder (200) determines whether the size of the current block corresponds to a size where MIP is applicable, and can determine whether the prediction mode of the current block corresponds to the MIP mode only if the size of the current block corresponds to a size where MIP is applicable.

[0218] As described above, whether the prediction mode of the current block is MIP mode can be signaled through a specific intra prediction mode. Additionally, a specific intra prediction mode within an MPM list or a non-MPM list can be substituted for MIP mode; in this case, if the intra prediction mode of the current block corresponds to the specific intra prediction mode substituted to point to the MIP mode within the MPM list or the non-MPM list, the prediction mode of the current block can be determined to be MIP mode.

[0219] As described above, the intra prediction unit (261) sets an MPM list using the intra prediction modes of surrounding blocks at a preset position and number of the current block. The intra prediction unit (261) converts (replaces) some of the intra prediction modes included in the MPM list into MIP modes, and determines the MPM list by selectively adding the converted MIP modes to the MPM list while considering redundancy.

[0220] As shown in Table 4 above, the intra prediction unit (261) can change the intra prediction mode (Intra_Prediction_Mode) to the converted MIP mode (MIP_Mode) according to MIP_Mode={((Intra_Prediction_Mode-2)+1>>1)*2+2} if the width and height of the current block are less than or equal to a predetermined size, and ii) otherwise, change the intra prediction mode (Intra_Prediction_Mode) to the converted MIP mode (MIP_Mode) according to MIP_Mode={((Intra_Prediction_Mode-2)+2>>1)*4+2}.

[0221] The intra prediction unit (261) can skip the process of adding the converted MIP mode to the MPM list if the converted MIP mode is included in the existing MPM list. Additionally, as described above, the intra prediction unit (261) can apply the prediction modes included in the MPM list to the template area to obtain the template cost, such as SATDe, and sort the prediction modes included in the MPM list in ascending order of the template cost.

[0222] And, the intra prediction unit (261) obtains MPM index information pointing to one of the intra prediction modes in the MPM list as intra prediction mode information of the current block from the bitstream, and if the MPM index of the current block points to the MIP mode included in the MPM list, the prediction mode of the current block can be determined to be the MIP mode.

[0223] The intra prediction unit (261) of the video decoder (200) sets a reference area used for MIP based on the size of the current block when the prediction mode of the current block determined based on the prediction mode information is the MIP prediction mode (S1530).

[0224] As described above, when the prediction mode of the current block is MIP mode, the intra prediction unit (261) may set a reference area based on the size of the current block, or, when information regarding the reference area is explicitly signaled, parse information related to the reference area from the bitstream to set the reference area. Additionally, based on the size of the current block, the size of the upper right surrounding pixels and the number of lower left surrounding pixels included in the reference area of the current block may be determined. For example, when the size of the current block is W*H, the surrounding pixels included in the reference area may be determined using 2W upper surrounding pixels and 2H left surrounding pixels. In this case, the number of surrounding pixels included in the reference area may be limited to be below a predetermined threshold (e.g., 50 or fewer).

[0225] Then, the intra prediction unit (261) determines the matrix used for MIP (S1540). As described above, the matrix used for MIP can be determined based on at least one of the current block size and the intra prediction mode. Additionally, the matrix used for MIP can be obtained by subsampling a reference matrix set for a block of reference size based on the current block size.

[0226] Then, the intra prediction unit (261) obtains a prediction sample for each pixel location within the current block based on the set reference area and matrix (S1550). As described above, a prediction sample for each pixel location can be obtained by calculating the weighted sum of the weight vectors within the reference area and the weight vectors set for each pixel location.

[0227] The methods described above in this specification may be performed through a processor of a video encoder or a video decoder. Additionally, the encoder may generate a bitstream that is decoded by a video signal processing method, and the bitstream generated by the encoder may be stored in a computer-readable non-transient storage medium (recording medium).

[0228] The embodiments of the present invention described above may be implemented through various means. For example, the embodiments of the present invention may be implemented by hardware, firmware, software, or a combination thereof.

[0229] Some embodiments may also be implemented in the form of a recording medium containing computer-executable instructions, such as program modules executed by a computer. A computer-readable medium may be any available medium accessible by a computer and includes both volatile and non-volatile media, and both removable and non-removable media.

[0230] Additionally, computer-readable media may include both computer storage media and communication media. Computer storage media include both volatile and non-volatile, removable and non-removable media implemented by any method or technique for storing information such as computer-readable instructions, data structures, program modules, or other data. Communication media typically include other data of modulated data signals such as computer-readable instructions, data structures, or program modules, or other transmission mechanisms, and include any information transmission media.

Claims

1. A step of obtaining prediction mode information of the current block; A step of determining whether the prediction mode of the current block corresponds to a Matrix-based Intra prediction (MIP) mode based on the prediction mode information above; If it is determined that the prediction mode of the current block corresponds to the MIP mode, a step of setting a reference region used for the MIP based on the size of the current block; A step of determining a matrix used in the above MIP; and Based on the reference region and matrix, the method includes the step of obtaining a predicted sample of the current block, The step of determining whether it corresponds to the above MIP mode A video decoding method characterized by determining the prediction mode of the current block as the MIP mode when, among a plurality of intra prediction mode indices, one or more intra prediction mode indices are set to point to the MIP mode, and the intra prediction mode information of the current block points to the MIP mode.

2. In Paragraph 1, The step of determining whether it corresponds to the above MIP mode A video decoding method further comprising a step of determining whether the size of the current block corresponds to a size to which the MIP is applicable, and, if the size of the current block corresponds to a size to which the MIP is applicable, determining whether the prediction mode of the current block corresponds to a MIP mode.

3. In Paragraph 1, A step of setting an MPM (Most Probable Mode) list using intra prediction modes of surrounding blocks at a preset location and number of the current block; A step of converting some of the intra prediction modes included in the above MPM list into the above MIP mode; and The method further includes a step of determining an MPM list by selectively adding the converted MIP mode to the MPM list while considering redundancy, and The intra prediction mode information of the current block above is MPM index information pointing to one of the intra prediction modes of the MPM list above, and The step of determining whether it corresponds to the above MIP mode A video decoding method characterized by determining the prediction mode of the current block as MIP mode when the MPM index of the current block points to a MIP mode included in the MPM list.

4. In Paragraph 3, The step of determining the above MPM list is A video decoding method characterized by skipping the process of adding the converted MIP mode to the MPM list when the converted MIP mode is included in the existing MPM list.

5. In Paragraph 3, The step of converting to the above MIP mode When the intra prediction mode included in the above MPM list is called Intra_Prediction_Mode and the above converted MIP mode is called MIP_Mode, i) If the width and height of the current block above are less than or equal to a predetermined size, the Intra Prediction Mode is changed to the converted MIP Mode according to the following mathematical formula: MIP_Mode={((Intra_Prediction_Mode-2)+1>>1)*2+2}, and ii) In other cases, a video decoding method characterized by changing the Intra Prediction Mode to a converted MIP Mode according to the following mathematical formula: MIP_Mode={((Intra_Prediction_Mode-2)+2>>1)*4+2}.

6. In Paragraph 3, The step of determining the above MPM list is A video decoding method comprising the step of applying prediction modes included in the above MPM list to a template area to obtain a template cost, and further comprising the step of sorting the prediction modes included in the above MPM list in ascending order of the template cost.

7. In Paragraph 6, A video decoding method characterized in that the above template area includes an upper template area of a predetermined size adjacent to the upper side of the current block and a left template area of a predetermined size adjacent to the left side of the current block.

8. In Paragraph 6, A video decoding method characterized by obtaining a predicted sample of the template area from surrounding reference samples of the template area by applying a prediction mode included in the MPM list to the template area, and obtaining the result using the Sum of Absolute Transformed Difference (SATD) between the predicted sample of the template area and the template area.

9. In Paragraph 1, The step of setting the above reference area Based on the size of the current block, the size of the upper-right surrounding pixels and the number of lower-left surrounding pixels of the current block included in the reference area are determined, A video decoding method characterized in that the number of surrounding pixels included in the above reference area is less than or equal to a predetermined threshold.

10. In Paragraph 1, The step of determining the matrix used in the above MIP is A video decoding method characterized by obtaining a subsampled matrix used for the MIP of the current block by subsampling elements included in a reference matrix (Fmax) set for the largest block size available for the above MIP based on the size of the current block.

11. In Paragraph 1, A video decoding method further comprising the step of determining one of the following modes: ii) a first mode in which a specific intra prediction mode among a plurality of intra prediction modes is replaced by an MIP mode and a prediction sample of the current block is obtained based on intra prediction mode information of the current block; ii) a second mode in which the MIP mode is used as a standalone mode distinct from the intra prediction mode; and iii) a third mode in which, when the intra prediction mode of the current block corresponds to an intra prediction mode in which the intra prediction mode is replaced by an MIP mode, a prediction sample of the current block is obtained by combining a first prediction sample obtained based on intra prediction mode information and a second prediction sample obtained according to the MIP mode.

12. A video decoding device that performs the method of any one of claims 1 to 11.

13. A computer-readable recording medium storing a program for executing the method of any one of claims 1 through 11 on a computer.

14. A step of obtaining a prediction sample of the current block by applying a plurality of prediction modes to the current block; A step of determining the prediction mode of the current block based on the cost of the prediction sample; and It includes the step of encoding the prediction mode information of the current block above, The step of obtaining a predicted sample of the current block above is It includes a step of performing matrix-based intra prediction (MIP), A video encoding method characterized in that, when the prediction mode of the current block is determined to be an MIP mode and one or more intra prediction mode indices among a plurality of intra prediction mode indices are set to point to the MIP mode, the prediction mode information of the current block is set to one of the one or more intra prediction mode indices pointing to the MIP mode.

15. A video encoding device that performs the method of paragraph 14.

16. A computer-readable recording medium storing a program for executing the method of paragraph 14 on a computer.