Intra prediction method and apparatus
By deriving intra-frame prediction modes and filtering correction techniques, the problem of low intra-frame prediction efficiency was solved, and the efficiency of image encoding and decoding was improved, especially in the prediction of sub-block units and intra-frame prediction of multiple pixel lines.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD
- Filing Date
- 2019-06-25
- Publication Date
- 2026-06-26
AI Technical Summary
Existing image encoding and decoding techniques are inefficient in intra-frame prediction, making it difficult to effectively improve encoding/decoding efficiency.
By deriving the intra-prediction mode of the current block, determining multiple pixel lines, performing intra-prediction, and improving encoding/decoding efficiency based on filtering, correction, and sub-block unit prediction, selectively performing filtering and correction using encoding parameters, considering pixel line position and intra-prediction mode, and deriving intra-prediction mode using default mode or multiple MPM candidates.
It improves the encoding/decoding efficiency of intra-frame prediction by enhancing the efficiency of image encoding and decoding through sub-block unit prediction, filtering, and correction techniques.
Smart Images

Figure CN116437082B_ABST
Abstract
Description
[0001] Case Analysis
[0002] This application is a divisional application of Chinese Patent Application No. 201980042368.5, entitled "Intra-frame Prediction Method and Apparatus", which entered the Chinese national phase based on PCT international patent application PCT / KR2019 / 007651 filed on June 25, 2019.
[0003] Cross-reference to related applications
[0004] This application is based on and claims priority to Korean Patent Application No. 10-2018-0072558, filed on June 25, 2018, the entire contents of which are incorporated herein by reference.
[0005] This application is based on and claims priority to Korean Patent Application No. 10-2018-0076783, filed on July 2, 2018, the entire contents of which are incorporated herein by reference. Technical Field
[0006] This invention relates to an image encoding and decoding technique, and more specifically, to a method and apparatus for encoding / decoding intra-frame prediction. Background Technology
[0007] Recently, there has been an increasing demand for high-resolution, high-quality images in various application fields, such as high-definition (HD) images and ultra-high-definition (UHD) images. As a result, high-efficiency image compression technologies are being developed.
[0008] Image compression techniques include inter-frame prediction techniques that predict pixel values in the current image from images before or after the current image; intra-frame prediction techniques that use pixel information in the current image to predict pixel values in the current image; and entropy coding techniques that assign short codes to frequently occurring information and long codes to infrequently occurring information. These image compression techniques can be used to effectively compress image data for transmission or storage. Summary of the Invention
[0009] Technical issues
[0010] This invention relates to an image encoding and decoding technique, and more specifically, to a method and apparatus for encoding / decoding intra-frame prediction.
[0011] Technical solution
[0012] According to the intra-prediction method and apparatus of the present invention, it is possible to derive the intra-prediction mode of the current block, determine the pixel lines among a plurality of pixel lines for intra-prediction of the current block, and perform intra-prediction of the current block based on the intra-prediction mode and the determined pixel lines.
[0013] According to the intra-frame prediction method and apparatus of the present invention, filtering can be performed on the first reference pixel of the determined pixel line.
[0014] According to the intra-frame prediction method and apparatus of the present invention, filtering can be selectively performed based on a first flag indicating whether filtering is performed on a first reference pixel used for intra-frame prediction.
[0015] According to the intra-frame prediction method and apparatus of the present invention, the first flag can be derived from the decoding apparatus based on the coding parameters of the current block. The coding parameters may include at least one of block size, component type, intra-frame prediction mode, or whether to apply intra-frame prediction at the sub-block level.
[0016] According to the intra-frame prediction method and apparatus of the present invention, the predicted pixels of the current block based on the intra-frame prediction can be corrected.
[0017] According to the intra-frame prediction method and apparatus of the present invention, the correction may further include: determining at least one of a second reference pixel or a weighting value for the correction based on the position of the predicted pixel of the current block.
[0018] According to the intra-prediction method and apparatus of the present invention, the correction step can be selectively performed by considering at least one of the following: the position of the pixel line of the current block, the intra-prediction mode of the current block, or whether to perform intra-prediction on a sub-block basis.
[0019] According to the intra-frame prediction method and apparatus of the present invention, the intra-frame prediction can be performed on a sub-block basis of the current block, and the sub-block is determined based on at least one of a second flag regarding whether to perform segmentation, segmentation direction information, or segmentation quantity information.
[0020] According to the intra-frame prediction method and apparatus of the present invention, the intra-frame prediction mode can be derived based on a predetermined default mode or multiple MPM candidates.
[0021] Technical effect
[0022] According to the present invention, encoding / decoding efficiency can be improved by predicting sub-block units.
[0023] According to the present invention, the encoding / decoding efficiency of intra-frame prediction can be improved by intra-frame prediction based on multiple pixel lines.
[0024] According to the present invention, the encoding / decoding efficiency of intra-frame prediction can be improved by performing filtering on reference pixels.
[0025] According to the present invention, the encoding / decoding efficiency of intra-frame prediction can be improved by correcting intra-frame prediction pixels.
[0026] According to the present invention, the encoding / decoding efficiency of intra-prediction mode can be improved by deriving intra-prediction mode based on default mode or MPM candidate. Attached Figure Description
[0027] Figure 1 This is a block diagram of an image encoding device according to an embodiment of the present invention.
[0028] Figure 2 This is a block diagram of an image decoding device according to an embodiment of the present invention.
[0029] Figure 3 This is a schematic diagram illustrating the shape of a tree-based block.
[0030] Figure 4 This is a schematic diagram illustrating the block shape based on type.
[0031] Figure 5 This is a schematic diagram showing various block shapes that can be obtained through the segmentation block portion of the present invention.
[0032] Figure 6 This is a schematic diagram illustrating tree-based segmentation according to an embodiment of the present invention.
[0033] Figure 7 This is a schematic diagram illustrating tree-based segmentation according to an embodiment of the present invention.
[0034] Figure 8 The block segmentation process according to an embodiment of the present invention is illustrated.
[0035] Figure 9 This is a schematic diagram illustrating a predefined intra-frame prediction mode in an image encoding / decoding device.
[0036] Figure 10 An example is shown where pixels are compared across color spaces to obtain correlation information.
[0037] Figure 11 This is a schematic diagram illustrating the configuration of reference pixels used for intra-frame prediction.
[0038] Figure 12 This is a schematic diagram used to illustrate the range of reference pixels used for intra-frame prediction.
[0039] Figure 13 This is a graph showing the blocks adjacent to the current block that are used to generate the predicted block.
[0040] Figure 14 and Figure 15 This is a partial example used to confirm the segmentation information for each block.
[0041] Figure 16 This is a schematic diagram illustrating various scenarios of segmented blocks.
[0042] Figure 17 An example of a segmented block according to an embodiment of the present invention is shown.
[0043] Figure 18 Various examples of intra-prediction mode candidate groups are shown regarding the setting of blocks that generate prediction information (in this example, the prediction block is 2N×N).
[0044] Figure 19 Various examples of intra-prediction mode candidate groups for setting up the blocks that generate prediction information (in this example, prediction blocks N×2N) are shown.
[0045] Figure 20 An example of a segmented block according to an embodiment of the present invention is shown.
[0046] Figure 21 and Figure 22 Various examples of intra-prediction mode candidate groups for setting up blocks that generate prediction information are shown.
[0047] Figures 23 to 25 An example of generating prediction blocks based on the prediction patterns of neighboring blocks is shown.
[0048] Figure 26 This is a diagram illustrating the relationship between the current block and its neighboring blocks.
[0049] Figure 27 and 28 Intra-frame prediction considering the directionality of the prediction mode is shown.
[0050] Figure 29 This is a schematic diagram illustrating the configuration of reference pixels used for intra-frame prediction.
[0051] Figures 30 to 35 This is a schematic diagram regarding the configuration of reference pixels. Detailed Implementation
[0052] According to the intra-prediction method and apparatus of the present invention, an intra-prediction mode for a current block can be derived, and a plurality of pixel lines for intra-prediction of the current block can be determined. Based on the intra-prediction mode and the determined pixel lines, intra-prediction of the current block is performed.
[0053] According to the intra-frame prediction method and apparatus of the present invention, filtering can be performed on the first reference pixel of the determined pixel line.
[0054] According to the intra-frame prediction method and apparatus of the present invention, the filtering can be selectively performed based on a first flag indicating whether filtering is performed on a first reference pixel used for intra-frame prediction.
[0055] According to the intra-prediction method and apparatus of the present invention, the first flag is derived from the decoding apparatus based on the coding parameters of the current block. The coding parameters may include at least one of block size, component type, intra-prediction mode, or whether intra-prediction is applied on a sub-block basis.
[0056] According to the intra-frame prediction method and apparatus of the present invention, the predicted pixels of the current block based on the intra-frame prediction can be corrected.
[0057] According to the intra-frame prediction method and apparatus of the present invention, the correction may further include: determining at least one of a second reference pixel or a weighting value for the correction based on the position of the predicted pixel of the current block.
[0058] According to the intra-prediction method and apparatus of the present invention, the correction step can be selectively performed by considering at least one of the following: the position of the pixel line of the current block, the intra-prediction mode of the current block, or whether to perform intra-prediction on a sub-block basis.
[0059] According to the intra-frame prediction method and apparatus of the present invention, the intra-frame prediction can be performed on a sub-block basis of the current block, and the sub-block can be determined based on at least one of a second flag regarding whether to perform segmentation, segmentation direction information, or segmentation quantity information.
[0060] According to the intra-frame prediction method and apparatus of the present invention, the intra-frame prediction mode can be derived based on a predetermined default mode or multiple MPM candidates.
[0061] This invention can be modified and has many embodiments, and specific embodiments are described in detail herein with reference to the accompanying drawings. However, this is not intended to limit the invention to specific implementations, but rather to be understood as including all modifications, equivalents, or substitutions within the spirit and scope of the invention.
[0062] Terms such as first, second, A, B, etc., may be used to describe various elements, but these elements should not be limited by these terms. The purpose of these terms is solely to distinguish one element from another. For example, without departing from the scope of the invention, a first element may be referred to as a second element, and similarly, a second element may be referred to as a first element. The term "and / or" may include a combination of multiple related items or any one of multiple related items.
[0063] It should be understood that when an element is referred to as "connected" or "joined" to another element, it can be directly connected or joined to the other element, but there may be other elements in between. On the other hand, it should be understood that when an element is referred to as "directly connected" or "directly joined" to another element, there are no other elements in between.
[0064] The terminology used in this application is for describing specific embodiments only and is not intended to limit the invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. It is to be understood in this application that terms such as "comprising" or "having" are used to specify the presence of features, numbers, steps, actions, structural elements, components, or combinations thereof as described in the specification, and do not preclude the presence or additional possibilities of one or more other features, numbers, steps, actions, structural elements, components, or combinations thereof.
[0065] Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in common dictionaries should be interpreted according to their meaning in the context of the relevant art, and should not be interpreted in an imaginary or overly formal sense unless expressly defined in this application.
[0066] Video encoding and decoding devices can be user terminals, such as personal computers (PCs), laptops, personal digital assistants (PDAs), portable multimedia players (PMPs), handheld game consoles (PSPs), wireless communication terminals, smartphones, televisions, virtual reality devices (VRs), augmented reality devices (ARs), mixed reality devices (MRs), head-mounted displays (HMDs), smart glasses, etc., or they can be server terminals, such as application servers and service servers. They can include various devices, such as communication devices, including communication modems for communicating with various devices or wired and wireless networks, memory, which stores various programs and data for intra-frame prediction or inter-frame prediction in order to encode or decode images, and processors, which perform calculations and control by executing programs. Furthermore, the image encoded into a bitstream by the image encoding device can be transmitted in real time or non-real time to the image decoding device via wired and wireless communication networks such as the Internet, local area networks, wireless LAN networks, Wi-Fi networks, and mobile communication networks, or via various communication interfaces such as cables and Universal Serial Bus (USB). The image decoding device then decodes the image, restores it to its original form, and regenerates it.
[0067] In addition, an image encoded into a bitstream by an image encoding device can be transmitted from the encoding device to the decoding device via a computer-readable storage medium.
[0068] The aforementioned image encoding device and image decoding device can each be separate devices, but they can be combined into one image encoding / decoding device depending on the circumstances. In this case, some configurations of the image encoding device, as technical elements substantially the same as some configurations of the image decoding device, can be implemented in a manner that includes at least the same structure or performs at least the same function.
[0069] Therefore, in the following detailed description of the technical elements and their operating principles, repeated descriptions of the corresponding technical elements will be omitted.
[0070] Furthermore, since the image decoding device corresponds to the computing device that applies the image encoding method executed by the image encoding device to the decoding, the image encoding device will be described in detail below.
[0071] The computing device may include: a memory storing programs or software modules for implementing image encoding methods and / or image decoding methods; and a processor connected to the memory to execute the programs. The image encoding device may be referred to as an encoder, and the image decoding device may be referred to as a decoder.
[0072] Typically, an image can be configured as a series of still images, which can be categorized into Groups of Pictures (GOPs), and each still image can be referred to as a picture. In this context, a picture can represent any of the frames and fields in progressive or interlaced signaling. When encoding / decoding is performed on a frame-by-frame basis, the picture can be represented as a 'frame', and when encoding / decoding is performed on a field-by-field basis, the picture can be represented as a 'field'. Although this invention assumes and describes progressive signaling, it is also applicable to interlaced signaling. As a higher-level concept, units such as GOPs or sequences can exist, and each picture can be divided into predetermined regions such as slices, tiles, blocks, etc. Furthermore, a GOP can include units such as I-pictures, P-pictures, and B-pictures. An I-picture can refer to a picture that is encoded / decoded by itself without using a reference picture, and P-pictures and B-pictures can refer to pictures that are encoded / decoded using a reference picture and performing actions such as motion estimation and motion compensation. Typically, in the case of a P-image, both the I-image and the P-image can be used as reference images, and in the case of a B-image, both the I-image and the P-image can be used as reference images. However, the above definitions can also change depending on the encoding / decoding settings.
[0073] Here, the encoded / decoded image is called the reference picture, and the referenced block or pixel is called the reference block and reference pixel. Furthermore, the reference data can be not only pixel values in the spatial domain, but also coefficient values in the frequency domain, as well as various encoding / decoding information generated and determined during the encoding / decoding process. For example, it can be intra-frame prediction-related information or motion-related information in the prediction unit, transform-related information in the transform / inverse transform unit, quantization information in the quantization / inverse quantization unit, encoding / decoding-related information (context information) in the encoding / decoding unit, and filtering-related information in the loop filtering unit, etc.
[0074] The smallest unit that makes up an image can be a pixel, and the number of bits used to represent a pixel is called the bit depth. Typically, the bit depth can be 8 bits, and more bit depths can be supported depending on the encoding settings. Depending on the color space, at least one bit depth can be supported. Additionally, at least one color space can be configured according to the image's color format. Depending on the color format, an image can consist of more than one image of a specific size or more than one image of other sizes. For example, in the case of YCbCr 4:2:0, it can consist of one luminance component (Y in this example) and two chrominance components (Cb / Cr in this example), where the ratio of the chrominance component to the luminance component can be 1:2 horizontally and vertically. As another example, in 4:4:4, the same horizontal and vertical ratio can be used. As illustrated above, when an image consists of more than one color space, the image can be divided into each color space.
[0075] In this invention, some color spaces (Y in this example) based on some color formats (YCbCr) are described, and similar applications can be performed for other color spaces according to the color formats (depending on the settings of a specific color space). However, it is also possible to produce partial differences in each color space (independent of the settings of a specific color space). That is, settings dependent on each color space may mean proportional to the composition ratio of each component, or have settings dependent on its composition ratio (e.g., 4:2:0, 4:2:2, 4:4:4, etc.), while settings independent of each color space may mean unrelated to the composition ratio of each component, or have settings only for the corresponding color space. In this invention, depending on the encoder / decoder, some components may have independent or dependent settings.
[0076] The setup information or syntax elements required during image encoding can be determined at the unit level of video, sequence, image, slice, tile, block, etc. This can be recorded in the bitstream as units such as Video Parameter Set (VPS), Sequence Parameter Set (SPS), Picture Parameter Set (PPS), slice headers, tile headers, and block headers, and transmitted to the encoder. The decoder can perform parsing at the same level to recover the setup information transmitted from the encoder for use in the image decoding process. Furthermore, relevant information can be transmitted to the bitstream in the form of Supplement Enhancement Information (SEI) or metadata for parsing and use. Each parameter set has a unique ID value, and a lower-level parameter set can have the ID value of a higher-level parameter set to reference. For example, a lower-level parameter set can reference information from more than one higher-level parameter set that has the same ID value. In the examples of the various units mentioned above, if a unit includes more than one other unit, that unit can be called the superordinate unit, and the included unit can be called the subordinate unit.
[0077] The setting information generated within the unit may include content set independently by each corresponding unit or content dependent on settings of preceding, following, or superior units. Here, subordinate settings can be understood as using flag information to represent the setting information of the corresponding unit (e.g., execution occurs if the 1-bit flag is 1, and not if it is 0), indicating the settings of preceding, following, or superior units. In this invention, the setting information will be described based on examples of independent settings, but may also include examples where settings are added to or replaced in the content of setting information of preceding, following, or superior units that depend on the current unit.
[0078] Figure 1 This is a block diagram of an image encoding device according to an embodiment of the present invention. Figure 2 This is a block diagram of an image decoding device according to an embodiment of the present invention.
[0079] Reference Figure 1 The image encoding apparatus may include a prediction unit, a subtraction unit, a transform unit, a quantization unit, an inverse quantization unit, an inverse transform unit, an addition unit, a loop filter unit, a memory and / or an encoding unit, and some of the above structures may not necessarily be included. Some or all of them may be selectively included as needed, and some other configurations not shown may be included.
[0080] Reference Figure 2 The image decoding device may include a decoding unit, a prediction unit, an inverse quantization unit, an inverse transform unit, an addition unit, a loop filter unit, and / or a memory. Some of the above structures may not necessarily be included, and some or all of them may be selectively included as needed. It may also include some other configurations not shown.
[0081] Image encoding and image decoding devices can be separate devices, but they can be configured into a single image encoding / decoding device as needed. In this case, some configurations of the image encoding device, as technical elements substantially the same as some configurations of the image decoding device, can be implemented in a manner that includes at least the same structure or performs at least the same function. Therefore, repeated descriptions of the corresponding technical elements will be omitted in the following detailed description of the technical elements and their operating principles. Since the image decoding device corresponds to the computing device that applies the image encoding method executed in the image encoding device to decoding, the image encoding device will be described in focus below. The image encoding device can be referred to as an encoder, and the image decoding device can be referred to as a decoder.
[0082] The prediction unit may include an intra-frame prediction unit that performs intra-frame prediction and an inter-frame prediction unit that performs inter-frame prediction. Intra-frame prediction can determine the intra-frame prediction mode by configuring the pixels of the neighboring blocks of the current block as reference pixels, and the prediction block can be generated using the intra-frame prediction mode. Inter-frame prediction can determine the motion information of the current block using one or more reference images, and generate the prediction block by performing motion compensation using the motion information. It is determined whether to use intra-frame prediction or inter-frame prediction for the current block (coding unit or prediction unit), and specific information based on each prediction method (e.g., intra-frame prediction mode, motion vector, reference image, etc.) can be determined. At this time, the processing unit that performs prediction and the processing unit that determines the prediction method and specific content can be determined according to the encoding / decoding settings. For example, the prediction method, prediction mode, etc., are determined by the prediction unit (or coding unit), and the prediction is performed by the prediction block unit (or coding unit, transform unit).
[0083] The subtraction unit subtracts the prediction block from the current block to generate a residual block. That is, the subtraction unit calculates the difference between the pixel value of each pixel in the current block to be encoded and the predicted pixel value of each pixel in the prediction block generated by the predictor to generate a residual block, which is a residual signal in the image.
[0084] The transform unit can transform a signal belonging to the spatial domain into a signal belonging to the frequency domain, and the signal obtained through the transform process is called the transform coefficient. For example, a transform block with transform coefficients can be obtained by performing a transform on a residual block having a residual signal received from the subtraction unit, but the input signal is determined according to the encoding settings and is not limited to the residual signal.
[0085] The transform unit can use transform techniques such as the Hadamard Transform, Discrete Sine Transform (DST-based transform), or Discrete Cosine Transform (DCT-based transform) to transform the residual block. This invention is not limited to these techniques, and various transform techniques that are improved and modified can be used.
[0086] For example, at least one transformation technique among the transformations can be supported, and at least one detailed transformation technique can be supported in each transformation technique. In this case, the at least one detailed transformation technique can be a technique that modifies a portion of the basis vector differently in each transformation method. For example, DST-based and DCT-based transformations can be supported as transformation techniques. For DST, detailed transformation techniques such as DST-I, DST-II, DST-III, DST-V, DST-VI, DST-VII, and DST-VIII can be supported; for DCT, detailed transformation techniques such as DCT-I, DCT-II, DCT-III, DCT-V, DCT-VI, DCT-VII, and DCT-VIII can be supported.
[0087] One of the transformations (e.g., a transformation technique and a detail transformation technique) can be set as the basic transformation technique, thus supporting additional transformation techniques (e.g., multiple transformation techniques and multiple detail transformation techniques). Whether additional transformation techniques are supported can be determined at the unit level, such as sequence, image, slice, tile, etc., so that relevant information can be generated at that unit. When additional transformation techniques are supported, transformation technique selection information can be determined at the unit level, such as block, to generate relevant information.
[0088] Transformations can be performed in both the horizontal and vertical directions. For example, a co-two-dimensional transformation can be performed by using the basis vectors in the transformation to perform a one-dimensional transformation in the horizontal direction and a one-dimensional transformation in the vertical direction, thereby transforming pixel values in the spatial domain to the frequency domain.
[0089] Additionally, adaptive transformations in the horizontal / vertical directions can be performed. Specifically, whether to perform an adaptive transformation can be determined based on at least one coding setting. For example, in the case of intra-frame prediction, when the prediction mode is horizontal, DCT-I can be applied horizontally and DST-I vertically; when the prediction mode is vertical, DST-VI can be applied horizontally and DCT-VI vertically; when diagonal down left, DCT-II can be applied horizontally and DCT-V vertically; and when diagonal down right, DST-I can be applied horizontally and DST-VI vertically.
[0090] The size and shape of each transform block can be determined based on the encoding cost of each candidate transform block, and encoding can be performed on the image data of each determined transform block and information such as the size and shape of each determined transform block.
[0091] The square transformation can be set as the basic transformation form, and additional transformation forms (e.g., rectangular shapes) can be supported. Whether to support additional transformation forms can be determined at the unit level (sequence, image, slice, tile, etc.), and related information can be generated at that unit level. Transformation form selection information can be determined at the unit level (block, etc.), and related information can be generated.
[0092] Furthermore, support for transform block formats can be determined based on the encoding information. In this case, the encoding information can correspond to slice type, encoding mode, block size and shape, block segmentation method, etc. That is, one transform format can be supported based on at least one encoding information, and multiple transform formats can be supported based on at least one encoding information. The former may be implicit, while the latter may be explicit. In the explicit case, adaptive selection information indicating the best candidate group among multiple candidate groups can be generated and included in the bitstream. It is understood that in the present invention, including this example, when encoding information is explicitly generated, the corresponding information is included in the bitstream in various units, and the decoder parses the relevant information in various units to recover the decoded information. Additionally, it is understood that when implicitly processing encoded / decoded information, the encoder and decoder process it using the same procedures and rules.
[0093] As an example, the supported transformations for rectangular shapes can be determined based on the slice type. In the case of I slices, supported transformations can be square transformations, while in the case of P / B slices, transformations can be either square or rectangular shape transformations.
[0094] As an example, the supported transformations for rectangular shapes can be determined based on the encoding mode. In the case of Intra, supported transformations can be for square shapes, while in the case of Inter, supported transformations can be for either square or rectangular shapes.
[0095] As an example, the supported transformations for rectangular shapes can be determined based on the size and shape of the block. Transformations supported in blocks of a predetermined size or larger can be square shape transformations, while transformations supported in blocks smaller than the predetermined size can be either square or rectangular shape transformations.
[0096] As an example, the supported transformations for rectangular shapes can be determined based on the method of block segmentation. When the block to which the transformation is performed is obtained through quadtree segmentation, the supported transformations can be square shapes, and when the block is obtained through binary tree segmentation, the supported transformations can be either square or rectangular shapes.
[0097] The example above demonstrates support for transformations based on a single encoding setting, and multiple pieces of information can be combined to enable additional transformations. The example above is not limited to the one described above, but rather illustrates support for additional transformations based on various encoding settings, and various modified examples can be implemented.
[0098] Depending on the encoding settings or the characteristics of the image, the transformation process can be omitted. For example, depending on the encoding settings (assuming a lossless compression environment in this example), the transformation process (including the inverse processing) can be omitted. As another example, the transformation process can be omitted when the compression performance through transformation is not utilized according to the characteristics of the image. In this case, the omitted transformation can be the entire unit, or one of the horizontal and vertical units can be omitted, and whether such omission is supported can be determined based on the size and shape of the block.
[0099] For example, in a setting where horizontal and vertical transformation omissions are bundled, when the transformation omission flag is 1, transformations are not performed in the horizontal and vertical directions, while when it is 0, transformations can be performed in both directions. In a setting where horizontal and vertical transformations are operated independently, when the first transformation omission flag is 1, transformations are not performed in the horizontal direction; if it is 0, transformations are performed in the horizontal direction. When the second transformation omission flag is 1, transformations are not performed in the vertical direction; if it is 0, transformations are performed in the vertical direction.
[0100] Transform omitting is supported when the block size corresponds to range A, but not when it corresponds to range B. For example, if the horizontal length of the block is greater than M or the vertical length is greater than N, the transform omitting flag is not supported; conversely, if the horizontal length of the block is less than m or the vertical length is less than n, the transform omitting flag is supported. M(m) and N(n) can be the same or different. Transformation-related settings can be determined in units such as sequences, images, slices, etc.
[0101] If additional transform techniques are supported, the transform technique settings can be determined based on at least one encoding information. In this case, the encoding information may correspond to slice type, encoding mode, block size and shape, prediction mode, etc.
[0102] As an example, the supported transformation techniques can be determined based on the encoding mode. If it is Intra, the supported transformation techniques can be DCT-I, DCT-III, DCT-VI, DST-II, and DST-III; if it is Inter, the supported transformation techniques are DCT-II, DCT-III, and DST-III.
[0103] As an example, the supported transformation schemes can be determined based on the slice type. In the case of I slices, the supported transformation techniques can be DCT-I, DCT-II, and DCT-III; in the case of P slices, the supported transformation techniques can be DCT-V, DST-V, and DST-VI; and in the case of B slices, the supported transformation techniques can be DCT-I, DCT-II, and DST-III.
[0104] As an example, the supported transformation techniques can be determined based on the prediction mode. Prediction mode A may support DCT-1 or DCT-II, prediction mode B may support DCT-1 or DST-1, and prediction mode C may support DCT-I. In this case, prediction modes A and B can be directional modes, and prediction mode C can be a non-directional mode.
[0105] As an example, the supported transformation techniques can be determined based on the size and shape of the block. On blocks larger than a certain size, the supported transformation technique could be DCT-II; on blocks smaller than the certain size, the supported transformation techniques could be DCT-II or DST-V; and on blocks larger than or smaller than the certain size, the supported transformation techniques could be DCT-I, DCT-II, or DST-I. Additionally, transformation techniques supported in square shapes could be DCT-1 or DCT-II, and transformation techniques supported in rectangular shapes could be DCT-1 or DST-1.
[0106] The above example illustrates the transformation technique supported by a single encoded information, and multiple pieces of information can be combined to support additional transformation techniques. It is not limited to the above example and can be transformed into other examples. Furthermore, the transformation unit can send the information required to generate the transform block to the encoding unit to encode the information, incorporate it into the bitstream, and then send the information to the decoder. The decoder's decoding unit parses the information and uses it for the inverse transformation process.
[0107] The quantization unit can quantize the input signal, and the signal obtained through the quantization process is called the quantized coefficient. For example, a quantized block with quantized coefficients can be obtained by performing quantization on a residual block having residual transform coefficients received from the transform unit. At this time, the input signal is determined according to the encoding settings, which are not limited to the residual transform coefficients.
[0108] The quantization department can perform quantization on the transformed residual block using quantization techniques such as Dead Zone Uniform Threshold Quantization and Quantization Weighted Matrix, but is not limited to these; various quantization techniques that are improved and modified for this purpose can be used.
[0109] In addition, the quantization unit can send the information required to generate the quantization block to the encoding unit to encode the information and record it in the bit stream. Then, the information is sent to the decoder, and the decoding unit of the decoder parses the information and uses it for the inverse quantization process.
[0110] Although the above example has been described under the assumption that the residual block is transformed and quantized by the transform unit and the quantization unit, the residual signal of the residual block can be transformed to generate a residual block with transform coefficients, and the quantization process can be omitted. Not only can the quantization process be performed without transforming the residual signal of the residual block into transform coefficients, but the transformation and quantization processes can also be omitted altogether. This can be determined based on the encoder settings.
[0111] The encoding unit generates a sequence of quantization coefficients, transform coefficients, or residual signals from the residual blocks generated by scanning in at least one scanning order (e.g., zigzag scanning, vertical scanning, horizontal scanning, etc.), and can perform encoding using at least one entropy coding technique. Information about the scanning order can be determined based on encoding settings (e.g., encoding mode, prediction mode, etc.), and related information can be generated implicitly or explicitly. For example, one of multiple scanning orders can be selected based on the intra-frame prediction mode. The scanning pattern can be set to one of various patterns such as zig-zag, diagonal, or raster.
[0112] Furthermore, encoded data including the encoded information sent from each component can be generated and output to a bitstream, which can be implemented by a multiplexer (MUX). In this case, encoding techniques can be employed to perform encoding using methods such as Exponential Golomb, Context Adaptive Variable Length Coding (CAVLC), and Context Adaptive Binary Arithmetic Coding (CABAC), and the invention is not limited thereto; various improved and modified encoding techniques can be used.
[0113] When entropy coding (assuming CABAC syntax in this example) is performed on the residual block data and syntax elements such as information generated during encoding / decoding, the entropy coding device may include a binarizer, a context modeler, and a binary arithmetic coder. The binary arithmetic coder may include a regular coding engine and a bypass coding engine. The regular coding engine may be a component that operates with respect to the context modeler, and the bypass coding engine may be a component that operates independently of the context modeler.
[0114] Since the syntax elements input to the entropy encoding device may not be binary values, when the syntax elements are not binary values, the binarization unit can output a bin string consisting of 0s or 1s by performing binarization on the syntax elements. In this case, bin represents bits consisting of 0s or 1s, and encoding can be performed by a binary arithmetic encoder. At this time, either a regular encoding unit or a bypass encoding unit can be selected based on the probability of 0s and 1s being generated, which can be determined according to the encoding / decoding settings. If the syntax elements are data where the frequencies of 0s and 1s are the same, the bypass encoding unit can be used; if they are different, the regular encoding unit can be used, and context modeling (or updating context information) can be used as a reference when executing the next regular encoding unit.
[0115] At this point, the context is information about the probability of bin generation, and context modeling is the process of estimating the probability of the bin required for binary arithmetic encoding by taking the bin as the result of binarization as input. To perform probability estimation, the syntax elements of the bin, the position index of the bin in the bin string, and the probability that the bin is included in surrounding blocks can be used, and at least one context table can be used for this purpose. For example, multiple context tables can be used as some information for the flags, depending on whether the surrounding blocks use combinations of flags.
[0116] Various methods can be used when binarizing syntax elements. For example, they can be categorized into fixed-length binarization and variable-length binarization. In the case of variable-length binarization, unary binarization, truncated unary binarization, truncated Rice binarization, K-th exponential-Golomb binarization, and truncated binary binarization can be used. Furthermore, signed or unsigned binarization can be performed based on the range of values of the syntax element. The binarization process of syntax elements in this invention includes not only the binarization methods mentioned in the examples above but also other additional binarization methods.
[0117] The inverse quantization unit and the inverse transform unit can be implemented by reversing the processes of the transform unit and the quantization unit. For example, the inverse quantization unit can perform inverse quantization on the quantized transform coefficients generated by the quantization unit, and the inverse transform unit can perform inverse transform on the inverse quantized transform coefficients to generate the restored residual block.
[0118] The adder reconstructs the current block by adding the predicted block and the reconstructed residual block. The reconstructed block can be stored in memory and used as reference data (prediction section, filter section, etc.).
[0119] The in-loop filtering unit may include at least one post-processing filtering component, such as a deblocking filter, a Sample Adaptive Offset (SAO), an Adaptive Loop Filter (ALF), etc. The deblocking filter can remove block distortion generated at the boundaries between blocks in the restored image. The ALF can perform filtering based on values obtained by comparing the restored image with the input image. Specifically, filtering can be performed based on values obtained by comparing the restored image after block filtering with the deblocking filter with the input image. Furthermore, filtering can be performed based on values obtained by comparing the restored image after block filtering with the SAO with the input image.
[0120] The memory can store restored blocks or images. These restored blocks or images stored in memory can be provided to the prediction unit that performs intra-frame or inter-frame prediction. Specifically, the encoder can treat the storage space in the form of a queue of compressed bitstreams as a Coded Picture Buffer (CPB), and the space storing decoded images in picture units as a Decoded Picture Buffer (DPB). In the case of the CPB, decoded units are stored in decoding order, and the decoding operation can be simulated in the decoder. The compressed bitstream can be stored during the simulation, and the bitstream output from the CPB is restored through the decoding process. The restored image is stored in the DPB, and the images stored in the DPB can be referenced in subsequent image encoding and decoding processes.
[0121] The decoding unit can be implemented by reversing the processes in the encoding unit. For example, a quantization coefficient sequence, a transform coefficient sequence, or a signal sequence can be received from a bitstream and decoded, or decoded data including decoding information can be parsed and sent to each component unit.
[0122] On the other hand, although Figure 1 and Figure 2The image encoding and decoding apparatus is not shown, but may also include a block segmentation unit. Information about basic coding units can be obtained from the image segmentation unit, and the basic coding units can represent the basic (or starting) units used for prediction, transformation, quantization, etc., during the image encoding / decoding process. In this case, the coding units can be grouped into one luminance coding block and two chrominance coding blocks according to the color format (YCbCr in this example), and the size of each block can be determined according to the color format. In the examples described later, the description will be based on blocks (luminance components in this example). Here, it is assumed that the block is a unit that can be obtained after determining each unit, and it is assumed that a similar setup will be applied to other types of blocks.
[0123] The segmentation block section can be set for each element of the image encoding and decoding apparatus, and the size and shape of the block can be determined through this process. At this time, the block can be defined and set according to different configuration sections, corresponding to prediction blocks in the prediction section, transformation blocks in the transformation section, and quantization blocks in the quantization section. The invention is not limited thereto; block units can be further defined according to other component sections. The size and shape of the block can be defined by the horizontal and vertical lengths of the block.
[0124] In the block segmentation section, blocks can be represented as M×N, and can be obtained within the range of maximum and minimum values for each block. For example, the shape of the block supports squares; if the maximum value of a block is 256×256 and the minimum value is 8×8, then a block of size 2 can be obtained. m ×2 m The blocks can be of size 2m×2m (where m is an integer from 3 to 8 in this example, e.g., 8×8, 16×16, 32×32, 64×64, 128×128, 256×256), or 2m×2m (where m is an integer from 4 to 128 in this example), or m×m (where m is an integer from 8 to 256 in this example). Alternatively, the block shape can be square or rectangular, and if the block has the same range as the examples above, 2 can be obtained. m ×2 nBlock sizes (in this example, m and n are integers from 3 to 8, assuming a maximum horizontal-to-vertical ratio of 2:1, examples include 8×8, 8×16, 16×8, 16×16, 16×32, 32×16, 32×32, 32×64, 64×32, 64×64, 64×128, 128×64, 128×128, 128×256, 256×128, 256×256; depending on the encoding / decoding settings, there is no limit to the horizontal-to-vertical ratio, or there may be a maximum value for the ratio). Alternatively, blocks of size 2m×2n can be obtained (in this example, m and n are integers from 4 to 128). Alternatively, blocks of size m×n can be obtained (where m and n are integers from 8 to 256).
[0125] The available blocks can be determined based on the encoding / decoding settings (e.g., block type, segmentation method, segmentation settings, etc.). For example, a coding block is 2... m ×2 n The size of the blocks is as follows: the prediction block is a 2m×2n or m×n block, while the transformation block is a 2m×2n or m×n block. m ×2 n Block size. Based on the settings, information such as block size and range can be generated (e.g., information related to indexes and multiples).
[0126] Depending on the type of block, the aforementioned ranges (maximum and minimum values in this example) can be determined. Furthermore, in some blocks, the block range information can be generated explicitly, while in others, it can be determined implicitly. For example, relevant information can be explicitly generated in encoding and transform blocks, and implicitly processed in prediction blocks.
[0127] In the explicit case, at least one range of information can be generated. For example, in the case of a coded block, information about the range can be generated regarding the maximum and minimum values. Alternatively, it can be generated based on the difference between the maximum value and a preset minimum value (e.g., 8) (e.g., information based on the difference between the indices of the maximum and minimum values). Additionally, information about multiple ranges of the horizontal and vertical lengths of the rectangular block can be generated.
[0128] In the implicit case, range information can be obtained based on encoding / decoding settings (e.g., block type, segmentation method, segmentation settings, etc.). For example, in the case of prediction blocks, the maximum and minimum values are obtained from candidate groups (M×N and m / 2×n / 2 in this example) within the encoding block (which serves as the higher-level unit, where the maximum size of the encoding block is M×N and the minimum size is m×n) obtained from the segmentation settings of the prediction block (quadtree segmentation with a segmentation depth of 0 in this example).
[0129] The size and shape of the initial (or starting) block of the segmentation unit can be determined by the higher-level unit. In the case of coded blocks, the basic coded block obtained from the image segmentation unit can be the initial block; in the case of prediction blocks, the coded block can be the initial block; and in the case of transform blocks, either the coded block or the prediction block can be the initial block, which can be determined according to the encoding / decoding settings. For example, when the encoding mode is Intra, the prediction block can be the higher-level unit of the transform block, while when it is Inter, the prediction block can be a unit independent of the transform block. The initial block can be divided into smaller blocks as the starting unit for segmentation. When the optimal size and shape for the segmentation of each block are determined, that block can be determined as the initial block of the lower-level unit. For example, in the former case, it can be the coded block, and in the latter case (lower-level unit), it can be the prediction block or the transform block. As in the example above, when determining the initial block of the lower-level unit, a segmentation process for searching for blocks with optimal size and shape (e.g., higher-level units) can be performed.
[0130] In summary, a block segmentation unit can segment a basic coding unit (or a maximum coding unit) into at least one coding unit (or a lower-level coding unit). Furthermore, a coding unit can be segmented into at least one prediction unit and at least one transform unit. A coding unit can be segmented into at least one coding block, a coding block can be segmented into at least one prediction block, and at least one transform block. A prediction unit can be segmented into at least one prediction block, and a transform unit can be segmented into at least one transform block.
[0131] As shown in the example above, when a block with optimal size and shape is found through the pattern determination process, pattern information (e.g., segmentation information) can be generated for that pattern. The pattern information, along with information generated in the constituent parts to which the block belongs (e.g., prediction-related information, transformation-related information, etc.), can be included in the bitstream and transmitted to the decoder, where it is parsed at the same level for use in the image decoding process.
[0132] The following will describe examples of segmentation methods, and will assume that the initial block has a square shape. However, the same or similar examples can be applied in the case of rectangles.
[0133] The segmentation block section can support various segmentation methods. For example, it can support tree-based segmentation or type-based segmentation, and other methods can be applied. In the case of tree-based segmentation, segmentation flags can be used to generate segmentation information, while in the case of type-based segmentation, index information on the block shapes included in a preset candidate group can be used to generate segmentation information.
[0134] Figure 3 This is a schematic diagram illustrating the shape of a tree-based block.
[0135] refer to Figure 3 `a` represents a 2N×2N instance obtained without splitting; `b` represents a 2N×N instance obtained through a partial split flag (horizontal split of a binary tree in this example); `c` represents two N×2N instances obtained through a split flag (vertical split of a binary tree in this example); and `d` represents four N×N instances obtained through a partial split flag (quad split of a quadtree or horizontal and vertical split of a binary tree in this example). The shape of the block to be obtained can be determined based on the type of tree used for splitting. For example, when performing a quadtree split, the candidate blocks that can be obtained are `a` and `d`. When performing a binary tree split, the candidate blocks that can be obtained are `a`, `c`, and `d`. In the case of a quadtree, one split flag is supported; if the corresponding flag is '0', `a` can be obtained, and if the corresponding flag is '1', `d` can be obtained. In the case of a binary tree, multiple split flags are supported, one of which can be a flag indicating whether a split is performed, one of which can be a flag indicating whether a horizontal / vertical split is performed, and one of which can be a flag indicating whether overlapping horizontal / vertical splits are allowed. When overlap is allowed, the available candidate blocks can be a, b, c, and d; when overlap is not allowed, the available candidate blocks can be a, b, and c. A quadtree can be a tree-based basic partitioning method; additionally, tree partitioning methods (binary trees in this example) can be included within tree-based partitioning methods. Multiple tree partitions can be performed when flags allowing additional tree partitions are implicitly or explicitly activated. Tree-based partitioning can be a method capable of performing recursive partitioning. That is, the partitioned blocks can be reset to the initial blocks, and tree-based partitioning can be performed, which can be determined based on partitioning settings such as partition range and allowed partition depth. This can be an example of a hierarchical partitioning method.
[0136] Figure 4 This is a schematic diagram illustrating the block shape based on type.
[0137] like Figure 4As shown, depending on the type, the segmented blocks can have 1 segmentation shape (a in this example), 2 segmentation shapes (b, c, d, e, f, and g in this example), and 4 segmentation shapes (h in this example). Candidate groups can be configured through various settings. For example, candidate groups can be... Figure 5 Configurations such as a, b, c, n or a, b to g, n or a, n, q, etc., but not limited to these, include examples described later, thus enabling various variations. When the flag allowing symmetric partitioning is activated, the supported blocks can be... Figure 4 The blocks supported by a, b, c, and h in the table, and when the flag allowing asymmetric partitioning is activated, can be... Figure 4 All values from a to h in the table. In the former case, relevant information can be implicitly activated (in this example, a flag allowing symmetric partitioning), while in the latter case, relevant information can be explicitly generated (in this example, a flag allowing asymmetric partitioning). Type-based partitioning can be a way to support a single partition. Compared to tree-based partitioning, blocks obtained through type-based partitioning may not be able to be further partitioned. This could be an example where partitioning allows for a depth of zero (e.g., a single-level partition).
[0138] Figure 5 This is a schematic diagram showing various block shapes that can be obtained through the segmentation block portion of the present invention.
[0139] refer to Figure 5 Blocks a to s can be obtained according to the segmentation settings and segmentation method, and can also have additional block shapes not shown.
[0140] As an example, asymmetric partitioning can be allowed for tree-based partitioning. For instance, in the case of binary trees, such as... Figure 5 Blocks b and c in the example (in this case, divided into multiple blocks) can allow asymmetric partitioning, or as... Figure 5 Blocks b through g (in this example, the case of being divided into multiple blocks) can be asymmetrically segmented. If the flag allowing asymmetric segmentation is explicitly or implicitly activated according to the encoding / decoding settings, the available candidate blocks may be b or c (this example assumes that horizontal and vertical overlapping segmentation is not allowed), and when the flag allowing asymmetric segmentation is activated, the available candidate blocks may be b, d, e (horizontal segmentation in this example) or c, f, g (vertical segmentation in this example). This example can correspond to the case where the segmentation direction is determined by the horizontal or vertical segmentation flag, and the block shape is determined according to the flag allowing asymmetry; however, the invention is not limited to this and can be varied into other examples.
[0141] As an example, additional tree splitting can be used for tree-based splitting. For example, ternary, quadtree, and octree splitting can be performed to obtain n split blocks (in this example, n is 3, 4, or 8, where n is an integer). In the case of a ternary tree, the supported blocks (in this example, splitting into multiple blocks) can be h to m; in the case of a quadtree, the supported blocks can be n to p; and for an octree, the supported blocks can be q. Whether tree-based splitting is supported can be implicitly determined based on the encoding / decoding settings, or this information can be explicitly generated. Furthermore, depending on the encoding / decoding settings, binary or quadtree splitting can be used alone, or a combination of both. For example, in the case of a binary tree, the following can be obtained: Figure 5 The blocks in b and c are given, and when binary and ternary trees are used in combination (this example assumes partial overlap between the use of binary and ternary trees), blocks such as b, c, i, l can be obtained. If the flags allowing further segmentation beyond the existing tree are explicitly or implicitly deactivated according to the encoding / decoding settings, the candidate blocks that can be obtained can be b or c, and when activated, the candidate blocks that can be obtained are b, i or b, h, i, j (horizontal segmentation in this example) or c, l or c, k, l, m (vertical segmentation in this example). This example can correspond to the case where the segmentation direction is determined by horizontal or vertical segmentation flags, and the block shape is determined according to the flags allowing additional segmentation; however, the invention is not limited to this and variations are possible.
[0142] As an example, type-based blocks can allow non-rectangular partitions. For example, partitions in shapes r and s are possible. When combined with the type-based block candidate groups described above, blocks of a, b, c, h, r, s or a through h, r, s are supported blocks. Additionally, the candidate groups can include blocks supporting n partitions such as h through m (e.g., n is an integer, in this example, 3 in addition to 1, 2, and 4).
[0143] The segmentation method can be determined based on the encoding / decoding settings.
[0144] As an example, the segmentation method can be determined based on the type of the block. For instance, encoding blocks and transform blocks can use tree-based segmentation, while prediction blocks can use type-based segmentation. Furthermore, combinations of both types of segmentation methods can be used. For example, prediction blocks can use a segmentation method that mixes tree-based and type-based segmentation, and the segmentation method varies depending on at least one range applied to the block.
[0145] For example, the partitioning method can be determined based on the size of the block. For instance, certain ranges between the maximum and minimum values of a block (e.g., a×b to c×d, where the latter is larger) can be partitioned based on a tree, and certain ranges (e.g., e×f to g×h) can be partitioned based on a type. In this case, the range information based on the partitioning method can be explicitly generated or implicitly determined.
[0146] As an example, the segmentation method can be determined based on the shape of the block (or the block before segmentation). For instance, if the block shape is square, tree-based segmentation and type-based segmentation can be performed. Alternatively, when the block is rectangular, tree-based segmentation can be performed.
[0147] The segmentation settings can be determined based on the encoding / decoding settings.
[0148] As an example, the segmentation settings can be determined based on the type of block. For instance, in tree-based segmentation, the coding and prediction blocks can use quadtrees, while the transform block can use a binary tree. Alternatively, the allowed segmentation depth can be m in the coding block, n in the prediction block, and o in the transform block, where m, n, and o may be the same or different.
[0149] For example, the partitioning settings can be determined based on the size of the blocks. For instance, quadtree partitioning can be performed on certain ranges (e.g., a×b to c×d), while binary tree partitioning can be performed on certain ranges (e.g., e×f to g×h, where in this example, we assume c×d is greater than g×h). In this case, the range can include all ranges between the maximum and minimum values of the blocks, and these ranges can have settings that do not overlap with each other or can have overlapping settings. For example, the minimum value of some ranges can be equal to the maximum value of some ranges, or the minimum value of some ranges can be less than the maximum value of some ranges. If overlapping ranges exist, the partitioning method with the higher maximum value can have priority. That is, whether to perform a partitioning method with lower priority can be determined based on the partitioning result among the partitioning methods with priority. In this case, the range information based on the tree type can be explicitly generated or implicitly determined.
[0150] As another example, in some ranges of the block (same as the example above), type-based splitting with some candidate groups can be performed, and in other ranges (same as the example above), type-based splitting with some candidate groups (at least one configuration in this example is different from the previous candidate group) can be performed. In this case, the range can include all ranges between the maximum and minimum values of the block, and the ranges can have settings that do not overlap with each other.
[0151] As an example, the partitioning settings can be determined based on the shape of the block. For instance, when the block is square, a quadtree partition can be performed. And when the block is rectangular, a binary tree split can be performed.
[0152] For example, the segmentation settings can be determined based on encoding / decoding information (e.g., slice type, color components, encoding mode, etc.). For instance, if the slice type is I, quadtree (or binary tree) segmentation can be performed in certain ranges (e.g., a×b to c×d); if it's P, in certain ranges (e.g., e×f to g×h); and if it's B, in certain ranges (e.g., i×j to k×l). Furthermore, when the slice type is I, the allowed segmentation depth for quadtree (or binary tree) segmentation can be set to m; when the slice type is P, the allowed segmentation depth is set to n; and when the slice type is B, the allowed segmentation depth is set to o. m and o can be the same or different. Some slice types may have the same settings as other slices (e.g., P and B slices).
[0153] As another example, when the color component is a luminance component, the allowed depth of the quadtree (or binary tree) partition can be set to m, and in the case of a chromatic aberration component, it can be set to n, where m and n can be the same or different. Furthermore, the partition range of the quadtree (or binary tree) (e.g., from a×b to c×d) can be the same or different if the color component is a luminance component and if it is a chromatic aberration component (e.g., from e×f to g×h).
[0154] As another example, if the encoding mode is Intra, the partition depth of the quadtree (or binary tree) can be m, and if it is Inter, it can be n (in this example, we assume n is greater than m), and m and n can be the same or different. Furthermore, the partition range of the quadtree (or binary tree) can be the same or different when the encoding mode is Intra and when it is Inter.
[0155] In the examples above, information about whether adaptive segmentation candidate group configuration is supported based on encoding / decoding information can be explicitly generated or implicitly determined.
[0156] The examples above have described how to determine the segmentation method and settings based on encoding / decoding settings. These examples illustrate some cases based on each element, and can be transformed into other forms. Furthermore, the segmentation method and settings can be determined based on a combination of multiple elements. For example, the segmentation method and settings can be determined based on the block type, size, shape, encoding / decoding information, etc.
[0157] Additionally, in the examples above, elements related to the segmentation method, settings, etc., can be implicitly determined or information can be explicitly generated to determine whether adaptation scenarios as shown in the examples above are allowed.
[0158] In the segmentation settings, the segmentation depth refers to the number of spatial divisions based on the initial block (in this example, the initial block's segmentation depth is 0), and increasing the segmentation depth allows for smaller segments. This depth-related setting can be changed depending on the segmentation method. For example, in a tree-based segmentation method, the quadtree segmentation depth and the binary tree segmentation depth can use a common depth, or they can use separate depths depending on the tree type.
[0159] In the examples above, when using a separate split depth based on the tree type, the split depth can be set to 0 at the split start position of the tree (the block before the split in this example). The split depth can be calculated centered on the split start position, rather than based on the split range of each tree (the maximum value in this example).
[0160] Figure 6 This is a schematic diagram illustrating tree-based segmentation according to an embodiment of the present invention.
[0161] 'a' represents an example of quadtree and binary tree partitioning. Specifically, the top-left block of 'a' shows a quadtree partition, the top-right and bottom-left blocks show partitions of quadtrees and binary trees, and the bottom-right block shows a binary tree partition. In the diagram, solid lines (Quad1 in this example) represent the boundary lines of partitions into quadtrees, dashed lines (Binary1 in this example) represent the boundary lines of partitions into binary trees, and thick solid lines (Binin2 in this example) represent the boundary lines of partitions into binary trees. The difference between dashed and thick solid lines represents the difference in partitioning methods.
[0162] As an example, (the quadtree split of the top-left block allows a depth of 3. If the current block is N×N, splitting is performed until either the horizontal or vertical length is reached (N>>3), and splitting information is generated up to (N>>2). This applies to the examples described later. Assuming the maximum and minimum values of the quadtree are N×N, (N>>3)×(N>>3), when tree splitting is performed, it can be divided into four blocks with a horizontal and vertical length of 1 / 2. When splitting is active, the splitting flag can be '1', and when splitting is deactivated, the splitting flag can be '0'. Depending on this setting, the splitting flag of the top-left block may be generated together with the top-left block of b.
[0163] As an example, (for the top right block, assuming a quadtree partition depth of 0, a binary tree partition depth of 4, and the maximum and minimum values for quadtree partitions are N×N, (N>>2)×(N>>2), while the maximum and minimum values for binary tree partitions are (N>>1)×(N>>1) and (N>>3)×(N>>3)), the top right block, if a quadtree partition is performed on the initial block, can be divided into four blocks with a length of 1 / 2 each horizontally and vertically. The size of the partitioned blocks is (N>>1)×(N>>), meaning that a binary tree partition can be performed (in this example, greater than the minimum value for quadtree partitions, but with a limited partition depth). That is, this example can be one where quadtree partitions and binary tree partitions cannot overlap. The binary tree partitioning information in this example can be configured by multiple partitioning flags. Some flags can be horizontal split flags (corresponding to x in x / y in this example), while others can be vertical split flags (corresponding to y in x / y in this example). The configuration of split flags can have settings similar to quadtree splits (e.g., whether to perform activation). In this example, both flags can be activated repeatedly. When flag information is generated in the graph using '-', '-' may correspond to implicit flag processing, which may occur when additional splits cannot be performed based on conditions such as the maximum, minimum, and split depth of the tree split. Depending on this setting, the split flag for the upper right block may be generated together with the upper left block of b.
[0164] As an example, (for the bottom left block, assuming a quadtree partition depth of 3, a binary tree partition depth of 2, a maximum and minimum quadtree partition value of N×N, (N>>3)×(N>>3), and a maximum and minimum binary tree partition value of (N>>2)×(N>>2), (N>>4)×(N>>4), with partition priority assigned to quadtree partitions within the overlapping range), when a quadtree partition is performed on the bottom left block in the initial block, it can be divided into four blocks with a length of 1 / 2 of the horizontal and vertical lengths. The size of the divided blocks is (N>>1)×(N>>), meaning that based on the settings of this example, both binary tree partitions and quadtree partitions can be performed. That is, this example can be one where quadtree partitions and binary tree partitions can be used concurrently. In this case, whether to perform a binary tree partition can be determined based on the quadtree partition result with the given priority. When a quadtree split is performed, a binary tree split is not performed; if a quadtree split is not performed, a binary tree split can be performed. Even if the conditions for an executable split are met according to the above settings, a quadtree split may fail to execute if it is not performed. The binary tree splitting information in this example can be configured by multiple splitting flags. Some flags can be splitting flags (in this example, corresponding to x in x / y), while some flags can be splitting direction flags (in this example, corresponding to y in x / y, where x determines whether y information is generated). The splitting flags can have settings similar to those for a quadtree split. In this example, horizontal and vertical splits cannot overlap and be activated. When the flag information is generated as '-' in the graph, '-' can have settings similar to those in the example above. Depending on this setting, the splitting flag for the lower left block may be generated together with the lower left block of b.
[0165] As an example, (for the bottom right block, assuming the binary tree partition allows a depth of 5, and the maximum and minimum values for the binary tree partition are N×N, (N>>2)×(N>>3)), the bottom right block, when the initial block is partitioned using a binary tree, can be divided into two blocks with half the horizontal and vertical lengths. The partition flag settings in this example can be the same as for the bottom left block. When the flag information is generated as '-' in the diagram, '-' can have settings similar to the example above. This example shows the case where the minimum horizontal and vertical values of the binary tree are set differently. Depending on these settings, the partition flag for the bottom right block may be generated together with the bottom right block of b.
[0166] As shown in the example above, after confirming the block information (e.g., type, size, shape, position, slice type, color composition, etc.), the segmentation method and segmentation settings can be determined, and the segmentation process can be executed accordingly.
[0167] Figure 7 This is a schematic diagram illustrating tree-based segmentation according to an embodiment of the present invention.
[0168] Referring to blocks a and b, the thick solid line L0 represents the largest coded block, and the blocks separated by the thick solid line and other lines L1 to L5 represent the segmented coded blocks. The numbers within the blocks indicate the positions of the sub-blocks (in this example, according to the RasterScan order), the number of '-' indicates the segmentation depth of the corresponding block, and the numbers on the boundary lines between blocks indicate the number of segments. For example, if it is divided into four (a quadtree in this example), it has the order UL(0)-UR(1)-DL(2)-DR(3); if it is divided into two (a binary tree in this example), it has the order L or U(0)-R or D(1), which can be defined at each segmentation depth. The examples described later illustrate the case where the available coded blocks are finite.
[0169] As an example, suppose the largest coded block of 'a' is 64×64 and the smallest coded block is 16×16, and a quadtree is used for partitioning. In this case, since blocks 2-0, 2-1, and 2-2 (16×16 in this example) are equal to the size of the smallest coded block, they may not be partitioned into smaller blocks, such as blocks 2-3-0, 2-3-1, 2-3-2, and 2-3-3 (8×8 in this example). In this case, since the available blocks in blocks 2-0, 2-1, 2-2, and 2-3 are 16×16 blocks, i.e., a single candidate group, no block partitioning information is generated.
[0170] As an example, suppose the maximum coded block size of b is 64×64, the minimum coded block size is 8 in either the horizontal or vertical direction, and the allowed segmentation depth is 3. In this case, since the 1-0-1-1 block (16×16 in this example, with a segmentation depth of 3) satisfies the minimum coded block condition, it can be segmented into smaller blocks. However, since it equals the allowed segmentation depth, it may not be segmented into blocks with a deeper segmentation depth (1-0-1-0-0 and 1-0-1-0-1 blocks in this example). In this case, in the 1-0-1-0 and 1-0-1-1 blocks, since the available blocks are 16×8 blocks, i.e., a candidate group, no block segmentation information is generated.
[0171] As shown in the examples above, quadtree splitting or binary tree splitting can be supported based on encoding / decoding. Alternatively, a combination of quadtree and binary tree splitting can be supported. For example, one or both of these methods can be supported based on block size, splitting depth, etc. If a block belongs to the first block range, quadtree splitting can be supported, and if the block belongs to the second block range, binary tree splitting can be supported. When multiple splitting methods are supported, there can be settings based on at least one of the following: maximum block size, minimum block size, allowable splitting depth, etc., for each method. The ranges can be set to overlap or not overlap. Alternatively, it is possible for one range to include another. The above settings can be determined based on single or mixed factors such as slice type, encoding mode, color components, etc.
[0172] As an example, the segmentation settings can be determined based on the slice type. In the case of an I slice, the supported segmentation settings are 128×128 to 32×32 for quadtrees and 32×32 to 8×8 for binary trees. In the case of a P / B slice, the supported segmentation settings are 128×128 to 32×32 for quadtrees and 64×64 to 8×8 for binary trees.
[0173] As an example, the partitioning settings can be determined based on the encoding mode. When the encoding mode is Intra, for supported partitioning settings, a partitioning range of 64×64 to 8×8 and an allowed partitioning depth of 2 are supported in binary trees. When the encoding mode is Inter, for supported partitioning settings, a partitioning range of 32×32 to 8×8 and an allowed partitioning depth of 3 are supported in binary trees.
[0174] As an example, the segmentation settings can be determined based on the color components. In the case of the luminance component, the supported segmentation range is 256×256 to 64×64 in the quadtree case and 64×64 to 16×16 in the binary tree case. In the case of the chrominance component, the same settings as the luminance component are supported in the quadtree case (in this example, the length ratio of each block is set according to the chrominance format), and in the binary tree case, the supported segmentation range is 64×64 to 4×4 (in this example, it is assumed that the range in the same luminance component is 128×128 to 8×8, 4:2:0).
[0175] The example illustrates different segmentation settings based on block type. Additionally, some blocks can be combined with other blocks to perform a segmentation process. For instance, when combining a coding block and a transform block into a single unit, a segmentation process is performed to obtain the optimal block size and shape, which can be the optimal size and shape for both the coding block and the transform block. Alternatively, coding blocks and transform blocks can be combined into a single unit, prediction blocks and transform blocks can be combined into a single unit, coding blocks, prediction blocks, and transform blocks can be combined into a single unit, and combinations can be performed on other blocks.
[0176] In this invention, the case of providing a separate partitioning setting in each block is described, but it is also possible to combine multiple units into one unit to have a single partitioning setting.
[0177] In the above process, the generated information will be recorded in the bitstream as at least one unit from the encoder, such as sequence, image, slice, tile, etc., and the decoder will parse the relevant information from the bitstream.
[0178] During image encoding / decoding, situations may arise where the input pixel values differ from the output pixel values. A pixel value adjustment process can be performed to prevent distortion caused by operational errors. The pixel value adjustment method is the process of adjusting pixel values that are outside the acceptable range to within that range; this process can also be called clipping.
[0179] pixel_val'=Clip_x(pixel_val, minI, maxI) Clip_x(A, B, C) { if(A<B)output=B; elseif(A>C)output=C; elseoutput = A; }
[0180] Table 1
[0181] Table 1 provides example code for the clipping function (Clip_x) used to perform pixel value adjustments. Referring to Table 1, the input pixel value (pixel_val) and the minimum (minI) and maximum (maxI) values of the allowed pixel value range are passed as parameters to the clipping function (Clip_x). In this case, if described based on bit depth, the minimum value (minI) can be 0, and the maximum value (maxI) can be (2bit_depth - 1). When the clipping function (Clip_x) is executed, input pixel values (pixel_val, parameter A) smaller than the minimum value (minI) are changed to the minimum value (minI), and input pixel values larger than the maximum value (maxI, parameter C) are changed to the maximum value (maxI). Therefore, the output value (output) can be returned as the output pixel value (pixel_val') after the pixel value adjustment is completed.
[0182] At this point, the range of pixel values is determined based on the bit depth. However, depending on the type and characteristics of the image, the pixel values constituting the image (e.g., picture, slice, tile, block, etc.) are different, and therefore may not necessarily occur within all pixel value ranges. According to embodiments of the present invention, the range of pixel values constituting the actual image can be referenced and used in the image encoding / decoding process.
[0183] For example, in the pixel value adjustment method according to Table 1, the minimum value (minI) of the truncation function can be used as the minimum value among the pixel values that constitute the actual image, and the maximum value (maxI) of the truncation function can be used as the maximum value among the pixel values that constitute the actual image.
[0184] In summary, the image encoding / decoding apparatus may include a pixel value adjustment method based on bit depth and / or a pixel value adjustment method based on the range of pixel values constituting the image. In the encoder / decoder, flag information may be supported to determine whether an adaptive pixel value adjustment method is supported. When the flag information is '1', pixel value adjustment method selection information may be generated, and when the flag information is '0', a preset pixel value adjustment method (in this example, a bit depth-based method) may be used as the basic pixel value adjustment method. When the pixel value adjustment method selection information indicates a pixel value adjustment method based on the range of pixel values constituting the image, pixel value-related information of the image may be included. For example, information regarding the minimum and maximum values of each image based on color components, as well as the median value described later, may be included. The information generated during adjustment may be recorded and transmitted in units such as video, sequence, image, slice, tile, block, etc., from the encoder, and the information recorded in the decoder may be parsed to restore the relevant information in the same unit.
[0185] On the other hand, through the aforementioned process, and through pixel value adjustment based on bit depth or based on the range of pixel values constituting the image, the range of pixel values, including the minimum and maximum values, can be changed (determined or defined), and additional pixel value range information can also be changed (determined or defined). For example, the maximum and minimum values of the pixel values used to constitute the actual image can be changed, and the median value used to configure the pixel values can also be changed.
[0186] That is, during the pixel value adjustment according to the bit depth, minI can represent the minimum pixel value of the image, maxI can represent the maximum pixel value of the image, I can represent the color component, and medianI can represent the central pixel value of the image. minI can be 0, maxI can be (1<<bit_depth - 1)-1, midI can be 1<<(bit_depth - 1), and medianI can be obtained in other forms including the above examples according to the encoding / decoding setting. Median is only a term used for description in the present invention, and can refer to an information of the pixel value range information that can be changed (determined or defined) according to the pixel value adjustment process during the image encoding / decoding process.
[0187] For example, during the pixel value adjustment process according to the range of pixel values constituting the image, minI can be the minimum pixel value of the image, maxI can be the maximum pixel value of the image, and medianI can be the central pixel value of the image. medianI can be the average value of the pixel values in the image, can be the value located at the center when aligning the pixels of the image, and can be the value obtained according to the pixel value range information of the image. Among at least one of minI and maxI, medianI can be derived. That is, medianI can be a pixel value existing within the pixel value range of the image.
[0188] Specifically, medianI can be a value obtained according to the pixel value range information of the image (in this example, minI, maxI), such as (minI + maxI) / 2 or (minI + maxI)>>1, (minI + maxI + 1) / 2, (minI + maxI + 1)>>1, and medianI can be obtained in other forms including the above examples according to the encoding / decoding setting.
[0189] Next, an embodiment of the pixel value adjustment process (median in this example) will be described.
[0190] As an example, the basic bit depth is 8 bits (0 to 255), and the pixel value adjustment process based on the range of pixel values constituting the image is selected (in this example, the minimum value is 10, the maximum value is 190; the intermediate value (average) derived from the minimum and maximum values is 100), and if the current block position is the first block in the image (in this example, the picture), there are no adjacent blocks (in this example, left, lower left, upper left, upper, upper right) for encoding / decoding, so the reference pixel can be filled with the median 100. The intra prediction process can be performed using the reference pixel according to the prediction mode.
[0191] As an example, the basic bit depth is 10 bits (0 to 1023), and a pixel value adjustment process is chosen based on the range of pixel values constituting the image (in this example, the median value is 600, and relevant syntax elements exist). The current block is the first block within the image (in this example, a slice, a tile). Neighboring blocks for encoding / decoding (in this example, left, bottom left, top left, top, top right) do not exist, so the reference pixel can be filled with the median value 600. Intra-frame prediction can be performed using this reference pixel according to the prediction mode.
[0192] As an example, the basic bit depth is 10 bits, and a pixel value adjustment process based on the range of pixel values constituting the image is selected (in this example, the median value is 112, and there is a relevant syntax element). Depending on the encoding mode of adjacent blocks (intra-prediction / inter-prediction), a setting is activated to determine whether pixels from the corresponding block can be used in the prediction of the current block (in this example, when the encoding mode of the corresponding block is intra-prediction, it can be used as a reference pixel for the current block; it is not available during intra-prediction. When this setting is deactivated, it can be used as a pixel for the current block, regardless of the encoding mode of the corresponding block; the relevant syntax element is constrained_intra_pred_flag, which may appear in P or B image types). If the current block is located on the left side of the image, there are no adjacent blocks for encoding / decoding (in this example, left, bottom left, top left). If there are adjacent blocks (in this example, right and top right) for encoding / decoding, but because the encoding mode of that block is inter-prediction and is disabled by the above setting, there are no usable reference pixels, then the reference pixels can be filled with the median value (112 in this example). That is, since there are no usable reference pixels, the median of the image pixel value range can be used to fill the gap. The reference pixels can then be used to perform an intra-frame prediction process based on the prediction mode.
[0193] In the above embodiments, the prediction unit has illustrated various cases related to the median, but this can be included in another configuration of video encoding / decoding. Furthermore, the present invention is not limited to the above embodiments and can be modified and extended in various cases.
[0194] In this invention, the pixel value adjustment process can be applied to the encoding / decoding processes of prediction units, transform units, quantization units, inverse quantization units, inverse transform units, filtering units, and memory. For example, the input pixel in the pixel value adjustment method can be a reference sample or a prediction sample in the prediction process, and can be a reconstructed sample in the transform, quantization, inverse transform, and inverse quantization processes. Furthermore, this pixel can be a reconstructed pixel in the loop filtering process, or it can be a stored pixel in the memory. In this case, the reconstructed pixel in the transform, quantization, and inverse transform / inverse quantization processes can refer to the reconstructed pixel before the application of the loop filter. The reconstructed pixel in the loop filter can refer to the reconstructed pixel after the application of the loop filter. The reconstructed pixel in the deblocking filter process can refer to the reconstructed pixel after the application of the decoding filter. The reconstructed pixel in the SAO process can refer to the reconstructed pixel after the application of SAO. The reconstructed pixel in the ALF process can refer to the reconstructed pixel after the application of ALF. Examples of the various situations described above have been given, but the invention is not limited thereto and can be applied to the input, intermediate, and output steps of all encoding / decoding processes that invoke the pixel value adjustment process.
[0195] In the examples described later, it is assumed that the luminance component Y is supported by the clipping function Clip_Y and the chrominance components Cb and Cr are supported by the clipping functions Clip_Cb and Clip_Cr.
[0196] In this invention, the prediction part can be classified as intra-frame prediction and inter-frame prediction, and intra-frame prediction and inter-frame prediction can be defined as follows.
[0197] Intra-frame prediction can be a technique for generating prediction values from regions that have completed encoding / decoding of the current image (e.g., picture, slice, tile, etc.), while inter-frame prediction can be a technique for generating prediction values from an image (e.g., picture, slice, tile, etc.) that has completed encoding / decoding before the current image.
[0198] Furthermore, intra-frame prediction can be a technique for generating prediction values from regions that have completed encoding / decoding of the current image, but some prediction methods can exclude, for example, methods that generate prediction values from a reference image, such as block matching, template matching, etc., and inter-frame prediction can be a technique for generating prediction values from at least one image that has completed encoding / decoding, which can be configured to include the current image.
[0199] The encoding / decoding settings may follow one of the above definitions, and the following examples will be described assuming adherence to the first definition. Furthermore, although the invention is described under the assumption that the predicted value is obtained through prediction in the spatial domain, it is not limited thereto.
[0200] Figure 8 A block segmentation process according to an embodiment of the present invention is illustrated. In detail, examples of block sizes and shapes that can be obtained from a basic coded block according to more than one segmentation method are shown.
[0201] In the diagram, thick solid lines represent basic coding blocks, thick dashed lines represent quadtree segmentation boundaries, double solid lines represent symmetric binary tree segmentation boundaries, solid lines represent ternary tree segmentation boundaries, and thin dashed lines represent asymmetric binary tree segmentation boundaries. All lines except the thick solid lines indicate the boundaries of each segmentation method. The segmentation settings described below (e.g., segmentation types, segmentation information, segmentation information configuration order, etc.) are not limited to the corresponding examples and can have various variations.
[0202] For ease of explanation, we will assume that the basic coding block (2N×2N, 128×128) has separate partitioning block settings for the top-left, top-right, bottom-left, and bottom-right blocks (NxN, 64×64). First, in the initial block, the states of four sub-blocks are obtained due to a partitioning operation (partition depth 0->1, i.e., partition depth increases by 1). We will also assume that, regarding the quadtree partitioning settings, the maximum coding block size is 128×128, the minimum coding block size is 8×8, and the maximum partitioning depth is 4, which is the typical setting applied to each block.
[0203] (1 time, top left block, A1-A6)
[0204] In this example, when single-tree splitting is supported (quadtree in this example), the size and shape of the obtainable block can be determined by a splitting block setting (e.g., maximum coding block, minimum coding block, splitting depth block, etc.). In this example, when a block can be obtained based on the splitting (splitting the horizontal and vertical segments separately), the splitting information required for a splitting operation (based on the original 4M×4N, with the splitting depth increasing by 1) is a flag indicating whether to perform the splitting (in this example, if it is 0, split x; if it is 1, split o). The obtainable candidates can be 4M×4N and 2M×2N.
[0205] (2 times, top right block, A7 to A11)
[0206] In this example, when multi-tree partitioning is supported (quadtree and binary tree in this example), the size and shape of the available blocks can be determined by setting multiple partition blocks. In this example, it is assumed that the maximum coded block size is 64×64 in the case of a binary tree, the minimum coded block size is 4, and the maximum partition depth is 4.
[0207] In this example, when there are more than two blocks that can be obtained from the split (2 or 4 in this example), the splitting information required for one splitting operation (increasing the quadtree splitting depth by 1) is a flag indicating whether to split, a flag indicating the type of split, a flag indicating the shape of the split, and a flag indicating the direction of the split. The available candidates are 4M×4N, 4M×2N, 2M×4N, 4M×N / 4M×3N, 4M×3N / 4M×N, M×4N / 3M×4N, and 3M×4N / M×4N.
[0208] If the quadtree and binary tree partitioning ranges overlap (i.e., in the current step, both quadtree and binary tree partitioning are available), and the current block (the state before partitioning) is a block obtained by quadtree partitioning (the parent block <in the case where the partitioning depth is less than 1 than the current one> is a block obtained by quadtree partitioning), then the partitioning information can be divided into the following cases. That is, when blocks supported by each partitioning setting can be obtained through multiple partitioning methods, the partitioning information can be generated by the following classification.
[0209] (1) Cases where quadtree partitioning and binary tree partitioning overlap.
[0210] a b c d e QT 1 No Split 0 0 SBT hor 0 1 0 0 ABT hor 1 / 4 0 1 0 1 0 ABT hor 3 / 4 0 1 0 1 1 SBT ver 0 1 1 0 ABT ver 1 / 4 0 1 1 1 0 ABT ver 3 / 4 0 1 1 1 1
[0211] In the table above, 'a' is a flag indicating whether to perform quadtree splitting. If it is 1, quadtree splitting (QT) is performed. If the flag is 0, then 'b' is confirmed, which is a flag indicating whether to perform binary tree splitting. If 'b' is 0, no splitting is performed in this block (no splitting); if 'b' is 1, binary tree splitting is performed.
[0212] `c` is a flag indicating the direction of the segmentation. 0 indicates a horizontal segmentation (hor), and 1 indicates a vertical segmentation (ver). `d` is a flag indicating the segmentation form. 0 indicates a symmetrical segmentation (SBT, Symmetric Binary Tree), and 1 indicates an asymmetrical segmentation (ABT, Asymmetric Binary Tree). Only when `d` is 1 is information about the subdivision ratio (1 / 4 or 3 / 4) in asymmetrical segmentation confirmed. When 0, the left / right block and top / bottom block have a 1 / 4 ratio, and the top / bottom block has a 3 / 4 ratio; when 1, the ratios are reversed.
[0213] (2) Cases where only binary tree splitting can be performed
[0214] In the above table, flags b to e other than a can be used to represent the splitting information.
[0215] In Figure 8 , in the case of block A7, since it is a case where quadtree splitting can be performed in the pre-split blocks (A7 to A11) (i.e., when quadtree splitting can be performed but has become binary tree splitting instead of quadtree splitting), it corresponds to the case of generating the splitting information in (1).
[0216] On the other hand, in the case of A8 to A11, if quadtree splitting has not been performed in the pre-split blocks (A8 to A11) and binary tree splitting has been performed (i.e., in the corresponding blocks <A8 to A11>, quadtree splitting cannot be performed), it corresponds to the case of generating the splitting information in (2).
[0217] (3 times. Lower left block, A12 to A15)
[0218] In this example, when multi-tree splitting (in this example, quadtree, binary tree, and ternary tree) is supported, and the size and shape of the obtainable blocks can be determined by multiple split block settings. In this example, it is assumed that in the case of binary tree / ternary tree, the maximum coding block is 64×64, and the length of one side of the minimum coding block is 4, and the maximum splitting depth is 4.
[0219] In this example, when the blocks obtainable by splitting are more than two (in this example, 2, 3, 4), the splitting information required for one splitting operation is a flag indicating whether to split, a flag indicating the splitting type, a flag indicating the splitting shape, and a flag indicating the splitting direction. The obtainable candidates can be 4M x 4N, 4M x 2N, 2M x 4N, 4M x N / 4M x2N / 4M x N, M x 4N / 2M x 4N / M x 4N.
[0220] If the splitting ranges of the quadtree and binary tree / ternary tree overlap and the current block is a block obtained by quadtree splitting, the splitting information can be divided into the following cases.
[0221] (1) Cases where quadtree splitting and binary tree / ternary tree splitting overlap
[0222] a b c d QT 1 No Split 0 0 BT hor 0 1 0 0 TT hor 0 1 0 1 Bt ver 0 1 1 0 TT ver 0 1 1 1
[0223] In the table above, 'a' is a flag indicating whether to perform a quadtree split. If it is 1, a quadtree split is performed. If the flag is 0, then 'b' determines whether to perform a binary tree split or a ternary tree split. If 'b' is 0, no further splitting is performed on the block. If it is 1, either a binary tree split or a ternary tree split is performed.
[0224] c is a flag indicating the direction of the split. If it is 0, it indicates a horizontal split; if it is 1, it indicates a vertical split. d is a flag indicating the type of split. If it is 0, it indicates a binary split (BT); if it is 1, it indicates a ternary tree split (TT).
[0225] (2) Cases where only binary / ternary tree partitioning can be performed
[0226] In the table above, the segmentation information can be represented by the symbols b to d, excluding a.
[0227] exist Figure 8 In the above, blocks A12 and A15 correspond to the cases in which quadtree partitioning can be performed in pre-segmented blocks A12 to A15, thus corresponding to the cases in which the partitioning information in (1) is generated.
[0228] On the other hand, A13 and A14 correspond to the following cases: since the pre-segmented blocks A13 and A14 are divided into ternary trees instead of quadtrees, they correspond to the cases in (2) where the segmentation information is generated.
[0229] (4 times. Bottom left block. A16-A20)
[0230] This example supports multi-tree partitioning (quadtree, binary tree, and ternary tree in this example), and the size and shape of the achievable blocks can be determined through multiple partition block settings. In this example, it is assumed that in the case of a binary / ternary tree, the maximum coded block is 64×64, the minimum coded block has a side length of 4, and the maximum partition depth is 4.
[0231] In this example, when more than two blocks can be obtained based on the split (2, 3, 4 in this example), the splitting information required for a splitting operation is a flag indicating whether to perform a split, a flag indicating the type of split, a flag indicating the form of split, a flag indicating the direction of split, and possible candidate values are 4M×4N, 4M×2N, 2M×4N, 4M×N / 4M×3N, 4M×3N / 4M×N, M×4N / 3M×4N, 3M×4N / M×4N, 4M×N / 4M×2N / 4M×N, M×4N / 2M×4N / M×4N.
[0232] If the partitioning ranges of the quadtree and the binary / ternary tree overlap and the current block is obtained by partitioning through the quadtree, the partitioning information can be categorized as follows.
[0233] (1) Cases where quadtree partitioning and binary / ternary tree partitioning overlap.
[0234] a b c d e f QT 1 No Split 0 0 TT hor 0 1 0 0 SBT hor 0 1 0 1 0 ABT hor 1 / 4 0 1 0 1 1 0 ABT hor 3 / 4 0 1 0 1 1 1 TT ver 0 1 1 0 SBT ver 0 1 1 1 0 ABT ver 1 / 4 0 1 1 1 1 0 ABT ver 3 / 4 0 1 1 1 1 1
[0235] In the table above, 'a' is a flag indicating whether to perform a quadtree split. If it is 1, a quadtree split is performed. If the flag is 0, then flag 'b' indicates whether to perform a binary tree split. If 'b' is 0, no further splitting is performed on the block. If it is 1, either a binary tree split or a ternary tree split is performed.
[0236] `c` is a flag indicating the direction of the split; 0 indicates a horizontal split, and 1 indicates a vertical split. `d` is a flag indicating the type of split; 0 indicates a ternary split, and 1 indicates a binary tree split. When `d` is 1, the flag `e` corresponding to the split form is confirmed. When `e` is 0, a symmetric split is performed; when `e` is 1, an asymmetric split is performed. When `e` is 1, information on the fine-grained segmentation ratios in the asymmetric split is confirmed, and this is the same as in the previous example.
[0237] (2) Cases where only binary / ternary tree partitioning can be performed
[0238] In the table above, the segmentation information can be represented by the symbols b through f, excluding a.
[0239] exist Figure 8 In this context, block A20 corresponds to the case where quadtree partitioning can be performed in pre-segmented blocks A16 to A19, thus corresponding to the case where the partitioning information in (1) is generated.
[0240] On the other hand, in the case of A16 to A19, since binary tree segmentation is performed instead of quadtree segmentation in pre-segmented blocks A16 to A19, the segmentation information in (2) is generated.
[0241] Next, the intra-frame prediction of the prediction unit in this invention will be described.
[0242] Figure 9 This is an example diagram illustrating predefined intra-frame prediction modes in an image encoding / decoding device.
[0243] refer to Figure 9Sixty-seven prediction modes are configured as a prediction mode candidate group for intra-frame prediction, of which 65 are directional modes (2 to 66) and two are non-directional modes (DC, planar). In this case, directional modes can be divided into slope (e.g., dy / dx) or angle information (degrees). All or some of the prediction modes described in the above examples can be included in the prediction mode candidate group for the luma component or chromatic aberration component, and other additional modes can be included in the prediction mode candidate group.
[0244] Furthermore, by utilizing the correlation between color spaces, a reconstructed block of another color space that has undergone encoding / decoding can be used for the prediction of the current block, and a prediction mode supporting it can be included. For example, in the case of color difference components, a reconstructed block of the luminance component corresponding to the current block can be used to generate the prediction block of the current block. That is, the correlation between color spaces can be considered to generate the prediction block based on the reconstructed block.
[0245] Candidate groups for prediction patterns can be adaptively determined based on encoding / decoding settings. To increase prediction accuracy, the number of candidate groups can be increased; conversely, to reduce the number of bits required for prediction patterns, the number of candidate groups can be decreased.
[0246] For example, one can select one of candidate group A (67, 65 oriented patterns and 2 non-oriented patterns), candidate group B (35, 33 oriented patterns and 2 non-oriented patterns), or candidate group C (18, 17 oriented patterns and 1 non-oriented pattern), and the candidate group can be adaptively selected or determined according to the size and shape of the block.
[0247] Furthermore, the prediction mode candidate group can have various configurations depending on the encoding / decoding settings. For example, such as Figure 2 As shown, in Figure 9 In the middle, the candidate groups of prediction modes are configured in an equal manner among the modes, or in Figure 9 In this configuration, candidate groups can be configured such that the number of modes between 18 and 34 is greater than the number of modes between 2 and 18. Alternatively, the opposite can be true, and candidate groups can be adaptively configured based on the shape of the block (i.e., square, rectangle with horizontal length, rectangle with vertical length). For example, if the width of the current block is greater than its height, intra-prediction modes belonging to 2 to 15 are not used, and can be replaced with intra-prediction modes belonging to 67 to 80. On the other hand, if the width of the current block is less than its height, intra-prediction modes belonging to 53 to 66 are not used, and can be replaced with intra-prediction modes belonging to -14 to -1.
[0248] In this invention, unless otherwise stated, it is assumed that intra-frame prediction is performed using a preset prediction mode candidate group (candidate group A) with equal mode intervals. However, the main elements of this invention can also be modified to set the above-described adaptive intra-frame prediction.
[0249] Figure 9 The prediction patterns supported can be those when the block shape is square or rectangular. Additionally, when the block shape is rectangular, the supported prediction patterns can be different from the examples above. For example, prediction patterns with different numbers of candidate groups or the same number of candidate groups but longer block lengths may be densely packed; conversely, it could be cases where the prediction patterns are dispersed or the opposite. In this invention, as... Figure 9 The prediction mode is described under the premise of supporting prediction mode settings (equal intervals between orientation modes) regardless of the shape of the block, but it can also be applied to other cases.
[0250] Various methods can be used to set the indices assigned to prediction patterns. In the case of directional patterns, the indices assigned to each pattern can be determined based on pre-set priority information, according to the angle or tilt information of the prediction patterns. For example, patterns corresponding to the x-axis or y-axis ( Figure 9 Modes 18 and 50 may have a higher priority, and diagonal modes (modes 2, 34, and 66) with an angle difference of 45 degrees or -45 degrees based on the horizontal or vertical modes may have a lower priority, and diagonal modes with an angle difference of 22.5 degrees or -22.5 degrees based on the diagonal modes may have a lower priority. Priority information can be set in this way (the next one is 11.25 degrees or -11.25 degrees, etc.) or other various methods.
[0251] Alternatively, indexes can be allocated in a specific directional order based on a preset prediction pattern. For example, such as... Figure 9 As shown, indices can be assigned clockwise from a certain diagonal pattern (pattern 2). The examples described later will be based on the assumption that indices are assigned clockwise according to a preset prediction pattern.
[0252] Furthermore, non-directional prediction modes can allocate index information preferentially over directional modes, or they can allocate index information among directional modes, or they can allocate index information at the end, which can be determined according to the encoding / decoding settings. In this example, we assume an example where non-directional modes are allocated indexes with the highest priority among prediction modes (using low index allocation, mode 0 is the plane, and mode 1 is the DC).
[0253] Although various examples of indexes assigned to prediction patterns have been described above, indexes can be assigned under other settings, not limited to the examples above, or there can be multiple transformation examples.
[0254] In the examples above, priority information has already been described in the example of using prediction mode to assign an index. However, priority information is not only used in prediction mode index assignment, but also in the encoding / decoding process of prediction modes. For example, priority information can be used for MPM configuration, and multiple sets of priority information can be supported during the encoding / decoding process of prediction modes.
[0255] The following describes how to derive the intra-prediction mode (specifically the luma component) for the current block.
[0256] The current block can use a default mode predefined in the image encoding / decoding device. The default mode can be an directional mode or a non-directional mode. For example, an directional mode can include at least one of a vertical mode, a horizontal mode, or a diagonal mode. A non-directional mode can include at least one of a planar mode or a DC mode. If it is determined that the current block uses the default mode, the intra-prediction mode of the current block can be set to the default mode.
[0257] Alternatively, the intra-prediction mode of the current block can be derived based on multiple MPM candidates. First, a predetermined MPM candidate can be selected from the aforementioned prediction mode candidate group. The number of MPM candidates can be three, four, five, or more. MPM candidates can be derived based on the intra-prediction modes of neighboring blocks adjacent to the current block. The neighboring block can be a block adjacent to at least one of the left, top, upper left, lower left, or upper right edges of the current block.
[0258] Specifically, MPM candidates can be determined by considering whether the intra-prediction mode (candIntraPredModeA) of the left block and the intra-prediction mode (candIntraPredModeB) of the upper block are the same, and whether candIntraPredModeA and candIntraPredModeB are non-directional modes.
[0259] For example, if candIntraPredModeA and candIntraPredModeB are the same and candIntraPredModeA is not a non-directional mode, then the MPM candidates for the current block can include at least one of candIntraPredModeA, (candIntraPredModeA-n), (candIntraPredModeA+n), or a non-directional mode. Here, n can be an integer of 1, 2, or greater. The non-directional mode can include at least one of a planar mode or a DC mode. For example, the MPM candidates for the current block can be determined as shown in Table 2 below. The indices in Table 2 specify the position or priority of the MPM candidates, but are not limited to this. For example, index1 can be assigned to the DC mode, or index4 can be assigned.
[0260] Index MPM Candidate 0 candIntraPredModeA 1 2+((candIntraPredModeA+61)%64) 2 2+((candIntraPredModeA-1)%64) 3 INTRA_DC 4 2+((candIntraPredModeA+60)%64)
[0261] Table 2
[0262] Alternatively, if candIntraPredModeA and candIntraPredModeB are different, and neither candIntraPredModeA nor candIntraPredModeB is a non-directional mode, then the MPM candidates for the current block can include at least one of candIntraPredModeA, candIntraPredModeB, (maxAB-n), (maxAB+n), or a non-directional mode. Here, maxAB represents the maximum value of candIntraPredModeA and candIntraPredModeB, and n can be an integer of 1, 2, or greater. The non-directional mode can include at least one of a planar mode and a DC mode. For example, the MPM candidates for the current block can be determined as shown in Table 3 below. The indices in Table 3 specify the position or priority of the MPM candidates, but are not limited to this. For example, the DC mode can be assigned the maximum index. When the difference between candIntraPredModeA and candIntraPredModeB is within a predetermined threshold range, MPM candidate 1 in Table 3 will be applied; otherwise, MPM candidate 2 can be applied. Here, the threshold range can be a range greater than or equal to 2 and less than or equal to 62.
[0263] Index MPM Candidate 1 MPM Candidate 2 0 candIntraPredModeA candIntraPredModeA 1 candIntraPredModeB candIntraPredModeB 2 INTRA_DC INTRA_DC 3 2 + ((maxAB + 61)%64) 2 + ((maxAB + 60)%64) 4 2+((maxAB-1)%64) 2+((maxAB)%64)
[0264] Table 3
[0265] Alternatively, if candIntraPredModeA and candIntraPredModeB are not the same, and only one of candIntraPredModeA and candIntraPredModeB is in a non-directed mode, then the MPM candidates for the current block can include at least one of maxAB, (maxAB-n), (maxAB+n), or a non-directed mode. Here, maxAB represents the maximum value of candIntraPredModeA and candIntraPredModeB, and n can be an integer of 1, 2, or greater. The non-directed mode can include at least one of a planar mode and a DC mode. For example, the MPM candidates for the current block can be determined as shown in Table 4 below. The index in Table 4 specifies the position or priority of the MPM candidate, but is not limited to this. For example, index 0 can be assigned to the DC mode, or the maximum index can be assigned.
[0266]
[0267]
[0268] Table 4
[0269] Alternatively, if candIntraPredModeA and candIntraPredModeB are not the same, and both candIntraPredModeA and candIntraPredModeB are non-directional modes, then the MPM candidates for the current block can include at least one of the following: non-directional mode, vertical mode, horizontal mode, (vertical mode - m), (vertical mode + m), (horizontal mode - m), or (horizontal mode + m). Here, m can be an integer of 1, 2, 3, 4, or larger. The non-directional mode can include at least one of planar mode and DC mode. For example, the MPM candidates for the current block can be determined as shown in Table 5 below. The index in Table 5 specifies the position or priority of the MPM candidate, but is not limited to this. For example, the horizontal mode can be assigned index1, or the maximum index can be assigned.
[0270] Index MPM Candidate 0 INTRA_DC 1 Vertical mode 2 Horizontal mode 3 (Vertical Mode-4) 4 (Vertical mode +4)
[0271] Table 5
[0272] Among the aforementioned multiple MPM candidates, the MPM candidate specified by the MPM index can be set as the intra-prediction mode for the current block. The MPM index can be encoded and signaled by the image coding device.
[0273] As described above, the intra-prediction mode can be derived by selectively using either the default mode or an MPM candidate. This selection can be based on a flag signaled by the coding device. In this case, the flag can indicate whether the intra-prediction mode of the current block is set to the default mode. If the flag is a first value, the intra-prediction mode of the current block is set to the default mode; otherwise, information regarding whether the intra-prediction mode of the current block is derived from an MPM candidate, the MPM index, etc., can be emitted.
[0274] The chromatic difference component may have the same candidate group as the luminance component's prediction mode candidate group, or it may include a candidate group configured by some modes from the luminance component's prediction mode candidate group. In this case, the chromatic difference component's prediction mode candidate group may have a fixed configuration or a variable (or adaptive) configuration.
[0275] (Fixed candidate group configuration vs. variable candidate group configuration)
[0276] As an example of a fixed configuration, some modes from the candidate prediction modes for the luminance component (e.g., DC mode, planar mode, vertical mode, horizontal mode, diagonal mode <e.g., at least one of DL, UL, UR, where DL predicts from the bottom right to the top right, UL predicts from the top left to the bottom right, and UR predicts from the top right to the bottom left, respectively) are selected. Figure 9 Modes 2, 34, and 66 in the model (and other diagonal modes) are configured as prediction mode candidate groups for the chromatic aberration components to perform intra-frame prediction.
[0277] As an example of a variable configuration, some modes from the prediction mode candidate group for the luminance component (e.g., DC mode, planar mode, vertical mode, horizontal mode, diagonal UR mode; assuming that the most frequently selected modes are typically configured as the basic prediction mode candidate group) are configured as the basic prediction mode candidate group for the chrominance component. However, it is possible that the modes included in the candidate group may not accurately reflect the characteristics of the chrominance component. To improve this, the configuration of the prediction mode candidate group for the chrominance component can be changed.
[0278] For example, at least one prediction mode of the luminance component block or sub-block at the same or corresponding position as the chrominance component block can be included in the basic prediction mode candidate group (Example 1 described later), or some modes can be replaced to configure a new prediction mode candidate group (Example 2 described later). For example, if the corresponding position in the luminance component that corresponds to the chrominance component <according to the color format> cannot be configured by a single block, but is configured as multiple sub-blocks by segmentation blocks, then a block at a preset position is represented. In this case, the position of the preset block is determined by the upper left, upper right, lower left, lower right, center, upper center, lower center, left center, and right center positions in the luminance component block corresponding to the chrominance component block; if it is distinguished by in-image coordinates, it can be that the upper left position includes the (0, 0) coordinate, the upper right position includes the (blk_width-1, 0) coordinate, the lower left position includes the (0, blk_height-1) coordinate, the lower right position includes the (blk_width-1, blk_height-1) coordinate, and the center position includes the (blk_width-1, blk_height-1) coordinate. The position is defined by one of the following coordinates: (blk_width / 2-1, blk_height / 2-1), (blk_width / 2-1, blk_height / 2), (blk_width / 2, blk_height / 2). The upper middle position includes one of the coordinates (blk_width / 2-1, 0) and (blk_width / 2, 0). The lower middle position includes one of the coordinates (blk_width / 2-1, blk_height / 2). The left-middle position includes one of the coordinates (0, blk_height / 2-1) and (0, blk_height / 2), and the right-middle position includes one of the coordinates (blk_width / 2-1) and (blk_width / 2-1). That is, it means the block including the coordinate positions. The blk_width and blk_height mentioned above refer to the horizontal and vertical lengths of the luminance block, and the coordinates are not limited to the above cases and can include other cases. In the following description, the luminance component prediction mode <or color mode> added to the prediction mode candidate group of the chrominance component is added with at least one prediction mode according to a preset priority <e.g., assumed to be top left-top right-bottom left-bottom right-center>. If two prediction modes are added, the mode for the top left block and the mode for the top right block are added according to the above settings. In this case, when the top left and top right blocks are configured by one block, the bottom left block mode with the next higher priority is added.
[0279] Alternatively, at least one prediction mode of the adjacent blocks located to the left, top, top left, top right, bottom left, etc., centered on the current block, or the sub-blocks of the corresponding block (when adjacent blocks are configured as multiple blocks) can be included in the basic prediction mode candidate group. Alternatively, a new prediction mode candidate group can be configured by replacing some modes.
[0280] Adding additional content to the above description can include not only prediction modes for blocks or adjacent blocks (of the luminance component), but also at least one mode derived from that prediction mode as a prediction mode for the chrominance component. In the examples described later, examples including prediction modes for the luminance component as prediction modes for the chrominance component will be given. Detailed descriptions of examples including prediction modes derived from the luminance component's prediction mode (e.g., if we take adjacent modes or directional modes as examples, when horizontal mode 18 is the luminance component's prediction mode, modes 17, 19, 16, etc., are equivalent to the derived prediction modes, and if multiple prediction modes are configured from the chrominance component as candidate groups of prediction modes for the chrominance component, the priority of the candidate group configuration can be set according to the order from the luminance component's prediction mode to the mode derived from the luminance component's prediction mode) or prediction modes derived from adjacent blocks as candidate groups of prediction modes for the chrominance component can be applied. However, the same or modified settings described below can be applied.
[0281] As an example (1), when the prediction mode of the luminance component matches one of the candidate groups of the prediction mode of the chrominance component, the configuration of the candidate groups is the same (the number of candidate groups remains unchanged), and when there is no match, the configuration of the candidate groups is different (the number of candidate groups increases).
[0282] When the candidate group configurations in the examples above are the same, the indexes of the prediction modes can be the same or can be assigned different indices, which can be determined based on the encoding / decoding settings. For example, when the indexes of the prediction mode candidate group for the chrominance component are plane 0, DC1, vertical 2, horizontal 3, diagonal UR4, the configuration of the prediction mode candidate group remains unchanged when the luminance component is horizontal, and the index of each prediction mode remains unchanged, or different indices can be assigned (in this example, horizontal 0, plane 1, DC2, vertical 3, diagonal UR4). The index reset above may be an example of a process performed to generate fewer mode bits during the prediction mode encoding / decoding process (assuming fewer bits are assigned to smaller indices).
[0283] When the candidate group configurations in the above examples are different, the prediction mode index can be kept the same, or different indices can be assigned. For example, when the prediction mode candidate group index setting is the same as in the previous example, when the prediction mode for the luminance component is diagonal DL, the configuration of the prediction mode candidate group is increased by 1, and the prediction mode index of the existing candidate group remains unchanged. The index of the newly added mode can be the last one (diagonal DL5 in this example) or other indices can be assigned (diagonal DL0, plane 1, DC2, vertical 3, horizontal 4, diagonal UL5 in this example).
[0284] As an example (2), when the prediction mode of the luminance component matches one of the candidate modes in the prediction mode candidate group, the configuration of the candidate group is the same (the mode of the candidate group remains unchanged), and when none of them match, the configuration of the candidate group is different (at least one of the modes in the candidate group is replaced).
[0285] When the candidate group configurations in the above examples are the same, the indexes of the prediction modes can be the same or different indices can be assigned. For example, when the indexes of the prediction mode candidate group for the chromatic difference component are plane 0, DC1, vertical 2, horizontal 3, and diagonal UR4, the configuration of the prediction mode candidate group remains unchanged when the prediction mode for the luminance component is vertical, and the index of each prediction mode remains unchanged, or different indices can be assigned (in this example, vertical 0, horizontal 1, diagonal UL2, plane 3, DC4; an example where the directional mode comes first when the luminance component mode is directional, and an example where the non-directional mode comes first when the luminance component mode is non-directional, but not limited to this).
[0286] When the candidate group configurations in the above examples are different, the index of the prediction mode remains unchanged, and the index of the replacement mode can be assigned to the changed mode, or a different index can be assigned to multiple prediction modes. For example, when the prediction mode candidate group index is set the same as in the previous example, when the prediction mode of the luminance component is diagonal DL, one mode in the prediction mode candidate group (diagonal UL in this example) will be replaced, and the prediction mode index of the existing candidate group will remain unchanged. The index of the mode to be replaced will be assigned as the index of the newly added mode (e.g., diagonal DL4) or other indexes can be assigned (in this example, diagonal DL0, plane 1, DC2, vertical (3), horizontal 4).
[0287] In the preceding description, an example of resetting the index was given for the purpose of allocating fewer mode bits. However, this is only an example based on the encoding / decoding settings, and other cases are possible. If the index of the prediction mode does not change, binarization can be performed by allocating a small number of bits to the smaller index, or binarization can be performed regardless of the index size. For example, when the candidate prediction mode to be reset is Plane 0, DC1, Vertical 2, Horizontal 3, Diagonal DL4, even if a large index is allocated to the diagonal DL, since the mode is obtained from the luminance component, it can be set to allocate fewer mode bits than other prediction modes.
[0288] The prediction pattern can be a pattern supported in a way that is independent of image type, or it can be a pattern whose support is determined based on some image types (e.g., a pattern supported by image type I but not by image types P or B).
[0289] The content described in the examples above is limited to this example, and there may be additional or other modified examples. Furthermore, the encoding / decoding settings described in the examples above can be implicitly determined, or the relevant information can be explicitly included in units such as video, sequence, image, slice, tile, etc.
[0290] (Obtaining predicted values in the same color space vs. obtaining predicted values in other color spaces)
[0291] In the case of the intra-frame prediction modes described by the above examples, a prediction mode is described that describes a method (e.g., extrapolation, interpolation, averaging, etc.) for obtaining data for generating prediction blocks from neighboring regions in the same time and space.
[0292] Additionally, it can support prediction models regarding methods for obtaining data for generating prediction blocks from regions located in different spaces at the same time.
[0293] For example, an example of the above could be a prediction model concerning a method for obtaining data for generating prediction blocks in other color spaces using the correlation between color spaces. In this case, when taking YCbCr as an example, the correlation between color spaces can represent the correlation between Y and Cb, Y and Cr, and Cb and Cr. That is, in the case of color difference components Cb or Cr, a reconstructed block of the luminance component corresponding to the current block can be generated as a prediction block for the current block (color difference vs. luminance is the basic setting of the example described later). Alternatively, a reconstructed block of some color difference components (Cb or Cr) corresponding to the current block of some color difference components (Cr or Cb) can be generated as a prediction block for the color difference components (Cr or Cb). In this case, a reconstructed block of another color space can be directly generated as a prediction block (i.e., without performing correction) or a block obtained considering the correlation between colors can be generated as a prediction block (e.g., performing correction on an existing reconstructed block, where a and b represent the values used for correction in P = a*R + b, and R and P represent the values obtained in different color spaces and the predicted value in the current color space, respectively).
[0294] In this example, the description assumes that data obtained using the correlation of color spaces is used as the predicted value for the current block. However, it is also possible to use this data as a correction value to correct an existing predicted value for the current block (e.g., using residual values from other color spaces as correction values; that is, there are other predicted values, and these are corrected; although adding these predicted values together still yields a predicted value, this is described for the sake of clarity). In this invention, the former scenario is assumed, but the invention is not limited thereto, and correction values can be used in the same or different ways.
[0295] The prediction pattern can be a pattern supported in a way that is independent of image type, or it can be a pattern whose support is determined based on some image types (e.g., a pattern supported by image type I but not by image types P or B).
[0296] (The parts compared to obtain relevance information)
[0297] In the examples above, the correlation information between color spaces (a, b, etc.) can be explicitly included or implicitly obtained. In this case, the regions compared to obtain the relevant information can be 1) the current block of the color difference component and the corresponding block of the luminance component, or 2) the adjacent regions of the current block of the color difference component (e.g., left block, top block, top-left block, top-right block, bottom-left block, etc.) and the adjacent regions of the corresponding block of the luminance component. The former is an example of explicit processing while the latter is implicit processing.
[0298] For example, relevant information is obtained by comparing at least one pixel value in each color space (at this time, the pixel value to be compared (Pixel Value) can be the pixel value obtained from one pixel in each color space, and can also be the pixel value obtained from multiple pixels. The pixel value derived through a filtering process such as weighted averaging, that is, the number of pixels for which the pixel value is referenced or used to compare one pixel value in each color space can be one pixel vs. one pixel, one pixel vs. multiple pixels, etc. At this time, the former may be the color space for generating the predicted value, and the latter may be the color space for reference. The above examples may be situations that can occur according to the color format, or it is possible to compare the pixel value of one pixel in the color difference component with the corresponding pixel value of one pixel in the luminance component in a manner independent of the color format, and the pixel value of one pixel in the color difference component can be compared with the pixel value obtained by performing filtering <a-tap separate 1D filter, b x c mask non-separable 2D filter, d-tap directional filter, etc.> on multiple pixels in the luminance component, and according to the encoding / decoding settings, one of the two methods can be used; the above describes examples of color difference and luminance, but there can also be cases such as color difference <cb>and color difference <cr>(Transformation example).
[0299] In the above example, the region compared when implicitly obtaining relevant information can be the nearest pixel line of the current block of the current color component (e.g., including pixels from p[-1, -1] to p[blk_width-1, -1], p[-1, 0] to p[-1, blk_height-1]) and pixel lines of other color spaces corresponding to it, or multiple pixel lines of the current block of the current color component (e.g., in the above case, including pixels included in multiple pixel lines from p[-2, -2] to p[blk_width-1, -2], p[-2, -1] to p[-2, blk_height-1]) and pixel lines of other color spaces corresponding to it.
[0300] Specifically, assuming a color format of 4:2:0, to compare the pixel value of a pixel in the current color space (color difference in this example), one can use the pixel value of a pixel in a predetermined position (chosen from the top left, top right, bottom left, and bottom right within the 2×2) of the corresponding four pixels in another color space (luminance in this example). Alternatively, to compare the pixel value of a pixel in the chroma space, one can use the pixel value obtained by filtering multiple pixels in the luminance space (e.g., at least two pixels in the corresponding 2×2 pixels).
[0301] In summary, the parameter information can be derived from the restored pixels of the adjacent regions of the current block and the restored pixels of other color spaces corresponding to it. That is, at least one parameter (e.g., a or b, a1, b1 or a2, b2, etc.) can be generated based on the relevant information and can be used as a value for multiplying or adding pixels of restored blocks in other color spaces (e.g., a, a1, a2 / b, b1, b2).
[0302] In this case, the comparison process can be performed after confirming the availability of the pixels compared in the above example. For example, when adjacent regions are available, they can be used as pixels for comparison, while when adjacent regions are unavailable, this can be determined based on the encoding / decoding settings. For example, when pixels in adjacent regions are unavailable, they can be excluded from the process of obtaining relevant information for the color space, or they can be included in the comparison process after filling in the unavailable areas, which can be determined based on the encoding / decoding settings.
[0303] For example, when excluded from the process of obtaining correlation information between color spaces, this could be an example of a region corresponding to the unavailability of pixels in at least one color space. More specifically, it could be an example of a situation where pixels in one of two color spaces are unavailable, or pixels in both color spaces are unavailable, which can be determined based on the encoding / decoding settings.
[0304] Alternatively, various filling methods can be used when performing a process to obtain correlation information between color spaces after filling unavailable areas with data for comparison (or an operation similar to the reference pixel filling process). For example, it can be filled with preset pixel values (e.g., the median value of the bit depth 1 << (bit_depth-1), the value between the minimum and maximum values of the actual pixels in the image, the average or median value of the actual pixels in the image, etc.), or it can be filled with values obtained by filtering adjacent pixels or by performing filtering on adjacent pixels (an operation similar to the reference pixel filtering process), or other methods can be used.
[0305] Figure 10 An example is shown comparing pixels across color spaces to obtain correlation information. For ease of illustration, a 4:4:4 color format is assumed. The process described later (i.e., including conversion based on the component ratios) will be considered in this context.
[0306] R0 represents an example where the color space regions on both sides can be used. Since both color space regions are available, pixels from the corresponding regions can be used during the comparison process to obtain relevant information.
[0307] R1 represents an unavailable example in one of the two color space regions (in this example, the adjacent region of the current color space is available, while the corresponding region in other color spaces is unavailable). After filling the unavailable region using various methods, the unavailable region can be used in the comparison process.
[0308] R2 represents an example where one of the two color space regions is unavailable (in this example, the adjacent region of the current color space is unavailable, while the corresponding region of the other color space is available). Because there is an unavailable region on one side, the corresponding regions of the two color spaces cannot be used during the comparison.
[0309] R3 represents an example where neither color space region on either side is usable. This can be used in the comparison process by filling the unusable areas using various methods.
[0310] R4 represents an example where neither of the color space regions is usable. Because there are unusable regions on both sides, the corresponding regions of the color spaces on both sides cannot be used during the comparison.
[0311] In addition, with Figure 10 Unlike other color spaces, various settings can be made when the adjacent areas of the current block or the corresponding areas of other color spaces are unavailable.
[0312] As an example, preset values (in this example, a is 1 and b is 0) can be assigned to a and b. This situation might mean maintaining a pattern that fills data in a different color space with the predicted block of the current block. Furthermore, this situation allows for different settings or priorities for the probability of occurrence (or selection) of the prediction pattern when performing prediction pattern encoding / decoding compared to existing cases (e.g., treating the probability of selection as low or setting the priority as low; in other words, because it is low-accuracy relevance information, the accuracy of the predicted block obtained through this prediction pattern might be very low, thus estimating that it ultimately cannot be selected as the best prediction pattern).
[0313] For example, a mode that fills another color space with the predicted block of the current block might not be supported because there is no data to compare. That is, the mode could be one that is supported when at least one available region exists. In this case, when performing predictive mode encoding / decoding, it can be set to allow or disallow the substitution of this mode with other modes. The former could be a setting that maintains the number of predictive mode candidate groups, and the latter a setting that reduces the number of predictive mode candidate groups.
[0314] Not limited to the examples above, but there can be various variations.
[0315] In the examples above, the unusable case is when the region is outside the boundary of an image (e.g., picture, slice, tile, etc.) before encoding / decoding is complete (i.e., the current block and the region are not included in the same image). Additionally, unusable cases can be added based on encoding / decoding settings (e.g., constrained_intra_pred_flag, for example, when the flag is 1 for P or B slice / type and the encoding mode of the corresponding region is Inter).
[0316] In the examples described later, the aforementioned limitations may occur when generating a predicted block for the current block using reconstructed data from other color spaces after obtaining relevant information through color space comparison. That is, as mentioned above, the use of this mode may be limited or unavailable when it is determined that the corresponding region in another color space corresponding to the current block is unavailable.
[0317] The predicted value for the current block can be generated using parameters representing the correlation information between color spaces obtained through the above process and restored data from other color spaces corresponding to the current block. Here, the restored data from other color spaces used for the prediction of the current block can be the pixel value of a pixel at a preset location or the pixel value obtained through a filtering process.
[0318] For example, in a 4:4:4 scenario, to generate a predicted value for a single pixel in the chroma space, the pixel value of the corresponding pixel can be used in the luma space. Alternatively, to generate a predicted value for a single pixel in the chroma space, the pixel value obtained by filtering multiple pixels in the luma space (e.g., pixels centered on the corresponding pixel in the directions of left, right, top, bottom, top left, top right, bottom left, bottom right, etc. Taking the application of 5-tap and 7-tap filters as examples, this can be understood as having two or three pixels centered on the corresponding pixel in the left, right, top, and bottom directions, respectively).
[0319] For example, in the 4:2:0 case, to generate a predicted value for a pixel in the chroma space, the pixel value of a pixel at a preset position (selected from top left, top right, bottom left, and bottom right) among the four corresponding pixels in the luma space (one pixel of the chroma component corresponds to 2×2 pixels of the luma component) can be used. Alternatively, to generate a predicted value for a pixel in the chroma space, the pixel value obtained by filtering multiple pixels in the luma space (e.g., at least two pixels in the corresponding 2×2 pixels, or pixels located in the left, right, top, bottom, top left, top right, bottom left, and bottom right directions centered on the 2×2 pixels) can be used.
[0320] In summary, the parameters representing the correlation information obtained through the above process can be applied (multiplied or added, etc.) to the pixel values obtained in other color spaces to obtain pixel values, and these obtained pixel values can be used as the predicted values of pixels in the current color space.
[0321] The examples above have described some color formats and some pixel value acquisition processes, but they are not limited to these, and the same or modified examples can be used in other cases.
[0322] The concepts described in (Obtaining Predicted Values in the Same Color Space vs. Obtaining Predicted Values in Different Color Spaces) can be applied to both fixed and variable candidate group configurations. For example, when predicted values cannot be obtained in other color spaces, alternative patterns for predicted values can be included in the candidate group.
[0323] Through the above examples, in the case of the above prediction mode, relevant information (e.g., information about support or lack thereof, parameter information, etc.) can be included in units of images, sequences, pictures, slices, and tiles.
[0324] In summary, the prediction mode candidate group is configured using a prediction mode (mode A) associated with the method (the above method is a method of obtaining data for generating prediction blocks from adjacent regions in the same time and space according to encoding / decoding settings), or other than the prediction mode, the prediction mode candidate group may include a prediction mode (B_mode) associated with the method (the above method is a method of obtaining data for generating prediction blocks from regions located in the same time but different spaces).
[0325] In the examples above, the prediction mode candidate group can be configured using only Mode A or only Mode B, or it can be configured by using a combination of Mode A and Mode B. Relatedly, the configuration information for the prediction mode candidate group can be explicitly generated, or the information can be implicitly determined beforehand.
[0326] For example, they can have the same configuration, regardless of some encoding / decoding settings (image type in this example), or have a single configuration based on some encoding / decoding settings (e.g., using mode A, mode B_1 <color mode>, mode B_2 <color copy mode> to configure the prediction mode candidate group in image type I, using mode A, mode B_1 to configure the prediction mode candidate group in image type P, and using mode A and mode B_2 to configure the prediction mode candidate group in image type B, etc.).
[0327] In this invention, the prediction mode candidate group for the luminance component is as follows: Figure 9 As shown, the candidate groups for the predicted modes of the color difference component are in Figure 9 Configured as horizontal, vertical, or diagonal mode ( Figure 9 The description is based on the assumptions of plane, DC, color mode 1, color mode 2, color mode 3, color copy mode 1, color copy mode 2, adjacent block mode 1 (left block), adjacent mode 2 (top block), but there may be other settings for various prediction mode candidate groups.
[0328] In the image coding method according to an embodiment of the present invention, intra-frame prediction can be configured as follows. The intra-frame prediction of the prediction unit may include a reference pixel configuration step, a prediction block generation step, a prediction mode determination step, and a prediction mode coding step. Furthermore, the image coding apparatus may be configured to include a reference pixel configuration unit, a prediction block generation unit, and a prediction mode coding unit for implementing the reference pixel configuration step, the prediction block generation step, the prediction mode determination step, and the prediction mode coding step. Some steps in the above process may be omitted, or other steps may be added, and the order may be changed to something other than the above sequence.
[0329] Furthermore, in the image decoding method according to an embodiment of the present invention, intra-frame prediction can be configured as follows. The intra-frame prediction of the prediction unit may include a prediction mode decoding step, a reference pixel configuration step, and a prediction block generation step. Additionally, the image decoding apparatus may be configured to include a prediction mode decoding unit, a reference pixel configuration unit, and a prediction block generation unit for implementing the prediction mode decoding step, the reference pixel configuration step, and the prediction block generation step. Some of the above steps may be omitted, or other steps may be added, and the order may be changed to something other than the above-described order.
[0330] In the prediction block generation step, intra-prediction can be performed on a unit of the current block (e.g., coding block, prediction block, transform block, etc.), or intra-prediction can be performed on a unit of predetermined sub-blocks. For this purpose, a flag indicating whether the current block is divided into sub-blocks for intra-prediction can be used. The flag can be encoded by the coding device and transmitted as a signal. If the flag is a first value, the current block is divided into multiple sub-blocks; otherwise, the current block is not divided into multiple sub-blocks. This division can be an additional division performed after the division based on the tree structure described above. Sub-blocks belonging to the current block share an intra-prediction mode, but can be configured with different reference pixels for each sub-block. Alternatively, the sub-blocks can use the same intra-prediction mode and reference pixels. Or, the sub-blocks can use the same reference pixels, but can use different intra-prediction modes for each sub-block.
[0331] The segmentation can be performed in either the vertical or horizontal direction. The segmentation direction can be determined based on a flag signaled by the encoding device. For example, segmentation can be performed horizontally when the flag is a first value, and vertically otherwise. Alternatively, the segmentation direction can be determined based on the size of the current block. For example, segmentation can be performed horizontally when the height of the current block is greater than a predetermined threshold size, and vertically when the width of the current block is greater than a predetermined threshold size. Here, the threshold size can be a fixed value predefined in the encoding / decoding device, or it can be determined based on information about the block size (e.g., the size of the maximum transform block, the size of the maximum code block, etc.). Information about the block size can be signaled at at least one level among sequence, image, slice, tile, brick, or CTU row.
[0332] The number of sub-blocks can be variably determined based on the current block's size, shape, segmentation depth, intra-prediction mode, etc. For example, when the current block is 4×8 or 8×4, it can be divided into two sub-blocks. Alternatively, if the current block is greater than or equal to 8×8, it can be divided into four sub-blocks.
[0333] In this invention, the encoder will be described in detail. In the case of the decoder, since it can be deduced from the contents of the encoder, a detailed description is omitted.
[0334] Figure 11 This is a schematic diagram illustrating the configuration of reference pixels for intra-frame prediction. The size and shape (M×N) of the current block for prediction can be obtained from the block segmentation unit, and the description assumes that intra-frame prediction is supported within the range of 4×4 to 128×128. Intra-frame prediction can typically be performed using prediction block units, but depending on the settings of the block segmentation unit, intra-frame prediction can be performed using coded blocks or transform block units. After confirming the block information, the reference pixel configuration unit can configure the reference pixels for prediction of the current block. This can be done using temporary memory (e.g., an array). <array>(e.g., main array or auxiliary array) manages reference pixels, generates and deletes reference pixels during the prediction process within each frame of the block, and the size of the temporary memory can be determined according to the configuration of the reference pixels.
[0335] In this example, the prediction of the current block is described with the current block as the center, assuming that the left block, top block, top-left block, top-right block, and bottom-left block are used for the prediction of the current block. However, it is not limited to this, and block candidate groups with other configurations can be used for the prediction of the current block. For example, the candidate group of adjacent blocks for the reference pixel can be an example based on raster or Z-scan, and a portion of the candidate group can be removed according to the scan order, or it can be configured to include other block candidate groups (e.g., additional configurations such as top block, bottom block, bottom-right block, etc.).
[0336] Alternatively, blocks in other color spaces (e.g., if the current block belongs to Cr, other color spaces correspond to Y or Cb) that are corresponding to the current block (e.g., having corresponding coordinates based on the same coordinates or color component composition ratios in each color space) can be used to predict the current block. Furthermore, for ease of description, this is illustrated by assuming a block is configured at the preset positions (left, top, top left, top right, bottom left), but at least one block can exist at each of those positions. That is, multiple sub-blocks can exist at the preset positions, based on the segmentation of the corresponding block.
[0337] In summary, the adjacent regions of the current block can be the locations of reference pixels used for intra-frame prediction of the current block, and depending on the prediction mode, regions corresponding to the current block in another color space can be further considered as the locations of reference pixels. Besides the examples above, the locations of reference pixels can be determined according to the prediction mode, method, etc. For example, when generating prediction blocks using methods such as block matching, the reference pixel locations can be considered as regions that have completed encoding / decoding before the current block of the current image, or regions included within the search range in the regions that have completed encoding / decoding (e.g., included to the left or top of the current block, or to the upper left, upper right, etc.).
[0338] like Figure 11 As shown, the reference pixels used for prediction of the current block can be composed of adjacent pixels from the left block, top block, top-left block, top-right block, and bottom-left block. Figure 11 The Ref_L, Ref_T, Ref_TL, Ref_TR, and Ref_BL parameters are configured in this context. In this case, the reference pixel is typically the pixel of the nearest neighboring block to the current block. Figure 11 It consists of (a) in the text, but may also include other pixels ( Figure 11 (The pixel b in the text is different from the pixels of other outer lines). That is, it can use one of the following: the first pixel line a adjacent to the current block, the second pixel line b adjacent to the first pixel line, the third pixel line adjacent to the second pixel line, or the fourth pixel line adjacent to the third pixel line. For example, depending on the encoding / decoding settings, multiple pixel lines can include all the first to fourth pixel lines, or they can include only the remaining pixel lines except for the third pixel line. Alternatively, multiple pixel lines can include only the first pixel line and the fourth pixel line.
[0339] The current block can perform intra-prediction by selectively referencing any one of multiple pixel lines. This selection can be based on the index (refIdx) of a signal emitted by the coding device. Alternatively, any one of the multiple pixel lines can be selectively used based on the size, shape, segmentation type, whether the intra-prediction mode is non-directional, the angle of the intra-prediction mode, etc. For example, when the intra-prediction mode is planar or DC mode, only the first pixel line can be used. Alternatively, when the size (width or height) of the current block is less than or equal to a predetermined threshold, only the first pixel line can be used. Alternatively, when the intra-prediction mode angle is greater than (or less than) a predetermined threshold angle, only the first pixel line can be used. The threshold angle can be the angle of the intra-prediction mode corresponding to mode 2 and mode 66 of the aforementioned prediction mode candidate group.
[0340] On the other hand, pixels adjacent to the current block can be classified into at least one reference pixel layer. For example, the pixel closest to the current block can be ref_0 (pixels whose pixel value difference from the boundary pixel of the current block is 1; p(-1, -1) to p(2m-1, -1), p(-1, 0) to p(-1, 2n-1)), the next adjacent pixel (pixels whose pixel value difference from the boundary pixel of the current block is 2; p(-2, -2) to p(2m, -2), p(-2, -1) to p(-2, 2n)) is ref_1, and the next adjacent pixel (pixels whose pixel value difference from the boundary pixel of the current block is 3; p(-3, -3) to p(2m+1, -3), p(-3, -2) to p(-3, 2n+1)) is ref_2, and so on. That is, reference pixels can be classified into multiple reference pixel layers based on the pixel distance to the boundary pixel of the current block.
[0341] Additionally, reference pixel layers can be set differently for each adjacent block. For example, when the current block and the block adjacent to the top are used as reference blocks, reference pixels based on layer ref_0 can be used, and when the block adjacent to the top is used as the reference block, reference pixels based on layer ref_1 can be used.
[0342] At this point, the reference pixel set typically used during intra-frame prediction belongs to the neighboring blocks adjacent to the current block. These neighboring blocks are located at the lower left, left, upper left, upper top, and upper right edges, and belong to layer ref_0 (the pixel closest to the boundary pixel). Unless otherwise specified, pixels belonging to each other are considered to be these pixels. However, only a portion of the pixels belonging to the aforementioned neighboring blocks can be used as the reference pixel set, or pixels belonging to two or more layers can be used as the reference pixel set. Here, the reference pixel set or layer can be implicitly determined (preset in the encoding / decoding device) or explicitly determined (information for determination can be received from the encoding device).
[0343] Here, the description is based on the premise that the maximum number of supported reference pixel layers is 3, but it can also have a value greater than this. The number of reference pixel layers and the number of reference pixel sets (or, can be called reference pixel candidate groups) based on the positions of neighboring blocks can be set differently according to the size, shape, prediction mode, image type (I / P / B, where the image is a picture, slice, tile, etc.), color components, etc., and can include relevant information in units of sequence, picture, slice, tile.
[0344] This invention is described under the premise of assigning a low index (incrementing by 1 from 0) to the reference pixel layer closest to the current block, but the invention is not limited thereto. Furthermore, the relevant information regarding the reference pixel configuration described later can be generated under the above index settings (such as assigning short bits of binarization to the low index when selecting multiple reference pixels as a set).
[0345] Additionally, when more than two reference pixel layers are supported, each reference pixel included in the two or more reference pixel layers can be used for weighted averaging.
[0346] For example, a reference pixel can be used to generate a prediction block, which is generated by a pixel located at... Figure 11 The sum is obtained by weighting the pixels in layer ref_0 (nearest neighbor pixel layer) and layer ref_1 (next pixel layer). Here, depending on the prediction mode (e.g., prediction mode orientation), the pixels to which the weighted sum is applied in each reference pixel layer can be not only integer units of pixels but also fractional units of pixels. Furthermore, weighted values (e.g., 7:1, 3:1, 2:1, 1:1) are applied to prediction blocks obtained using reference pixels from the first reference pixel layer and prediction blocks obtained using reference pixels from the second reference pixel layer, respectively, to obtain a prediction block. Here, a higher weighted value can be obtained if the weighted value is based on a prediction block from a reference pixel layer adjacent to the current block.
[0347] Typically, for the example above, the nearest pixel of the adjacent block can be used as the reference pixel, but it is not limited to this. For example, there can be a variety of cases (e.g., selecting ref_0 and ref_1 as reference pixel layers and generating the predicted pixel value by performing a weighted sum with ref_0 and ref_1, i.e., the implicit case).
[0348] Furthermore, reference pixel configuration information (e.g., selection information for reference pixel layers or sets) may not include preset information (e.g., when the reference pixel layer is preset to ref_0), and may be configured as ref_1, ref_2, ref_3, etc., but is not limited to this.
[0349] The above examples have described some cases of reference pixel configuration, which can be combined with various encoding / decoding information to determine intra-frame prediction settings. In this case, the encoding / decoding information includes image type, color components, current block size, shape, prediction mode (type of prediction mode (directional, non-directional), prediction mode direction (vertical, horizontal, diagonal 1, diagonal 2, etc.), etc., and the intra-frame prediction settings (in this example, the reference pixel configuration settings) can be determined based on the encoding / decoding information of adjacent blocks and the combination of the encoding / decoding information of the current block and adjacent blocks.
[0350] Figure 12 This is a schematic diagram illustrating the reference pixel range used for intra-frame prediction. Specifically, the reference pixel range is determined based on the block size, shape, prediction mode configuration (in this example, the angle information of the prediction mode), etc. Figure 12 The middle arrow points to the pixel used for prediction.
[0351] refer to Figure 12 Pixels A, A', B, B', and C refer to the pixels at the lower right end of the 8×2, 2×8, 8×4, 4×8, and 8×8 blocks. To perform prediction for these pixels, the reference pixel range for each block can be determined by using the pixels AT, AL, BT, BL, CT, and CL used in the upper and left blocks.
[0352] For example, when the reference pixels are A and A' (rectangular blocks), they are located in the range of p(0, -1) to p(9, -1), p(-1, 0) to p(-1, 9), and p(-1, -1). When the reference pixels are B and B' (rectangular blocks), they are located in the range of p(0, -1) to p(11, -1), p(-1, 0) to p(-1, 11), and p(-1, -1). When the reference pixel is C (square block), it is located in the range of p(0, -1) to p(15, -1), p(-1, 0) to p(-1, 15), and p(-1, -1).
[0353] The range information of the reference pixels obtained through the above process (e.g., P(-1, -1), P(M+N-1, -1), P(-1, N+M-1), etc.) can be used in the intra-frame prediction process (e.g., reference pixel filtering, prediction pixel generation process, etc.). In addition, the cases supporting reference pixels are not limited to the above cases, and there can be various other cases.
[0354] The reference pixel configuration unit for intra-frame prediction may include a reference pixel generation unit, a reference pixel interpolation unit, a reference pixel filtering unit, etc., and may include all or part of the above configurations.
[0355] The reference pixel configuration unit can determine the availability of reference pixels to classify them into available and unavailable reference pixels. For example, if a block (or a candidate block of reference pixels) at a preset position is available, the corresponding block can be used as a reference pixel, and if it is unavailable, the block cannot be used as a reference pixel.
[0356] A reference pixel is deemed unavailable if at least one of the following conditions is met: it is located outside the image boundary; it does not belong to the same segmentation unit as the current block (e.g., slice, tile, etc.); encoding / decoding is incomplete; or its use is restricted according to encoding / decoding settings. Conversely, a pixel is deemed available if none of the above conditions are met.
[0357] Additionally, the use of reference pixels can be restricted by encoding / decoding settings. For example, the use of reference pixels can be restricted based on whether constrained intra-prediction is performed (e.g., constrained_intra_pred_flag). Constrained intra-prediction can be performed when error-robust encoding / decoding is required to withstand external factors such as the communication environment, or when attempting to prevent the use of blocks that have been referenced from other images and restored as reference pixels.
[0358] When the restricted intra-prediction is deactivated (e.g., constrained_intra_pred_flag = 0 in I-frame or P or B-frame types), all reference pixel candidate blocks are available, and when activated (e.g., constrained_intra_pred_flag = 1 in P or B-frame types), it can be determined whether to use the reference pixels of the corresponding block based on the encoding mode (intra-frame or inter-frame) of the reference pixel candidate block. That is, if the encoding mode of the block is Intra, the block can be used regardless of whether restricted intra-prediction is activated. In the case of Inter, it is determined whether it can be used (deactivated) or is unavailable (activated) based on whether restricted intra-prediction is activated.
[0359] Additionally, restricted intra-frame prediction can be applied based on the encoding mode of the reconstructed block corresponding to the current block in another color space. For example, if the current block belongs to some color difference components Cb, Cr, its availability can be determined based on the encoding mode of the block that has completed encoding / decoding of the luminance component Y corresponding to the current block. The above example could be an example of using a reconstructed block from another color space as a reference pixel. Alternatively, it could be an example of determining the encoding mode independently based on the color space.
[0360] At this point, when the reference pixel candidate block is encoded / decoded using some prediction method (e.g., in the current image, predicted by block matching or template matching, etc.), it can be determined whether to use the reference pixel based on the encoding / decoding settings.
[0361] As an example, when performing encoding / decoding using the prediction method, setting the encoding mode to Intra can determine that the corresponding block is available. Alternatively, even with Intra, special cases can be allowed to make it unavailable.
[0362] For example, when performing encoding / decoding using the prediction method, a corresponding block can be determined to be unavailable when the encoding mode is set to Inter. Alternatively, even with Inter, special cases can be allowed to make it available.
[0363] That is, it can be determined whether to make exceptions for cases where the use is determined by the encoding mode, based on the encoding / decoding settings.
[0364] Restricted intra-frame prediction can be a setting applied to some image types (e.g., P or B slice / tile types, etc.).
[0365] Based on reference pixel availability, candidate reference pixels can be categorized into three cases: all candidate reference pixels are usable, some reference pixels are usable, and none of the candidate reference pixels are usable. Except for the case where all candidate reference pixels are usable, in all other cases, unavailable reference pixels can be used to fill or generate candidate pixel locations.
[0366] When a candidate reference pixel block is available, the pixel at a preset position in that block (assuming in this example that the pixel is adjacent to the current block) can be stored in the reference pixel memory of the current block. At this time, the pixel data at the corresponding block position can be stored in the reference pixel memory through processes such as direct copying or reference pixel filtering.
[0367] When a reference pixel candidate block is unavailable, the pixels obtained through the reference pixel generation process can be stored in the reference pixel memory of the current block.
[0368] In summary, reference pixels can be configured when the reference pixel candidate block is available, and reference pixels can be generated when the reference pixel candidate block is unavailable.
[0369] The following shows examples of using various methods to fill reference pixels at unavailable block locations.
[0370] For example, a reference pixel can be generated using any pixel value, and it can be a pixel value that belongs to a range of pixel values (e.g., a value derived from the minimum, maximum, median, etc., of a pixel value adjustment process based on bit depth or pixel value range information of the image). Specifically, this could be an example of what to do when all candidate reference pixel blocks are unavailable.
[0371] Alternatively, reference pixels can be generated from the encoding / decoding of the completed image. Specifically, reference pixels can be generated from at least one available block adjacent to an unavailable block. In this case, at least one of the following methods can be used: extrapolation, interpolation, copying, etc., and the direction of reference pixel generation (or copying, extrapolation) can be clockwise or counterclockwise, and can be determined according to the encoding / decoding settings. For example, the direction of reference pixel generation within a block can be along a preset direction or along a direction adaptively determined based on the position of the unavailable block. Alternatively, in the case of a region corresponding to the current block in another color space, the same method as in the above example can be used. The difference is that if the process in the current color space is to fill the adjacent reference pixels of the current block, then in the case of other color spaces, it is to fill the block M×N corresponding to the current block m×n. Therefore, the corresponding region can be generated by using various other methods, including the method described above (e.g., extrapolation in the vertical, horizontal, diagonal, etc. directions of surrounding pixels, interpolation such as Planar, averaging, etc., where the filling direction refers to the direction from the surrounding pixels of the block corresponding to the current block towards the inside of the block). This example could be a case where a prediction pattern used to generate a prediction block from another color space is included in the candidate group without being excluded from it.
[0372] Furthermore, after configuring the reference pixels through the confirmation process of their availability, reference pixels with fractional units can be generated through linear interpolation. Alternatively, the reference pixel interpolation process can be performed after the reference pixel filtering process. Or, the filtering process can be performed only for the configured reference pixels. In short, it can be performed before the prediction block generation process.
[0373] At this point, in horizontal, vertical, and some diagonal modes (e.g., bottom right diagonal, bottom left diagonal, top right diagonal), as well as in non-directional, color, and color copy modes, the interpolation process is not performed. However, in other modes (other diagonal modes), interpolation can be performed.
[0374] The interpolation accuracy can be determined based on the supported candidate groups of prediction modes (or the total number of prediction modes), prediction mode configuration (e.g., prediction mode orientation angle, prediction mode interval), etc.
[0375] For reference pixel interpolation in fractional units, a preset filter (e.g., a 2-tap linear interpolation filter) can be used, or one of a group of multiple filter candidates (e.g., a 4-tap cubic filter, a 4-tap Gaussian filter, a 6-tap Wiener filter, or an 8-tap Kalman filter).
[0376] When using one of multiple filter candidate groups, filter selection information can be generated explicitly or determined implicitly, and can be determined based on encoding / decoding settings (e.g., interpolation precision, block size, shape, prediction mode, etc.).
[0377] For example, the interpolation filter to be used can be determined based on the range of block size, the interpolation accuracy, and the characteristics of the prediction mode (e.g., orientation information).
[0378] In detail, depending on the block size range, a preset interpolation filter a can be used in some range A, a preset interpolation filter b can be used in some range B, and one of multiple interpolation filters c can be used in some range C, and one of multiple interpolation filters d can be used in some range D, and a preset interpolation filter can be used in some ranges, and one of multiple interpolation filters can be used in some ranges. In this case, using one interpolation filter is implicit, using one of multiple interpolation filters is explicit, and the size of the block for segmenting the block size range can be M×N (in this example, M and N are 4, 8, 16, 32, 64, 128, etc.; that is, M and N can be the minimum or maximum value of each block size range).
[0379] The interpolation-related information can be included in units such as video, sequence, image, slice, tile, and block. The interpolation process can be performed in the reference pixel configuration unit or in the prediction block generation unit.
[0380] Additionally, after configuring the reference pixels, filtering can be performed on them to improve prediction accuracy by reducing residual degradation after the encoding / decoding process. In this case, a low-pass filter can be used. Whether to apply filtering can be determined based on the encoding / decoding settings. If filtering is applied, either fixed filtering or adaptive filtering can be used, and the decoding / decoding settings can be defined based on block size, shape, prediction mode, etc.
[0381] Fixed filtering refers to applying a preset filter to the reference pixel filtering section, while adaptive filtering refers to applying one of a plurality of filters to the reference pixel filtering section. In the case of adaptive filtering, one of the plurality of filters can be implicitly determined according to the encoding / decoding settings, or selection information can be explicitly generated, and the filter candidate group can include filters with 3 taps (e.g., [1, 2, 1] / 4) or 5 taps (e.g., [2, 3, 6, 3, 2]).
[0382] As an example, filtering can be omitted under certain settings (block range A).
[0383] As an example, filtering can be left unapplied in some settings (block range B, some modes C), and filtering can be applied using a preset filter (3-tap filter) in some settings (block range B, some modes D).
[0384] For example, filtering can be disabled in some settings (block range E, some modes F), and filtering can be applied using a preset filter (3-tap filter) in some settings (block range E, some modes G). Filtering can be applied using a preset filter (5-tap filter) in some settings (block range E, some modes H). Filtering can be performed by selecting one of multiple filters in some settings (block range E, some modes I).
[0385] As an example, filtering can be applied using a predefined filter (5-tap filter) under certain settings (block range J, some modes K), and further filtering can be applied using a preset filter (3-tap filter). That is, multiple filtering processes can be performed. In detail, further filtering can be applied based on the results of previous filtering.
[0386] In the above example, the size of the block segment used to divide the block size range can be M x N (in this example, M and N are 4, 8, 16, 32, 64, 128, etc., that is, M and N can be the minimum or maximum value of each block size range). Furthermore, the prediction patterns can be broadly classified into directional patterns, non-directional patterns, color patterns, color copying patterns, etc., and more specifically, into horizontal or vertical patterns / diagonal patterns (45-degree intervals) / pattern 1 adjacent to the horizontal or vertical pattern, pattern 2 adjacent to the horizontal or vertical pattern (the pattern interval is slightly longer than the previous one), etc. That is, as described above, the application of filtering and the type of filtering can be determined based on the classification pattern.
[0387] Furthermore, the examples above illustrate the application of adaptive filtering based on multiple factors such as block range and prediction mode. However, these multiple factors are not always necessary, and there can also be examples of performing adaptive filtering based on at least one factor. Additionally, various transformation examples are possible beyond those described above, and reference pixel filter information can be included at the level of video, sequence, image, slice, tile, or block.
[0388] The filtering described above can be selectively performed based on a predetermined flag. Here, the flag can indicate whether filtering is performed on a reference pixel to perform intra-prediction. The flag can be encoded by the encoding device and emitted as a signal. Alternatively, the flag can be derived in the decoding device based on the encoding parameters of the current block. The encoding parameters may include at least one of the following: the location / region of the reference pixel, the block size, the component type, whether intra-prediction is applied on a sub-block basis, and the intra-prediction mode.
[0389] For example, if the reference pixel of the current block is the first pixel line adjacent to the current block, filtering can be performed on the reference pixel; otherwise, filtering can be omitted. Alternatively, if the number of pixels belonging to the current block is greater than a predetermined threshold, filtering can be performed on the reference pixel; otherwise, filtering may not be performed. The threshold is a value pre-agreed upon by the encoding / decoding device and can be an integer of 16, 32, 64, or larger. Alternatively, if the current block is larger than a predetermined threshold size, filtering can be performed on the reference pixel; otherwise, filtering may not be performed. The threshold size can be expressed as M×N and is a value pre-agreed upon by the encoding / decoding device, where M and N can be integers of 8, 16, 32, or larger. The threshold quantity or threshold size can be set to determine whether to perform filtering on the reference pixel solely by one of the threshold quantity or threshold size or a combination thereof. Alternatively, if the current block is a luminance component, filtering can be performed on the reference pixel; otherwise, filtering may not be performed. Alternatively, if the current block does not perform the above intra-prediction for sub-block units (i.e., the current block is not divided into multiple sub-blocks), then filtering is performed on the reference pixels; otherwise, filtering may not be performed on the reference pixels. Alternatively, if the intra-prediction mode of the current block is a non-directional mode or a predetermined directional mode, filtering may be performed on the reference pixels; otherwise, filtering may not be performed on the reference pixels. Here, the non-directional mode can be a planar mode or a DC mode. However, in the DC mode within the non-directional mode, filtering of the reference pixels may be restricted to not being performed. The directional mode can represent an intra-prediction mode with reference integer pixels. For example, the directional mode may include... Figure 9 At least one of the intra-prediction modes corresponding to the shown modes -14, -12, -10, -6, 2, 18, 34, 50, 66, 72, 78, 80. However, directional modes can be restricted to exclude the horizontal and vertical modes corresponding to modes 18 and 50, respectively.
[0390] When filtering a reference pixel according to the flag, filtering may be performed based on a filter predefined in the encoding / decoding device. The number of taps of the filter may be 1, 2, 3, 4, 5, or greater. The number of filter taps may be variably determined according to the position of the reference pixel. For example, a 1-tap filter may be applied to a reference pixel corresponding to at least one side of the lowermost, uppermost, leftmost, or rightmost side of a pixel line, and a 3-tap filter may be applied to the remaining reference pixels. Additionally, the filter strength may be variably determined according to the position of the reference pixel. For example, a filtering strength s1 may be applied to a reference pixel corresponding to at least one side of the lowermost, uppermost, leftmost, or rightmost side of a pixel line, and a filtering strength s2 may be applied to the remaining reference pixels (s1 < s2). The filter strength may be signaled in the encoding device or may be determined based on the above encoding parameters. When an n-tap filter is applied to a reference pixel, the filter may be applied to the current reference pixel and (n - 1) surrounding reference pixels. The surrounding reference pixels may represent pixels located in at least one of the upper, lower, left, or right directions of the current reference pixel. The surrounding reference pixels may belong to the same pixel line as the current reference pixel, and a part of the surrounding reference pixels may belong to a pixel line different from the current reference pixel.
[0391] For example, when the current reference pixel is located on the left side of the current block, the surrounding reference pixels may be pixels adjacent in at least one of the upper or lower directions of the current reference pixel. Or, when the current reference pixel is located on the upper side of the current block, the surrounding reference pixels may be pixels adjacent in at least one of the left and right directions of the current reference pixel. Or, when the current reference pixel is located at the upper left corner of the current block, the surrounding reference pixels may be pixels adjacent in at least one of the lower or right directions of the current reference pixel. The ratio between the coefficients of the filter may be [1:2:1], [1:3:1], or [1:4:1].
[0392] The prediction block generation unit may generate a prediction block according to at least one prediction mode and use reference pixels based on an intra prediction mode. At this time, the reference pixels may be used in methods such as extrapolation (directional mode) according to the prediction mode, and may be used in methods such as interpolation or average (DC) or copy (non-directional mode). Meanwhile, as described above, the current block may use the filtered reference pixels or may use the unfiltered reference pixels.
[0393] Figure 13 is a diagram showing a block adjacent to the current block regarding the generation of a prediction block.
[0394] For example, in directional mode, the pattern between horizontal mode and some diagonal modes (top right diagonal, including diagonals other than horizontal) can use bottom left block + left block ( Figure 13 In Ref_BL and Ref_L, the horizontal mode can use the left block, and the mode between horizontal and vertical can use the left block + top-left block + top block. Figure 13 In the Ref_L, Ref_TL, Ref_T blocks, vertical mode can use the upper block ( Figure 13 In the Ref_L section, the pattern between vertical and some diagonal patterns (bottom left diagonal, including diagonals other than vertical) can use the top block + top right block ( Figure 13 The reference pixels for Ref_T and Ref_TR in the model. Alternatively, in non-directional mode, the left and top blocks (Ref_T and Ref_TR) can be used. Figure 13 Reference pixels of the middle (Ref_L, Ref_T) or the bottom left block, left block, top left block, top block, and top right block ( Figure 13 The Ref_BL, Ref_L, Ref_TL, Ref_T, and Ref_TR are mentioned. Alternatively, in the case of using a color space-dependent mode (color reproduction mode), a restoration block from another color space can be used. Figure 12 Not shown in the text, but referred to as Ref_Col in this invention; representing the collocated reference of blocks in different spaces at the same time as the reference pixel.
[0395] Reference pixels used for intra-frame prediction can be categorized into several concepts. For example, reference pixels used for intra-frame prediction can be categorized into a first reference pixel and a second reference pixel, where the first reference pixel can be a pixel directly used to generate the prediction value for the current block, and the second reference pixel can be a pixel indirectly used to generate the prediction value for the current block. Alternatively, the first reference pixel can be a pixel used to generate the prediction values for all pixels in the current block, and the second reference pixel can be a pixel used to generate the prediction values for some pixels in the current block. Alternatively, the first reference pixel can be a pixel used to generate the primary prediction value for the current block, and the second reference pixel can be a pixel used to generate the secondary prediction value for the current block. Or, the first reference pixel (unconditionally) can be a pixel located in a region at the start of the prediction direction of the current block, and the second reference pixel can be a pixel not located at the start of the prediction direction of the current block (must be).
[0396] As in the example above, although reference pixels can be distinguished using various definitions, there are also cases where some definitions do not apply depending on the prediction mode. That is, it should be noted that the definitions used to distinguish reference pixels can vary depending on the prediction mode.
[0397] The reference pixel described in the above examples can be the first reference pixel, and the second reference pixel can further participate in the generation of the prediction block. In some diagonal patterns (top right diagonal, including diagonals other than horizontal ones), the pattern can use top left block + top block + top right block (…). Figure 13 The reference pixels for Ref_TL, Ref_T, and Ref_TR in the horizontal mode can be the top left block + top block + top right block (in... Figure 13 In the Ref_TL, Ref_T, Ref_TR, the vertical mode is top left + left + bottom left block ( Figure 13 The reference pixels for Ref_TL, Ref_T, and Ref_TR in the vertical mode can be the top left block + left block + bottom left block. Figure 13 The reference pixels of Ref_TL, Ref_L, and Ref_BL in the diagram, and the pattern between them and some diagonal blocks (lower left diagonal, including diagonals other than vertical ones) can be represented by the top left block + left block + bottom left block. Figure 13 The reference pixels (Ref_TL, Ref_L, Ref_BL) in the model can be used as second reference pixels. Furthermore, the prediction block can be generated using either the first reference pixel or both the first and second reference pixels in non-directional mode and color reproduction mode.
[0398] Additionally, the second reference pixel can be considered to include not only pixels that have already been encoded / decoded, but also pixels in the current block (predicted pixels in this example). That is, the primary prediction value can be the pixel used to generate the auxiliary prediction value. In this invention, the example considering pixels that have already been encoded / decoded as the second reference pixel will be mainly described, but it is not limited to this, and there may also be transformation examples using pixels that have not yet been encoded / decoded (predicted pixels in this example).
[0399] Prediction blocks can be generated or corrected using multiple reference pixels to compensate for the shortcomings of existing prediction models.
[0400] For example, directional patterns are used to predict the directionality of a block by using some reference pixels (first reference pixels), but they may not accurately reflect changes within the block, potentially leading to reduced prediction accuracy. In such cases, prediction accuracy can be improved by using additional reference pixels (second reference pixels) to generate or correct the predicted block.
[0401] Therefore, examples of generating prediction blocks using various reference pixels as described in the examples above will be described in the following examples. However, the present invention is not limited to the cases described in the examples above, and the concepts can be derived and understood from the above definitions even without using terms such as first and second reference pixels.
[0402] Settings for generating prediction blocks using additional reference pixels can be explicitly determined or implicitly set. In the explicit case, units can include video, sequence, image, slice, tile, etc. The following examples will describe the implicit processing case, but the invention is not limited thereto, and other modifications (explicit or mixed cases) are possible.
[0403] Prediction blocks can be generated in various ways depending on the prediction mode. Specifically, the prediction method can be determined based on the position of the reference pixels used in the prediction mode. Alternatively, the prediction method can be determined based on the pixel positions within the block.
[0404] The following explains the horizontal mode.
[0405] For example, when the left block is used as a reference pixel ( Figure 13 When using Ref_L), the nearest neighbor pixel is used (e.g., extrapolation, etc.). Figure 13 In the 1300, prediction blocks are generated in the horizontal direction.
[0406] Alternatively, a prediction block can be generated (or corrected; generation may involve the final prediction value; correction may not involve all pixels) using reference pixels adjacent to the current block corresponding to the horizontal direction. Specifically, the nearest neighbor pixel of the corresponding block can be used. Figure 13 1310 in the example; alternatively, 1320 and 1330 can be considered to correct the predicted value, and the degree of change or gradient information of the pixel (e.g., the degree of change or gradient information of pixel values such as R0-T0, T0-TL, T2-TL, T2-T0, T2-T1, etc.) can be fed back to the correction process.
[0407] At this point, the pixels for correction can be all pixels in the current block, or limited to a subset of pixels (e.g., they can be determined on a unit basis, such as individual pixels without a specific shape or existing in irregular positions, as in the examples described later, or on a unit basis, such as pixels with a certain shape, like lines; for ease of description, in the examples described later, the pixel unit is assumed to be a line). If the correction pixels are restricted to a subset of pixels, the unit can be determined as at least one line corresponding to the direction of the prediction mode. For example, pixels corresponding to a through d can be included in the correction target, and further, pixels corresponding to e through h can also be included in the correction target. Additionally, correction information obtained from neighboring pixels in the block can be applied identically regardless of the position of the line, or the correction information can be applied differently on a line-by-line basis, and the greater the distance from the neighboring pixels, the less correction information may be applied (e.g., a larger segmentation value, such as L1-TL, L0-TL, etc., can be set based on the distance).
[0408] At this point, the pixels included in the object to be corrected can have only one setting in an image, or can be adaptively determined based on various encoding / decoding elements.
[0409] Taking the adaptive determination case as an example, the pixels to be corrected can be determined based on the block size. In blocks smaller than 8×8, no lines are corrected; for blocks larger than 8×8 and smaller than 32×32, only one pixel line can be corrected; and for blocks larger than 32×32, two pixel lines can be corrected. The definition of the block size range can be derived from the previous description of this invention.
[0410] Alternatively, the pixels to be corrected can be determined based on the shape of the block (e.g., square, rectangle; specifically, a rectangle that is longer horizontally or vertically). For example, in the case of an 8×4 block, two pixel lines ( Figure 13 In the middle, (a to h) corrections are performed, and in the case of a 4×8 block, a pixel line ( Figure 13 In section a through d), corrections are performed. This is because if the shape of a horizontally stretched block in an 8×4 configuration is determined to be horizontal, the orientation of the current block may depend more on the upper block. Conversely, if the shape of a vertically stretched block in a 4×8 configuration is determined to be horizontal, the orientation of the current block may not depend as much on the upper block. Furthermore, the opposite setting is also possible.
[0411] Alternatively, the pixels to be corrected can be determined based on the prediction mode. In horizontal or vertical modes, a pixel line can be the correction target, while in other modes, b pixel lines can be the correction target. As mentioned above, in some modes (e.g., non-directional DC mode, color reproduction mode, etc.), the pixels to be corrected are not rectangular in shape, but are specified in pixels (e.g., a to d, e, i, m). This will be described in detail in the examples described later.
[0412] In addition to the above descriptions, adaptive settings can be applied based on additional encoding / decoding elements. While the above descriptions focus on the limitations of the horizontal mode, the same or similar settings can be applied to other modes, not just the examples above. Furthermore, the above examples can be achieved through combinations of multiple elements, rather than a single encoding / decoding element.
[0413] In the case of the vertical mode, since it can be derived by applying different directions to the prediction method used for the horizontal mode, a detailed description of it is omitted. Additionally, in the examples below, content that overlaps with the description of the horizontal mode is omitted.
[0414] The following explains the case of the diagonal upward pattern.
[0415] For example, when the left block and the bottom left block are used as reference pixels (first reference pixel or main reference pixel). Figure 13 When using Ref_L and Ref_BL in the corresponding block, the nearest neighbor pixel (e.g., extrapolation, etc.) is used. Figure 13 1300 and 1340 in the model generate prediction blocks in the diagonal direction.
[0416] Alternatively, a reference pixel (second reference pixel or auxiliary reference pixel) adjacent to the current block that exists at a position opposite to the diagonal can be used; Figure 13 The Ref_T and Ref_TR values in the reference matrix are used to generate (or correct) prediction blocks. Specifically, this can be achieved by using the nearest neighbor pixel of the corresponding block (...). Figure 13 The predicted values (1310 and 1330, and possibly 1320) are corrected and can be obtained by a weighted average of the auxiliary reference pixels and the main reference pixels (e.g., the weighted value can be obtained based on at least one of the distance differences between the predicted pixel and the main reference pixel, or between the predicted pixel and the auxiliary reference pixels along the x-axis or y-axis; examples of weighted values applied to the main reference pixels and auxiliary reference pixels can include 15:1 to 8:8; if there are more than two auxiliary reference pixels, examples of weighted values are such as 14:1:1, 12:2:2, 10:3:3, 8:4:4, etc., where the auxiliary reference pixels have the same weighted value, or such as 12:3:1, 10:4:2, 8:6:2, etc., where the auxiliary reference pixels can have different weighted values. In this case, different weighted values are determined based on the proximity of the corresponding predicted pixel in the current prediction mode direction according to tilt information, etc., i.e., the gradient of the corresponding predicted pixel with each auxiliary reference pixel is confirmed, and which current prediction mode gradient is closer to it) and fed back to the correction process.
[0417] At this point, the filtering has a setting in one image or is adaptively determined based on various encoding / decoding elements.
[0418] Taking the adaptive determination case as an example, the pixels to be filtered (e.g., the number of pixels, etc.) can be determined based on the position of the pixel to be corrected. If the prediction mode is a diagonal mode (mode 2 in this example) and the pixel to be corrected is c, then prediction is performed using L3 (the first reference pixel in this example), and correction is performed using T3 (the second reference pixel in this example). That is, it can be a case where a first reference pixel and a second reference pixel are used for the prediction of a single pixel.
[0419] Alternatively, when the prediction mode is a diagonal mode (mode 3 in this example) and the pixel to be corrected is b, prediction is performed using L1* (or L2*, the first reference pixel in this example), obtained by interpolating a fraction of pixels between L1 and L2, and correction can be performed using T2 (the second reference pixel in this example), or correction can be performed using T3. Alternatively, correction can be performed using both T2 and T3, or correction can be performed using T2* (or T3*), obtained by interpolating a fraction of pixels between T2 and T3 based on the directionality of the prediction mode. That is, to predict a pixel, it can be done using a first reference pixel (in this example, assumed to be L1*, which can be considered two pixels when the directly used pixel is treated as L1 and L2; or, depending on the filter used to interpolate L1*, it can be considered two or more pixels) and two second reference pixels (in this example, assumed to be T2 and T3, where L1* can be considered one pixel).
[0420] In summary, at least one first reference pixel and at least one second reference pixel can be used for a pixel prediction, which can be determined based on the prediction mode and the position of the predicted pixel.
[0421] If the correction is limited to a limited number of pixels, the corrected pixels can be determined in units of at least one horizontal or vertical line, based on the intra-frame prediction mode direction. For example, pixels corresponding to a, e, i, m, or pixels corresponding to a through d can be included in the correction target. Furthermore, pixels corresponding to b, f, j, n, or pixels corresponding to e through h can also be included in the correction target. In some cases of diagonal up-right, pixels in horizontal units can be corrected, while in some cases of diagonal down-left, pixels in vertical units can be corrected, but this is not a limitation.
[0422] In addition to the above descriptions, adaptive settings can be applied based on additional encoding / decoding elements. The above descriptions focus on the restrictive case of the diagonal up right; however, the same or similar settings can be applied to other modes, not just in the examples above. Furthermore, the above examples can be implemented using combinations of multiple elements instead of a single encoding / decoding element.
[0423] In the case of diagonal down left, since the prediction method for diagonal up right can be obtained simply by applying different directions, a detailed description is omitted.
[0424] The following explains the case of the top left diagonal.
[0425] For example, when the left block, top-left block, and top block are used as reference pixels (first reference pixel or main reference pixel). Figure 13 When using Ref_L, Ref_TL, Ref_T in the block, the nearest pixel (e.g., extrapolation, etc.) is used. Figure 13 The prediction blocks (1300, 1310, 1320) are generated in the diagonal direction.
[0426] Alternatively, a reference pixel (second reference pixel or auxiliary reference pixel) adjacent to the current block that exists at a position matching the diagonal can be used. Figure 13 The Ref_L, Ref_TL, and Ref_T pixels (located in the same position as the main reference pixel) are used to generate (or correct) prediction blocks. Specifically, pixels other than the nearest neighbor pixel of the corresponding block can be used. Figure 13 The predicted value is corrected by the pixels to the left of 1300, the pixels to the left, top, and upper left of 1320, the pixels above 1310, etc., and can be fed back to the correction process by a weighted average of the auxiliary reference pixels and the main reference pixels (for example, the ratio of the weights applied to the main reference pixels and auxiliary reference pixels can be 7:1 to 4:4, etc. If there are more than two auxiliary reference pixels, examples of weighted values can be such as 14:1:1, 12:2:2, 10:3:3, 8:4:4, etc., where the auxiliary reference pixels have the same weight, or such as 12:3:1, 10:4:2, 8:6:2, etc., where the auxiliary reference pixels can have different weights; in this case, the weights applied differently can be determined based on whether they are adjacent to the main reference pixel) or linear extrapolation, etc.
[0427] If the correction is applied restrictively to a limited number of pixels, the pixels to be corrected can be determined on a unit of horizontal or vertical lines adjacent to the reference pixels used in the prediction mode. In this case, both horizontal and vertical lines can be considered simultaneously, and overlap can be allowed. For example, pixels corresponding to a through d and pixels corresponding to a, e, i, and m (a overlap) can be included in the correction target. Further, pixels corresponding to e through h and pixels corresponding to b, f, j, and n (a, b, e, and f overlap) can be included in the correction target.
[0428] The following describes the case of non-directional mode (DC).
[0429] For example, when at least one of the left block, top block, top-left block, top-right block, and bottom-left block is used as a reference pixel, the nearest neighbor pixel of that block can be used (e.g., averaged, etc.) (assuming in this example). Figure 13 The prediction block is generated from pixels 1300 and 1310 in the image.
[0430] Alternatively, the neighboring pixels of the reference pixel (the second reference pixel or auxiliary reference pixel of the reference pixel) can be used; Figure 13 In this example, the positions of Ref_L and Ref_T are the same as or similar to the main reference pixel, and also include pixels located at the next nearest neighbor (similar to the case of diagonal up left). Specifically, the predicted block can be generated (or corrected) by using pixels at positions identical to or similar to the main reference pixel of the corresponding block, and the correction process is fed back by a weighted average of the auxiliary reference pixels and the main reference pixel (e.g., examples of the ratio of the weights applied to the main reference pixels and auxiliary reference pixels could be 15:1 to 8:8, etc.; if there are more than two auxiliary reference pixels, examples of the weights could be 14:1:1, 12:2:2, 10:3:3, 8:4:4, etc., where the auxiliary reference pixels have the same weight, or 12:3:1, 10:4:2, 8:6:2, etc., where the auxiliary reference pixels can have different weights).
[0431] At this point, the filtering has a setting for an image or is adaptively determined based on various encoding / decoding elements.
[0432] Taking the adaptive determination case as an example, the filter can be determined based on the block size. For pixels located at the top left, top, and left ends of the current block (in this example, it is assumed that pixels at the top left are filtered to the left and above the pixel, pixels at the top are filtered to the pixel above the pixel, and pixels at the left are filtered to the left of the pixel), some filtering settings can be used in blocks smaller than 16×16 (in this example, filtering is applied with weighted ratios of 8:4:4 and 12:4), and some filtering settings can be used in blocks larger than 16×16 (in this example, filtering is applied with weighted ratios of 10:3:3 and 14:2).
[0433] Alternatively, the filter can be determined based on the shape of the block. For example, for a 16x8 block, the pixel at the top of the current block can be filtered (in this example, assuming the pixels to the top left, top, and top right of that pixel are filtered, this can be considered an example where even the pixels to which the filter is applied are altered; the weighting ratio for the filter is 10:2:2:2), and the pixel at the left end of the current block can be filtered (in this example, assuming the pixels to the left of that pixel are filtered, the weighting ratio for the filter is 12:4). This is an example assuming it can be applied to multiple pixels at the top of a block in a block shape that is longer in the horizontal direction. Furthermore, the opposite settings can also be implemented.
[0434] If the pixels to be corrected are limited to a few pixels, the corrected pixels can be determined on a unit of horizontal or vertical lines adjacent to the reference pixels used in the prediction mode. In this case, both horizontal and vertical lines can be considered simultaneously, and overlap can be allowed. For example, pixels corresponding to a through d and pixels corresponding to a, e, i, m (a overlap) can be included in the correction target. Further, pixels corresponding to e through h and pixels corresponding to b, f, j, n (a, b, e, f overlap) can be included in the correction target.
[0435] In addition to the above description, adaptive settings are implemented based on additional encoding / decoding elements. While the above description focuses on the limitations of non-directional modes, these limitations are not limited to the examples above; similar or identical settings can be applied to other modes. Furthermore, the example described above can be implemented using a combination of multiple elements instead of a single encoding / decoding element.
[0436] The following describes the color copying mode.
[0437] For the color reproduction mode, prediction blocks are generated using a different method than that used in existing prediction modes, but reference pixels can be used in the same or similar way to generate (or correct) the prediction blocks. Since the content of obtaining the prediction blocks can be derived from the examples above and those described later, the details of obtaining the prediction blocks are omitted.
[0438] For example, a prediction block can be generated by using (e.g., copying) a block in a different color space that corresponds to the current block as a reference pixel (first reference pixel or main reference pixel).
[0439] Alternatively, a reference pixel (second reference pixel or auxiliary reference pixel) from a block adjacent to the current block can be used. Figure 13 The Ref_L, Ref_T, Ref_TL, Ref_TR, and Ref_BL values in the provided text are used to generate (or correct) the prediction block. Specifically, the nearest neighbor pixel of the corresponding block (assuming it's the nearest neighbor in this example) can be used. Figure 13 The predicted values (1300, 1310) are corrected and fed back to the correction process by a weighted average of the auxiliary reference pixels and the main reference pixels (e.g., the ratio of the weights applied to the main reference pixels and the auxiliary reference pixels can be 15:1 to 8:8, etc.; if there are more than two auxiliary reference pixels, the weights can be 14:1:1, 12:2:2, 10:3:3, 8:4:4, etc., where the auxiliary reference pixels have the same weight, or 12:3:1, 10:4:2, 8:6:2, etc., where the auxiliary reference pixels can have different weights).
[0440] Alternatively, pixels from blocks adjacent to those obtained in other color spaces can be used (secondary reference pixels or auxiliary reference pixels; when assuming...). Figure 13 When the image is a block in another color space corresponding to the current block, the pixels are Ref_L, Ref_T, Ref_TL, Ref_TB, Ref_BL, and in... Figure 13 A prediction block is generated (or corrected) using Ref_R, Ref_BR, and Ref_B (not shown). Filtering can be performed on the pixel to be corrected and its surrounding pixels (e.g., a first reference pixel in a different color space or a first reference pixel and a second reference pixel in a different color space; that is, within the block, when applying weighted averaging, a first reference pixel to be corrected and a first reference pixel with applied filtering are needed; at the block boundary, a first reference pixel to be corrected and a first reference pixel with applied filtering are needed) and fed back to the correction process.
[0441] In cases where both scenarios occur simultaneously, not only pixels from neighboring blocks of the current block can be used for correction, but also pixels from predicted blocks obtained in other color spaces can be used. Furthermore, filtering is performed on the pixel to be corrected and its surrounding pixels (e.g., pixels from neighboring blocks of the target pixel and pixels within the current block adjacent to the target pixel). This filtering is then fed back to the correction process. (For example, at the target correction location, an MxN mask is used to apply filtering; in this case, the mask filters the pixel and all or some of the pixels in the top, bottom, left, right, upper left, upper right, lower left, and lower right regions.)
[0442] This example illustrates the case where filtering is applied after obtaining the predicted value of the current block in another color space. However, it's also possible to use the value that has already been filtered in the corresponding color space before obtaining the predicted value as the predicted value for the current block. In this case, it should be noted that the only difference from the previous example is the order of the filtering steps, and the object being filtered is the same.
[0443] At this point, the filter has only one setting in an image or the filter is adaptively determined based on various encoding / decoding elements.
[0444] For example, in the case of adaptive determination, the settings for filtering can be determined according to the prediction mode. Specifically, it can adaptively filter according to the detailed color copying mode in the color copying mode. For example, in some color copying modes (in this example, when an associated information set is obtained in the adjacent area of the current block and the adjacent area of the block corresponding to a different color space), some filtering settings <1> can be adopted, and in some color copying modes (in this example, when multiple correlation information sets are obtained compared with the above mode. That is, a1 and b1, a2 and b2), some filtering settings <2> can be adopted.
[0445] In the filtering settings, it can be determined whether to apply filtering. For example, according to the filtering settings, filtering <1> can be applied or filtering <2> can be not applied. Or, an A filter <1> can be used or a B filter <2> can be used. Or, the filtering can be applied to all pixels on the left side and the upper side of the current block, or the filtering can be applied to some pixels on the left side and the upper side.
[0446] If the pixels to be corrected are limited to some pixels, the pixels to be corrected can be determined in units of horizontal lines or vertical lines, and the horizontal line or vertical line is adjacent to the reference pixel for the prediction mode (in this example, it is an auxiliary reference pixel. Different from the previous example). At this time, the horizontal line and the vertical line can be considered simultaneously, and overlap can be allowed.
[0447] For example, the pixels corresponding to a to d and the pixels corresponding to a, e, i, m (a overlaps) can be included in the correction target. Further, the pixels corresponding to e to h and the pixels corresponding to b, f, j, n (a, b, e, f overlap) can be included in the correction target.
[0448] In summary, the main reference pixels for generating the prediction block can be obtained from other color spaces, and the auxiliary reference pixels for correcting the prediction block can be obtained from the blocks adjacent to the current block of the current color space. And, it can be obtained from the blocks adjacent to the corresponding blocks of other color spaces. And, it can be obtained from some pixels of the prediction block of the current block. That is, some pixels in the prediction block can be used to correct some pixels in the prediction block.
[0449] In addition to the above description, according to additional encoding / decoding elements, adaptive settings can be applied. In the above description, the restrictive cases of the non-directional mode are mainly described. However, not only are they restrictively applied in the above examples, the same or similar settings can be applied to other modes. And, according to the combination of multiple elements, rather than one encoding / decoding element, the above examples are implemented.
[0450] This example uses the correlation between color spaces to obtain the predicted block of the current block, but the blocks used to obtain its correlation are obtained from the neighboring regions of the current block and the neighboring regions of corresponding blocks in different color spaces, so filtering can be applied to the block boundaries.
[0451] Depending on the encoding / decoding settings, there are several scenarios where multiple reference pixels can be used to generate the prediction block. Specifically, the encoding / decoding settings can be used to determine whether the use of a second reference pixel for generating or correcting the prediction block is supported.
[0452] As an example, it can be determined implicitly or explicitly whether to use additional pixels during the prediction process. If it is explicit, the above information can be included in units such as video, sequence, image, slice, tile, or block.
[0453] As an example, whether or not to use additional pixels during the prediction process can be applied to all prediction modes, or only to certain prediction modes. In this case, some prediction modes can be at least one of horizontal, vertical, some diagonal modes, non-directional modes, color replication modes, etc.
[0454] As an example, whether to use additional pixels during the prediction process can be applied to all blocks or only some blocks. In this case, some blocks can be defined according to their size, shape, etc., and the corresponding blocks are MxN (for example, the lengths of M and N are 8, 16, 32, 64, etc.; if it is a square, the lengths are 8×8, 16×16, 32×32, 64×64, etc.; if it is a rectangle, it can have the shape of a 2:1 rectangle, a 4:1 rectangle, etc.).
[0455] Additionally, the use of additional pixels during prediction can be determined based on certain encoding / decoding settings. In this case, the encoding / decoding setting could be `constrained_intra_pred_flag`, which allows for the limited use of additional reference pixels during prediction.
[0456] For example, when the use of a region including the second reference pixel is restricted by the flag (i.e., the region filled by a process such as filling with reference pixels according to the flag described above), the use of the second reference pixel in the prediction process can be restricted. Alternatively, the second reference pixel can be used in the prediction process independently of the flag.
[0457] In addition to the cases described in the examples above, there are various applications and modifications, such as combinations of one or more elements. Furthermore, although the examples above only describe some cases related to color copying modes, these examples can also be used, in the same or different ways, for prediction modes that generate or correct prediction blocks using multiple reference pixels, besides color copying.
[0458] The examples above have described a scenario where a single setting is used to generate or correct prediction blocks using multiple reference pixels in each prediction mode. However, multiple settings can be used for each prediction mode. That is, the candidate groups for the filtering settings can be configured to be multiple to generate selection information.
[0459] In summary, information about whether to perform filtering can be processed explicitly or implicitly, and when filtering is performed, information about the filtering selection can be processed explicitly or implicitly. When processing information explicitly, the information can be included in units of video, sequence, image, slice, tile, and block.
[0460] The generated prediction blocks can be corrected, and the process of correcting the prediction blocks will be described below.
[0461] The correction process can be performed based on a predetermined reference pixel and weighting. In this case, the reference pixel and weighting value can be determined based on the position of the pixel in the current block to be corrected (hereinafter referred to as the current pixel). The reference pixel and weighting value can be determined based on the intra-frame prediction mode of the current block.
[0462] When the intra-prediction mode of the current block is non-directional, the reference pixels refL and refT of the current pixel can belong to the first pixel line adjacent to the current block and can be located on the same horizontal / vertical line as the current pixel. The weighting value can include at least one of a first weighting value wL in the x-axis direction, a second weighting value wT in the y-axis direction, or a third weighting value wTL in the diagonal direction. The first weighting value can refer to the weighting value applied to the left reference pixel, the second weighting value can refer to the weighting value applied to the upper reference pixel, and the third weighting value can refer to the weighting value applied to the upper-left reference pixel. Here, the first and second weighting values can be determined based on the position information of the current pixel and a predetermined scaling factor (nScale). The scaling factor can be determined based on the width W and height H of the current block. For example, the first weighted value wL[x] of the current pixel predPixel[x][y] is determined as (32>>((x<<1)>>nScale)), and the second weighted value (wT[x]) can be determined as (32>>((y<<1)>>nScale)). The third weighted value wTL[x][y] can be determined as ((wL[x]>>4)+(wT[y]>>4)). However, when the intra-frame prediction mode is planar mode, the third weighted value can be determined as 0. The scaling factor can be set to ((Log2(nTbW)+Log2(nTbH)-2)>>2).
[0463] When the intra-prediction mode of the current block is vertical / horizontal mode, the reference pixels refL and refT of the current pixel belong to the first pixel line adjacent to the current block and can be located on the same horizontal / vertical line as the current pixel. In vertical mode, the first weighted value wL[x] of the current pixel predPixel[x][y] is determined to be (32>>((x<<1)>>nScale)), the second weighted value wT[y] can be determined to be 0, and the third weighted value wTL[x][y] can be determined to be equal to the first weighted value. On the other hand, in horizontal mode, the first weighted value wL[x] of the current pixel predPixel[x][y] is determined to be 0, the second weighted value wT[y] is determined to be (32>>((y<<1)>>nScale)), and the third weighted value wTL[x][y] can be determined to be equal to the second weighted value.
[0464] When the intra-prediction mode of the current block is diagonal, the reference pixels refL and refT of the current pixel belong to the first pixel line adjacent to the current block and can be located on the same diagonal as the current pixel. Here, the diagonal has the same angle as the intra-prediction mode of the current block. The diagonal can represent a diagonal from the lower left to the upper right, or a diagonal from the upper left to the lower right. At this time, the first weighted value wL[x] of the current pixel predPixel[x][y] is determined to be (32>>((x<<1)>>nScale)), the second weighted value wT[y] is determined to be (32>>((y<<1)>>nScale)), and the third weighted value wTL[x][y] can be determined to be 0.
[0465] When the intra-prediction mode of the current block is less than or equal to mode 10, the reference pixels refL and refT of the current pixel belong to the first pixel line adjacent to the current block and can be located on the same diagonal as the current pixel. Here, the diagonal has the same angle as the intra-prediction mode of the current block. At this time, the reference pixels can be restricted so that only one of the left or top reference pixels of the current block is used. The first weighted value wL[x] of the current pixel predPixel[x][y] is determined to be 0, the second weighted value wT[y] is determined to be (32>>((y<<1)>>nScale)), and the third weighted value wTL[x][y] can be determined to be 0.
[0466] When the intra-prediction mode of the current block is greater than or equal to mode 58, the reference pixels refL and refT of the current pixel belong to the first pixel line adjacent to the current block and can be located on the same diagonal as the current pixel. Here, the diagonal has the same angle as the intra-prediction mode of the current block. At this time, the reference pixels can be restricted so that only one of the left or top reference pixels of the current block is used. The first weighted value wL[x] of the current pixel predPixel[x][y] is determined to be (32>>((x<<1)>>nScale)), the second weighted value wT[y] is determined to be 0, and the third weighted value wTL[x][y] can be determined to be 0.
[0467] Based on the determined reference pixels refL[x][y], refT[x][y] and weighted values wL[x], wT[y], wTL[x][y], the current pixel predPixels[x][y] can be corrected as shown in Formula 1 below.
[0468] [Formula 1]
[0469] predPixels[x][y]=clip1Cmp((refL[x][y]*wL[x]+refT[x][y]*wT[y]-p[-1][-1]*wTL[x][y]+(64-wL[x]-wT[y]+wTL[x][y])*predPixels[x][y]+32)>>6)
[0470] However, the above correction process can only be performed when intra-prediction on a sub-block basis is not performed in the current block. The correction process can only be performed when the reference pixel of the current block is the first pixel line. The correction process can only be performed when the intra-prediction mode of the current block corresponds to a specific mode. Here, the specific mode may include at least one of a non-directional mode, a vertical mode, a horizontal mode, a mode less than a predetermined first threshold mode, and a mode greater than a predetermined second threshold mode. The first threshold mode may be 8, 9, 10, 11, or 12, and the second threshold mode may be 56, 57, 58, 59, or 60.
[0471] The prediction mode determination unit performs a process to select the optimal mode from multiple candidate prediction modes. Typically, the mode with the optimal coding cost can be determined using rate-distortion techniques that consider block distortion (e.g., distortion between the current block and the reconstructed block; sum of absolute differences (SAD), sum of square differences (SSD), etc.) and the generated bit count of the corresponding mode. The predicted block generated based on the prediction mode determined through the above process can be sent to the subtraction and addition units.
[0472] The prediction mode encoding unit can encode the prediction mode selected by the prediction mode determination unit. In the prediction mode candidate group, it can encode the index information corresponding to the prediction mode, or it can perform prediction on the prediction mode to encode the information associated with it. That is, in the former case, it represents a method of directly encoding the prediction mode without performing prediction; in the latter case, it represents a method of performing prediction on the prediction mode to encode the mode prediction information and the information obtained based on the prediction information. Furthermore, the former is an example that can be applied to the chromatic aberration component, and the latter is an example that can be applied to the luminance component; however, it is not limited to these examples and other cases are also possible.
[0473] When encoding predictions for a prediction pattern, the predicted value (or prediction information) of the prediction pattern can be called the Most Probable Mode (MPM). In this case, the prediction patterns of preset prediction patterns (e.g., DC pattern, planar pattern, vertical pattern, horizontal pattern, diagonal pattern, etc.) or spatially adjacent blocks (e.g., left block, top block, top-left block, top-right block, bottom-left block, etc.) are configured as MPMs. In this example, the diagonal pattern can represent the top-right diagonal, the bottom-right diagonal, and the bottom-left diagonal, and can correspond to... Figure 9 Modes 9, 2, and 66.
[0474] Additionally, patterns derived from patterns already included in the MPM candidate group can be configured as MPM candidate groups. For example, in the case of directional patterns already included in the MPM candidate group, patterns with a pattern interval difference of 'a' (e.g., 'a' is a non-zero integer, such as 1, -1, 2, -2, etc.) can be configured as MPM candidate groups. Figure 9 If mode 10 is already included, the derived modes (mode 9, mode 11, mode 8, mode 12, etc.) are re-included (or additionally) in the MPM candidate group.
[0475] The example above can correspond to the case of configuring MPM candidate groups with multiple modes, and the MPM candidate groups (or the number of MPM candidate groups) can be determined based on encoding / decoding settings (e.g., prediction mode candidate groups, image type, block size, block shape, etc.), and can include at least one mode.
[0476] The prediction patterns used to configure MPM candidate groups may have priorities. The order in which prediction patterns are included in the MPM candidate groups can be determined according to the priorities, and the configuration of the MPM candidate groups can be completed when the number of MPM candidate groups has been filled according to the priorities. At this time, the priorities may be, in order, prediction patterns of spatially adjacent blocks, preset prediction patterns, and patterns derived from the prediction patterns first included in the MPM candidate groups, but other modifications are also possible.
[0477] When performing prediction mode encoding for the current block using MPM, information about whether the prediction mode matches the MPM can be generated (e.g., most_probable_mode_flag).
[0478] If it matches an MPM (e.g., most_probable_mode_flag = 1), additional MPM index information (e.g., mpm_idx) can be generated based on the MPM configuration. For example, if the MPM is configured with one prediction mode, no additional MPM index information will be generated; if the MPM is configured with multiple prediction modes, index information corresponding to the prediction mode of the current block can be generated in the MPM candidate group.
[0479] If it does not match MPM (e.g., most_probable_mode_flag=0), then nonMPM index information (e.g., non_mpm_idx) corresponding to the prediction mode of the current mode can be generated in the remaining prediction mode candidate groups (or nonMPM candidate groups) other than the MPM candidate groups. This can be an example of configuring nonMPM as a group.
[0480] When a non-MPM candidate group consists of multiple groups, information about which group the prediction mode of the current block belongs to can be generated. For example, the non-MPM can be configured by groups A and B (assuming A is configured with m prediction modes, B with n prediction modes, and the non-MPM is configured with m+n prediction modes, where n is greater than m; assuming A's mode is a directional mode with equal intervals, and B's mode is a directional mode without equal intervals). If the prediction mode of the current block matches the prediction mode of group A (e.g., non_mpm_A_flag = 1), then in the candidate group A, index information corresponding to the prediction mode of the current block can be generated. If they do not match (e.g., non_mpm_A_flag = 0), then in the remaining prediction mode candidate groups (or the candidate group B), index information corresponding to the prediction mode of the current block can be generated. As in the example above, the non-MPM can be configured as at least one prediction mode candidate group (or cluster), and the non-MPM configuration can be determined based on the prediction mode candidate group. For example, when the number of prediction pattern candidate groups is less than 35, there can be one non-MPM; in other cases, there can be two or more non-MPMs.
[0481] As in the example above, the purpose of supporting non-MPM is to reduce the number of pattern bits when the number of prediction patterns is large and the prediction patterns are not predicted by MPM, when the non-MPM consists of multiple groups.
[0482] When performing prediction mode encoding (or prediction mode decoding) of the current block using MPM, a binarization table can be generated separately for each prediction mode candidate group (e.g., MPM candidate group, non-MPM candidate group, etc.), and a binarization method applicable to each candidate group can be applied separately.
[0483] The prediction-related information generated by the prediction mode coding unit can be transmitted to the coding unit and included in the bit stream.
[0484] Figure 14 This is an example of a tree-based segmentation block according to an embodiment of the present invention.
[0485] exist Figure 14 In the diagram, (a) represents a quadtree partition, (b) represents a horizontal partition in a binary tree partition, and (c) represents a vertical partition in a binary tree partition. In the diagram, A to C represent the initial blocks (blocks before partitioning, e.g., coded tree units), and the numbers following these texts indicate the partition numbers assigned during the partitioning process. In the case of a quadtree, the top-left, top-right, bottom-left, and bottom-right blocks are assigned numbers 0 to 3 respectively; in the case of a binary tree, the left / top and right / bottom blocks are assigned numbers 0 and 1 respectively.
[0486] refer to Figure 14 The segmentation status or information obtained during the segmentation process can be identified by the text and number obtained during the segmentation process to determine the segmentation status or information performed to obtain the corresponding block.
[0487] For example, in Figure 14 In (a), block A00 is the top left block (with 0 added to A0) of the four blocks obtained by performing a quadtree partition on the top left block (with 0 added to A) of the four blocks obtained after performing a quadtree partition on the initial block A.
[0488] Or, in Figure 14 In (b), block B10 is the upper block (adding 0 to B1) of the two blocks obtained by performing a horizontal split in the binary tree partitioning on the initial block B, and then performing a horizontal split in the binary tree partitioning on the lower block (adding 1 to B) B1.
[0489] By segmenting as described above, we can know the segmentation status and information of each block (e.g., supported segmentation settings <types of tree methods, etc.>), the supported range of the block, such as minimum and maximum size <details depend on the supported range of the segmentation method>, the allowed segmentation depth <details depend on the supported range of the segmentation method>, the segmentation flags <details depend on the segmentation flags of the segmentation method>, and the image type. Encoding mode <intra inter>This includes information such as the block segmentation status, encoding / decoding settings required for confirmation, and other information, and it can confirm which block (parent block) the current block belongs to before the segmentation step used to obtain the current block (child block). For example, in Figure 14 In (a), in the case of block A31, adjacent blocks may include A30, A12, and A13, and it can be confirmed that A30 and A31 belong to the same block A3 in the previous splitting step. In the case of A12 and A13, in the previous step, i.e., the splitting step of A3, A12 and A13 belong to another block A1, and it can only be confirmed that A12 and A13 belong to the same block A in the step before the previous step.
[0490] Since the above examples are for a single splitting operation (a quadpartition of a quadtree or a horizontal / vertical split of a binary tree), the examples described later will continue to illustrate the splitting based on multiple trees.
[0491] 15 is an example of a segmentation block based on multiple trees according to an embodiment of the present invention.
[0492] refer to Figure 15 The segmentation status or information performed to obtain the corresponding block can be identified through the text and numbers obtained during the segmentation process. In this example, each text does not necessarily represent an initial block, but rather information about the segmentation is indicated by a number.
[0493] For example, in the case of block A1A1B0, when a quadtree split is performed in the initial block, the upper right block A1 is obtained, and when a quadtree split is performed in block A1, the upper right block (A1 added to A1) is obtained, while when a horizontal split is performed in a binary tree split in block A1A1, the upper block (B0 added to A1A1) is represented.
[0494] Optionally, in the case of block A3B1C1, when a quadtree split is performed in the initial block, the lower right block A3 is obtained, and when a horizontal split is performed in a binary tree split in block A3, the lower block (B1 is added to A3) is obtained, while when a vertical split is performed in the binary tree split process in A3B1, the right block (C1 is added to A3B1) is represented.
[0495] In this example, by using the segmentation information and following the segmentation steps of each block, the relationship information between the current block and its neighboring blocks can be confirmed, such as in which segmentation step the current block and its neighboring blocks are the same.
[0496] Figure 14 and Figure 15 These are some examples of how segmentation information for each block can be confirmed. Various information and combinations of information used to confirm segmentation information (e.g., segmentation flags, depth information, maximum depth information, block range, etc.) can be used to confirm the segmentation information for each block, thereby confirming the relationships between blocks.
[0497] Figure 16 This is a schematic diagram illustrating various scenarios of segmented blocks.
[0498] Typically, due to the diverse texture information present in images, it is difficult to perform encoding / decoding using a single encoding / decoding method. For example, some regions may have strong edge components in specific directions, while others may have complex regions lacking edge components. Block segmentation plays a crucial role in effectively encoding these regions.
[0499] The purpose of performing block segmentation is to effectively divide regions based on the features of an image. However, when only one segmentation method is used (e.g., quadtree segmentation), it may be difficult to properly reflect the characteristics of the image to perform the segmentation.
[0500] refer to Figure 16 It can be confirmed that images with various textures are segmented based on quadtree segmentation and binary tree segmentation. a through e may only support quadtree segmentation, while f through j may only support binary tree segmentation.
[0501] In this example, the partitions based on the quadtree will be referred to as UL, UR, DL, and DR, and the binary tree partitions will be described accordingly.
[0502] Figure 16 The texture 'a' might be the best-performing texture form for quadtree partitioning, which can be divided into four parts through a single quadtree partition (Div_HV), and encoding / decoding can be performed on a block-by-block basis. Meanwhile, when... Figure 16 As shown in f, when applying binary tree partitioning, it may require three partitions (two Div_V and one Div_H) compared to quadtree partitioning.
[0503] Figure 16 The 'b' indicates that the texture is divided into upper and lower regions of the block. Figure 16 In b, when applying quadtree partitioning, one partition (Div_HV) is required, and... Figure 16 In g, binary tree segmentation can also be used to perform a single Div_H segmentation. Assuming a quadtree segmentation flag requires 1 bit, while a binary tree segmentation flag requires 2 or more bits, quadtree segmentation can be considered efficient in terms of flag bits. However, generating encoding / decoding information (e.g., information for expressing texture information (residual signals, coding coefficients, information indicating the presence or absence of coding coefficients, etc.), prediction information (e.g., intra-frame prediction related information, inter-frame prediction related information), and transform information (e.g., transform type information, transform segmentation information, etc.) per block unit is less efficient than binary tree segmentation because texture information is an example of regenerating corresponding information from segmented similar regions.
[0504] exist Figure 16 In other cases, the type of tree can also be determined based on the texture. The examples above demonstrate the crucial importance of supporting variable block sizes and various segmentation methods, enabling the division of effective regions based on image characteristics.
[0505] The following is a detailed analysis of quadtree partitioning and binary tree partitioning.
[0506] Reference Figure 16 The 'b' in the diagram represents the blocks adjacent to the bottom-right block (the current block) (in this example, the left and top blocks). It can be seen that the left block has similar characteristics to the current block, while the top block has different characteristics. Although some blocks (in this example, the left block) share similar characteristics, based on the characteristics of quadtree partitioning, they may already be partitioned.
[0507] At this point, the current block, due to its similar characteristics to the left block, may generate similar encoding / decoding information. When intra-prediction modes (e.g., the most probable mode, i.e., information predicting the current block's mode from adjacent blocks to reduce the number of bits of the current block's prediction mode) or motion information prediction modes (e.g., skip mode, merge mode, information used to reduce mode bits (e.g., contention mode)) occur, the information from the left block is valid for reference. That is, when referencing the encoding information of the current block (e.g., intra-prediction information, inter-prediction information, filter information, coefficient information, etc.) from either the left or upper block, more accurate information (in this example, the left block) can be referenced.
[0508] refer to Figure 16 In the 'e', let's assume the bottom rightmost block (the block that splits the current block twice). Similarly, the current block also has similar characteristics to the block above it in the left or upper block, thus allowing for improved encoding performance by referencing encoding information from the corresponding block (the upper block in this example).
[0509] Conversely, reference Figure 16 In the 'j' parameter, the left block adjacent to the rightmost block (the block that splits the current block twice) is referenced. In this example, since there is no upper block, referencing only the left block confirms that the current block has characteristics different from the left block. It can be confirmed that a quadtree split may be similar to a portion of its adjacent blocks, while a binary tree split has a higher probability of having characteristics different from its adjacent blocks.
[0510] In detail, when some blocks (x) and their neighboring blocks (y) are identical blocks before being split, for quadtree splits, some blocks (x) and their neighboring blocks (y) that are identical before the split can have similar characteristics or different characteristics. Conversely, for binary trees, in most cases, some blocks (x) and their neighboring blocks (y) that are identical before the split have different characteristics. In other words, if the characteristics are similar, there is no need to perform block splitting; the blocks are determined before the split. However, in most cases, the blocks are split due to different characteristics.
[0511] In addition to the examples above, Figure 16 In i, the lower right block x and the lower left block y may have the following relationship. That is, since the image characteristics of the upper blocks are similar, they are delineated without performing segmentation, and since the image characteristics of the lower blocks are different, they are delineated after being segmented into the lower left block and the lower right block.
[0512] In summary, in the case of quadtree segmentation, if a neighboring block adjacent to the current block has a neighboring block whose parent block is the same as the current block, due to the quadtree property of unconditionally dividing it into 1 / 2 in both horizontal and vertical directions, it may have similar or different characteristics to the current block. In the case of binary tree segmentation, if a neighboring block adjacent to the current block has a neighboring block whose parent block is the same as the current block, it can be segmented horizontally or vertically based on image characteristics; therefore, being segmented may mean being segmented due to different characteristics.
[0513] (Quadritree splitting)
[0514] The adjacent blocks that split the previous block compared to the current block (assuming the left block and the top block in this example, but not limited to these) can have similar or different characteristics to the current block.
[0515] In addition, adjacent blocks that are different from the previous block can have similar or different characteristics to the current block.
[0516] (Binary tree partitioning)
[0517] Neighboring blocks that are the same as the previous block when splitting the current block (assuming the left or upper block in this example, since it is a binary tree, the maximum number of candidates is one) can have different properties.
[0518] In addition, adjacent blocks that are different from the previous block in splitting the current block can have similar or different characteristics to the current block.
[0519] The following description will be based on the assumptions above, which are the main assumptions of this invention. Based on the above, it can be divided into two cases: (1) where the characteristics of adjacent blocks are similar to or different from those of the current block, and (2) where the characteristics of adjacent blocks are different from those of the current block.
[0520] Refer again Figure 15 .
[0521] As an example (the current block is A1A2), since the block A1 before splitting A1A2 is different from the block A0 before splitting (the initial block), the block A0 (the left block) in the adjacent blocks can be classified as the normal case (i.e., without knowing whether its characteristics are similar to or different from the current block).
[0522] For adjacent blocks A1A0 (the upper block), since the previous block A1 of block A1A2 is the same as the previous block A1 of block A1A0, the partitioning method can be determined. In this case, since it is a quadtree partition (A), it is classified as a normal case.
[0523] As an example (the current block is A2B0B1), for the adjacent block A2B0B0 (the previous block), the preceding block A2B0 of block A2B0B1 is the same as the preceding block A2B0B1 of block A2B0B1, thus confirming the splitting method. In this case, since it is a binary tree split (B), it is classified as an exception.
[0524] As an example (the current block is A3B1C0), in adjacent blocks, since the block A3B1 that splits A3B1C0 is different from the block A3 that splits A3B0, block A3B0 can be classified as a normal case.
[0525] As illustrated in the examples above, adjacent blocks can be categorized into normal cases and exceptional cases. In the normal case, it is unknown whether the encoding information of the corresponding adjacent block can be used as the state of the encoding information of the current block. In the exceptional case, it is strongly determined that the encoding information of the corresponding adjacent block cannot be used as the state of the encoding information of the current block.
[0526] Based on the above classification, a method can be used to obtain the prediction information of the current block from neighboring blocks.
[0527] In summary, confirm whether the current block and the adjacent blocks were the same before the split (A).
[0528] If A has the same result, then confirm the segmentation method (B) of the current block (if they are the same, then not only the current block, but also the adjacent blocks are blocks that were delineated using the same segmentation method, so only the current block is confirmed).
[0529] If the result of A is different, then exit (end).
[0530] If the result of B is a quadtree partition, then mark the adjacent blocks as normal and exit (end).
[0531] If the result of B is a binary tree partition, then mark the adjacent blocks as exceptions and exit (end).
[0532] The above example can be applied to the setting of intra-frame prediction mode prediction candidate groups (related to the most likely mode) in this invention. Normal prediction candidate group settings (e.g., candidate group configuration priority, etc.) and exceptional prediction candidate group settings can be supported, and priority can be pushed back or candidate groups derived from that block can be excluded based on the state of the adjacent blocks.
[0533] At this point, the adjacent blocks for which the above settings are applied can be limited to spatial cases (same space), or can be applied to cases exported from other color spaces of the same image, such as color copy mode. That is, the above settings can be performed considering the segmentation state of blocks exported through color copy mode, etc.
[0534] In addition to the examples above, the following example is an example of adaptively determining encoding / decoding settings (e.g., prediction candidate group settings, reference pixel settings, etc.) based on the relationships between blocks (in the above examples, segmentation block information, etc., is used to identify the relative relationship between the current block and other blocks).
[0535] This example (luminance component) is described under the assumption that there are 35 predefined intra-prediction modes in the encoding / decoding device, and that a total of 3 candidate blocks (left and top in this example) are selected from adjacent blocks to form an MPM candidate group.
[0536] Two candidates can be configured by adding one candidate to the left block L0 and the top block T0 respectively. If a candidate group cannot be configured in each block, it can be replaced and filled with candidates such as DC mode, planar mode, vertical mode, horizontal mode, and diagonal mode. If two candidates are filled through the above process, the remaining candidate can be filled with the quantities from various cases.
[0537] For example, if the candidates filled in each block are the same, they are replaced with adjacent modes of the aforementioned modes in a manner that does not overlap with the modes included in the candidate group (e.g., when the same mode is k_mode, k_mode-2, k_mode-2, k_mode+1, k_mode+2, etc.). Alternatively, when the candidates filled in each block are the same or different, candidate groups can be configured by adding planar modes, DC modes, vertical modes, horizontal modes, and diagonal modes.
[0538] Through the above process, intra-prediction mode candidate groups can be configured (normal case). The candidate groups established in this way belong to the normal case, and adaptive intra-prediction mode candidate groups can be configured based on neighboring blocks (exceptional case).
[0539] For example, if an adjacent block is marked as an exception, candidate groups derived from that block can be excluded. If the left block is marked as an exception, the candidate group can be configured with the upper block and a preset prediction mode (e.g., a mode derived from a DC block, planar block, vertical block, horizontal block, diagonal block, upper block, etc.). The above examples can be applied in the same or similar way even when the MPM candidate group is configured with more than three MPM candidates.
[0540] The description will be based on the following assumptions: the example (chromatic aberration component) has five intra-frame prediction modes (DC mode, planar mode, vertical mode, horizontal mode, and color mode in this example), and encoding / decoding is performed by configuring the prediction modes, which are adaptively prioritized and sorted, as candidate groups (i.e., encoding / decoding is performed directly without using MPM).
[0541] First, assign the color mode the highest priority (index 0 in this example; 1 bit is allocated for '0'), while assigning lower priority to the other modes (planar, vertical, horizontal, DC in this example) (indexes 1 to 4 in this example, '100', '101', '110', '111', 3 bits are allocated).
[0542] If the color pattern matches one of the other prediction patterns in the candidate group (DC, planar, vertical, horizontal), a preset prediction pattern (e.g., diagonal pattern, etc.) can be assigned to the priority given to the matching prediction pattern (index 1 to 4 in this example). If no match is found, the candidate group configuration ends.
[0543] The above process allows you to configure candidate groups for intra-frame prediction modes. The example described corresponds to the general case, and adaptive candidate groups can be configured based on the blocks whose color modes have been acquired.
[0544] In this example, if the block corresponding to the current block in other color spaces with the acquired color mode is configured as a single block (i.e., unsegmented), this can be considered a normal case; if it is configured as multiple blocks (i.e., segmented into more than two states), it is considered an exception. As mentioned above, this example may be based on the following assumptions: this example differs from examples that perform classification based on whether the current block shares a parent block with adjacent blocks, the segmentation method, etc., when multiple corresponding blocks in different color spaces are configured, the likelihood of characteristics different from the current block is higher. In other words, it can be understood as an example of adaptively determining encoding / decoding settings based on the relationships between blocks.
[0545] Taking the above assumptions as an example, if a corresponding block is marked as an exception, the prediction mode derived from that block can have its priority deferred. In this case, one of the other prediction modes (planar, DC, vertical, horizontal) can be assigned to a high priority, and lower priorities can be assigned to other prediction modes and color modes not included in the priorities.
[0546] For ease of description, the above examples are described under certain assumptions, but are not limited thereto, and the same or similar applications can be used in the various embodiments of the present invention described above.
[0547] In summary, candidate group A can be used when there is a block marked as an exception in an adjacent block, and candidate group B can be used when there is a block marked as an exception. It is understood that the above classification is divided into two cases, but the intra-frame prediction candidate group configuration for block units can be adaptively implemented based on the exception state and block location.
[0548] Furthermore, the above example assumes tree-based segmentation, but is not limited to this. Specifically, in the above example, it might be that a block obtained using at least one tree-based segmentation method is set as an encoding block, and prediction, transformation, etc., are performed directly without dividing the block into prediction blocks, transformation blocks, etc.
[0549] Another example of a segmentation setup is to use tree-based segmentation to obtain coded blocks and to obtain at least one prediction block based on the obtained coded blocks.
[0550] For example, suppose that tree-based partitioning (quadtree in this example) can be used to obtain coded blocks (2N×2N), and prediction blocks are obtained through type-based partitioning (in this example, supported candidate types are 2N×2N, 2N×N, N×2N, and N×N). In this case, when a coded block (assuming it is the parent block in this example) is partitioned into multiple prediction blocks (assuming these blocks are child blocks in this example), the aforementioned exception states and other settings can also be applied between prediction blocks.
[0551] Figure 16 If there are two prediction blocks (separated by a thin solid line) in a coded block (thick solid line), the lower block has different characteristics from the upper block and does not refer to the coding information of the upper block or pushes its priority down.
[0552] Figure 17 An example of a segmented block according to an embodiment of the present invention is shown. In detail, an example is shown in which a basic coding block (maximum coding block, 8N×8N) is used to obtain a coding block (slash block, 2N×2N) using quadtree-based segmentation, and the obtained coding block is segmented into at least one prediction block (2N×2N, 2N×N, N×2N, N×N) by type-based segmentation.
[0553] The following describes the intra-prediction mode candidate group settings for the case of obtaining rectangular blocks (2N×N and N×2N).
[0554] Figure 18 Various examples of intra-prediction mode candidate groups are shown regarding the setting of blocks that generate prediction information (in this example, the prediction block is 2N×N).
[0555] This example assumes that when the (luminance component) has 67 intra-prediction modes, a total of 6 candidates are selected from neighboring blocks (left, top, top left, top right, and bottom left in this example) and configured in the MPM candidate group.
[0556] Reference photo Figure 13 Four candidates can be configured in the order L3-T3-B0-R0-TL, and two candidates can be configured in a preset mode (e.g., plane, DC). If the maximum number is not filled in the above configuration (6 in this example), prediction modes derived from prediction modes already included in the candidate group (e.g., k_mode-2, k_mode-2, k_mode+1, k_mode+2, etc. in the case of k_mode), preset modes (e.g., vertical, horizontal, diagonal, etc.) can be included.
[0557] In this example, assuming that the candidate groups are adjacent blocks in space, the priority is the order of the current block's left block - top block - bottom left block - top right block - top left block (specifically, the sub-blocks below the left block and to the right of the top block). For the preset mode, the order is assumed to be planar - DC - vertical - horizontal - diagonal mode.
[0558] refer to Figure 18 For 'a', the candidate group settings are the same as in the example above. The current block (2N×N, PU0) can be configured with candidate groups in the order l1-t3-l2-tr-tl (omitted because other content is repeated). In this example, the adjacent blocks of the current block can be blocks that have been encoded / decoded (encoded blocks, i.e., predicted blocks in other encoded blocks).
[0559] Unlike the above, when the current block's position corresponds to PUl, at least one intra-prediction mode candidate group can be set. Generally, the closer a block is to the current block, the more likely it is to have similar characteristics to the current block, so configuring the candidate group from the corresponding block is most advantageous (1). On the other hand, in order to perform parallel decoding / decoding processing, it may be necessary to configure the candidate group (2).
[0560] If the current block is PU1, then in the candidate group configuration settings as shown in (1), such as Figure 18 b can be configured in the order of l3-c7-bl-k-l1 (k can be derived from c7 or tr, etc.), and in the candidate group configuration settings as shown in (2), such as Figure 18 For c, candidate groups can be configured in the order l3-bl-k-l1 (k can be derived from tr, etc.). The difference between the two examples lies in whether or not candidate groups for the intra-prediction modes of the previous block are included. That is, in the first case, to improve the efficiency of intra-prediction mode encoding / decoding, the intra-prediction modes of the previous block are included in the candidate groups. In the second case, since the encoding / decoding is not yet complete, intra-prediction modes of the previous block that cannot be referenced are excluded from the candidate groups for parallel processing, etc.
[0561] Figure 19 Various examples of intra-prediction mode candidate group settings are shown regarding the settings of the blocks that generate prediction information (in this example, the prediction blocks are N×2N).
[0562] This example assumes that when the (luminance component) has 67 intra-prediction modes, a total of 6 candidates are selected from neighboring blocks (left, top, top left, top right, and bottom left in this example) and configured in the MPM candidate group.
[0563] refer to Figure 13 One candidate can be configured in the order of L3-L2-L1-L0 (top block), another in the order of T3-T2-T1-T0 (top block), two candidates can be configured in the order of B0-R0-TL (top left block, top right block, bottom left block), or two candidates can be configured according to a preset pattern (e.g., Planar, DC). If the maximum number is not filled in the above configurations, prediction patterns derived from prediction patterns already included in the candidate group, preset patterns, etc., can be included. In this case, the priority of the candidate group configuration can be in the order of left block-top block-planar block-DC block-bottom left block-top right block-top left block.
[0564] refer to Figure 19 In the same manner as the candidate group setup in the example above, the current block (N×2N, PU0) can be configured with one candidate in the order l3-l2-l1-l0, one candidate in the order t1-t0, and two candidates in the order bl-t2-tl. In this example, the adjacent blocks of the current block can be blocks that have already been encoded / decoded (coded blocks; i.e., prediction blocks in other coded blocks).
[0565] Unlike the above, when the current block's position corresponds to PUl, at least one intra-prediction mode candidate group can be set. In the example above, (1) and (2) can be configured.
[0566] If the current block is PU1, then in the candidate group configuration settings as shown in (1), such as Figure 19 For b, configure one candidate in the order of c13-c9-c5-c1, configure one candidate in the order of t3-t2, and configure two candidates in the order of k-tr-t1 (k can be derived from bl or c13, etc.). It can also be configured in the candidate group configuration settings as shown in (2). Figure 19 For c, one candidate is configured in the order t3-t2, and two candidate groups are configured in the order k-tr-t1 (k can be derived from bl, etc.). The difference between the two examples is whether or not the candidate group of the intra-prediction mode of the upper block is included. That is, in the former case, in order to improve the efficiency of intra-prediction mode encoding / decoding, the intra-prediction mode of the left block is included in the candidate group. In the latter case, since the encoding / decoding is not yet determined, the intra-prediction mode of the left block that cannot be referenced is excluded from the candidate group for parallel processing, etc.
[0567] In this way, the candidate group configuration can be determined based on the settings of the candidate group configuration. In this example, the candidate group configuration settings can be determined implicitly (in this example, the candidate group configuration settings used for parallel processing), or the relevant information can be explicitly included in units such as video, sequence, image, slice, tile, etc.
[0568] The above content can be summarized as follows. This assumes either implicit determination or explicit generation of the relevant information.
[0569] Confirm the intra-prediction mode candidate group configuration settings (A) in the early stages of encoding / decoding.
[0570] If the confirmation result for A can refer to the settings of a previous prediction block in the same coding block, then the intra-prediction mode of that block is included in the candidate group (end).
[0571] If the confirmation result for A is that referencing the settings of previous prediction blocks in the same coding block is prohibited, then the intra-prediction mode of that block is excluded from the candidate group (end).
[0572] For ease of description, the above examples are described under certain assumptions, but are not limited thereto, and the same or similar applications can be applied to the various embodiments of the present invention described above.
[0573] Figure 20 An example of block segmentation according to an embodiment of the present invention is shown. In detail, the basic coded block (maximum coded block) represents an example of obtaining a coded block (slash block AxB) through binary tree-based segmentation (or multiple tree-based segmentations), and the obtained coded block is set as an example of a prediction block.
[0574] At this point, the following describes the motion information prediction candidate group settings for the case of obtaining a rectangular block (AxB, A≠B).
[0575] Figure 21 Various examples are shown of setting up intra-prediction mode candidate groups for blocks that generate prediction information (in this example, coded blocks, 2N×N).
[0576] The following assumptions are made: When this example (luminance component) has 67 intra-frame prediction modes, a total of 6 candidates are selected from neighboring blocks (left, top, top left, top right, and bottom left in this example) and configured in the MPM candidate group. The non-MPM candidate group is configured into multiple groups (A and B in this example. The mode in A that is more likely to predict the current block's prediction mode in the non-MPM candidates belongs to A). A total of 16 candidates are configured in group A and a total of 45 candidates are configured in group B.
[0577] At this point, group A includes candidates categorized according to certain rules (e.g., composed of isometric patterns in directional patterns) that are not included in the MPM candidate group, or according to the priority of the MPM candidate group, and may include candidates not included in the final MPM candidate group. Group B may consist of candidates from both the MPM candidate group and non-MPM candidate groups that are not included in group A.
[0578] refer to Figure 21 For block a, the current block (CU0 or PU0, assuming a size of 2NxN, horizontal / vertical ratio of 2:1) can be configured into a candidate group of six candidates in the order of l1-t3-planar-DC-l2-tr-tl-l1*-t3*-l2*-tr*-tl*-vertical-horizontal-diagonal pattern. In the example above, * represents the pattern derived from the prediction pattern of each block (e.g., the summation pattern of +1, -1, etc.).
[0579] On the other hand, reference Figure 21 b, the current block (CU1 or PU1, size 2N×N) can be configured into a candidate group with six candidates in the order of l3-c7-planar-DC-bl-k-l1-l3*-c7*-bl*-k*-l1*-vertical-horizontal-diagonal pattern.
[0580] In this example, in the application Figure 18 In the case of setting (2), such as Figure 21 b, according to l3-planar-DC-bl-k-l1-c7-l3*-bl*-k*-l1*-vertical-horizontal-diagonal-c7*, is excluded from the prediction pattern candidate group of the upper block, or its priority is pushed back, so it may be included in group A.
[0581] However, with Figure 18 The difference lies in the fact that even rectangular blocks are segmented into coding units and immediately set as prediction units without further segmentation. Therefore, as... Figure 18 As shown, the candidate group configuration settings may not be applicable to this example (such as the settings in (2)).
[0582] However, in Figure 21 In the case of k, since it is a position where encoding / decoding is not yet complete, it can be derived from the adjacent block that has completed encoding / decoding.
[0583] Figure 22 Various examples of setting up intra-prediction mode candidate groups for blocks that generate prediction information (in this example, coded blocks N×2N) are shown.
[0584] The description will be based on the following assumptions: when the example (chromatic aberration component) has five intra-frame prediction modes (DC mode, planar mode, vertical mode, horizontal mode, and color copy mode in this example), the prediction modes will be adaptively prioritized and sorted as candidate groups, and encoding / decoding will be performed.
[0585] At this point, the priority will be described based on the assumption that priority is determined from adjacent blocks (left, top, top left, top right, bottom left in this example).
[0586] refer to Figure 13 First-level candidates can be determined from left block L3, top block T3, top-left block TL, top-right block R0, and bottom-left block B0. At this time, the pattern with the most frequent value among the predicted patterns of the blocks can be determined as the first-level candidate. If multiple patterns have the most frequent value, a preset priority is assigned (e.g., color copying pattern - planar - vertical - horizontal - DC).
[0587] This example can be considered similar to the MPM candidate settings (in this example, the first bit is determined to be 0 or 1; if it is 1, then 2 bits are added; depending on the encoding / decoding settings, the first bit can be either bypassed or regular encoded, and the remaining bits can be bypassed) in terms of obtaining the pattern from neighboring blocks (the pattern is estimated as the prediction pattern for the current block) and thus determining the priority (i.e., determining the number of bits allocated; for example, '0' for the first level; '100', '101', '110', '111' for the second to fourth levels) and performing the prediction of the prediction pattern.
[0588] refer to Figure 22 For a, the current block (CU0 or PU0, assuming a size of N×2N and a horizontal / vertical ratio of 1:2) can be identified as a first-level candidate among the blocks l3, t1, t1, t2, and b1.
[0589] On the other hand, reference Figure 22 b, the current block (CU1 or PU1, size N×2N) can be determined as a first-level candidate in blocks c13, t3, t1, tr, k.
[0590] In this example, when applicable as Figure 19 When setting (2), the current block CU1 can determine the first-level candidate among blocks t3, t1, tr, and k. Additionally, if the block corresponding to the current block in other color spaces that have acquired the color copy mode is not configured as a block, the priority of the color copy mode is pushed back in the preset priority, changing the priority to Planar-Vertical-Horizontal-DC-Color Copy Mode, etc. As described above, this can be an example given under the following assumptions: this example differs from the example that performs classification based on whether the current block and adjacent blocks share the same parent block, the segmentation method, etc., but when multiple corresponding blocks in different color spaces are configured, the possibility of characteristics different from the current block is higher. That is, it can be understood as an example of adaptively determining the encoding / decoding settings based on the relationship between blocks.
[0591] However, with Figure 19 The difference lies in the fact that even rectangular blocks are divided into coding units, corresponding to cases where they are immediately set as prediction units without further segmentation. Therefore, as... Figure 19 The candidate group configuration settings shown may not be applicable to this example.
[0592] However, in Figure 22 In the case of k, since it is a position where encoding / decoding is not yet complete, it can be derived from the adjacent block that has completed encoding / decoding.
[0593] For ease of description, the above examples are described under certain assumptions, but are not limited thereto, and the same or similar applications may be found in the various embodiments of the present invention described above.
[0594] exist Figure 21 and Figure 22 In the embodiments, it is assumed that the M×N (M≠N) blocks that can be divided by binary tree partitioning occur consecutively.
[0595] In the above situations, the following may occur: Figures 14 to 16 The situation described in [the text] (an example of setting candidate groups by confirming the relationship between the current block and its neighboring blocks, etc.). That is, in [the text] Figure 21 and Figure 21 In the process, the adjacent blocks of CU0 and CU1 before the split are identical to each other, and CU0 and CU1 can be obtained by horizontal or vertical splitting of the binary tree.
[0596] In addition, Figure 18 and Figure 19 In one embodiment, it is assumed that multiple rectangular prediction blocks occur within the coded block after the coded block is segmented.
[0597] In the above situations, the following may also occur: Figures 14 to 16 The situation described in [the text] will then occur. Figure 18 and Figure 19 Examples of conflicting situations. For example, in Figure 18 In the case of PU1, it can be determined whether to use the information from PU0. Figures 14 to 16 In the case where PU1 has different characteristics than PU0, the information of PU0 is not used.
[0598] Regarding the above, candidate groups can be configured based on the initial encoding / decoding settings, assuming no conflicts. Candidate groups for motion information prediction can also be set based on various other encoding / decoding settings.
[0599] The above example illustrates the scenario of setting up a candidate group for the prediction mode. Additionally, it's possible to restrict the use of reference pixels from neighboring blocks marked with exception states for predicting the current block.
[0600] For example, when generating a prediction block by distinguishing between a first reference pixel and a second reference pixel according to an embodiment of the invention, if the second reference pixel is included in a block marked as an exception, the generation of the prediction block can be restricted to using the second reference pixel. That is, the prediction block can be generated using only the first reference pixel.
[0601] In summary, the above example can be considered as an element in the encoding / decoding setup related to the use of the second reference pixel.
[0602] Various scenarios regarding the intra-frame prediction mode candidate group settings of the present invention will be described.
[0603] In this example, it is assumed that there are 67 intra-prediction modes, configured with 65 directional modes and 2 non-directional modes for planar and DC. However, this is not a limitation, and other intra-prediction modes can be set. In this example, it is assumed that six candidates are included in the MPM candidate group. However, this is not a limitation, and the MPM candidate group can also be configured with four, five, or seven candidates. Additionally, there are settings with no overlapping modes in the candidate group. Furthermore, priority refers to the order in which a candidate is determined to be included in the MPM candidate group, but it can be considered as elements used to determine the binarization, entropy encoding / decoding settings, etc., for each candidate belonging to the MPM candidate group. The following examples will be described with the luma component as the focus, but the same or similar or modified applications can be performed for the chroma component.
[0604] The modes included in the intra-prediction mode candidate group (e.g., MPM candidate group, etc.) of the present invention can be configured as prediction modes of spatially adjacent blocks, preset prediction modes, and prediction modes derived from the prediction modes included in the candidates, etc. At this time, the rules (e.g., priority, etc.) for configuring the candidate group can be determined according to the encoding / decoding settings.
[0605] The examples described later illustrate the case of a fixed candidate group configuration.
[0606] As an example (1), a fixed priority can be supported for configuring intra-prediction mode candidate groups (MPM candidate groups in this example). For example, a preset priority can be supported, such as when adding prediction modes of spatially adjacent blocks to the candidate group, the left block ( Figure 13 L3 in the middle) - upper block ( Figure 13 (T3 in the middle) - bottom left block ( Figure 13 B0 in the middle) - top right block ( Figure 13 R0 in the middle) - top left block ( Figure 13 The order of TL (Translation Transformation) in the candidate group, when adding preset prediction modes, is: planar-DC-vertical-horizontal-diagonal mode. Alternatively, a mixed configuration as shown in the examples above is possible, such as the order of left block-top block-planar block-DC block-bottom left block-top right block-top left block. If the number of candidates cannot be filled even after executing candidates according to priority, the derived modes of the included prediction modes (e.g., +1, -1 for the left block mode, +1, -1 for the top block mode, etc.), vertical mode, horizontal mode, diagonal mode, etc., can have the next priority.
[0607] In the example above, when adding the predicted patterns of spatially adjacent blocks to the candidate group, the candidate patterns have a priority of left block-top block-bottom left block-top right block-top left block, and the left block is added in that order. Figure 12 The predicted pattern of L3 is included in the candidate group, and when the corresponding module does not exist, it will be the next priority block. Figure 13 The predicted patterns of T3 in the block are included in the candidate group. In this way, they are included in the candidate group in order, and when the predicted pattern of the corresponding block is unavailable or overlaps with the included pattern, the order jumps to the next block.
[0608] As another example, when adding prediction patterns of spatially adjacent blocks to a candidate group, the candidate group can be configured in the order of left block - top block - bottom left block - top right block - top left block. In this case, the prediction pattern of the left block is first considered from the prediction patterns of blocks located in L3. However, if an unavailable or overlapping pattern exists, the prediction pattern candidates for the left block are filled in the order of the next sub-blocks of the left block: L2, L1, L0. Similarly, the same or similar settings are applied to the top block (T3-T2-T1-T0), bottom left block (B0-B1-B2-B3), top right block (R0-R1-R2-R3), and top left block (TL). For example, if a block cannot be added to the candidate group even if executed in the order of L3-L2-L1-L0, the next sequential block can be executed.
[0609] The examples described later illustrate the case of adaptive candidate group configuration. Adaptive candidate group configuration can be determined based on the state of the current block (e.g., the size and shape of the block), the state of neighboring blocks (e.g., the size and shape of the blocks, prediction modes, etc.), or the relationship between the current block and neighboring blocks.
[0610] As an example (3), adaptive priority can be supported for configuring intra-prediction mode candidate groups (MPM candidate groups in this example). Priorities can be set based on frequency. That is, frequently occurring prediction modes can have high priority, while less frequently occurring prediction modes can have low priority.
[0611] For example, priorities can be set based on the frequency of prediction patterns of spatially adjacent blocks.
[0612] It can support preset priorities for cases with the same frequency. For example, if the patterns that occur twice in the left block, top block, bottom left block, top right block, and top left block (assuming one prediction pattern is obtained for each block) are pattern 6 and pattern 31, and the pattern that occurs once is pattern 14, then in this example (assuming that the pattern that occurs in the block in the previous sequence in the order of left block-top block-bottom left block-top right block-top left block is pattern 6), pattern 6 and pattern 31 are included as the first and second candidates.
[0613] Planar and DC are included as third and fourth candidates, and mode 14, which has a prediction mode with a first frequency, is included as the fifth candidate. Modes 5 and 7, as well as modes 30 and 32, derived from the first and second candidates, can then be placed in the next priority. In addition, modes 13 and 15, which are derived from the fifth candidate, are placed in the next priority, and then the vertical mode, horizontal mode, and diagonal mode can have the next priority.
[0614] That is, prediction modes with a frequency of 2 or higher are assigned priority before plane mode and DC mode, while prediction modes with a frequency of 1 can be assigned priority after plane mode and DC mode, such as the derived mode and preset mode in the example above.
[0615] In summary, a preset priority (e.g., left block - top block - plane block - DC block - bottom left block - top right block - top left block order) can be a priority set considering the statistical characteristics of a general image, and an adaptive priority (in this example, based on the frequency included in the candidate group) can be an example of considering some features of the image and performing partial modifications to the preset priority (in this example, fixing the plane and DC, and arranging patterns that occur more than twice in the front and patterns that occur once in the back).
[0616] Furthermore, not only can the priority of candidate group configuration be determined based on frequency, but the binarization and entropy encoding / decoding settings for each candidate belonging to the MPM candidate group can also be determined based on that frequency. As an example, the binarization of MPM candidates with m frequencies can be determined based on that frequency. Alternatively, context information about the candidates can be adaptively determined based on the frequency. That is, in this example, context information that sets the selection probability of the mode to high can be used. Specifically, when m is between 1 and 4 (in this example, two of the total six candidates include non-directional modes, so the maximum frequency is 4), the context information can be set differently.
[0617] If a bin index (the order of bits when configured with more than one bit according to binarization; for example, when an MPM candidate is configured as '010', the first to third bins can be 0, 1, 0) has 0s and 1s where 0 indicates that the candidate has been selected as the MPM (the case where a single bin is used to determine whether it is the final MPM, or the case where additional bins need to be confirmed to determine the final MPM; that is, in the above '010', if the first bin is 0, the second and third bins need to be confirmed to confirm whether the pattern is the final MPM; if it is a single bin, it can be immediately confirmed whether it is the final MPM based on the 0s and 1s of that bin), and 1 indicates that the candidate has not been selected as the MPM, then context information with a high probability of 0 is applied (that is, when performing binary arithmetic, the probability of 0 can be set to 90% and the probability of 1 to 10%. Assuming the basic case is that the probabilities of 0 and 1 are 60% and 40%, respectively), then context adaptive binary arithmetic coding (CABAC) can be applied.
[0618] As an example (4), adaptive prioritization for configuring intra-prediction mode candidate groups (MPM candidate groups in this example) can be supported. For example, the priority can be set based on the orientation of the prediction modes of spatially adjacent blocks.
[0619] At this point, the directional category is the pattern group facing upwards and to the right ( Figure 9 (2 to 17) Horizontal pattern group ( Figure 9 18) The pattern group facing the lower right (in the middle) Figure 9 (19 to 49) Vertical mode group ( Figure 9 50 in the middle), the pattern group facing the lower left ( Figure 9 (51 to 66) Non-directional mode group (planar, DC mode). Alternatively, it can be divided into horizontally oriented mode group ( Figure 9 Modes 2 to 34), vertical orientation mode group ( Figure 9 The options include 35 to 66, non-directional mode groups (planar, DC mode), and each can have various configuration examples.
[0620] For example, if the basic candidate group priority is the order of top-left block - plane block - DC block - bottom-left block - top-right block - top-left block, then the candidate groups can be configured in the above order. However, after configuring the candidate groups, the priority for binarization, entropy encoding / decoding settings for each candidate can be determined based on the category. As an example, binarization with fewer bits allocated can be performed on categories in the MPM candidate group that include a large number of candidates. Alternatively, context information can be adaptively determined based on the category. That is, context information can be determined based on the number of patterns included in each category (e.g., when the first category has m patterns and the second category has n patterns, context information can be determined based on the combination of m and n).
[0621] As an example (5), adaptive priority can be supported based on the size and shape of the current block. For example, priority can be determined based on the size of the block, and priority can be determined based on the shape of the block.
[0622] If the block size is 32×32 or larger, it can be included in the candidate group in the order of left block - top block - flat block - DC block - bottom left block - top right block - top left block. If it is smaller than 32×32, it can be included in the candidate group in the order of left block - top block - bottom left block - top right block - top left block - flat block - DC block.
[0623] Alternatively, if the block is square, it can be included in the candidate group in the order of left block - top block - flat block - DC block - bottom left block - top right block - top left block. If the block is rectangular (horizontally elongated), it can be included in the candidate group in the order of top block - top right block - top left block - flat block - DC block - left block - bottom left block. If the block is rectangular (vertically elongated), it can be included in the candidate group in the order of left block - bottom left block - top left block - flat block - DC block - top block - top right block. This example can be understood as the case where blocks adjacent to the longer block have a leading order.
[0624] As an example (6), adaptive priority can be supported based on the relationship between the current block and its neighboring blocks.
[0625] Reference Figure 23 Figure 'a' shows an example of generating a prediction block based on the prediction pattern of neighboring blocks (in this example, a child block of the left block).
[0626] When the prediction pattern of the left block (in this example, the upper child block of the left block) is (used for ease of description; refer only to the section on orientation content) is Figure 9 When a directional pattern exists in 51 to 66 (in vertical mode, a right-tilted pattern), the reference pixels used when generating the prediction block for the left block correspond to the diagonal portion (part of TL, T) in the diagram. For the right region of the left block, it could actually be a region highly correlated with the left region of the left block, but since encoding / decoding has not yet been performed, the prediction block can be generated using reference pixels from the lower region of T, which has already been encoded / decoded.
[0627] Even if prediction is performed from the lower region of a portion of block T as described above, the final prediction pattern is still determined to be one of the patterns present in 51 to 66. This can indicate that some regions 2400 of the current block also have the same or similar orientation (or edges, etc.) as the prediction pattern described above.
[0628] In other words, this could mean that the prediction mode of the left block is more likely to be selected as the prediction mode for the current block. Or, it could mean that 51 to 66 are more likely to be selected.
[0629] refer to Figure 23 In the case of b, when the relevant description is derived from the above examples, when in Figure 9 When a prediction pattern for the upper block exists in 2 to 17 (a downward-sloping pattern in horizontal mode), it may mean that some areas of the current block also have the same orientation as the corresponding prediction pattern.
[0630] In other words, this could mean that the prediction pattern of the previous block is more likely to be chosen as the prediction pattern for the current block. Or, it could mean that 2 through 17 are more likely to be chosen.
[0631] As described above, when the probability of the current block's prediction pattern occurring is determined by the prediction patterns of neighboring blocks, the priority can be adaptively determined.
[0632] For example, when the basic candidate group is configured with the priority in the order of left block - top block - plane block - DC block - bottom left block - top right block - top left block, in relation to Figure 23 When 'a' is the same, the priority remains unchanged, such as... Figure 23 In case b, the sequence may change in the order of top block - left block - plane block - DC block - bottom left block - top right block - top left block.
[0633] Or, in Figure 23 In the case of 'a', the sequence can be left block - (left+1) block - (left-1) block - top block - Planar block - DC block - bottom left block - top right block - bottom left block, and... Figure 23 In case b, the changes can occur in the following order: top block - (top-1) block - (top+1) block - left block - Planar block - DC block - bottom left block - top right block - bottom left block.
[0634] refer to Figure 24 In the example 'a', we see an example that generates a prediction block based on the prediction pattern of the neighboring block (the left block in this example).
[0635] The prediction mode in the left block is Figure 9 In the case of directional patterns present in 51 to 66, the pixels referenced for generating the prediction block of the left block correspond to the diagonal lines TL and T in the figure. The right region of the left block can be a region highly correlated with the left region of the left block, but since encoding / decoding has not yet been performed, the prediction block can be generated using reference pixels from the lower region of T, which has already been encoded / decoded.
[0636] Even if predictions are performed from the lower region of block T as described above, the final prediction pattern will still be determined to be one of the patterns that exist in 51 to 66. This means that some regions of the current block also have the same or similar directionality (or edges, etc.) as the prediction patterns described above.
[0637] In other words, this could mean that the prediction pattern for the left block has a high probability of being selected. Or, it could mean that 51 to 66 have a high probability of being selected.
[0638] refer to Figure 24 In the case of b, when the relevant description is derived from the above examples, when in Figure 9 When a prediction pattern for the previous block exists in patterns 2 to 17, it may mean that some regions of the current block also have the same directionality as the corresponding prediction pattern.
[0639] In other words, this could mean that the prediction pattern of the previous block is more likely to be chosen as the prediction pattern for the current block. Alternatively, it could mean that patterns 2 through 17 are more likely to be chosen.
[0640] As described above, when the probability of the current block's prediction pattern occurring is determined by the prediction patterns of neighboring blocks, the priority can be adaptively determined.
[0641] For example, when the basic candidate group is configured with the priority in the order of left block - top block - plane block - DC block - bottom left block - top right block - top left block, in the case of... Figure 24 In the case of 'a', the priority remains unchanged. Figure 23 In case b, the sequence might change as follows: top block - left block - flat block - DC block - bottom left block - top right block - top left block. In this case, as... Figure 24 In case 'a', the priority remains the same as before. Instead, the binarization and entropy encoding / decoding settings for the prediction mode candidates for the left block in the MPM candidate group can be adaptively determined. As an example, binarization (allocating shorter bits) for the prediction mode candidates for the left block can be determined. Alternatively, context information can be adaptively determined. That is, in this example, context information that sets the selection probability of the candidate to high can be used.
[0642] If a 0 in a bin index represents a bin that was selected as the MPM (in the case of using a bin to determine whether it is the final MPM, or in the case of confirming an attachment bin in order to know the final MPM), and a 1 represents a bin that was not selected as the MPM, then CABAC can be applied by applying context information where the probability of a 0 occurs is high.
[0643] Or, in Figure 24 In the case of 'a', the sequence can be left block - (left+1) block - (left-1) block - (left+2) block - (left-2) block - top block - Planar block - DC block - bottom left block - top right block - bottom left block, and... Figure 24 In case b, the changes can occur in the following order: top block - (top-1) block - (top+1) block - (top-2) block - (top+2) block - left block - Planar block - DC block - bottom left block - top right block - bottom left block.
[0644] refer to Figure 25 In the example 'a', we see an example that generates a prediction block based on the prediction pattern of the neighboring block (the left block in this example).
[0645] Considering the prediction patterns of the left and bottom left blocks, they have... Figure 9 The directional cases exist in 51 to 66. T...
Claims
1. An intra-frame prediction method, characterized in that, Applied to decoding devices, including: Export the intra-prediction mode for the current block; Determine the sample line from a plurality of sample lines used for intra-frame prediction of the current block; and Based on the intra-prediction mode and the determined sample lines, perform intra-prediction for the current block; The reference sample is interpolated using an interpolation filter; wherein, using an interpolation filter includes using a preset interpolation filter or using one of a plurality of interpolation filter candidate groups; The method further includes: Based on a first flag indicating whether filtering is to be performed on the first reference sample used for intra-frame prediction, filtering is selectively performed on the first reference sample of the determined sample line; The first flag is derived from the decoding device based on the encoding parameters of the current block, the encoding parameters including at least: whether intra-frame prediction is applied on a sub-block basis.
2. The intra-frame prediction method according to claim 1, characterized in that, When using one of multiple interpolation filter candidate groups, determine the filter selection information; The filter selection information is determined based on preset parameters set in the decoding settings, and the preset parameters include at least: block size and prediction mode.
3. The intra-frame prediction method according to claim 1, characterized in that, When the plurality of interpolation filter candidate groups include a 4-tap cubic filter and a 4-tap Gaussian filter, the interpolation filter to be used is determined according to the decoding settings.
4. The intra-frame prediction method according to claim 1, characterized in that, The sample line is a reference sample line, and the sample line is determined based on the size and shape of the current block.
5. The intra-frame prediction method according to any one of claims 1 to 4, characterized in that, The intra-frame prediction is performed on a sub-block basis within the current block. The sub-block is determined based on at least one of a second flag indicating whether a segmentation is to be performed, segmentation direction information, or segmentation quantity information.
6. An intra-frame prediction method, characterized in that, Applied to encoding devices, including: Export the intra-prediction mode for the current block; Determine the sample line from a plurality of sample lines used for intra-frame prediction of the current block; and Based on the intra-prediction mode and the determined sample lines, perform intra-prediction for the current block; The reference sample is interpolated using an interpolation filter; wherein, using an interpolation filter includes using a preset interpolation filter or using one of a plurality of interpolation filter candidate groups; The method further includes: Based on a first flag indicating whether filtering is to be performed on the first reference sample used for intra-frame prediction, filtering is selectively performed on the first reference sample of the determined sample line; The first flag is encoded and included in the bitstream according to the encoding parameters of the current block, wherein the encoding parameters include at least whether to apply intra-frame prediction on a sub-block basis.
7. The intra-frame prediction method according to claim 6, characterized in that, When using one of multiple interpolation filter candidate groups, determine the filter selection information; The filter selection information is determined based on preset parameters set in the encoding settings. These preset parameters include at least the block size and the prediction mode.
8. The intra-frame prediction method according to claim 6, characterized in that, When the plurality of interpolation filter candidate groups include a 4-tap cubic filter and a 4-tap Gaussian filter, the interpolation filter to be used is determined according to the encoding settings.
9. The intra-frame prediction method according to claim 6, characterized in that, The sample line is a reference sample line, and the sample line is determined based on the size and shape of the current block.
10. The intra-frame prediction method according to any one of claims 6 to 9, characterized in that, The intra-frame prediction is performed on a sub-block basis within the current block. The sub-block is determined based on at least one of a second flag indicating whether a segmentation is to be performed, segmentation direction information, or segmentation quantity information.
11. A decoding device, characterized in that, include: The prediction mode decoding unit is configured to export the intra-prediction mode of the current block; The reference sample configuration unit is configured to determine a sample line from a plurality of sample lines for intra-frame prediction of the current block; The prediction block generation unit is configured to perform intra-frame prediction of the current block based on the intra-frame prediction mode and the determined sample line; The reference sample interpolation unit is configured to interpolate the reference sample using an interpolation filter; wherein, using the interpolation filter includes using a preset interpolation filter or using one of a plurality of interpolation filter candidate groups; The reference sample filtering unit is configured to selectively filter the first reference sample of the determined sample line based on a first flag indicating whether to perform filtering on the first reference sample used for intra-frame prediction; the first flag is derived from the decoding device based on the coding parameters of the current block, the coding parameters including at least whether to apply intra-frame prediction on a sub-block basis.
12. An encoding device, characterized in that, include: The prediction mode determination unit is configured to derive the intra-prediction mode for the current block; The reference sample configuration unit is configured to determine a sample line from a plurality of sample lines for intra-frame prediction of the current block; The prediction block generation unit is configured to perform intra-frame prediction of the current block based on the intra-frame prediction mode and the determined sample line; The reference sample interpolation unit is configured to interpolate the reference sample using an interpolation filter; wherein, using the interpolation filter includes using a preset interpolation filter or using one of a plurality of interpolation filter candidate groups; The reference sample filtering unit is configured to selectively filter the first reference sample of the determined sample line based on a first flag indicating whether filtering is performed on the first reference sample used for intra-frame prediction; and to encode the first flag and include it in the bitstream according to the coding parameters of the current block, wherein the coding parameters include at least whether intra-frame prediction is applied on a sub-block basis.
13. A computing device, characterized in that, include: Memory is used to store programs that can be executed on a processor; A processor, configured to, when executing the program, implement the intra-prediction method as described in any one of claims 1 to 5, or the intra-prediction method as described in any one of claims 6 to 10.
14. A computer storage medium, characterized in that, The computer storage medium stores a program that, when executed by a processor, implements the intra-frame prediction method as described in any one of claims 1 to 5, or the intra-frame prediction method as described in any one of claims 6 to 10.
15. A computer storage medium storing a program and a bit stream thereon, characterized in that, When the program is executed by a processor, it implements the method as described in any one of claims 6 to 10 to generate the bit stream.