Intra prediction system and method using small pixel block vectors
Intra-template matching prediction using fractional pixel block vectors addresses inefficiencies in existing video encoding technologies, enhancing decoding and encoding accuracy and efficiency.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD
- Filing Date
- 2024-06-14
- Publication Date
- 2026-07-01
Smart Images

Figure 2026521707000001_ABST
Abstract
Description
Technical Field
[0001] (Cross - Reference to Related Applications) This application claims priority to U.S. Provisional Application No. 63 / 521,071, filed on June 14, 2023, with the title "FRACTIONAL - PEL INTRA TEMPLATE MATCHING", and all of the content of the provisional application is incorporated herein by reference.
[0002] Embodiments of the present invention relate to video encoding.
Background Art
[0003] Digital video has become mainstream and is used in a wide range of applications including digital television, videophone, and video conferencing. Advancements in computing technology and communication technology, along with efficient video encoding techniques, have made these digital video applications possible. Video data can be compressed using various video encoding techniques, whereby encoding of video data can be performed using one or more video encoding standards. Exemplary video encoding standards can include, but are not limited to, Versatile Video Coding (H.266 / VVC), High - Efficiency Video Coding (H.265 / HEVC), Advanced Video Coding (H.264 / AVC), Moving Picture Experts Group (MPEG) coding, Enhanced Video Coding Model (ECM), etc.
Summary of the Invention
[0004] According to one aspect of the present invention, a decoding method is provided which is performed by a decoder. The method may include the processor analyzing a bitstream to determine an intra-template matching prediction (intraTMP) mode associated with the current block. The method may include the processor obtaining at least one fractional pixel block vector (BV) for decoding the current block. The method may include the processor obtaining a reference block based on at least one fractional pixel BV. The method may include the processor decoding the current block based on the reference block. The method may include the processor obtaining a transformed fractional pixel BV after decoding the current block based on at least one fractional pixel BV. The method may include the processor storing the transformed fractional pixel BV for decoding another block.
[0005] According to another aspect of the present invention, a device for decoding is provided. The device may include a processor and a memory storing instructions. The memory stores instructions that, when executed by the processor, cause the processor to analyze a bitstream and determine the intraTMP mode associated with the current block. The memory stores instructions that, when executed by the processor, cause the processor to obtain at least one fractional pixel BV for decoding the current block. The memory stores instructions that, when executed by the processor, cause the processor to obtain a reference block based on at least one fractional pixel BV. The memory stores instructions that, when executed by the processor, cause the processor to decode the current block based on a reference block. The memory stores instructions that, when executed by the processor, cause the processor to obtain a converted fractional pixel BV after decoding the current block based on at least one fractional pixel BV. The memory stores instructions, and when these instructions are executed by the processor, the processor can store the converted fractional pixel BV for decoding another block.
[0006] According to another aspect of the present invention, a non-temporary computer-readable medium for storing instructions is provided. When the instructions are executed by a processor, the processor can cause the processor to analyze a bitstream to determine the intraTMP mode associated with the current block. When the instructions are executed by a processor, the processor can cause the processor to obtain at least one fractional pixel BV for decoding the current block. When the instructions are executed by a processor, the processor can cause the processor to obtain a reference block based on at least one fractional pixel BV. When the instructions are executed by a processor, the processor can cause the processor to decode the current block based on the reference block. When the instructions are executed by a processor, the processor can cause the processor to obtain a converted fractional pixel BV after decoding the current block based on at least one fractional pixel BV. When the instructions are executed by a processor, the processor can cause the processor to store the converted fractional pixel BV for encoding another block.
[0007] According to yet another aspect of the present invention, an encoding method is provided which is performed by an encoder. The method may include the processor obtaining at least one fractional pixel BV for encoding the current block. The method may include the processor obtaining a reference block based on at least one fractional pixel BV. The method may include the processor encoding the current block based on the reference block. The method may include the processor obtaining a converted fractional pixel BV after encoding the current block based on at least one fractional pixel BV. The method may include the processor storing the converted fractional pixel BV for decoding another block. The method may include the processor encoding the intraTMP mode associated with the current block into a bitstream.
[0008] According to yet another aspect of the present invention, an apparatus for encoding is provided. The apparatus may comprise a processor and a memory storing instructions. The memory stores instructions, which, when executed by the processor, cause the processor to acquire at least one fractional pixel BV for encoding the current block. The memory stores instructions, which, when executed by the processor, cause the processor to acquire a reference block based on at least one fractional pixel BV. The memory stores instructions, which, when executed by the processor, cause the processor to encode the current block based on a reference block. The memory stores instructions, which, when executed by the processor, cause the processor to acquire a converted fractional pixel BV after encoding the current block based on at least one fractional pixel BV. The memory stores instructions, which, when executed by the processor, cause the processor to store a converted fractional pixel BV for encoding another block. The memory stores instructions, which, when executed by the processor, cause the processor to encode the intraTMP mode associated with the current block into a bitstream.
[0009] According to another aspect of the present invention, a non-temporary computer-readable medium for storing encoder instructions is provided. When executed by a processor, the instructions cause the processor to acquire at least one fractional pixel BV for encoding the current block. When executed by a processor, the instructions cause the processor to acquire a reference block based on at least one fractional pixel BV. When executed by a processor, the instructions cause the processor to encode the current block based on the reference block. When executed by a processor, the instructions cause the processor to acquire a converted fractional pixel BV after encoding the current block based on at least one fractional pixel BV. When executed by a processor, the instructions cause the processor to store the converted fractional pixel BV for encoding another block. When executed by a processor, the instructions cause the processor to encode the intraTMP mode associated with the current block into a bitstream.
[0010] These exemplary embodiments are not intended to limit or define the present invention, but rather to provide examples that aid in understanding the invention. Specific embodiments describe other embodiments, and further explanations are provided in the detailed description. [Brief explanation of the drawing]
[0011] [Figure 1] A block diagram of an exemplary coding system according to some embodiments of the present invention is shown. [Figure 2] A block diagram of an exemplary decoding system according to several embodiments of the present invention is shown. [Figure 3] A detailed block diagram of an exemplary encoder in the encoding system shown in Figure 1, according to some embodiments of the present invention, is shown. [Figure 4] A detailed block diagram of an exemplary decoder in the decoding system shown in Figure 2, according to some embodiments of the present invention, is shown. [Figure 5]The following are illustrative images divided into coding tree units (CTUs) according to some embodiments of the present invention. [Figure 6] An exemplary CTU divided into coding units (CUs) according to some embodiments of the present invention is shown. [Figure 7] This is a schematic diagram of a current CU block and spatially adjacent and non-adjacent reconstructed samples according to some embodiments of the present invention. [Figure 8] A schematic diagram of the angular modes of VVC according to several embodiments of the present invention is shown. [Figure 9A] The diagram shows a slice division for intra-prediction according to some embodiments of the present invention. [Figure 9B] The diagram shows a tile partition for intra-prediction according to some embodiments of the present invention. [Figure 9C] The diagram shows wavefront parallel processing for intraprediction according to some embodiments of the present invention. [Figure 10A] The following diagrams illustrate intrablock copies (IBCs) according to several embodiments of the present invention. [Figure 10B] The following diagrams illustrate intra-template matching prediction (intraTMP) according to several embodiments of the present invention. [Figure 10C] The diagram shows an extended search area for intraTMP according to several embodiments of the present invention. [Figure 11] The diagram shows fractional pixel positions for intraTMP according to some embodiments of the present invention. [Figure 12] A flowchart of a decoding method according to several embodiments of the present invention is shown. [Figure 13] A flowchart of a video encoding method according to several embodiments of the present invention is shown. [Modes for carrying out the invention]
[0012] The drawings incorporated in and constituting a part of this specification illustrate embodiments of the present invention and, together with the description of the specification, further explain the principles of the present invention and serve to enable those skilled in the art to make and use the present invention.
[0013] Embodiments of the present invention are described with reference to the drawings.
[0014] Some configurations and arrangements will be discussed, but it should be understood that this is only for the purpose of explanation. Those skilled in the art can recognize that other configurations and arrangements can be used without departing from the spirit and scope of the present invention. It is clear to those skilled in the art that the present invention can also be used in various other applications.
[0015] It should be noted that terms such as "one embodiment", "embodiment", "exemplary embodiment", "some embodiments", "an embodiment", etc. referred to in the specification indicate that the described embodiment may include specific features, structures or characteristics, but not all embodiments necessarily include such specific features, structures or characteristics. Furthermore, such phrases do not necessarily refer to the same embodiment. Also, when describing specific features, structures or characteristics in combination with an embodiment, it is within the knowledge of those skilled in the relevant technical field to implement such features, structures or characteristics in combination with other embodiments, whether explicitly described or not..
[0016] In general, usage can be understood, at least in part, from its contextual context. For example, the term “one or more” as used herein can be used to describe any singular feature, structure, or characteristic, or a combination of features, structures, or characteristics, and this is at least in part context-dependent. Similarly, terms such as “one,” “single,” or “the said” can also be understood to convey singular or plural usage, and this is at least in part context-dependent. Furthermore, the term “based on” can be understood not necessarily to convey a single set of exclusive factors, but to allow for the presence of additional factors not necessarily explicitly described, and this is at least in part context-dependent.
[0017] Here, various aspects of a video encoding system will be described with reference to various devices and methods. These devices and methods will be described in the following detailed description and will be shown in the drawings by various modules, components, circuits, steps, operations, processes, algorithms, etc. (collectively referred to as “elements”). These elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether these elements are implemented as hardware, firmware, or software depends on the specific application and the design constraints imposed on the overall system.
[0018] The techniques described herein can be used in a variety of video encoding applications. As described herein, video encoding includes both encoding and decoding of video. Video encoding and decoding can be performed in blocks. For example, encoding / decoding operations such as transformation, quantization, prediction, in-loop filtering, and reconstruction can be performed on an encoding block, a transformation block, or a prediction block. As described herein, the block being encoded / decoded is called the “current block”. For example, the current block can represent an encoding block, a transformation block, or a prediction block according to the current encoding / decoding process. Furthermore, it should be understood that the term “unit” as used in this invention refers to a basic unit for performing a particular encoding / decoding operation, while the term “block” refers to an array of samples of a given size. Unless otherwise specified, “block” and “unit” can be used interchangeably.
[0019] Figure 1 shows a block diagram of an exemplary encoding system 100 according to some embodiments of the present invention. Figure 2 shows a block diagram of an exemplary decoding system 200 according to some embodiments of the present invention. Both system 100 or 200 can be applied to or integrated into a variety of data-processing systems and devices, such as computers and wireless communication devices. For example, system 100 or 200 could be all or part of a mobile phone, desktop computer, laptop computer, tablet computer, in-vehicle computer, game console, printer, positioning device, wearable electronic device, smart sensor, virtual reality (VR) device, augmented reality (AR) device, or any other suitable electronic device with data processing capabilities. As shown in Figures 7 and 8, system 100 or 200 may include a processor 102, memory 104, and interface 106. These components are shown connected to each other by a bus, but other connection types are also possible. It should be understood that system 100 or 200 may include any other suitable components for performing the functions described herein.
[0020] Processor 102 may include microprocessors such as graphics processing units (GPUs), image signal processors (ISPs), central processing units (CPUs), digital signal processors (DSPs), tensor processing units (TPUs), vision processing units (VPUs), neural processing units (NPUs), synergistic processing units (SPUs), or physics processing units (PPUs), microcontroller units (MCUs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gate logic, discrete hardware circuits, and other suitable hardware configured to perform the various functions described in this invention. Although only one processor is shown in Figures 7 and 8, it should be understood that multiple processors may be included. The processor 102 may be a hardware device having one or more processing cores. The processor 102 can execute software.Software is broadly interpreted to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, processes, functions, etc., whether they are called software, firmware, middleware, microcode, hardware description languages, or otherwise. Software may include computer instructions written in interpreted languages, compiled languages, or machine code. Other techniques for instructing hardware are also included in the broad category of software.
[0021] Memory 104 may broadly include both memory (also called primary / system memory) and storage devices (also called secondary memory). For example, memory 104 may include random-access memory (RAM), read-only memory (ROM), static RAM (SRAM), dynamic RAM (DRAM), ferro-electric RAM (FRAM®), electrically erasable programmable ROM (EEPROM), Compact Disc Read-Only Memory (CD-ROM), or other optical disc storage devices, magnetic disk storage devices such as hard disk drives (HDD), or other magnetic storage devices, flash drives, solid-state drives (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions that can be accessed and executed by processor 102. More broadly, memory 104 can be implemented by any computer-readable medium, such as non-temporary computer-readable media. Although Figures 7 and 8 show only one memory location, please understand that multiple memory locations may be present.
[0022] Interface 106 may broadly include data interfaces and communication interfaces, the communication interfaces being configured to send and receive signals in the process of sending and receiving information with other external network elements. For example, interface 106 may include input / output (I / O) devices and wired or wireless transceivers. Although only one interface is shown in Figures 1 and 2, it should be understood that multiple interfaces may be included.
[0023] The processor 102, memory 104, and interface 106 can be implemented in various forms in system 100 or 200 and perform video encoding functions. In some embodiments, the processor 102, memory 104, and interface 106 of system 100 or 200 are implemented (e.g., integrated) on one or more system-on-chip (SoCs). In one example, the processor 102, memory 104, and interface 106 can be integrated on an application processor (AP) SoC, which performs application processing in an operating system (OS) environment, including the execution of video encoding and video decoding applications. In another example, the processor 102, memory 104, and interface 106 can be integrated on a dedicated processor chip for video encoding, such as a GPU or ISP chip specialized for image and video processing in a real-time operating system (RTOS).
[0024] As shown in Figure 1, in the encoding system 100, the processor 102 may include one or more modules, for example, an encoder 101. Although Figure 1 shows that the encoder 101 resides within one processor 102, it should be understood that the encoder 101 may include one or more submodules that can be implemented on different processors that are close to or far from each other. The encoder 101 (and any corresponding submodule or subunit) may be a hardware unit of the processor 102 (e.g., part of an integrated circuit), which is designed to be used in conjunction with other components or software units implemented by the processor 102 by executing at least some programs (e.g., instructions). The program instructions may be stored in a computer-readable medium such as memory 104, and when executed by the processor 102, the processor may be made to perform a process having one or more functions related to video encoding, such as image segmentation, interpretation, intraprediction, transformation, quantization, filtering, entropy coding, etc., as will be described in detail below.
[0025] Similarly, as shown in Figure 2, in the decoding system 200, the processor 102 may include one or more modules, for example, a decoder 201. Although Figure 2 shows that the decoder 201 is located within one processor 102, it should be understood that the decoder 201 may include one or more submodules that can be implemented on different processors that are close to or far from each other. The decoder 201 (and any corresponding submodule or subunit) may be a hardware unit of the processor 102 (e.g., part of an integrated circuit) that is designed to be used in conjunction with other components or software units implemented by the processor 102 by executing at least some programs (e.g., instructions). The program instructions may be stored in a computer-readable medium such as memory 104 and, when executed by the processor 102, can cause the processor to perform a process having one or more functions related to video decoding, such as entropy decoding, inverse quantization, inverse transform, inter-prediction, intra-prediction, filtering, etc., as will be described in detail below.
[0026] Figure 3 shows a detailed block diagram of an exemplary encoder 101 in the encoding system 100 of Figure 1, according to several embodiments of the present invention. As shown in Figure 3, the encoder 101 may comprise a splitting module 302, an inter-prediction module 304, an intra-prediction module 306, a transform module 308, a quantization module 310, an inverse quantization module 312, an inverse transform module 314, a filter module 316, a buffer module 318, and an encoding module 320. Each element shown in Figure 3 is shown independently to represent a different characteristic function in the video encoder, and it should be understood that this does not mean that each component is formed by a separate hardware configuration unit or a single software. In other words, for the convenience of explanation, each element is included as an independent element, at least two elements may be combined to form a single element, or one element may be divided into multiple elements to perform functions. Also, it should be understood that some elements are not essential for performing the functions described in the present invention, but are optional elements for improving performance. Furthermore, it should be understood that these elements can be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether these elements are implemented as hardware, firmware, or software depends on the specific application and design constraints imposed on the encoder 101.
[0027] The splitting module 302 may be configured to split a video input image into at least one processing unit. The image may be a video frame or a video field. In some embodiments, the image includes an array of luminance samples in monochrome format, or an array of luminance samples and two corresponding arrays of chromaticity samples. In this case, the processing unit may be a prediction unit (PU), a transform unit (TU), or a coding unit (CU). The splitting module 302 can split the image into a plurality of combinations of coding units, prediction units, and transform units, and selects a combination of coding units, prediction units, and transform units based on a predetermined rule (e.g., a cost function) to encode the image.
[0028] Like H.265 / HEVC, H.266 / VVC is a block-based hybrid spatial and temporal predictive coding scheme. As shown in Figure 5, during coding, the input image 500 is first divided into square blocks called CTUs 502 by the partitioning module 302. For example, a CTU 502 may be a 128 × 128 pixel block. As shown in Figure 6, each CTU 502 in the input image 500 may be divided into one or more CUs 602 by the partitioning module 302, which can be used for prediction and transformation. Unlike H.265 / HEVC, in H.266 / VVC, a CU 602 may be rectangular or square and can be coded without further division into prediction or transformation units. For example, as shown in Figure 6, dividing a CTU 502 into multiple CUs 602 may include quadtree division (shown by solid lines), binary tree division (shown by dashed lines), and ternary tree division (shown by dashed lines). According to some embodiments, each CU602 may be the same size as its root CTU502, or it may be a subdivision of the root CTU502 that is as small as a 4x4 block.
[0029] Referring to Figure 4, the inter-prediction module 304 may be configured to perform inter-prediction on the prediction unit, and the intra-prediction module 306 may be configured to perform intra-prediction on the prediction unit. It is decided whether to use inter-prediction or perform intra-prediction on the prediction unit, and specific information (e.g., intra-prediction mode, motion vector, reference image, etc.) may be determined according to each prediction method. In this case, the processing unit that performs the prediction may be different from the processing unit that determines the prediction method and specific content. For example, the prediction method and prediction mode may be determined in the prediction unit, and the prediction may be performed in the transformation unit. The residual coefficients in the residual block between the generated prediction block and the original block may be input to the transformation module 308. Furthermore, prediction mode information, motion vector information, etc., used for prediction may be encoded into a bitstream by the encoding module 320, along with the residual coefficients or quantization level. It should be understood that in some encoding modes, the original block may be encoded directly without generating a prediction block via the prediction module 304 or 306. Furthermore, it should be understood that in some encoding modes, prediction, transformation, and / or quantization may be skipped.
[0030] In some embodiments, the interpretation module 304 can predict prediction units based on information from at least one image that precedes or follows the current image, and in some cases, the interpretation module 304 can predict prediction units based on information from a partially encoded region within the current image. The interpretation module 304 may include submodules such as a reference image interpolation module, a motion prediction module, and a motion compensation module (not shown). For example, the reference image interpolation module can receive reference image information from the buffer module 318 and generate pixel information for an integer number of pixels or fewer based on the reference image. For luminance pixels, an 8-tap interpolation filter based on a discrete cosine transform (DCT) with variable filter coefficients can be used to generate pixel information for an integer number of pixels or fewer in units of 1 / 4 pixels. For chrominance signals, a 4-tap interpolation filter based on a DCT with variable filter coefficients can be used to generate pixel information for an integer number of pixels or fewer in units of 1 / 8 pixels. The motion prediction module can perform motion prediction based on the reference image interpolated by the reference image interpolation module. Various methods can be used to calculate motion vectors, such as the full search-based block matching algorithm (FBMA), three-step search (TSS), and the new three-step search algorithm (NTS). Motion vectors can have motion vector values in units of 1 / 2, 1 / 4, or 1 / 16 pixels, or integer pixels, based on interpolated pixels. The motion prediction module can predict the current prediction unit by changing the motion prediction method. For example, various methods such as the skip method, merge method, advanced motion vector prediction (AMVP), and intra-block copy method can be used as motion prediction methods.
[0031] Continuing to refer to Figure 3, in some embodiments, the intra-prediction module 306 can generate prediction units based on information from reference pixels around the current block (i.e., image information in the current image). The reference pixels may be located on reference lines not adjacent to the current block. If a block near the current prediction unit is an inter-predicted block, and therefore the reference pixels are inter-predicted pixels, the reference pixels included in the inter-predicted block can be used in place of the reference pixel information of the nearby block where the intra-prediction was performed. That is, if a reference pixel is unavailable, at least one of the available reference pixels can be used in place of the unavailable reference pixel information. In intra-prediction, the prediction modes may include an angle prediction mode that uses reference pixel information based on the prediction direction, and a non-angle prediction mode that does not use direction information when performing the prediction. The mode for predicting luminance information may be different from the mode for predicting chrominance information, and chrominance information can be predicted using the intra-prediction mode information used to predict luminance information, or the predicted luminance signal information. When performing intraprediction, if the size of the prediction unit is the same as the size of the transformation unit, intraprediction can be performed on the prediction unit based on the left pixel, top-left pixel, and top pixel of the prediction unit. However, when performing intraprediction, if the size of the prediction unit is different from the size of the transformation unit, intraprediction can be performed using the reference pixels based on the transformation unit.
[0032] The intra-prediction method can generate a prediction block after applying an adaptive intra-smoothing (AIS) filter to a reference pixel based on the prediction mode. The type of AIS filter applied to the reference pixel may be different. To perform the intra-prediction method, the intra-prediction mode of the current prediction unit can be predicted based on the intra-prediction modes of prediction units located in the vicinity of the current prediction unit. When predicting the prediction mode of the current prediction unit using mode information predicted from adjacent prediction units, if the intra-prediction mode of the current prediction unit is the same as that of a neighboring prediction unit, information indicating that the prediction mode of the current prediction unit is the same as that of a neighboring prediction unit can be transmitted using predetermined flag information. If the prediction mode of the current prediction unit is different from that of a neighboring prediction unit, the prediction mode information of the current block can be encoded using additional flag information.
[0033] As shown in Figure 3, a residual block can be generated, which includes a prediction unit that has performed a prediction based on the prediction unit generated by the prediction module 304 or 306, and residual coefficient information (also referred to herein as “residuals”), where the residual coefficient information is the difference between the prediction unit and the original block. The generated residual block can be input to the transformation module 308. Hereinafter, additional details regarding residuals and transformations that can be used for video coding are provided.
[0034] In hybrid video coding systems, redundancy in the video signal is first utilized by applying an inter-prediction tool or an intra-prediction tool to each CU. The difference between the original sample of a CU and the predicted block of that CU is generally called the residual. Even after prediction, residuals can still have high spatial correlation. Conditional entropy coding can capture some of the spatial dependence between adjacent samples, but forming an entropy coding statistical model that fully utilizes the spatial correlation in residuals is computationally impractical. In contrast, transform coding is a practical and effective method for spatially decorrelating residuals.
[0035] For example, the transformation module 308 can transform residuals using an integerized version of the two-dimensional discrete cosine transform (DCT), which can be applied separably in the horizontal and vertical directions. For an M × N block of residual samples (where M is the width of the block and N is the height of the block), the transformation module 308 can obtain the transformation coefficients by applying the MxM DCT to each row to obtain intermediate transformation coefficients, and then applying the NxN DCT to each column of the intermediate transformation coefficients.
[0036] In the case of an intra-encoded CU (also referred to herein as "intraCU"), spatially adjacent reconstructed samples are used to predict the current block, and the intra-prediction mode is signaled once for the entire CU. Each CU consists of one or more identically positioned coded blocks (CBs) corresponding to the color components of the video sequence. For example, consumer video typically employs a 4:2:0 chromaticity format, in which case each CU consists of one luminance CB and two chromaticity CBs having one-quarter samples of the luminance CB. Intra-prediction and transform coding are performed at the prediction block (PB) and transform block (TB) levels, respectively. Each CB consists of a single TB, except in intra-subpartition (ISP) mode and in the case of implicit partitioning. For a luminance CB, the maximum side length of the TB is 64 and the minimum side length is 4. Furthermore, the luminance TB is further specified as a W×H rectangular block with width W and height H, where W, H ∈ {4, 8, 16, 32, 64}. For chromaticity CB, the maximum side length of TB is 32, and chromaticity TB is a W×H rectangular block with width W and height H. Here, W, H ∈ {2, 4, 8, 16, 32}, but to address memory architecture and throughput requirements, blocks of the shapes 2×H and 4×2 are excluded.
[0037] Figure 7 is a schematic diagram 700 of the current CU block 702 and spatially adjacent and non-adjacent reconstructed samples according to several embodiments of the present invention. In Figure 7, numbers 0, 1, 2, ... indicate pixel line indices for the current CU block 702.
[0038] In VVC, an intra-predictive sample of the current block is generated using a reference sample obtained from a reconstructed sample of an adjacent block. In the case of a W×H block, the reference sample consists of a vertical line of 2·H reconstructed samples that are spatially adjacent to the current block, located to the left of the block and extending downward, and a horizontal line of 2·W reconstructed samples that are located to the upper left and above the current block and extending to the right. This "L"-shaped sample set may be called a "reference line" in this invention. The reference line directly adjacent to the current CU block 702 is shown in Figure 7 as the line with index 0.
[0039] Like AVC and HEVC, VVC also supports an angular intra-prediction mode. Angular intra-prediction is a directional intra-prediction method. Compared to HEVC, VVC's angular intra-prediction is modified by improving prediction accuracy and adapting to a new partitioning framework. Improved prediction accuracy is achieved by increasing the number of angular prediction directions and by more accurate interpolation filters, while adaptation to a new partitioning framework is achieved by introducing a wide-angle intra-prediction mode. In VVC, the number of directional modes available in a given block increases from 33 in HEVC to 65. Figure 8 shows the 800 angular modes for VVC.
[0040] The directions of even indices between 2 and 66 are equivalent to the directions of the angular modes supported in HEVC. For a square block, the same number of angular modes are assigned to the top and left sides of the block. On the other hand, rectangular intrablocks, which do not exist in HEVC, are the central part of the VVC partitioning scheme, and additional intraprediction directions are assigned to the longer sides of the block. The additional modes assigned to the longer sides are called Wide-Angle Intra Prediction (WAIP) modes because they correspond to prediction directions that have an angle greater than 45° with respect to the horizontal or vertical modes. As shown in Figure 8, the WAIP modes for a given mode index are defined by mapping the original directional modes to modes that have the opposite direction and whose index offset is equal to 1. For a given rectangular block, the aspect ratio (e.g., the ratio of width to height) is used to determine which angular modes should be replaced by their corresponding wide-angle modes.
[0041] In the case of VVC square blocks, each pair of predicted samples that are adjacent horizontally or vertically are predicted from the adjacent pair of reference samples. In contrast, WAIP extends the angular range of directional prediction beyond 45°, so for coded blocks predicted using WAIP mode, adjacent predicted samples may be predicted from non-adjacent reference samples.
[0042] In addition to the directly adjacent lines of neighboring samples, one of the two non-neighboring reference lines (line 1 and line 2) shown in Figure 7 may contain an input sample for intra-prediction in VVC. In ECM, more non-neighboring reference lines may be used. The use of both neighboring and non-neighboring reference samples is called multiple reference line (MRL) prediction.
[0043] The intra-modes available for MRL are DC mode and angle prediction mode. However, not all of these modes can be combined with MRL for a given block. In VVC, MRL modes are always combined with modes in the Most Probable Mode (MPM) list. Such a combination means that when non-adjacent reference lines are used, the intra-prediction mode is one of the MPMs. The design of such MPM-based MRL prediction modes is motivated by the observation that non-adjacent reference lines are beneficial for texture patterns of sharp, strongly directional edges. In these cases, MPMs are more frequently chosen because there is usually a strong correlation between the texture patterns of adjacent blocks and the current block. On the other hand, choosing a non-MPM for intra-prediction indicates that the edges are not uniformly distributed in adjacent blocks, and therefore the MRL prediction mode is not expected to be very useful in this case. Furthermore, it has been observed that when the intra-prediction mode is a planar mode, MRL does not provide additional coding gain, because this mode is typically used for smooth regions. Therefore, MRL always excludes planar modes, which are one of the MPMs. The angle or DC prediction process in MRL is very similar to that for directly adjacent reference lines. However, for angle modes with non-integer slopes, a DCT-based interpolation filter (DCTIF) is always used. This design choice is supported by experimental results and is also consistent with the empirical observation that MRL is most beneficial for sharp, strongly directional edges, where DCTIF is more suitable because it retains more high frequencies than some other filters.
[0044] From a hardware design perspective, applying multiple reference lines as proposed in earlier methods requires the additional cost of line buffers to hold the extra reference lines. In typical hardware designs, line buffers are part of the on-chip memory architecture for image and video encoding, and minimizing their on-chip area is of paramount importance. To address this issue, MRLs are disabled, and MRLs are not signaled for encoding units tangent to the upper boundary of the CTU. In this way, the additional buffers for holding non-adjacent reference lines are limited to 128, which is the maximum unit size width.
[0045] In some known methods, intra-prediction fusion methods have been proposed to improve the accuracy of intra-prediction. More specifically, if the current block is a luminance block, encoded in an angle mode with a non-integer slope rather than ISP mode, and the block size (width * height) is greater than 16, then two prediction blocks generated from two different reference lines are "fused," where the prediction fusion is calculated as a weighted sum of the two prediction blocks. More specifically, in the current signaling method in the bitstream, index i(line i The first reference line of ) is specified, and the prediction block generated from that reference line using the selected intra prediction mode is p(line i This is denoted as p(), where p() represents the operation of generating a prediction block from a reference line using a given intra prediction mode. In known methods, the reference line i+1 The second reference line is implicitly selected. That is, the second reference line is located one index away from the current block relative to the first reference line. Similarly, the predicted block generated from the second reference line is p(line i+1 This is denoted as ). The weighted sum of the two prediction blocks is obtained as follows and functions as the predictor of the current block according to equation (1).
[0046] p fusion =w0*p(line i )+w1*p(linei+1 ) (1) Here, p fusion The symbol represents the fusion prediction, and w0 and w1 are two weighting coefficients, which are set to 3 / 4 and 1 / 4 respectively in the experiment.
[0047] In the intra-prediction method described above, the predictor is derived based on adjacent reference samples. However, this depends on the reference samples currently available to the CU. Sample availability depends on two factors: (1) whether the sample has been reconfigured, and (2) whether the sample belongs to a logical unit that the CU is currently permitted to use.
[0048] To determine whether a sample has been reconstructed, the VVC partitioning structure is considered. Referring to Figure 5, each image is partitioned into square CTU tiles, which are processed in raster scan order. When the intra-prediction method is run on current CU602 at current CTU502, samples belonging to other CTUs that are earlier than current CTU502 in the raster scan order are reconstructed and may be available for prediction. Samples belonging to CTUs that are later than current CTU502 in the raster scan order are not reconstructed and are therefore unavailable.
[0049] Each CTU502 itself is divided into CUs by a hierarchical structure consisting of quadtree, binary tree, and ternary tree partitions, an example of which is shown in Figure 6. The scan order of the CUs within the CTU502 is determined by the partition structure. In the case of a single-level partition, the partitions are scanned in the following order: (1) from left to right in the case of a horizontal binary or horizontal ternary tree partition, (2) from top to bottom in the case of a vertical binary or vertical ternary tree partition, and (3) from top left, top right, bottom left, and bottom right in the case of a quadtree partition.
[0050] If a partition includes further hierarchical divisions, all CUs within that partition are scanned before proceeding to the CUs of the next partition. Figure 6 shows an example where CTU502 is divided into 15 CUs. Each CU602 in Figure 6 is numbered from 1 to 15 to indicate the scan order. When the intra-prediction method is run on the current CU in the current CTU, samples belonging to other CUs that are earlier than the current CU in the current CTU's partition scan order are reconstructed and made available for prediction. Samples belonging to the current CU, or to CUs that are later than the current CU in the current CTU's partition scan order, are not reconstructed and are therefore unavailable.
[0051] Samples of CTUs that are currently ahead of a CTU in the raster scan order are considered reconstructed according to the definition above. However, they are not necessarily available for intra-prediction. In order to be considered available for prediction, they must also belong to a logical unit in which the CU is currently permitted to be used. The image may be divided into sub-image partitions, each sub-image partition containing an integer number of CTUs. Figure 9A shows a diagram of an image slice partition 900 for intra-prediction according to some embodiments of the present invention. Samples belonging to slice partition 904 other than the slice containing the current CU (of CTU 902) are not available for intra-prediction. Imposing this restriction makes it possible for slices to be decoded independently.
[0052] Figure 9B shows a diagram of image tiling 901 for intra-prediction according to some embodiments of the present invention. Samples belonging to tiling 906 other than the tile containing the current CU (CTU902) are unavailable for intra-prediction. Imposing this restriction makes it possible to decode tiles independently.
[0053] Figure 9C shows a diagram of wavefront parallel processing 903 of an image for intra-prediction according to some embodiments of the present invention.
[0054] The way intra-prediction methods (e.g., slicing, tiling, or wavefront parallel processing) handle the unavailability of reference samples required for prediction varies depending on the method. If such samples are unavailable, the method may simply be disabled. Alternatively, some form of extrapolation for the unavailable samples, such as boundary extension, may be performed.
[0055] In the intra-prediction method described with reference to Figures 9A to 9C above, predictors are derived only from spatially adjacent reference samples. However, greater coding gains can be achieved by extending the region of reconstructed samples that can be used to derive predictors. One example of this is the intra-block copy (IBC) mode.
[0056] Figure 10A shows a diagram of IBC1000 according to several embodiments of the present invention.
[0057] Referring to Figure 10A, when CU1004 is currently predicted by the intra-block copy mode, a block vector (BV) 1010 is signaled, indicating which block in the same image is copied and will function as the predictor 1012 for the current block. The block vector can be signaled by signaling the block vector difference (BVD) in the bitstream, thereby determining the block vector by adding the BVD to the block vector predictor. Alternatively, if the block vector from a previous CU perfectly matches the current block vector, it can be signaled using the merge flag. Regardless of the signaling mechanism, the block vector points to a location in the same image, indicates a sample block of the same size as the current CU1004, and this sample block is used as the predictor block for the current CU1004. Several restrictions can be applied to the block vector. The first restriction is that block vector 1010 may point to a sample block in the current image that is available for intra-prediction. The second limitation is that block vector 1010 may be limited to a search region defined in the IBC tool, which may be smaller than the current image. For example, in VVC, the IBC search region is the current CTU 1002 and the previous CTU. In ECM, if the size of the CTU is 256x256, the IBC search region is the current CTU row 1006 and the CTU row above it, or if the size of the current CTU is 128x128 or less, the IBC search region is the current CTU row 1006 and the two CTU rows above it.
[0058] To further improve encoding performance, ECM-9.0 proposes a fractional pixel IBC method. More specifically, in addition to the existing integer pixel IBC, 1 / 16 pixel resolution is further supported. An 8-tap luminance filter and a chromaticity filter used for fractional motion compensation in VVC are used to interpolate fractional pixel values. After an IBC block is encoded, it is stored at 1 / 16 pixel resolution for encoding future blocks.
[0059] Figure 10B shows a diagram of intraTMP1001 according to several embodiments of the present invention.
[0060] Referring to Figure 10B, intraTMP is an intra-prediction mode similar to IBC in that CU1022 is currently predicted by a sample block from the current image. intraTMP can only be selected as a prediction mode for CUs with a size of 64x64 or less. However, unlike IBC, in intraTMP, the block vector 1024 is not signaled in the bitstream. Instead, the decoder 201 compares a given L-shaped template or other shaped template of a reconstructed sample adjacent to the current CU1022 with a template of the same shape of a candidate predictor in a given search region. If the template is L-shaped, the adjacent samples to the left and above the current CU1022 or intraTMP predictor 1026 are used. Let TmpW be the width of the left template region and TmpH be the height of the upper template region. Other template shapes include a left template and an upper template, where the left template includes only the left template region and the upper template includes only the upper template region.
[0061] The intraTMP predictor block is determined by finding the best candidate template that matches the current CU template. The best match can be determined by finding the template that minimizes the sum of absolute differences (SAD) or the sum of absolute transformed differences (SATD), or by comparing hashes between templates. The search algorithm through the search domain can be exhaustive (e.g., scanning templates in the search domain with a shift in sample resolution) or fast (e.g., performing a coarse search first, then a localized detailed search around the best match from the coarse search). In any case, the search algorithm is executed identically by both encoder 101 and decoder 201, so the intraTMP predictor is implicitly known to encoder 101 and decoder 201 without signaling in the bitstream. Figure 10B shows an example of intraTMP, where the current CU template and the best matching template are shown in hatched shading.
[0062] Continuing to refer to Figure 10B, for the intraTMP predictor 1026 to be selected, the sample block corresponding to the intraTMP predictor 1026 must be completely contained within the search area. The search area is shown by a dashed shade in Figure 10B. Currently within CTU1020, the search area is limited to a rectangular block of samples, with one corner constrained by the upper left corner of CTU1020 and the other corner constrained by the upper left corner of CU1022.
[0063] Currently, outside of CTU1020, the search region is limited by imposing a maximum length on the intraTMP block vector (searchRangeWidth1028, searchRangeHeight1030), where searchRangeWidth1028 and searchRangeHeight1030 are set proportionally to the dimensions of CU1022. That is, searchRangeWidth = a * BlkW and searchRangeHeight = a * BlkH, where "a" is a constant controlling the gain / complexity tradeoff, and BlkW and BlkH are the width and height of CU1022, respectively. Here, "a" is set to 5 in the ECM-7.0 test software. searchRangeHeight1030 limits the length of the block vector only in the negative vertical direction (i.e., upwards in the image). For block vectors with a positive vertical component, the search region is currently limited by the lower boundary of the CTU row. For example, in Figure 10B, the search area extends to the lower boundary of the left CTU 1032, regardless of the value of searchRangeHeight 1030. Furthermore, these limitations on the search range do not apply to the current CTU 1020. For example, even in the case of a small CU where searchRangeWidth 1028 and searchRangeHeight 1030 may be smaller than the dimensions of the current CTU 1020, the search area still extends to the upper left corner of the current CTU 1020.
[0064] In addition to the constraints imposed by the search region, the intraTMP predictor 1026 and its template must consist of samples available for intra-prediction. For example, the search region boundary is still overridden by image, slice, or tile boundaries. Currently, the coordinates of the upper-left corner of the current CU relative to the image are (currCuX, currCuY). In this case, the left boundary of the intraTMP search region is initially intraTmpLeftBound = CurrCUx - SearchRangeWidth. To take image boundaries into account, the left boundary is clipped to allow for a TmpW sample width for the predictor template: intraTmpLeftBound = max(intraTmpLeftBound, TmpW).
[0065] To speed up the template matching process, the search area is initially traversed horizontally or vertically in increments of 2 pixels at a time. This is also known as the search subsampling factor 2. This reduces the search complexity of template matching by a quarter. After finding the optimal match from the initial search, a refinement process is performed. Refinement is carried out through a second template matching search in a reduced range around the optimal match. In ECM-7.0, the reduced range is set to BlkH / 2.
[0066] Figure 10C shows a diagram of the extended search area for intraTMP1003 according to some embodiments of the present invention.
[0067] Referring to Figure 10C, for small CUs, the search range can become excessively restrictive, making it difficult to obtain good predictors. For example, a 4x4 CU is only allowed a maximum search range of (20,20). To improve this situation for small CUs, some implementations impose a minimum limit on the intraTMP search range. For example, searchRangeWidth=max(a*BlkW,minSearchRange) and searchRangeHight=max(a*BlkH,minSearchRange), where minSearchRange is set to 128.
[0068] This implementation excludes some areas currently available for prediction within CTU1020. Here, it is proposed to expand the search area within CTU1020 to include the areas directly above and to the left of CU1022. The proposed modified search area is shown in Figure 10C, where the areas added to the search area compared to Figure 10B are marked with hatched shading.
[0069] Figure 11 shows a diagram of fractional pixel position 1100 for intraTMP according to some embodiments of the present invention.
[0070] Referring to Figure 11, ECM-9.0 employs a multi-candidate intraTMP. A candidate list is constructed, candidate BVs are arranged in ascending order of their template matching cost, and the index of the selected candidate is signaled in a bitstream.
[0071] In ECM-9.0, fractional pixel precision is enabled for intraTMP. More specifically, an intraTMP block can have a BV with a 1 / 4 pixel fractional resolution. Three fractional pixel offsets (e.g., 1 / 2 pixel, 1 / 4 pixel, and 3 / 4 pixel) are supported in eight directions around an integer pixel position, thereby generating the fractional pixel positions shown in Figure 11. When a non-zero fractional pixel offset is signaled, a direction index is signaled to indicate which direction is used. A 4-tap DCT-IF interpolation filter in ECM is used for sub-pixel interpolation in intraTMP.
[0072] ECM-9.0 further employs an intraTMP prediction block for model derivation. Model parameters are derived using the current block template and the corresponding matching template. The prediction block is obtained by applying the model and filtering it against the reference block.
[0073] ECM-9.0 employs a fusion method that uses a Wiener filter-based weight derivation method to blend multiple reference blocks and derive the final prediction block. The block vectors (BVs) of these reference blocks are obtained through a template matching search process.
[0074] ECM-9.0 utilizes three additional intraTMP modes, such as left template, top template, and L-shaped fusion mode. The left template and top template modes derive template matching candidates using only the left or top template, while the L-shaped fusion mode uses both the left and top templates. The fusion mode fuses the two best or five best L-shaped candidates using a linear combination formula based on template matching cost or mean-squared error (MSE) minimization.
[0075] In the current ECM-9.0, the syntax related to intraTMP is as shown in Table 1.
[0076] [Table 1]
[0077] Referring to Table 1, intra_tmp_flag indicates whether the current block's intra-prediction type is intraTMP, intra_tmp_fusion_flag indicates whether fusion is used for the current block, and intra_tmp_fusion_idx specifies the candidate set used for intraTMP fusion. The range of intra_tmp_fusion_idx is from 0 to 2, and intra_tmp_fusion_idx is used to indicate one of three candidate sets {BV0~BV4}, {BV5~BV9}, and {BV10~BV14}. intra_tmp_fusion_weight_type indicates whether a SAD-based weight derivation method or a Wiener-filter-based weight derivation method is used. intra_tmp_idx specifies the index of the BV in the candidate list used for the current block. The range of intra_tmp_idx is from 0 to 18. Candidates from the L-shaped template, top template, and left template are included in the same candidate list. intra_tmp_sub_pel_precision_idx specifies the precision index of the current block. intra_tmp_sub_pel_precision_idx ranges from 0 to 3, used to indicate integer pixel precision, 1 / 2 pixel precision, 1 / 4 pixel precision, and 3 / 4 pixel precision, respectively. intra_tmp_sub_pel_direction_idx specifies the sub-pixel direction index of the current block. intra_tmp_sub_pel_phase_idx ranges from 0 to 7.
[0078] In the current ECM-9.0, an intraTMP block can be encoded with fractional pixel BV resolution only if the block is not encoded as a fused intraTMP (e.g., intra_tmp_fusion_flag is 1) or a filtered intraTMP (e.g., intra_tmp_filter_flag is 1). If a block is encoded as a fused intraTMP or a filtered intraTMP, the intraTMP block will only have integer pixel (full pixel) resolution BV.
[0079] After an intraTMP block is encoded, regardless of whether the current intraTMP encoded block has an integer pixel BV or a quarter-pixel fractional resolution BV, only the integer pixel BV information of the current intraTMP block is stored for future block encoding. More specifically, if the current intraTMP has a quarter-pixel fractional BV, the quarter-pixel fractional BV is first rounded to an integer pixel resolution. Then, the integer pixel BV is converted to a 1 / 16 pixel resolution (the current integer pixel BV is left-shifted by 4). The converted 1 / 16 pixel resolution BV is stored in ECM-9.0 for future block encoding.
[0080] Referring again to Figure 3, the conversion module 308 can convert the video signal in the residual block from the pixel domain to the conversion domain (e.g., the frequency domain depending on the conversion method). It should be understood that in some examples, the conversion module 308 may be skipped and the video signal may not need to be converted to the conversion domain.
[0081] The quantization module 310 may be configured to quantize the coefficients of each position in the coding block to generate a quantization level of the position. The current block may be a residual block. That is, the quantization module 310 can perform a quantization process on each residual block. A residual block may contain N × M positions (samples), each position associated with a transformed or untransformed video signal / data (e.g., luminance information and / or chromaticity information), where N and M are positive integers. In this invention, before quantization, the transformed or untransformed video signal at a particular position is referred to herein as a “coefficient”. After quantization, the quantized value of the coefficient is referred herein as a “quantization level” or “level”.
[0082] Quantization can be used to reduce the dynamic range of a converted or unconverted video signal, resulting in fewer bits being used to represent the video signal. Quantization typically involves division by the quantization step size followed by rounding, while inverse quantization (also called dequantization) involves multiplication by the quantization step size. The quantization step size can be denoted by the quantization parameter (QP). This type of quantization is called scalar quantization. Quantization of all coefficients within a coding block can be performed independently, and this method of quantization is used in several existing video compression standards, such as H.264 / AVC and H.265 / HEVC. The QP in quantization can affect the bitrate used to encode / decode the video image. For example, a higher QP may result in a lower bitrate, and a lower QP may result in a higher bitrate.
[0083] In the case of an N×M coded block, the two-dimensional (2D) coefficients of the block can be converted to a one-dimensional (1D) order in a specific coded scan order for the quantization and coding of the coefficients. Typically, the coded scan starts from the top-left corner of the coded block and stops at the bottom-right corner, or the last non-zero coefficient / level in the bottom-right direction. The coded scan order can include any suitable order, such as a "Z"-shaped scan order, a vertical (column) scan order, a horizontal (row) scan order, a diagonal scan order, or any combination thereof. The quantization of coefficients within the coded block can utilize coded scan order information. For example, it may depend on the state of the previous quantization level in the coded scan order. To further improve coding efficiency, the quantization module 310 may use multiple quantizers (e.g., two scalar quantizers). Which quantizer is used to quantize the current coefficient may depend on the information preceding the current coefficient in the coded scan order. Such a quantization process is called dependent quantization.
[0084] Referring to Figure 3, the encoding module 320 may be configured to encode the quantization level at each position in the encoding block into a bitstream. In some embodiments, the encoding module 320 can perform entropy coding on the encoding block. Entropy coding can convert each quantization level into a corresponding binary representation (e.g., binary bin) using various binarization methods (e.g., Golomb-Rice binarization). The binary representation can then be further compressed using an entropy coding algorithm. The compressed data can be added to the bitstream. In addition to quantization levels, the encoding module 320 can encode various other information, such as block type information, prediction mode information, partition unit information, prediction unit information, transmission unit information, motion vector information, reference frame information, block interpolation information, and filtering information input from, for example, prediction modules 304 and 306. In some embodiments, the encoding module 320 can perform residual coding on the encoding block to convert the quantization levels into a bitstream. For example, after quantization, there may be N×M quantization levels for an N×M block. These N × M levels can be zero or non-zero values. If the non-zero levels are not binary, they can be further binarized to binary (bin) using, for example, a combination of truncated rice (TR) and restricted EGk binarization.
[0085] Non-binary syntactic elements can be mapped to binary codewords. A bijective mapping between symbols and codewords (usually using simple structured coding) is called binarization. Binary arithmetic coding can be used to encode binary symbols (also called bins) for both binary syntactic elements and codewords used for non-binary data. The core coding engine of context-adaptive binary arithmetic coding (CABAC) can support two operating modes: a context coding mode where bins are encoded with adaptive probabilistic models, and a less complex bypass mode using a fixed probability of 1 / 2. Adaptive probabilistic models are also called contexts, and the assignment of probabilistic models to each bin is called context modeling.
[0086] As shown in Figure 3, the inverse quantization module 312 may be configured to perform inverse quantization on the quantization level, and the inverse transform module 314 may be configured to perform inverse transform on the coefficients transformed by the transform module 308. The reconstructed residual blocks generated by the inverse quantization module 312 and the inverse transform module 314 can be combined with the prediction units predicted by the prediction module 304 or 306 to generate a reconstructed block.
[0087] The filter module 316 may include at least one of the following: a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF). The deblocking filter can remove block distortion generated by the boundaries between blocks in the reconstructed image. The SAO module can correct the offset to the original video on a pixel-by-pixel basis for the video on which deblocking has been performed. The ALF can be performed based on values obtained by comparing the reconstructed and filtered video with the original video. The buffer module 318 may be configured to store the reconstructed blocks or images calculated by the filter module 316, and can provide the reconstructed and stored blocks or images to the inter-prediction module 304 when inter-prediction is performed.
[0088] Figure 4 shows a detailed block diagram of an exemplary decoder 201 in the decoding system 200 of Figure 2, according to several embodiments of the present invention. As shown in Figure 4, the decoder 201 may comprise a decoding module 402, an inverse quantization module 404, an inverse transform module 406, an interpretation module 408, an intrapretation module 410, a filter module 412, and a buffer module 414. Each element shown in Figure 4 is shown independently to represent a different characteristic function in the video decoder, and it should be understood that this does not mean that each component is formed by a separate hardware configuration unit or a single piece of software. In other words, for the sake of explanation, each element is included as an independent element, and at least two elements may be combined to form a single element, or one element may be divided into multiple elements to perform functions. Also, it should be understood that some elements are not essential for performing the functions described in the present invention, but are optional elements for improving performance. Furthermore, it should be understood that these elements can be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether these elements are implemented as hardware, firmware, or software depends on the specific application and design constraints imposed on decoder 201.
[0089] When a video bitstream is input from a video encoder (e.g., encoder 101), the input bitstream can be decoded by decoder 201 in the reverse process of the video encoder process. Therefore, for the sake of explanation, some details of the decoding described above with respect to encoding can be omitted. Decoding module 402 may be configured to decode the bitstream to obtain various information encoded in the bitstream, such as the quantization level of each position in the coding block. In some embodiments, decoding module 402 may perform entropy decoding (decompression) corresponding to the entropy coding (compression) performed by the encoder, such as video local-area network (VLC) coding, context-adaptive variable-length coding (CAVLC), CABAC, syntax-based binary arithmetic coding (SBAC), PIPE coding, etc., to obtain a binary representation (e.g., binary bin). The decoding module 402 may further convert the binary representation to a quantization level using Golomb-Rice binarization (including, for example, EGk binarization and combinations of TR and restricted EGk binarization). In addition to the quantization level of the position in the conversion unit, the decoding module 402 may decode various other information, such as parameters used for Golomb-Rice binarization (e.g., Rice parameters), block type information of the coding unit, prediction mode information, partition unit information, prediction unit information, transmission unit information, motion vector information, reference frame information, block interpolation information, and filtering information. During the decoding process, the decoding module 402 may perform rearrangement on the bitstream to reconstruct and rearrange the data from a 1D order to 2D rearranged blocks through a reverse scan method based on the coding scan order used by the encoder.
[0090] The inverse quantization module 404 may be configured to perform inverse quantization on the quantization level of each position in an encoded block (e.g., a 2D reconstructed block) to obtain coefficients for each position. In some embodiments, the inverse quantization module 404 may further perform dependent inverse quantization based on quantization parameters provided by the encoder, the quantization parameters including information related to the quantizer used in the dependent quantization, such as the quantization step size used by each quantizer.
[0091] The inverse transform module 406 may be configured to perform inverse transforms, e.g., inverse discrete cosine transform (DCT), inverse DST, and inverse Karhunen-Loeve transform (KLT), on the DCT, DST, and KLT performed by the encoder, respectively, to inversely transform the data from a transformation region (e.g., coefficients) to a pixel region (e.g., luminance and / or chromaticity information). In some embodiments, the inverse transform module 406 may selectively perform the transform operation (e.g., DCT, DST, KLT) based on multiple pieces of information, such as the prediction method, the current block size, and the prediction direction.
[0092] The inter-prediction module 408 and the intra-prediction module 410 may be configured to generate prediction blocks based on information related to the generation of prediction blocks provided by the decoding module 402 and information of previously decoded blocks or images provided by the buffer module 414. As described above, when intra-prediction is performed in the same manner as the encoder operation, if the size of the prediction unit and the size of the transform unit are the same, intra-prediction can be performed on the prediction unit based on the pixels to the left of the prediction unit, the pixels to the upper left, and the pixels to the top of the prediction unit. However, when performing intra-prediction, if the size of the prediction unit is different from the size of the transform unit, intra-prediction can be performed using reference pixels based on the transform unit.
[0093] In existing intraTMP methods, integer pixel BVs are stored for the encoded intraTMP blocks. This can result in suboptimal encoding performance.
[0094] Continuing to refer to Figure 4, to overcome these and other challenges, the present invention proposes that the intra-prediction module 410 directly converts the quarter-pixel fractional BV (instead of integer pixel BV) to a 1 / 16 pixel resolution (by left-shifting the current quarter-pixel BV by 2). The intra-TMP BV thus converted is called the converted fractional pixel BV. The intra-prediction module 410 then stores the converted fractional pixel BV after encoding the fractional pixel intra-TMP block. When encoding future blocks, the stored converted fractional pixel BV can be referenced by an encoding mode that reuses the BV stored by the intra-prediction module 410. Examples of such encoding modes include IBC-advanced motion vector prediction (AMVP), IBC-merge luminance block, and direct block vector (DBV) chromaticity block encoding.
[0095] More specifically, if the current block is predicted by intraTMP and is not encoded in either fused intraTMP mode or filtering intraTMP mode, the intra prediction module 410 may directly convert the obtained quarter-pixel fractional resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 2), which can then be stored for the current block. If the current block is predicted by intraTMP and is encoded in either fused intraTMP or filtering intraTMP mode, the intra prediction module 410 may convert the integer pixel resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 4), which can then be stored for the current block.
[0096] In some implementations, the intra-prediction module 410 may obtain fractional pixel BVs by using the current block template and the reference block template on the encoded fused or filtered intraTMP block, even when the current block is encoded in either a fused or filtered intraTMP mode. More specifically, the intra-prediction module 410 may obtain integer pixel BVs by using the current block template and the reference block template, which minimizes the difference between the two templates within a given search range according to the current intraTMP process. The intra-prediction module 410 may search around the obtained integer pixel BVs for fractional pixel BVs with a specified resolution (e.g., half a pixel, quarter a pixel, or 1 / 16 a pixel) to find the reference block template and further minimize the difference between the current block template and the reference block template. As used herein, such fractional pixel BVs are referred to as template fractional pixel BVs.
[0097] As an example, a template fractional pixel BV may have three similar fractional pixel offsets in eight directions around an integer pixel position, such as 1 / 2 pixel, 1 / 4 pixel, and 3 / 4 pixel, as shown in Figure 11. As another example, a template fractional pixel BV may have an IBC-like fractional pixel, which may occupy all fractional pixels, such as 1 / 16 pixel positions.
[0098] In some implementations, the intra-prediction module 410 can still use the integer pixel BV of the filtered intraTMP block to find the reference block. The filtered reference block can then be used as a prediction for encoding the current block without any modification. After encoding the filtered intraTMP block, the template fractional pixel BV can be obtained and converted to a predetermined resolution, for example, 1 / 16 pixels. The converted template fractional pixel BV can be stored for the current block.
[0099] In some implementations, the intra-prediction module 410 may first obtain a template fractional pixel BV. The filtered intraTMP block uses the template fractional pixel BV to find a reference block. The intra-prediction module 410 may then apply filtering to the fractional pixel interpolated reference block. After encoding the filtered intraTMP block, the intra-prediction module 410 may convert it to a predetermined resolution (e.g., 1 / 16 pixels) before storing the template fractional pixel BV for the current block.
[0100] Assume there are N intraTMP reference blocks used in fused intraTMP mode. The one intraTMP reference block with the smallest SAD is shown as the first intraTMP reference block among these N intraTMP reference blocks. Here, the intra-prediction module 410 can obtain the template fractional pixel BV by using the template of the current block and the template of the first intraTMP reference block.
[0101] In some implementations, the fused intraTMP remains unchanged. After encoding the fused intraTMP block, the intra-prediction module 410 may obtain the template fractional pixel BV and convert it to a predetermined resolution, for example, 1 / 16 pixels. The converted template fractional pixel BV can then be stored for the current block.
[0102] In some implementations, the intra-prediction module 410 may interpolate the first intraTMP reference block using the acquired template fractional pixel BV. Here, the interpolated reference block replaces the first intraTMP reference block to generate a fused intraTMP block. After encoding the current block, the template fractional pixel BV is converted to a predetermined resolution, for example, 1 / 16 pixels, and then stored for the current block.
[0103] In some implementations, the intra-prediction module 410 can further interpolate N intra-TMP reference blocks using the template fractional pixel BV obtained for the first intra-TMP reference block, and the interpolated reference block replaces the N intra-TMP reference blocks to generate a fused intra-TMP block. After encoding the current block, the template fractional pixel BV of the first intra-TMP is converted to a predetermined resolution (e.g., 1 / 16 pixels) and then stored for the current block.
[0104] In some implementations, the intra-prediction module 410 may obtain N template fractional pixel BVs using the template of the current block and the templates of N intraTMP reference blocks. The intra-prediction module 410 may further interpolate the N intraTMP reference blocks using the corresponding template fractional pixel BVs, and the interpolated N reference blocks replace the N intraTMP reference blocks to generate a fused intraTMP block. After encoding the current block, the first (optimal) template fractional pixel BV is converted to a predetermined resolution, for example, 1 / 16 pixels, and then stored for the current block.
[0105] By using the above fractional pixel technique, the coding performance of the intra-prediction module 410 can be improved.
[0106] For example, the interprediction module 408 may be configured to receive a bitstream from the encoder containing a reference frame, a current frame, and instructions for weighting coefficients associated with a multiple-hypothesis prediction (MHP) process. The interprediction module 408 may be configured to perform the MHP process on CUs located in the current frame based on search blocks in the reference frame (e.g., the reference frame and / or reference template). In some embodiments, to perform the MHP process, the interprediction module 408 may be configured to perform template matching on CUs in the current frame based on the search blocks and weighting coefficients in the reference frame to obtain motion information. In some embodiments, to perform the MHP process, the interprediction module 408 may be configured to identify the weighting coefficient index associated with the weighting coefficients based on template matching. The interprediction module 408 may be configured to identify the weighting coefficient code of the weighting coefficients based on instructions contained in the bitstream. The interprediction module performs the interprediction process based on the current frame, reference frame, weighting coefficient index, and weighting coefficient code of the weighting coefficients and decodes the bitstream.
[0107] A reconstructed block or image combined from the outputs of the inverse transform module 406 and the prediction module 408 or 410 may be provided to the filter module 412. The filter module 412 may include a deblocking filter, an offset correction module, and an ALF. The buffer module 414 stores the reconstructed image or block and may use it as a reference image or reference block for the interprediction module 408 and output a reconstructed image.
[0108] In accordance with the scope of the present invention, the encoding module 320 and the decoding module 402 may be configured to encode images of a video by employing a quantization level binarization scheme with Rice parameters adapted to the bit depth and / or bit rate in order to improve encoding efficiency.
[0109] Figure 12 shows a flowchart of an exemplary method 1200 for video decoding according to some embodiments of the present invention. Method 1200 may be performed by a system, such as a decoding system 200, a decoder 201, or an intra-prediction module 410. Method 1200 may include operations 1202-1218, as described below. It should be understood that some steps are optional, and some steps may be performed simultaneously or in an order different from that shown in Figure 12.
[0110] Referring to Figure 12, in step 1202, the system analyzes the bitstream to determine the intraTMP mode associated with the current block. For example, referring to Figures 2 and 4, the decoder 201 may analyze the bitstream encoded by the encoder 101 based on at least one flag. By analyzing the bitstream, the decoder 201 may determine the intraTMP mode enabled for the current block based on the intraTMP flag or syntactic element.
[0111] In step 1204, the system may obtain at least one fractional pixel BV to decode the current block. For example, referring to Figure 4, the intra-prediction module 410 obtains a quarter-pixel fractional BV.
[0112] In step 1206, the system may acquire a reference block based on at least one fractional pixel BV. For example, referring to Figure 4, the intra-prediction module 410 may acquire a reference block based on at least one fractional pixel BV.
[0113] In step 1208, the system may obtain filtered intraTMP blocks by performing a filtering process on the reference blocks. For example, referring to Figure 4, the intra-prediction module 410 may first obtain the template fractional pixel BV. The filtered intraTMP blocks find the reference blocks using the template fractional pixel BV. The intra-prediction module 410 may then apply filtering to the fractional pixel interpolated reference blocks.
[0114] In step 1210, the system may decode the current block based on the reference block. For example, referring to Figure 4, decoder 201 may decode the current block based on the reference block.
[0115] In step 1212, the system may obtain a transformed fractional pixel BV after decoding the current block, based on at least one fractional pixel BV. For example, referring to Figure 4, the fractional pixel BV (rather than the integer pixel BV) may be transformed to a 1 / 16 pixel resolution (the current quarter-pixel BV is left-shifted by 2). The intraTMP BV thus transformed is called the transformed fractional pixel BV. The intra-prediction module 410 then stores the transformed fractional pixel BV after encoding the intraTMP block. When encoding a future block, the stored transformed fractional pixel BV may be referenced by an encoding mode that reuses the BV stored by the intra-prediction module 410. Examples of such encoding modes include IBC-AMVP, IBC-merge luminance block and DBV chromaticity block encoding, and the intraTMP-merge mode. More specifically, if the current block is predicted by intraTMP and is not encoded in either fused intraTMP mode or filtering intraTMP mode, the intra prediction module 410 may directly convert the obtained quarter-pixel fractional resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 2), which can then be stored for the current block. If the current block is predicted by intraTMP and is encoded in either fused intraTMP or filtering intraTMP mode, the intra prediction module 410 may convert the integer pixel resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 4), which can then be stored for the current block.
[0116] In step 1214, the system may store the converted fractional pixel BV for decoding another block. For example, referring to Figure 4, the intra-prediction module 410 stores the converted fractional pixel BV after encoding a fractional pixel intraTMP block. When encoding a future block, the stored converted fractional pixel BV may be referenced by an encoding mode that reuses the BV stored by the intra-prediction module 410. Examples of such encoding modes include IBC-AMVP, IBC-merge luminance block, and DBV chromaticity block encoding. More specifically, if the current block is predicted by intraTMP and is not encoded in either fused intraTMP mode or filtering intraTMP mode, the intra-prediction module 410 may directly convert the obtained quarter-pixel fractional resolution BV to 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 2), which may be stored for the current block. If the current block is predicted by intraTMP and encoded in fused intraTMP or filtering intraTMP mode, the intra-prediction module 410 may convert the integer pixel resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 4), which can then be stored for the current block.
[0117] In step 1216, the system may analyze the bitstream to determine that IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. For example, referring to Figure 2, decoder 201 may analyze the bitstream to determine that IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. By analyzing the bitstream, decoder 201 may determine the intraTMP mode currently enabled for the block based on the intraTMP flag or syntax element.
[0118] In step 1218, the system may decode another block based on the converted fractional pixel BV using IBC-AMVP, IBC-merge luminance block coding, or DBV chromaticity block coding. For example, referring to Figure 2, decoder 201 may decode another block using the converted fractional pixel BV.
[0119] Figure 13 shows a flowchart of an exemplary video coding method 1300 according to some embodiments of the present invention. Method 1300 may be performed by a system, such as an coding system 100, an encoder 101, or an intra-prediction module 306. Method 1300 may include operations 1302-1318, as described below. It should be understood that some steps are optional, and some steps may be performed simultaneously or in an order different from that shown in Figure 13.
[0120] Referring to Figure 13, in step 1302, the system may obtain at least one fractional pixel BV for encoding the current block. For example, referring to Figure 3, the intra prediction module 306 obtains a quarter-pixel fractional BV.
[0121] In step 1304, the system may acquire a reference block based on at least one fractional pixel BV. For example, referring to Figure 3, the intra prediction module 306 may acquire a reference block based on at least one fractional pixel BV.
[0122] In step 1306, the system may obtain filtered intraTMP blocks by performing a filtering process on the reference blocks. For example, referring to Figure 3, the intra-prediction module 306 may first obtain the template fractional pixel BV. The filtered intraTMP blocks find the reference blocks using the template fractional pixel BV. The intra-prediction module 306 may then apply filtering to the fractional pixel interpolated reference blocks.
[0123] In step 1308, the system may encode the current block based on the reference block. For example, referring to Figure 3, encoder 101 may encode the current block based on the reference block.
[0124] In step 1310, the system may obtain a converted fractional pixel BV after encoding the current block, based on at least one fractional pixel BV. For example, referring to Figure 3, the fractional pixel BV (rather than the integer pixel BV) may be converted to a 1 / 16 pixel resolution (the current quarter-pixel BV is left-shifted by 2). The intraTMP BV thus converted is called the converted fractional pixel BV. The intra-prediction module 306 then stores the converted fractional pixel BV after encoding the intraTMP block. When encoding a future block, the stored converted fractional pixel BV may be referenced by an encoding mode that reuses the BV stored by the intra-prediction module 306. Examples of such encoding modes include IBC-AMVP, IBC-merge luminance block and DBV chromaticity block encoding, and the intraTMP-merge mode. More specifically, if the current block is predicted by intraTMP and is not encoded in either fused intraTMP mode or filtering intraTMP mode, the intra prediction module 306 may directly convert the obtained quarter-pixel fractional resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 2), which can then be stored for the current block. If the current block is predicted by intraTMP and is encoded in either fused intraTMP or filtering intraTMP mode, the intra prediction module 306 may convert the integer pixel resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 4), which can then be stored for the current block.
[0125] In step 1312, the system may store the converted fractional pixel BV for encoding another block. For example, referring to Figure 3, the intra-prediction module 306 stores the converted fractional pixel BV after encoding a fractional pixel intraTMP block. When encoding a future block, the stored converted fractional pixel BV may be referenced by an encoding mode that reuses the BV stored by the intra-prediction module 306. Examples of such encoding modes include IBC-AMVP, IBC-merge luminance block, and DBV chromaticity block encoding. More specifically, if the current block is predicted by intraTMP and is not encoded in either fused intraTMP mode or filtering intraTMP mode, the intra-prediction module 306 may directly convert the obtained quarter-pixel fractional resolution BV to 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 2), which may be stored for the current block. If the current block is predicted by intraTMP and encoded in fused intraTMP or filtering intraTMP mode, the intra-prediction module 306 may convert the integer pixel resolution BV to a 1 / 16 pixel resolution (left-shifting the current quarter-pixel BV by 4), which can then be stored for the current block.
[0126] In step 1314, the system may analyze the bitstream and determine that IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. For example, referring to Figure 1, encoder 101 may analyze the bitstream and determine that IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. By analyzing the bitstream, encoder 101 may determine the intraTMP mode currently enabled for the block based on the intraTMP flag or syntax element.
[0127] In step 1316, the system may encode another block based on the converted fractional pixel BV using IBC-AMVP, IBC-merge luminance block coding, or DBV chromaticity block coding. For example, referring to Figure 1, encoder 101 may encode another block using the converted fractional pixel BV.
[0128] In step 1318, the system may encode the intraTMP mode associated with the current block into a bitstream. For example, referring to Figures 1 and 3, encoder 101 may encode the intraTMP mode enabled for the current block into a bitstream based on the intraTMP flag or syntax element.
[0129] In each embodiment of the present invention, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored as instructions on a non-temporary computer-readable medium. The computer-readable medium includes computer storage media. The storage medium may be any available medium that can be accessed by a processor (such as processor 102 in Figures 1 and 2). As a non-limiting example, such computer-readable media may include RAM, ROM, EEPROM, CD-ROM, or other optical disk storage devices, HDDs (such as magnetic disk storage devices or other magnetic storage devices), flash drives, SSDs, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and can be accessed by a processing system (such as a mobile device or computer). As used herein, magnetic disks and optical disks include CDs, laser optical disks, optical disks, digital video discs (DVDs), and floppy disks, where magnetic disks typically reproduce data magnetically and optical disks reproduce data optically with a laser. Any combination of the above should also be included within the scope of computer-readable media.
[0130] According to one aspect of the present invention, a decoding method is provided which is performed by a decoder. The method may include the processor analyzing a bitstream to determine the intraTMP mode associated with the current block. The method may include the processor obtaining at least one fractional pixel BV for decoding the current block. The method may include the processor obtaining a reference block based on at least one fractional pixel BV. The method may include the processor decoding the current block based on the reference block. The method may include the processor obtaining a converted fractional pixel BV after decoding the current block based on at least one fractional pixel BV. The method may include the processor storing the converted fractional pixel BV for decoding another block.
[0131] In some implementations, the method may include the processor analyzing the bitstream to determine whether IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. In some implementations, the method may include the processor decoding another block using IBC-AMVP, IBC-merge luminance block coding, or DBV chromaticity block coding based on the converted fractional pixel BV.
[0132] In some implementations, obtaining at least one fractional pixel BV for decoding the current block by the processor may include obtaining an integer pixel BV by minimizing the difference between the template of the current block and the template of the reference block within a predefined search range. In some implementations, obtaining at least one fractional pixel BV for decoding the current block may include obtaining at least one fractional pixel BV by performing a search on an integer pixel BV using the resolution associated with at least one fractional pixel BV. In some implementations, at least one fractional pixel BV may be a template fractional pixel BV.
[0133] In some implementations, the method may involve the processor performing a filtering operation on the reference block to obtain a filtered intraTMP block. In some implementations, the current block may be decrypted based on the filtered intraTMP block.
[0134] In some implementations, the intraTMP mode may include a fused intraTMP mode. In some implementations, the fused intraTMP mode may be associated with N reference blocks. In some implementations, the processor obtaining a reference block based on at least one fractional pixel BV may include the processor obtaining a first reference block from among the N reference blocks that has the smallest SAD. In some implementations, the template of the reference block used to obtain the template fractional pixel BV may be the template of the first reference block.
[0135] In some implementations, obtaining a reference block based on at least one fractional pixel BV by the processor may include the processor interpolating the first reference block to obtain an interpolated reference block. In some implementations, obtaining a reference block based on at least one fractional pixel BV by the processor may include the processor generating a fused intraTMP block based on the interpolated reference block. In some implementations, the current block may be decoded based on the fused intraTMP block.
[0136] In some implementations, obtaining at least one fractional pixel BV for decoding the current block by the processor may involve the processor obtaining N template fractional pixel BVs using the template of the current block and the corresponding templates of N intraTMP reference blocks. In some implementations, obtaining a reference block based on at least one fractional pixel BV by the processor may involve the processor interpolating each of the N associated intraTMP reference blocks using the corresponding one template fractional pixel BV in the N template fractional pixel BVs to generate a fused intraTMP block, which is the reference block. In some implementations, the current block may be decoded based on the fused intraTMP block.
[0137] According to another aspect of the present invention, a device for decoding is provided. The device may include a processor and a memory storing instructions. The memory stores instructions that, when executed by the processor, cause the processor to analyze a bitstream and determine the intraTMP mode associated with the current block. The memory stores instructions that, when executed by the processor, cause the processor to obtain at least one fractional pixel BV for decoding the current block. The memory stores instructions that, when executed by the processor, cause the processor to obtain a reference block based on at least one fractional pixel BV. The memory stores instructions that, when executed by the processor, cause the processor to decode the current block based on a reference block. The memory stores instructions that, when executed by the processor, cause the processor to obtain a converted fractional pixel BV after decoding the current block based on at least one fractional pixel BV. The memory stores instructions, and when these instructions are executed by the processor, the processor can store the converted fractional pixel BV for decoding another block.
[0138] In some implementations, memory stores instructions that, when executed by the processor, cause the processor to analyze a bitstream and determine whether IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. In some implementations, memory stores instructions that, when executed by the processor, cause the processor to decode another block using IBC-AMVP, IBC-merge luminance block coding, or DBV chromaticity block coding based on the converted fractional pixel BV.
[0139] In some implementations, memory stores an instruction to obtain at least one fractional pixel BV for decoding the current block, and when executed by the processor, the instruction may cause the processor to obtain an integer pixel BV by minimizing the difference between the template of the current block and the template of the reference block within a predefined search range. In some implementations, memory stores an instruction to obtain at least one fractional pixel BV for decoding the current block, and when executed by the processor, the instruction may cause the processor to obtain at least one fractional pixel BV by performing a search on an integer pixel BV using the resolution associated with at least one fractional pixel BV, and the at least one fractional pixel BV is a template fractional pixel BV.
[0140] In some implementations, memory stores instructions, and when these instructions are executed by the processor, the processor may obtain filtered intraTMP blocks by performing a filtering operation on the reference blocks. In some implementations, the current blocks may be decoded based on the filtered intraTMP blocks.
[0141] In some implementations, the intraTMP mode may include a fused intraTMP mode. In some implementations, the fused intraTMP mode is associated with N reference blocks. In some implementations, memory stores an instruction to acquire a reference block based on at least one fractional pixel BV, which, when executed by the processor, causes the processor to acquire a first reference block from among the N reference blocks that has the smallest absolute error sum (SAD). In some implementations, the template of the reference block used to acquire the template fractional pixel BV may be the template of the first reference block.
[0142] In some implementations, memory stores an instruction to acquire a reference block based on at least one fractional pixel BV, and when the instruction is executed by the processor, the processor may interpolate a first reference block to acquire an interpolated reference block. In some implementations, memory stores an instruction to acquire a reference block based on at least one fractional pixel BV, and when the instruction is executed by the processor, the processor may generate a fused intraTMP block based on the interpolated reference block. In some implementations, the current block may be decoded based on the fused intraTMP block.
[0143] In some implementations, memory stores an instruction to obtain at least one fractional pixel BV for decoding the current block, and when executed by the processor, the instruction may cause the processor to obtain N template fractional pixel BVs using the template of the current block and the corresponding templates of N intraTMP reference blocks. In some implementations, memory stores an instruction to obtain a reference block based on at least one fractional pixel BV, and when executed by the processor, the instruction may cause the processor to generate a fused intraTMP block by interpolating each intraTMP reference block in the associated N intraTMP reference blocks using the corresponding one template fractional pixel BV in the N template fractional pixel BVs. In some implementations, the fused intraTMP block may be a reference block. In some implementations, the current block may be decoded based on the fused intraTMP block.
[0144] According to another aspect of the present invention, a non-temporary computer-readable medium for storing instructions is provided. When the instructions are executed by a processor, the processor can cause the processor to analyze a bitstream to determine the intraTMP mode associated with the current block. When the instructions are executed by a processor, the processor can cause the processor to acquire at least one fractional pixel BV for decoding the current block. When the instructions are executed by a processor, the processor can cause the processor to acquire a reference block based on at least one fractional pixel BV. When the instructions are executed by a processor, the processor can cause the processor to decode the current block based on the reference block. When the instructions are executed by a processor, the processor can cause the processor to acquire a converted fractional pixel BV after decoding the current block based on at least one fractional pixel BV. When the instructions are executed by a processor, the processor can cause the processor to store the converted fractional pixel BV for decoding another block.
[0145] In some implementations, when the instruction is executed by the processor, the processor may parse the bitstream and determine that IBC-AMVP, IBC-merge luminance block coding, and DBV chromaticity block coding are enabled for another block. In some implementations, when the instruction is executed by the processor, the processor may decode another block using IBC-AMVP, IBC-merge luminance block coding, or DBV chromaticity block coding based on the converted fractional pixel BV.
[0146] In some implementations, to obtain at least one fractional pixel BV for decoding the current block, the instruction, when executed by the processor, may cause the processor to obtain an integer pixel BV by minimizing the difference between the template of the current block and the template of the reference block within a predefined search range. In some implementations, to obtain at least one fractional pixel BV for decoding the current block, the instruction, when executed by the processor, may cause the processor to obtain at least one fractional pixel BV by performing a search on an integer pixel BV using the resolution associated with at least one fractional pixel BV, the at least one fractional pixel BV being a template fractional pixel BV.
[0147] In some implementations, when the instruction is executed by the processor, the processor may perform a filtering operation on the reference block to obtain a filtered intraTMP block. In some implementations, the current block may be decoded based on the filtered intraTMP block.
[0148] In some implementations, the intraTMP mode may include a fused intraTMP mode. In some implementations, the fused intraTMP mode is associated with N reference blocks. In some implementations, in order to acquire a reference block based on at least one fractional pixel BV, the instruction may, when executed by the processor, cause the processor to acquire a first reference block from among the N reference blocks that has the smallest absolute error sum (SAD). In some implementations, the template of the reference block used to acquire the template fractional pixel BV may be the template of the first reference block.
[0149] In some implementations, in order to obtain a reference block based on at least one fractional pixel BV, the instruction, when executed by the processor, may cause the processor to interpolate a first reference block to obtain an interpolated reference block. In some implementations, in order to obtain a reference block based on at least one fractional pixel BV, the instruction, when executed by the processor, may cause the processor to generate a fused intraTMP block based on the interpolated reference block. In some implementations, the current block may be decoded based on the fused intraTMP block.
[0150] In some implementations, in order to obtain at least one fractional pixel BV for decoding the current block, the instruction, when executed by the processor, may cause the processor to obtain N template fractional pixel BVs using the template of the current block and the corresponding templates of the N intraTMP reference blocks. In some implementations, in order to obtain a reference block based on at least one fractional pixel BV, the instruction, when executed by the processor, may cause the processor to generate a fused intraTMP block by interpolating each intraTMP reference block in the associated N intraTMP reference blocks using the corresponding one template fractional pixel BV in the N template fractional pixel BVs. In some implementations, the fused intraTMP block may be a reference block. In some implementations, the current block may be decoded based on the fused intraTMP block.
[0151] According to yet another aspect of the present invention, an encoding method is provided which is performed by an encoder. The method may include the processor obtaining at least one fractional pixel BV for encoding the current block. The method may include the processor obtaining a reference block based on at least one fractional pixel BV. The method may include the processor encoding the current block based on the reference block. The method may include the processor obtaining a converted fractional pixel BV after encoding the current block based on at least one fractional pixel BV. The method may include the processor storing the converted fractional pixel BV for decoding another block. The method may include the processor encoding the intraTMP mode associated with the current block into a bitstream.
[0152] According to yet another aspect of the present invention, an apparatus for encoding is provided. The apparatus may comprise a processor and a memory storing instructions. The memory stores instructions that, when executed by the processor, cause the processor to acquire at least one fractional pixel BV for encoding the current block. The memory stores instructions that, when executed by the processor, cause the processor to acquire a reference block based on at least one fractional pixel BV. The memory stores instructions that, when executed by the processor, cause the processor to encode the current block based on a reference block. The memory stores instructions that, when executed by the processor, cause the processor to acquire a converted fractional pixel BV after encoding the current block based on at least one fractional pixel BV. The memory stores instructions that, when executed by the processor, cause the processor to store a converted fractional pixel BV for encoding another block. The memory stores instructions that, when executed by the processor, cause the processor to encode the intraTMP mode associated with the current block into a bitstream.
[0153] According to another aspect of the present invention, a non-temporary computer-readable medium for storing encoder instructions is provided. When executed by a processor, the instructions cause the processor to acquire at least one fractional pixel BV for encoding the current block. When executed by a processor, the instructions cause the processor to acquire a reference block based on at least one fractional pixel BV. When executed by a processor, the instructions cause the processor to encode the current block based on the reference block. When executed by a processor, the instructions cause the processor to acquire a converted fractional pixel BV after encoding the current block based on at least one fractional pixel BV. When executed by a processor, the instructions cause the processor to store the converted fractional pixel BV for encoding another block. When executed by a processor, the instructions cause the processor to encode the intraTMP mode associated with the current block into a bitstream.
[0154] Since the above description clarifies the general nature of the present invention, those skilled in the art can readily modify and / or adapt such examples to various applications without departing from the general concept of the present invention or without excessive experimentation. Accordingly, based on the teachings and guidance presented herein, such adaptations and modifications are intended to be within the meaning and scope of equivalents of the disclosed examples. It should be understood that the terms and technical terms herein are for illustrative purposes only, not limiting purposes, and that they should be interpreted by those skilled in the art in accordance with the teachings and guidance.
[0155] Embodiments of the present invention are described above relying on function building blocks that illustrate the implementation of specified functions and their relationships. For convenience of explanation, the boundaries of these function building blocks are arbitrarily defined herein. Alternative boundaries can be defined as long as the specified functions and their relationships are properly performed.
[0156] The sections on the description and abstract of the invention may describe, but not all, exemplary embodiments of the invention as envisioned by the inventors, and are therefore not intended to limit the scope of the invention and the appended claims in any way.
[0157] Various functional blocks, modules, and steps have been disclosed above. The arrangements provided are illustrative and not limiting. Therefore, functional blocks, modules, and steps may be rearranged or combined in ways different from the examples provided above. Similarly, some embodiments may include only a subset of functional blocks, modules, and steps, and such subsets are also acceptable.
[0158] The breadth and scope of the present invention should not be limited by any of the exemplary embodiments described above, but should be defined solely in accordance with the appended claims and their equivalents.
Claims
1. A decoding method performed by a decoder, The processor analyzes the bitstream to determine the intra-template matching prediction (intraTMP) mode associated with the current block, The processor obtains at least one fractional pixel block vector (BV) for decoding the current block, The processor obtains a reference block based on the at least one fractional pixel BV, The processor decodes the current block based on the reference block, The processor decodes the current block based on the at least one fractional pixel BV and then obtains the converted fractional pixel BV. A decoding method comprising storing the converted fractional pixel BV for decoding another block using the processor.
2. The aforementioned decoding method is The processor analyzes the bitstream and determines that intra-block copy (IBC)-advanced motion vector prediction (AMVP), IBC-merge luminance block, and direct block vector (DBV) chromaticity block coding should be enabled for another block. The processor further includes decoding the other block based on the converted fractional pixel BV using the IBC-AMVP, IBC-merge luminance block, or DBV chromaticity block coding, The decoding method according to claim 1.
3. The processor obtains the at least one fractional pixel block vector (BV) for decoding the current block, The processor obtains integer pixel BV by minimizing the difference between the template of the current block and the template of the reference block within a predefined search range. The process includes the following: the processor performs a search on the integer pixel BV using the resolution associated with the at least one fractional pixel BV to obtain the at least one fractional pixel BV, wherein the at least one fractional pixel BV is a template fractional pixel BV. The decoding method according to claim 1.
4. The aforementioned decoding method is The process further includes obtaining a filtered intraTMP block by performing a filtering operation on the reference block using the aforementioned processor, The current block is decrypted based on the filtered intraTMP block. The decoding method according to claim 3.
5. The aforementioned intraTMP mode includes a fused intraTMP mode. The aforementioned fused intraTMP mode is associated with N reference blocks, The processor obtains the reference block based on the at least one fractional pixel BV, The processor includes obtaining a first reference block from among the N reference blocks that has the smallest absolute sum of errors (SAD), The template of the reference block used to obtain the template fractional pixel BV is the template of the first reference block. The decoding method according to claim 4.
6. The processor obtains the reference block based on the at least one fractional pixel BV, The processor interpolates the first reference block and obtains the interpolated reference block. The processor includes generating a fused intraTMP block based on the interpolated reference block, The current block is decoded based on the fused intraTMP block. The decoding method according to claim 5.
7. The processor obtains at least one fractional pixel block vector (BV) for decoding the current block, The processor includes obtaining N template fractional pixel BVs using the template of the current block and the corresponding templates of the N intraTMP reference blocks, The processor obtains the reference block based on the at least one fractional pixel BV, The processor includes interpolating each of the associated N intraTMP reference blocks using one corresponding template fractional pixel BV in the N template fractional pixel BVs to generate a fused intraTMP block, wherein the fused intraTMP block is the reference block. The current block is decoded based on the fused intraTMP block. The decoding method according to claim 5.
8. A device for decoding, A processor and memory configured to store instructions, When the aforementioned instruction is executed by the processor, the processor will be instructed to: The bitstream is analyzed to determine the intra-template matching prediction (intraTMP) mode associated with the current block, Obtaining at least one fractional pixel block vector (BV) for decoding the current block, Obtaining a reference block based on the aforementioned at least one fractional pixel BV, Decoding the current block based on the aforementioned reference block, After decoding the current block based on the aforementioned at least one fractional pixel BV, the transformed fractional pixel BV is obtained. A decryption device that performs an operation including storing the converted fractional pixel BV for the decryption of another block.
9. The memory contains, when executed by the processor, the processor, Analyze the bitstream to determine whether intra-block copy (IBC)-advanced motion vector prediction (AMVP), IBC-merge luminance block, and direct block vector (DBV) chromaticity block coding are enabled for other blocks, Instructions are stored to perform an operation that includes decoding another block using IBC-AMVP, IBC-Merge luminance block coding, or DBV chromaticity block coding based on the converted fractional pixel BV. The apparatus for decoding according to claim 8.
10. In order to obtain the at least one fractional pixel block vector (BV) for decoding the current block, the memory contains, when executed by the processor, the processor The integer pixel BV is obtained by minimizing the difference between the template of the current block and the template of the referenced block within a predefined search range, An instruction is stored to perform an operation which includes: searching for the integer pixel BV using the resolution associated with the at least one fractional pixel BV to obtain the at least one fractional pixel BV, wherein the at least one fractional pixel BV is a template fractional pixel BV. The apparatus for decoding according to claim 8.
11. The memory contains, when executed by the processor, the processor, A command is stored to perform an operation to obtain the filtered intraTMP block by performing a filtering process on the aforementioned reference block. The current block is decrypted based on the filtered intraTMP block. The apparatus for decoding according to claim 10.
12. The aforementioned intraTMP mode includes a fused intraTMP mode. The aforementioned fused intraTMP mode is associated with N reference blocks, In order to obtain the reference block based on the at least one fractional pixel BV, the memory contains, when executed by the processor, the processor An instruction is stored to perform the operation of obtaining the first reference block having the smallest absolute error sum (SAD) from among the N reference blocks. The template of the reference block used to obtain the template fractional pixel BV is the template of the first reference block. The apparatus for decoding according to claim 11.
13. In order to obtain the reference block based on the at least one fractional pixel BV, the memory contains, when executed by the processor, the processor Interpolating the aforementioned first reference block to obtain the interpolated reference block, Instructions are stored to cause an operation to be performed, which includes generating a fused intraTMP block based on the interpolated reference block, The current block is decoded based on the fused intraTMP block. The apparatus for decoding according to claim 12.
14. In order to obtain the at least one fractional pixel block vector (BV) for decoding the current block, the memory contains, when executed by the processor, the processor An instruction is stored to perform the operation of obtaining N template fractional pixel BVs using the template of the current block and the corresponding templates of each of the N intraTMP reference blocks. In order to obtain the reference block based on the at least one fractional pixel BV, the memory contains the processor, Instructions are stored to perform an operation to generate a fused intraTMP block by interpolating each intraTMP reference block in the associated N intraTMP reference blocks using one corresponding template fractional pixel BV in the N template fractional pixel BV, wherein the fused intraTMP block is the reference block. The current block is decoded based on the fused intraTMP block. The apparatus for decoding according to claim 12.
15. A non-temporary computer-readable medium for storing instructions, wherein when the instructions are executed by the processor, the processor... The bitstream is analyzed to determine the intra-template matching prediction (intraTMP) mode associated with the current block, Obtaining at least one fractional pixel block vector (BV) for decoding the current block, Obtaining a reference block based on the aforementioned at least one fractional pixel BV, Decoding the current block based on the aforementioned reference block, After decoding the current block based on the aforementioned at least one fractional pixel BV, the transformed fractional pixel BV is obtained. A non-temporary computer-readable medium that causes an operation to be performed, which includes storing the converted fractional pixel BV for decoding another block.
16. When the aforementioned instruction is executed by the processor, the processor will be instructed to: Analyze the bitstream to determine whether intra-block copy (IBC)-advanced motion vector prediction (AMVP), IBC-merge luminance block, and direct block vector (DBV) chromaticity block coding are enabled for other blocks, Based on the converted fractional pixel BV, the system performs an operation that includes decoding another block using IBC-AMVP, IBC-Merge luminance block coding, or DBV chromaticity block coding. The non-temporary computer-readable medium according to claim 15.
17. In order to obtain the at least one fractional pixel block vector (BV) for decoding the current block, the instruction, when executed by the processor, causes the processor to: The integer pixel BV is obtained by minimizing the difference between the template of the current block and the template of the referenced block within a predefined search range, The process involves performing a search on the integer pixel BV using the resolution associated with the at least one fractional pixel BV, thereby obtaining the at least one fractional pixel BV, wherein the at least one fractional pixel BV is a template fractional pixel BV. The operation includes performing a filtering process on the aforementioned reference block to obtain the filtered intraTMP block, The aforementioned current block is decrypted based on the filtered intraTMP block. The non-temporary computer-readable medium according to claim 15.
18. The aforementioned intraTMP mode includes a fused intraTMP mode. The aforementioned fused intraTMP mode is associated with N reference blocks, In order to obtain the reference block based on the at least one fractional pixel BV, the instruction, when executed by the processor, causes the processor to: The operation is performed to obtain the first reference block having the smallest absolute error sum (SAD) from among the N reference blocks. The template of the reference block used to obtain the template fractional pixel BV is the template of the first reference block. The non-temporary computer-readable medium according to claim 17.
19. In order to obtain the reference block based on the at least one fractional pixel BV, the instruction, when executed by the processor, causes the processor to: Interpolating the aforementioned first reference block to obtain the interpolated reference block, The operation includes generating a fused intraTMP block based on the interpolated reference block, The current block is decoded based on the fused intraTMP block. A non-temporary computer-readable medium according to claim 18.
20. In order to obtain the at least one fractional pixel block vector (BV) for decoding the current block, the instruction, when executed by the processor, causes the processor to: Using the template of the current block and the corresponding templates of the N intraTMP reference blocks, the operation to obtain N template fractional pixel BVs is performed. In order to obtain the reference block based on the at least one fractional pixel BV, the instruction, when executed by the processor, causes the processor to: Using one corresponding template fractional pixel BV in the N template fractional pixel BVs, the operation is performed to interpolate each intraTMP reference block in the associated N intraTMP reference blocks to generate a fused intraTMP block, the fused intraTMP block being the reference block, The current block is decoded based on the fused intraTMP block. A non-temporary computer-readable medium according to claim 18.
21. An encoding method performed by an encoder, The processor obtains at least one fractional pixel block vector (BV) to encode the current block, The processor obtains a reference block based on the at least one fractional pixel BV, The processor encodes the current block based on the reference block, The processor encodes the current block based on the at least one fractional pixel BV, and then obtains the converted fractional pixel BV. The processor stores the converted fractional pixel BV for encoding another block, An encoding method comprising: the processor encoding an intra-template matching prediction (intraTMP) mode associated with the current block into a bitstream.
22. A device for encoding, A processor and memory configured to store instructions, When the aforementioned instruction is executed by the processor, the processor will be instructed to: Currently, obtain at least one fractional pixel block vector (BV) to encode the block, Obtaining a reference block based on the aforementioned at least one fractional pixel BV, Encoding the current block based on the aforementioned reference block, After encoding the current block based on the at least one fractional pixel BV, the converted fractional pixel BV is obtained. For encoding another block, the converted fractional pixel BV is stored, A device for encoding that performs operations including encoding the intra-template matching prediction (intraTMP) mode associated with the current block into a bitstream.
23. A non-temporary computer-readable medium for storing instructions, wherein when the instructions are executed by the processor, the processor... Currently, obtain at least one fractional pixel block vector (BV) to encode the block, Obtaining a reference block based on the aforementioned at least one fractional pixel BV, Encoding the current block based on the aforementioned reference block, After encoding the current block based on the at least one fractional pixel BV, the converted fractional pixel BV is obtained. For encoding another block, the converted fractional pixel BV is stored, A non-temporary, computer-readable medium that performs operations including encoding the intra-template matching prediction (intraTMP) mode associated with the current block into a bitstream.