Threshold-based filtering of template matching merge list
By employing threshold-based filtering of merge lists in template matching processes, the efficiency of video encoding and decoding is enhanced, addressing the challenges of large data sizes and optimizing merge candidate selection in video compression.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- LEE JUNG KYUNG
- Filing Date
- 2025-12-24
- Publication Date
- 2026-07-02
AI Technical Summary
Existing video encoding and decoding systems face challenges in efficiently compressing and decompressing video sequences due to large data sizes, requiring significant resources for storage and transmission, and existing threshold-based filtering methods are inadequate for optimizing merge lists in template matching processes.
Implementing threshold-based filtering of merge lists in template matching processes, utilizing intra prediction modes and block vector prediction techniques to refine and optimize the merge candidate selection, thereby enhancing the efficiency of video encoding and decoding.
Improves the compression efficiency of video sequences by reducing redundant information and optimizing merge lists, leading to more efficient storage and transmission of video data.
Smart Images

Figure IMGF000016_0001 
Figure IMGF000016_0002 
Figure IMGF000017_0001
Abstract
Description
Docket No.: 24-2066PCTTITLEThreshold-based Filtering of Template Matching Merge ListCROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U. S. Provisional Application No. 63 / 738,759, filed December 24, 2024, and No.63 / 805,281, filed May 13, 2025, all of are hereby incorporated by reference in their entireties.BRIEF DESCRIPTION OF THE DRAWINGS
[0002] Some features are shown by way of example, and not by limitation, in the accompanying drawings. In the drawings, like numerals reference similar elements.
[0003] FIG. 1 illustrates an example video coding / decoding system in which embodiments of the present disclosure may be implemented.
[0004] FIG. 2 illustrates an example encoder in which embodiments of the present disclosure may be implemented.
[0005] FIG. 3 illustrates an example decoder in which embodiments of the present disclosure may be implemented.
[0006] FIG. 4 illustrates an example quadtree partitioning of a coding tree block (CTB).
[0007] FIG. 5 illustrates an example quadtree corresponding to the example quadtree partitioning of the CTB in FIG. 4.
[0008] FIG. 6 illustrates examples of binary tree and ternary tree partitions.
[0009] FIG. 7 illustrates an example of combined quadtree and multi-type tree partitioning of a CTB.
[0010] FIG. 8 illustrates an example tree corresponding to the combined quadtree and multi-type tree partitioning of the CTB shown in FIG.7.
[0011] FIG. 9 illustrates an example set of reference samples determined for intra prediction of a current block.
[0012] FIG. 10A and FIG. 10B illustrate example intra prediction modes.
[0013] FIG. 11 illustrates an example of a current block and corresponding reference samples.
[0014] FIG. 12 illustrates an example of applying an intra prediction mode (e.g., an angular mode) for prediction of a current block.
[0015] FIG. 13A illustrates an example of inter prediction performed for a current block in a current picture.
[0016] FIG. 13B illustrates an example motion vector.
[0017] FIG. 14 illustrates an example of bi-prediction performed for a current block.
[0018] FIG. 15A illustrates example spatial candidate neighboring blocks relative to a current block being coded.
[0019] FIG. 15B illustrates example locations of two temporal, co-located blocks relative to a current block.
[0020] FIG. 16 illustrates an example of intra block copy (IBC).
[0021] FIG. 17 illustrates, for an example current block, a reference region or search area of reconstructed samples within which, in Intra Template Match Prediction (IntraTMP), a search is performed for a candidate reference block of which the template best matches the template of the current block.
[0022] FIG. 18 illustrates a current block in a current coding tree unit (CTU), and the reference region with the corresponding IntraTMP search regions R1-R6 identified.Docket No.: 24-2066PCT
[0023] FIG. 19A illustrates the sparse search stage of IntraTMP.
[0024] FIG. 19B illustrates a block vector according to IntraTMP where the block vector's refinement window is clipped at the boundary of a search region.
[0025] FIG. 20 illustrates an example of the top-template type that can be used in IntraTMP.
[0026] FIG. 21 illustrates an example of the left-template type that can be used in IntraTMP.
[0027] FIG. 22 illustrates an example of the L-shape template type that can be used in IntraTMP.
[0028] FIG. 23A and FIG. 23B illustrate adjacent blocks and non-adjacent blocks for determining block vector candidates, according to IntraTMP with merge candidates.
[0029] FIG. 24 illustrates an example IntraTMP with merge candidates process.
[0030] FIG. 25A shows an example of Auto-Relocated Block Vector Prediction (AR-BVP) applied to Intra Block Copy (IBC).
[0031] FIG. 25B shows examples of refinement windows of AR-BVP candidates derived from different positions associated with a guiding block vector.
[0032] FIG. 25C shows an example of IntraTMP with block vector predictor (BVP) candidates derived using IntraTMP AR-BVP merge.
[0033] FIG. 25D shows the IntraTMP with merge candidates process illustrated in FIG.24 modified to include AR- BVP candidates in the merge list.
[0034] FIG. 26 shows the process shown in FIG.25D, modified to include a filtering operation, according to some embodiments of this disclosure.
[0035] FIG. 27 shows example filtering operations being applied to an initial merge list according to some embodiments.
[0036] FIG. 28A, FIG. 28B, FIG.28C, and FIG.29 illustrate examples of generating a threshold and using the generated threshold for filtering the merge TMP list, according to some embodiments.
[0037] FIG. 30 illustrates a flowchart of an example method for encoding a bitstream where the process includes filtering of the merge list based on TMP cost, according to some embodiments.
[0038] FIG. 31 illustrates a flowchart of an example method for decoding a bitstream where the process includes filtering the merge list based on TMP cost, according to some embodiments.
[0039] FIG. 32 illustrates a flowchart of another example method for decoding / encoding a bitstream where the process includes filtering the merge list based on TMP cost, according to some embodiments
[0040] FIG. 33 illustrates a block diagram of an example computer system in which embodiments of the present disclosure may be implemented.DETAILED DESCRIPTION
[0041] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the disclosure. However, it will be apparent to those skilled in the art that the disclosure, including structures, systems, and methods, may be practiced without these specific details. The description and representationDocket No.: 24-2066PCTherein are the common means used by those experienced or skilled in the art to most effectively convey the substance of their work to others skilled in the art. In other instances, well-known methods, procedures, components, and circuitry have not been described in detail to avoid unnecessarily obscuring aspects of the disclosure.
[0042] References in the specification to “one embodiment," “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
[0043] Also, it is noted that individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed, but could have additional steps not included in a figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination can correspond to a return of the function to the calling function or the main function.
[0044] The term “computer-readable medium” includes, but is not limited to, portable or non-portable storage devices, optical storage devices, and various other mediums capable of storing, containing, or carrying instruction(s) and / or data. A computer-readable medium may include a non-transitory medium in which data can be stored and that does not include carrier waves and / or transitory electronic signals propagating wirelessly or over wired connections. Examples of a non-transitory medium may include, but are not limited to, a magnetic disk or tape, optical storage media such as compact disk (CD) or digital versatile disk (DVD), flash memory, memory or memory devices. A computer-readable medium may have stored thereon code and / or machine-executable instructions that may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and / or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, or the like.
[0045] Furthermore, embodiments may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks (e.g., a computer-program product) may be stored in a computer-readable or machine-readable medium. A processor(s) may perform the necessary tasks.Docket No.: 24-2066PCT
[0046] A video sequence, comprising multiple pictures / frames, may be represented in digital form for storage and / or transmission. Representing a video sequence in digital form may require a large quantity of bits. Large data sizes that may be associated with video sequences may require significant resources for storage and / or transmission. Video encoding may be used to compress a size of a video sequence for more efficient storage and / or transmission. Video decoding may be used to decompress a compressed video sequence for display and / or other forms of consumption.
[0047] FIG. 1 shows an example video coding / decoding system 100 in which embodiments of the present disclosure may be implemented. Video coding / decoding system 100 comprises a source device 102, a transmission medium 104, and a destination device 106. Source device 102 encodes a video sequence 108 into a bitstream 110 for more efficient storage and / or transmission. Source device 102 may store and / or send / transmit bitstream 110 to destination device 106 via transmission medium 104. Destination device 106 decodes bitstream 110 to display video sequence 108. Destination device 106 may receive bitstream 110 from source device 102 via transmission medium 104. Source device 102 and / or destination device 106 maybe any of a plurality of different devices (e.g., a desktop computer, laptop computer, tablet computer, smart phone, wearable device, television, camera, video gaming console, set-top box, video streaming device, etc.).
[0048] Source device 102 may comprise (e.g., for encoding video sequence 108 into bitstream 110) one or more of a video source 112, an encoder 114, and / or an output interface 116. Video source 112 may provide and / or generate video sequence 108 based on a capture of a natural scene and / or a synthetically generated scene. A synthetically generated scene may be a scene comprising computer generated graphics and / or screen content. Video source 112 may comprise a video capture device (e.g., a video camera), a video archive comprising previously captured natural scenes and / or synthetically generated scenes, a video feed interface to receive captured natural scenes and / or synthetically generated scenes from a video content provider, and / or a processor to generate synthetic scenes.
[0049] A video sequence, such as video sequence 108, may comprise a series of pictures (also referred to as frames). A video sequence may achieve an impression of motion based on successive presentation of pictures of the video sequence using a constant time interval or variable time intervals between the pictures. A picture may comprise one or more sample arrays of intensity values. The intensity values maybe taken (e.g., measured, determined, provided) at a series of regularly spaced locations within a picture. A color picture may comprise (e.g., typically comprises) a luminance sample array and two chrominance sample arrays. The luminance sample array may comprise intensity values representing the brightness (e.g., luma component, Y) of a picture. The chrominance sample arrays may comprise intensity values that respectively represent the blue and red components of a picture (e.g., chroma components, Cb and Cr) separate from the brightness. Other color picture sample arrays may be possible based on different color schemes (e.g., a red, green, blue (RGB) color scheme). A pixel, in a color picture, may refer to / comprise / be associated with all intensity values (e.g., luma component, chroma components), for a given location, in the sample arrays (e.g., three sample arrays are used for one luma component and two chroma components, respectively) used to represent color pictures. A monochrome picture may comprise a single, luminanceDocket No.: 24-2066PCTsample array. A pixel, in a monochrome picture, may refer to / comprise / be associated with the intensity value (e.g., luma component) at a given location in the single, luminance sample array used to represent monochrome pictures.
[0050] Encoder 114 may encode video sequence 108 into bitstream 110. Encoder 114 may apply / use (e.g., to encode video sequence 108) one or more prediction techniques to reduce redundant information in video sequence 108. Redundant information is information that may be predicted at a decoder and need not be transmitted to the decoder for accurate decoding of video sequence 108. For example, encoder 114 may apply spatial prediction (e.g., intra-frame or intra prediction), temporal prediction (e.g., inter-frame prediction or inter prediction), inter-layer prediction, and / or other prediction techniques to reduce redundant information in video sequence 108. Encoder 114 may partition pictures comprising video sequence 108 into rectangular regions referred to as blocks, for example, before applying one or more prediction techniques. Encoder 114 may then encode a block using the one or more of the prediction techniques.
[0051] For temporal prediction, encoder 114 may search fora block similar to the block being encoded in another picture (e.g., referred to as a reference picture) of video sequence 108. The block determined during the search (e.g., referred to as a prediction block) may then be used to predict the block being encoded. For spatial prediction, encoder 114 may form a prediction block based on data from reconstructed neighboring samples of the block to be encoded within the same picture of video sequence 108. A reconstructed sample refers to a sample that was encoded and then decoded. Encoder 114 may determine a prediction error (e.g., also referred to as a residual) based on the difference between a block being encoded and a prediction block. The prediction error may represent non-redundant information that may be sent / transmitted to a decoder for accurate decoding of video sequence 108.
[0052] Encoder 114 may apply a transform to the prediction error (e.g using a discrete cosine transform (DCT), or any other transform) to generate transform coefficients. Encoder 114 may form bitstream 110 based on the transform coefficients and other information used to determine prediction blocks using / based on prediction types, motion vectors, and / or prediction modes. Encoder 114 may perform one or more of quantization and entropy coding of the transform coefficients and / or the other information used to determine the prediction blocks, for example, before forming bitstream 110. The quantization and / or the entropy coding may further reduce the quantity of bits needed to store and / or transmit video sequence 108.
[0053] Output interface 116 maybe configured to write and / or store bitstream 110 onto transmission medium 104 for transmission to destination device 106. In addition or alternatively, output interface 116 may be configured to send / transmit, upload, and / or stream bitstream 110 to destination device 106 via transmission medium 104. Output interface 116 may comprise a wired and / or a wireless transmitter configured to send / transmit, upload, and / or stream bitstream 110 in accordance with one or more proprietary, open-source, and / or standardized communication protocols (e.g., Digital Video Broadcasting (DVB) standards, Advanced Television Systems Committee (ATSC) standards, Integrated Services Digital Broadcasting (ISDB) standards, Data Over Cable Service Interface Specification (DOCSIS) standards, 3rd Generation Partnership Project (3GPP) standards, Institute of Electrical andDocket No.: 24-2066PCTElectronics Engineers (IEEE) standards, Internet Protocol (IP) standards, Wireless Application Protocol (WAP) standards, and / or any other communication protocol).
[0054] Transmission medium 104 may comprise wireless, wired, and / or computer readable medium. For example, transmission medium 104 may comprise one or more wires, cables, air interfaces, optical discs, flash memory, and / or magnetic memory. In addition or alternatively, transmission medium 104 may comprise one or more networks (e.g., the internet) or file servers configured to store and / or send / transmit encoded video data.
[0055] Destination device 106 may decode bitstream 110 into video sequence 108 for display. Destination device 106 may comprise one or more of an input interface 118, a decoder 120, and / or a video display 122. Input interface 118 may be configured to read bitstream 110 stored on transmission medium 104 by source device 102. In addition or alternatively, input interface 118 may be configured to receive, download, and / or stream bitstream 110 from source device 102 via transmission medium 104. Input interface 118 may comprise a wired and / or a wireless receiver configured to receive, download, and / or stream bitstream 110 in accordance with one or more proprietary, open- source, standardized communication protocols, and / or any other communication protocol (e.g., such as referenced herein).
[0056] Decoder 120 may decode video sequence 108 from encoded bitstream 110. The decoder 120 may generate prediction blocks for pictures of video sequence 108 in a similar manner as encoder 114 and determine the prediction errors for the blocks, for example, to decode video sequence 108. Decoder 120 may generate the prediction blocks using / based on prediction types, prediction modes, and / or motion vectors received in bitstream 110. Decoder 120 may determine the prediction errors using the transform coefficients received in bitstream 110. Decoder 120 may determine the prediction errors by weighting transform basis functions using the transform coefficients. Decoder 120 may combine the prediction blocks and the prediction errors to decode video sequence 108. Video sequence 108 at the destination device 106 may be, or may not necessarily be, the same video sequence sent, such as video sequence 108 as sent by the source device 102. Decoder 120 may decode a video sequence that approximates video sequence 108, for example, because of lossy compression of video sequence 108 by encoder 114 and / or errors introduced into encoded bitstream 110 during transmission to destination device 106.
[0057] Video display 122 may display video sequence 108 to a user. Video display 122 may comprise a cathode rate tube (CRT) display, a liquid crystal display (LCD), a plasma display, a light emitting diode (LED) display, and / or any other display device suitable for displaying video sequence 108.
[0058] Video coding / decoding system 100 is merely an example and video encoding / decoding systems different from the video coding / decoding system 100 and / or modified versions of the video coding / decoding system 100 may similarly perform the methods and processes as described herein. For example, the video coding / decoding system 100 may comprise other components and / or arrangements. For example, video source 112 may be external to source device 102. Similarly, video display 122 may be external to destination device 106 or omitted altogether (e.g., if video sequence 108 is intended for consumption by a machine and / or storage device). In an example, source device 102 may further comprise a video decoder and destination device 106 may further comprise a video encoder. ForDocket No.: 24-2066PCTexample, source device 102 may be configured to further receive an encoded bitstream from destination device 106 to support two-way video transmission between the devices.
[0059] Encoder 114 and / or decoder 120 may operate according to one or more proprietary or industry video coding standards. For example, encoder 114 and / or decoder 120 may operate in accordance with one or more proprietary, open-source, and / or standardized protocols (e.g., International Telecommunications Union Telecommunication Standardization Sector (ITU-T) H.263, ITU-T H.264 and Moving Picture Expert Group (MPEG)-4 Visual (also known as Advanced Video Coding (AVC)), ITU-T H.265 and MPEG-H Part 2 (also known as High Efficiency Video Coding (HEVC)), ITU-T H.265 and MPEG-I Part 3 (also known as Versatile Video Coding (WC)), the WebM VP8 and VP9 codecs, and / or AOMedia Video 1 (AV1), and / or any other video coding protocol).
[0060] FIG. 2 shows an example encoder. Encoder 200 as shown in FIG. 2 may implement one or more processes described herein. Encoder 200 may encode a video sequence 202 into a bitstream 204 for more efficient storage and / or transmission. Encoder 200 may be implemented in video coding / decoding system 100 as shown in FIG. 1 (e.g., as encoder 114) or in any computing, communication, or electronic device (e.g., desktop computer, laptop computer, tablet computer, smartphone, wearable device, television, camera, video gaming console, set-top box, video streaming device, etc.). Encoder 200 may comprise one or more of an inter prediction unit 206, an intra prediction unit 208, combiners 210 and 212, a transform and quantization unit (TR + Q) 214, an inverse transform and quantization unit (iTR + iQ) 216, an entropy coding unit 218, one or more filters 220, and / or a buffer 222.
[0061] Encoder 200 may partition pictures (e.g., frames) of (e.g., comprising) video sequence 202 into blocks and encode video sequence 202 on a block-by-block basis. Encoder 200 may perform / apply a prediction technique on a block being encoded using either inter prediction unit 206 or intra prediction unit 208. Inter prediction unit 206 may perform inter prediction by searching for a block similar to the block being encoded in another, reconstructed picture (e.g., a reference picture) of video sequence 202. A reconstructed picture refers to a picture that was encoded and then decoded. The block determined during the search (e.g., referred to as a prediction block) may then be used to predict the block being encoded to remove redundant information. Inter prediction unit 206 may exploit temporal redundancy or similarities in scene content from picture to picture in video sequence 202 to determine the prediction block. For example, scene content between pictures of video sequence 202 may be similar except for differences due to motion and / or affine transformation of the screen content over time.
[0062] Intra prediction unit 208 may perform intra prediction by forming a prediction block based on data from reconstructed neighboring samples of the block to be encoded within the same picture of video sequence 202. A reconstructed sample refers to a sample that was encoded and then decoded. Intra prediction unit 208 may exploit spatial redundancy or similarities in scene content within a picture of video sequence 202 to determine the prediction block. For example, the texture of a region of scene content in a picture may be similar to the texture in the immediate surrounding area of the region of the scene content in the same picture.Docket No.: 24-2066PCT
[0063] Combiner 210 may determine a prediction error (e.g., referred to as a residual) based on the difference between the block being encoded and the prediction block. The prediction error may represent non-redundant information that may be sent / transmitted to a decoder for accurate decoding of video sequence 202.
[0064] Transform and quantization unit (TR + Q) 214 may transform and quantize the prediction error. Transform and quantization unit 214 may transform the prediction error into transform coefficients by applying, for example, a DCT to reduce correlated information in the prediction error. Transform and quantization unit 214 may quantize the coefficients by mapping data of the transform coefficients to a predefined set of representative values. Transform and quantization unit 214 may quantize the coefficients to reduce irrelevant information in bitstream 204. The irrelevant information refers to information that may be removed from the coefficients without producing visible and / or perceptible distortion in video sequence 202 after decoding (e.g., at a receiving device).
[0065] Entropy coding unit 218 may apply one or more entropy coding methods to the quantized transform coefficients to further reduce the bit rate. For example, entropy coding unit 218 may apply context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), and / or syntax-based context-based binary arithmetic coding (SBAC). The entropy coded coefficients may be packed to form bitstream 204.
[0066] Inverse transform and quantization unit (iTR + iQ) 216 may inverse quantize and inverse transform the quantized transform coefficients to determine a reconstructed prediction error. Combiner 212 may combine the reconstructed prediction error with the prediction block to form a reconstructed block. Filter(s) 220 may filter the reconstructed block, for example, using a deblocking filter and / or a sample-adaptive offset (SAO) filter. Buffer 222 may store the reconstructed block for prediction of one or more other blocks in the same and / or different picture of video sequence 202.
[0067] Encoder 200 may further comprise an encoder control unit. The encoder control unit may be configured to control one or more units of encoder 200 as shown in FIG. 2. The encoder control unit may control the one or more units of encoder 200 such that bitstream 204 may be generated in conformance with the requirements of one or more proprietary coding protocols, industry video coding standards, and / or any other video cording protocol. For example, the encoder control unit may control the one or more units of encoder 200 such that bitstream 204 may be generated in conformance with one or more of ITU-T H.263, AVC, HEVC, WC, VP8, VP9, AV1, and / or any other video coding standard / format.
[0068] The encoder control unit may be configured to attempt to minimize (or reduce) the bitrate of bitstream 204 and / or maximize (or increase) the reconstructed video quality (e.g., within the constraints of a proprietary coding protocol, industry video coding standard, and / or any other video cording protocol). For example, the encoder control unit may be configured to attempt to minimize or reduce the bitrate of bitstream 204 such that the reconstructed video quality does not fall below a certain level / threshold, and / or to maximize or increase the reconstructed video quality such that the bitrate of bitstream 204 does not exceed a certain level / threshold. The encoder control unit may determine / control one or more of: partitioning of the pictures of video sequence 202 into blocks, whether a block is inter predicted by inter prediction unit 206 or intra predicted by intra prediction unit 208, a motion vector for interDocket No.: 24-2066PCTprediction of a block, an intra prediction mode among a plurality of intra prediction modes for intra prediction of a block, filtering performed by filter(s) 220, and / or one or more transform types and / or quantization parameters applied by transform and quantization unit 214. The encoder control unit may determine / control one or more of the above based on a rate-distortion measure for a block or picture being encoded. The encoder control unit may determine / control one or more of the above to reduce the rate-distortion measure for a block or picture being encoded.
[0069] The prediction type used to encode a block (intra or inter prediction), prediction information of the block (intra prediction mode if intra predicted, motion vector, etc.), and / or transform and / or quantization parameters, may be sent to entropy coding unit 218 to be further compressed (e.g., to reduce the bitrate). For example, entropy coding unit 218 may apply context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), and / or syntax-based context-based binary arithmetic coding (SBAC) to achieve further compression. The prediction type, prediction information, and / or transform and / or quantization parameters may be packed with the prediction error to form bitstream 204.
[0070] Encoder 200 is merely an example and encoders different from encoder 200 and / or modified versions of encoder 200 may perform the methods and processes as described herein. For example, encoder 200 may comprise other components and / or arrangements. One or more of the components shown in FIG. 2 may be optionally included in encoder 200 (e.g., entropy coding unit 218 and / or filters(s) 220).
[0071] FIG. 3 shows an example decoder. A decoder 300 as shown in FIG.3 may implement one or more processes described herein. Decoder 300 may decode a bitstream 302 into a decoded video sequence 304 for display and / or some other form of consumption. Decoder 300 may be implemented in video coding / decoding system 100 in FIG. 1 and / or in a computing, communication, or electronic device (e.g., desktop computer, laptop computer, tablet computer, smart phone, wearable device, television, camera, video gaming console, set-top box, and / or video streaming device). Decoder 300 may comprise an entropy decoding unit 306, an inverse transform and quantization (iTR + iQ) unit 308, a combiner 310, one or more filters 312, a buffer 314, an inter prediction unit 316, and / or an intra prediction unit 318.
[0072] Decoder 300 may comprise a decoder control unit configured to control one or more units of decoder 300. The decoder control unit may control the one or more units of decoder 300 such that bitstream 302 is decoded in conformance with the requirements of one or more proprietary coding protocols, industry video coding standards, and / or any other communication protocol. For example, the decoder control unit may control the one or more units of decoder 300 such that the bitstream 302 is decoded in conformance with one or more of ITU-T H.263, AVC, HEVC, WC, VP8, VP9, AV1, and / or any other video coding standard / format.
[0073] The decoder control unit may determine / control one or more of: whether a block is inter predicted by inter prediction unit 316 or intra predicted by intra prediction unit 318, a motion vector for inter prediction of a block, an intra prediction mode among a plurality of intra prediction modes for intra prediction of a block, filtering performed by filter(s) 312, and / or one or more inverse transform types and / or inverse quantization parameters to be applied byDocket No.: 24-2066PCTinverse transform and quantization unit 308. One or more of the control parameters used by the decoder control unit may be packed in bitstream 302.
[0074] Entropy decoding unit 306 may entropy decode the bitstream 302. For example, entropy decoding unit 306 may apply context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), and syntax-based context-based binary arithmetic coding (SBAC) to decompress the prediction type used to encode a block (intra or inter prediction), prediction information of the block (intra prediction mode if intra predicted, motion vector, etc.), and transform and quantization parameters. Inverse transform and quantization unit 308 may inverse quantize and / or inverse transform the quantized transform coefficients to determine a decoded prediction error. Combiner 310 may combine the decoded prediction error with a prediction block to form a decoded block. The prediction block may be generated by intra prediction unit 318 or inter prediction unit 316 (e.g., as described above with respect to encoder 200 in FIG 2). Filter(s) 312 may filter the decoded block, for example, using a deblocking filter and / or a sample-adaptive offset (SAO) filter. Buffer 314 may store the decoded block for prediction of one or more other blocks in the same and / or different picture of the video sequence in bitstream 302. Decoded video sequence 304 may be output from filter(s) 312 as shown in FIG. 3.
[0075] Decoder 300 is merely an example and decoders different from decoder 300 and / or modified versions of decoder 300 may perform the methods and processes as described herein. For example, decoder 300 may have other components and / or arrangements. One or more of the components shown in FIG. 3 may be optionally included in decoder 300 (e.g., entropy decoding unit 306 and / or filters(s) 312).
[0076] Although not shown in FIGS. 2 and 3, each of encoder 200 and decoder 300 may further comprise an intra block copy unit in addition to inter prediction and intra prediction units The intra block copy unit may perform / operate similar to an inter prediction unit but may predict blocks within the same picture. For example, the intra block copy unit may exploit repeated patterns that appear in screen content. The screen content may include computer generated text, graphics, animation, etc.
[0077] Video encoding and / or decoding may be performed on a block-by-block basis. The process of partitioning a picture into blocks may be adaptive based on the content of the picture. For example, larger block partitions may be used in areas of a picture with higher levels of homogeneity to improve coding efficiency.
[0078] A picture (e.g., in HEVC, or any other coding standard / format) may be partitioned into non-overlapping square blocks, which may be referred to as coding tree blocks (CTBs). The CTBs may comprise samples of a sample array. A CTB may have a size of 2nx2n samples, where n may be specified by a parameter of the encoding system. For example, n may be 4, 5, 6, or any other value. A CTB may have any other size. A CTB may be further partitioned by a recursive quadtree partitioning into coding blocks (CBs) of half vertical and half horizontal size. The CTB may form the root of the quadtree. A CB that is not split further as part of the recursive quadtree partitioning may be referred to as a leaf CB of the quadtree, and otherwise may be referred to as a non-leaf CB of the quadtree. A CB may have a minimum size specified by a parameter of the encoding system. For example, a CB may have a minimum size of 4x4, 8x8, 16x16, 32x32, 64x64 samples, or any other minimum size. ACB may be further partitioned into oneDocket No.: 24-2066PCTor more prediction blocks (PBs) for performing inter and / or intra prediction. A PB may be a rectangular block of samples on which the same prediction type / mode may be applied. A CB may also be further partitioned into intra subpartitions (ISP) where the reconstructed samples of each sub-partition are available to generate the prediction of the next sub-partition. For example, a CB may be split into 2 to 4 sub-partitions. For transformations, a CB may be partitioned into one or more transform blocks (TBs). A TB may be a rectangular block of samples that may determine / indicate an applied transform size.
[0079] FIG. 4 shows an example quadtree partitioning of a CTB 400. FIG. 5 shows an example quadtree 500 corresponding to the example quadtree partitioning of CTB 400 in FIG. 4. As shown in the examples of FIGS. 4 and 5, CTB 400 may first be partitioned into four CBs of half vertical and half horizontal size. Three of the resulting CBs of the first level partitioning of CTB 400 are leaf CBs. The three leaf CBs of the first level partitioning of CTB 400 are respectively labeled 7, 8, and 9 in FIGS. 4 and 5. The non-leaf CB of the first level partitioning of CTB 400 is partitioned into four sub-CBs of half vertical and half horizontal size. Three of the resulting sub-CBs of the second level partitioning of CTB 400 are leaf CBs. The three leaf CBs of the second level partitioning of CTB 400 are respectively labeled 0, 5, and 6 in FIGS. 4 and 5. Finally, The non-leaf CB of the second level partitioning of CTB 400 is partitioned into four leaf CBs of half vertical and half horizontal size. The four leaf CBs are respectively labeled 1, 2, 3, and 4 in FIGS.4 and 5.
[0080] The example CTB 400 of FIG. 4 is partitioned into 10 leaf CBs respectively labeled 0-9, but maybe partitioned into other quantities of leaf CBs. The 10 leaf CBs may correspond to 10 CB leaf nodes (e.g., 10 CB leaf nodes of quadtree 500 as shown in FIG. 5). In other examples, a CTB may be partitioned into a different number of leaf CBs. The resulting quadtree partitioning of CTB 400 may be scanned using a z-scan (e.g., left-to-right, top-to- bottom) to form the sequence order for encoding / decoding the CB leaf nodes. A numeric label (e.g., indicator, index) of each CB leaf node in FIGS. 4 and 5 may correspond to the sequence order for encoding / decoding. For example, CB leaf node 0 may be encoded / decoded first and CB leaf node 9 may be encoded / decoded last. Although not shown in FIGS. 4 and 5, each CB leaf node may comprise one or more PBs and / or TBs.
[0081] A picture, in WC (or in any other coding standard / format), may be partitioned in a similar manner (such as in HEVC). A picture may be first partitioned into non-overlapping square CTBs. The CTBs may then be partitioned, using a recursive quadtree partitioning, into CBs of half vertical and half horizontal size. A quadtree leaf node (e.g., in WC) may be further partitioned by a binary tree or ternary tree partitioning (or any other partitioning) into CBs of unequal sizes.
[0082] FIG. 6 shows example binary tree and ternary tree partitions. A binary tree partition may divide a parent block in half in either a vertical direction 602 or a horizontal direction 604. The resulting partitions may be half in size as compared to the parent block. In other examples, the resulting partitions may correspond to sizes that are less than and / or greater than half of the parent block size. A ternary tree partition may divide a parent block into three parts in either a vertical direction 606 or a horizontal direction 608. FIG. 6 shows an example in which the middle partition may be twice as large as the other two end partitions in the ternary tree partitions. In other examples,Docket No.: 24-2066PCTpartitions may be of other sizes relative to each other and to the parent block. Binary and ternary tree partitions are examples of multi-type tree partitioning. Multi-type tree partitions may comprise partitioning a parent block into other quantities of smaller blocks. The block partitioning strategy (e.g., in WC) may be referred to as a combination of quadtree and multi-type tree partitioning (quadtree + multi-type tree partitioning) because of the addition of binary and / or ternary tree partitioning to quadtree partitioning.
[0083] FIG. 7 shows an example of combined quadtree and multi-type tree partitioning of a CTB 700. FIG. 8 shows an example tree 800 corresponding to the combined quadtree and multi-type tree partitioning of CTB 700 shown in FIG. 7. In both FIGS. 7 and 8, quadtree splits are shown in solid lines and multi-type tree splits are shown in dashed lines. For ease of explanation, CTB 700 is shown with the same quadtree partitioning as the CTB 400 described in FIG. 4, and a description of the quadtree partitioning of CTB 700, which is similar to that for CTB 400, is omitted. The quadtree partitioning of the CTB 700 is merely an example and a CTB may be quadtree partitioned in a manner different from the CTB 700. Additional multi-type tree partitions of CTB 700 may be made relative to three leaf CBs shown in FIG. 4. The three leaf CBs in FIG.4 that are shown in FIG. 7 as being further partitioned may be leaf CBs 5, 8, and 9. The three leaf CBs may be further partitioned using one or more binary and / or ternary tree partitions.
[0084] The leaf CB 5 of FIG. 4 may be partitioned into two CBs based on a vertical binary tree partitioning. The two resulting CBs may be leaf CBs respectively labeled 5 and 6 in FIGS. 7 and 8. The leaf CB 8 of FIG. 4 may be partitioned into three CBs based on a vertical ternary tree partition. Two of the three resulting CBs may be leaf CBs respectively labeled 9 and 14 in FIGS.7 and 8. The remaining, non-leaf CB may be partitioned first into two CBs based on a horizontal binary tree partition. One of the two CBs may be a leaf CB labeled 10. The other of the two CBs may be further partitioned into three CBs based on a vertical ternary tree partition The resulting three CBs may be leaf CBs respectively labeled 11, 12, and 13 in FIGS. 7 and 8. The leaf CB 9 of FIG. 4 may be partitioned into three CBs based on a horizontal ternary tree partition. Two of the three CBs may be leaf CBs respectively labeled 15 and 19 in FIGS. 7 and 8. The remaining, non-leaf CB may be partitioned into three CBs based on another horizontal ternary tree partition. The resulting three CBs may all be leaf CBs respectively labeled 16, 17, and 18 in FIGS.7 and 8.
[0085] Altogether, CTB 700 may be partitioned into 20 leaf CBs respectively labeled 0-19. The 20 leaf CBs may correspond to 20 leaf nodes (e.g., 20 leaf nodes of tree 800 shown in FIG.8). The resulting combination of quadtree and multi-type tree partitioning of the CTB 700 may be scanned using a z-scan (left-to-right, top-to-bottom) to form the sequence order for encoding / decoding the CB leaf nodes A numeric label of each CB leaf node in FIGS.7 and 8 may correspond to the sequence order for encoding / decoding, with CB leaf node 0 encoded / decoded first and CB leaf node 19 encoded / decoded last. Although not shown in FIGS. 7 and 8, it should be noted that each CB leaf node may comprise one or more PBs and / or TBs.
[0086] A coding standard / format (e.g., HEVC, VVC, or any other coding standard / format) may define various units (e.g., in addition to specifying various blocks (e.g., CTBs, CBs, PBs, TBs)). Blocks may comprise a rectangular area of samples in a sample array. Units may comprise the collocated blocks of samples from the different sample arraysDocket No.: 24-2066PCT(e.g., luma and chroma sample arrays) that form a picture as well as syntax elements and prediction data of the blocks. A coding tree unit (CTU) may comprise the collocated CTBs of the different sample arrays and may form a complete entity in an encoded bitstream. A coding unit (CU) may comprise the collocated CBs of the different sample arrays and syntax structures used to code the samples of the CBs. A prediction unit (PU) may comprise the collocated PBs of the different sample arrays and syntax elements used to predict the PBs. A transform unit (TU) may comprise TBs of the different samples arrays and syntax elements used to transform the TBs.
[0087] A block may refer to any of a CTB, CB, PB, TB, CTU, CU, PU, and / or TU (e.g., in the context of HEVC, VVC, or any other coding format / standard). A block may be used to refer to similar data structures in the context of any video coding format / standard / protocol. For example, a block may refer to a macroblock in the AVC standard, a macroblock or a sub-block in the VP8 coding format, a superblock or a sub-block in the VP9 coding format, and / or a superblock or a sub-block in the AV1 coding format.
[0088] In intra prediction, samples of a block to be encoded (e.g., also referred to as a current block) may be predicted from samples in a line of samples immediately adjacent to the current block. For example, the line of samples may include samples of the column immediately adjacent to the left-most column of the current block and samples of the row immediately adjacent to the top-most row of the current block. The samples from the immediately adjacent column and row may be jointly referred to as reference samples. Each sample of the current block may be predicted (e.g., in an intra prediction mode) by projecting the position of the sample in the current block in a given direction to a point along the reference samples. The sample may be predicted by interpolating between the two closest reference samples of the projection point if the projection does not fall directly on a reference sample. A prediction error (e.g., referred to as a residual) may be determined for the current block based on differences between the predicted sample values and the original sample values of the current block.
[0089] Predicting samples and determining a prediction error based on a difference between the predicted samples and original samples may be performed (e.g., at an encoder) for a plurality of different intra prediction modes (e.g., including non-directional intra prediction modes). The encoder may select one of the plurality of intra prediction modes and its corresponding prediction error to encode the current block. The encoder may send an indication of the selected prediction mode and its corresponding prediction error to a decoder for decoding of the current block. The decoder may decode the current block by predicting the samples of the current block, using the intra prediction mode indicated by the encoder, and / or combining the predicted samples with the prediction error.
[0090] FIG. 9 shows an example set of reference samples 902 determined for intra prediction of a current block 904. Current block 904 may correspond to a block being encoded and / or decoded. Current block 904 may correspond to block 3 of partitioned CTB 700 as shown in FIG.7. As described herein, the numeric labels 0-19 of the blocks of partitioned CTB 700 may correspond to the sequence order for encoding / decoding the blocks and may be used as such in the example of FIG. 9.
[0091] In some embodiments, reference samples 902 may include a line of samples immediately adjacent to current block 904 and include samples from a column and a row immediately adjacent to current block 904. ForDocket No.: 24-2066PCTexample, the line of samples may include reference samples to the left and / or above current block 904. In some embodiments, reference samples 902 may be obtained (or selected) from a reference line of multiple reference lines (MRL), which may include a line of samples adjacent to current block 904 and also a line of non-adjacent samples. The MRL may include reference lines identified by corresponding reference line indices that indicate an i-th line of samples adjacent to current block 904 such that the 0-th line indicates the reference line immediate adjacent (or closest) to current block 904 and a higher numbered i-th line indicates a line of samples further away from current block 904. An encoder may select a reference line from a set of MRL and signal an MLR index in the bitstream to indicate the selected reference line. For example, the encoder may signal a codeword encoding the MRL index. The decoder may decode the codeword to determine the MRL index that identifies a specific reference line used in intra prediction of current block 904.
[0092] For current block 904 that is w x h samples in size, reference samples 902 may comprise: 2w samples (or any other quantity of samples) of an i-th row (e.g., indicated by an MRL index) adjacent to the top-most row of current block 904, 2h samples (or any other quantity of samples) of the i-th column adjacent to the left-most column of current block 904, and the top left neighboring corner sample(s) extending from the i-th column and i-th row with respect to current block 904. Current block 904 may be square, such that w = h = s. In other examples, a current block need not be square, such that w ≠ h. Available samples from neighboring blocks of current block 904 may be used for constructing the set of reference samples 902. Samples may not be available for constructing the set of reference samples 902, for example, if the samples lie outside the picture of the current block, the samples are part of a different slice of the current block (e.g., if the concept of slices is used), and / or the samples belong to blocks that have been inter coded and constrained intra prediction is indicated. Intra prediction may not be dependent on inter predicted blocks, for example, if constrained intra prediction is indicated.
[0093] Samples that may not be available for constructing the set of reference samples 902 may comprise samples in blocks that have not already been encoded and reconstructed at an encoder and / or decoded at a decoder based on the sequence order for encoding / decoding. Restriction of such samples from inclusion in the set of reference samples 902 may allow identical prediction results to be determined at both the encoder and decoder. In the example of FIG. 9, samples from neighboring blocks 0, 1, 2, and 8 may be available to construct reference samples 902 given that these blocks are encoded and reconstructed at an encoder and decoded at a decoder prior to coding of current block 904. The samples from neighboring blocks 0, 1, 2, and 8 may be available to construct reference samples 902, for example, if there are no other issues (e.g., as mentioned above) preventing the availability of the samples from the neighboring blocks 0, 1, 2, and 8. The portion of reference samples 902 from neighboring block 6 may not be available due to the sequence order for encoding / decoding (e.g., because the block 6 may not have already been encoded and reconstructed at the encoder and / or decoded at the decoder based on the sequence order for encoding / decoding).
[0094] In some examples, unavailable samples from reference samples 902 may be filled with one or more of the available reference samples 902. For example, an unavailable reference sample may be filled with a nearestDocket No.: 24-2066PCTavailable reference sample. The nearest available reference sample may be determined by moving in a clock-wise direction through reference samples 902 from the position of the unavailable reference. The reference samples 902 may be filled with the mid-value of the dynamic range of the picture being coded, for example, if no reference samples are available.
[0095] Samples of current block 904 may be intra predicted based on reference samples 902, for example, based on (e.g., after) determination and (optionally) filtering of reference samples 902. In some examples, a filtering scheme (e.g., a filtering algorithm) may be applied to reference samples 902 to improve prediction accuracy. The filtering scheme may be one of a plurality of filter types including at least: a smoothing filter (or reference sample smoothing filter) or an interpolation filter. In some examples, if reference samples of a given block are to be filtered, only one of the plurality of filter types is selected (e.g., activated) to be applied to the reference samples. For example, if the smoothing filter is selected (e.g, activated), the interpolation filter is not selected (e.g., disabled) or vice versa.
[0096] Many encoders / decoders may support a plurality of intra prediction modes in accordance with one or more video coding standards. For example, HEVC supports 35 intra prediction modes, including a planar mode, a direct current (DC) mode, and 33 angular modes. WC supports 67 intra prediction modes, including a planar mode, a DC mode, and 65 angular modes. Planar and DC modes may be used to predict smooth and gradually changing regions of a picture. Angular modes may be used to predict directional structures in regions of a picture. Any quantity of intra prediction modes may be supported.
[0097] FIGS. 10A-B show example intra prediction modes. FIG. 10A shows 35 intra prediction modes, such as supported by HEVC. The 35 intra prediction modes may be indicated / identified by indices 0 to 34. Prediction mode 0 may correspond to planar mode. Prediction mode 1 may correspond to DC mode. Prediction modes 2-34 may correspond to angular modes. Prediction modes 2-18 may be referred to as horizontal prediction modes because the principal source of prediction is in the horizontal direction. Prediction modes 19-34 may be referred to as vertical prediction modes because the principal source of prediction is in the vertical direction.
[0098] FIG. 10B shows 67 intra prediction modes, such as supported by WC. The 67 intra prediction modes may be indicated / identified by indices 0 to 66. Prediction mode 0 may correspond to planar mode. Prediction mode 1 corresponds to DC mode. Prediction modes 2-66 may correspond to angular modes. Prediction modes 2-34 may be referred to as horizontal prediction modes because the principal source of prediction is in the horizontal direction. Prediction modes 35-66 may be referred to as vertical prediction modes because the principal source of prediction is in the vertical direction. Some of the intra prediction modes illustrated in FIG. 10B may be adaptively replaced by wide-angle directions because blocks in WC need not be squares.
[0099] FIG. 11 shows a current block 904 and corresponding reference samples 902 from FIG. 9. To further describe how intra prediction modes are applied to determine a prediction (e.g., a prediction block) of current block 904, FIG. 11 shows current block 904 and reference samples 902, from a reference line among a set of multiple reference lines (MRL) 908-910, in a two-dimensional x, y plane, where a sample may be referenced as p [x] [y]. To simplify the prediction process, reference samples 902 may be placed in two, one-dimensional arrays. The referenceDocket No.: 24-2066PCTsamples 902 belonging to a reference line I from the set of MRL 908-912, above the current block 904, may be placed in the one-dimensional array ref [x]:re / iM = P[~ + ]Hlx0). (1) The reference samples 902 belonging to reference line I, to the left of current block 904, may be placed in the onedimensional array ref2[y]:ref y] = PH]H + y], (y °). (2) The variable / represents how many lines away the selected reference line is from current block. For example, if reference line #0908 is selected, then / is set to 1 to indicate the reference line adjacent to current block 904. For example, if reference line #1910 is selected, then I is set to 2. For example, if reference line #2912 is selected, then I is set to 3
[0100] In some examples, if MRL is not activated or selected, then reference samples 902 may be from reference line #0908 that is immediately adjacent to current block 904. In this example, the variable / in Equations (1) and (2) is set to 1.
[0101] The prediction process may comprise determination of a predicted sample p [x] [y] (e.g., a predicted value) at a location [x] [y] in current block 904. For planar mode, a sample at the location [x] [y] in current block 904 may be predicted by determining / calculating the mean of two interpolated values. The first of the two interpolated values may be based on a horizontal linear interpolation at the location [x] [y] in current block 904. The second of the two interpolated values may be based on a vertical linear interpolation at location [x] [y] in current block 904. The predicted sample p [x] [y ] in current block 904 may be determined / calculated as:1 p[^] [y] = 5 2 — ■ s [y] + [y] +s). (3) where / i[x][y] = (s - x - 1) ■ re / 2[y] + (x + 1) ■ re / j[s] (4) may be the horizonal linear interpolation at the location [x] [y] in current block 904 andv[x][y] = (s - y - 1) ■ re / ^x] + (y 4- 1) ■ re2[s] (5) may be the vertical linear interpolation at the location [x] [y] in current block 904. s may be equal to a length of a side (e.g., a number of samples on a side) of the current block 904.
[0102] For DC mode, a sample at a location [x] [y] in current block 904 may be predicted by the mean of the reference samples 902. The predicted sample p [x] [y] in current block 904 may be determined / calculated as:s-l s-l \ ^re / Jx] + ^\e / 2[y]. (6)x=0 y=O y
[0103] For angular modes, a sample at a location [x][y] in current block 904 maybe predicted by projecting the location [x] [y] in a direction specified by a given angular mode to a point on the horizontal or vertical line of samples comprising reference samples 902. The sample at the location [x] [y] may be predicted by interpolating between theDocket No.: 24-2066PCTtwo closest reference samples of the projection point if the projection does not fall directly on a reference sample. The direction specified by the angular mode may be given by an angle q> defined relative to the y-axis for vertical prediction modes (e.g., modes 19-34 in HEVC and modes 35-66 in VVC). The direction specified by the angular mode may be given by an angle q> defined relative to the x-axis for horizontal prediction modes (e.g., modes 2-18 in HEVC and modes 2-34 in WC).
[0104] FIG. 12 shows an example of applying an intra prediction mode (e.g., an angular mode such as vertical prediction mode 906) for prediction of a current block 904. FIG. 12 specifically shows prediction of a sample at a location [x] [y] in current block 904 for a vertical prediction mode 906. Vertical prediction mode 906 may be given by an angle q> with respect to the vertical axis. The location [x][y] in current block 904, in vertical prediction modes, may be projected to a point (e.g., referred to as a projection point) on the horizontal line of reference samples re / j[x], The reference samples 902 are only partially shown in FIG. 12 and shown as being from a reference line with reference line index of 0 for ease of illustration. Reference samples 902 may be from another reference line of the set of MRL, as explained in FIG.9. As shown in FIG. 12, the projection point on the horizontal line of reference samples re [x] may not be exactly on a reference sample. A predicted sample p [x] [y] in current block 904 may be determi ned / calculated by linearly interpolating between the two reference samples, for example, if the projection point falls at a fractional sample position between two reference samples. The predicted sample p [x] [y] may be determined / calculated as:p[x][y] = (1 - if) -re / Jx + i;+ 1] + if■ ref^x + + 2], (7) i;may be the integer part of the horizontal displacement of the projection point relative to the location [x] [y], i, may be determined / calculated as a function of the tangent of the angle q> of the vertical prediction mode 906 as:if = L(y + 1) -tan^J. (8) ifmay be the fractional part of the horizontal displacement of the projection point relative to the location [x] [y] and may be determined / calculated as:if = ((y + 1) ■ tan (p) - [(y + 1) ■ tan <pj, (9) where [ ■ ] is the integer floor function.
[0105] For horizontal prediction modes, a location [x][y] of a sample in current block 904 may be projected onto the vertical line of reference samples ref2[y], A predicted sample p [x] [y]for horizontal prediction modes may be determined / calculated as:p[x][y] = (1 - if) -ref2\y + + 1] + if-ref2\y + it+ 2], (10) ii may be the integer part of the vertical displacement of the projection point relative to the location [x] [y]. r(may be determined / calculated as a function of the tangent of the angle (p of the horizontal prediction mode as:k = [(x + 1) -tan<jpj. (11) if may be the fractional part of the vertical displacement of the projection point relative to the location [x] [y]. ifmay be determined / calculated as:Docket No.: 24-2066PCTif = ((x + 1) ■ tan <p) - [(x + 1) ■ tan (p J, (12) where L ■ ] is the integer floor function.
[0106] The interpolation functions given by Equations (7) and (10) may be implemented by an encoder and / or a decoder (e.g., encoder 200 in FIG. 2 and / or decoder 300 in FIG.3). The interpolation functions may be implemented by finite impulse response (FIR) filters. For example, the interpolation functions may be implemented as a set of two- tap FIR filters. The coefficients of the two-tap FIR filters may be respectively given by (1-if) and if. The predicted sample p [x] [y], in angular intra prediction, may be calculated with some predefined level of sample accuracy (e.g., 1 / 32 sample accuracy, or accuracy defined by any other metric). For 1 / 32 sample accuracy, the set of two-tap FIR interpolation filters may comprise up to 32 different two-tap FIR interpolation filters — one for each of the 32 possible values of the fractional part of the projected displacement if. In other examples, different levels of sample accuracy may be used.
[0107] In some examples, the FIR filters may be used for predicting chroma samples and / or luma samples. For example, the two-tap interpolation FIR filter may be used for predicting chroma samples and a same and / or a different interpolation technique / filter may be used for luma samples. For example, a four-tap FIR filter may be used to determine a predicted value of a luma sample. Coefficients of the four tap FIR filter may be determined based on if(e.g., similar to the two-tap FIR filter). For 1 / 32 sample accuracy, a set of 32 different four-tap FIR filters may comprise up to 32 different four-tap FIR filters — one for each of the 32 possible values of the fractional part of the projected displacement if. In other examples, different levels of sample accuracy may be used. The set of four-tap FIR filters may be stored in a look-up table (LUT) and referenced based on if. A predicted sample p [x] [y], for vertical prediction modes, may be determined based on the four-tap FIR filter as:3 (13) p[x][y] = y fT[i] ■ refi[x + ildx + i],i=Owhere fTp], = 0...3, may be the filter coefficients, and Idx is integer displacement. A predicted sample p [x] [y], for horizontal prediction modes, may be determined based on the four-tap FIR filter as:3(14) p[x][y] = y fT[i] ■ ref2[y + ildx + i],t=o
[0108] Supplementary reference samples may be determined / constructed if the location [x] [y] of a sample in current block 904 to be predicted is projected to a negative x coordinate. The location [x] [y] of a sample may be projected to a negative x coordinate, for example, if negative vertical prediction angles q> are used. The supplementary reference samples may be determined / constructed by projecting the reference samples in ref2[y] in the vertical line of reference samples 902 to the horizontal line of reference samples 902 using the negative vertical prediction angle cp. Supplementary reference samples may be similarly determined / constructed, for example, if the location [x] [y] of a sample in current block 904 to be predicted is projected to a negative y coordinate. The location [x] [y ] of a sample may be projected to a negative y coordinate, for example, if negative horizontal prediction anglesDocket No.: 24-2066PCT(p are used. The supplementary reference samples may be determined / constructed by projecting the reference samples in ref [x] on the horizontal line of reference samples 902 to the vertical line of reference samples 902 using the negative horizontal prediction angle cp.
[0109] An encoder may determine / predict samples of a current block being encoded (e.g., current block 904) for a plurality of intra prediction modes (e.g., using one or more of the functions described herein). For example, an encoder may determine / predict samples of a current block for each of 35 intra prediction modes in HEVC and / or 67 intra prediction modes in WC and / or including extended intra prediction modes from WAIP for rectangular blocks. The encoder may determine, for each intra prediction mode applied, a corresponding prediction error for the current block based on a difference (e.g., sum of squared differences (SSD), sum of absolute differences (SAD), or sum of absolute transformed differences (SATD)) between the prediction samples, generated from reference samples 902 of a reference line (e.g., from a set of MRL), determined for the intra prediction mode and the original samples of the current block. The encoder may determine / select one of the intra prediction modes to encode the current block based on the determined prediction errors. For example, the encoder may determine / select one of the intra prediction modes that results in the smallest prediction error for the current block. In some examples, the encoder may determine / select the intra prediction mode and the associated reference line to encode the current block based on a rate-distortion measure (e.g., Lagrangian rate-distortion cost) determined using the prediction errors. The encoder may signal, in the bitstream to a decoder for decoding of the current block, an indication of the determined / selected intra prediction mode and an indication of the associated MRL index (which may indicate a reference line index). The encoder may also signal in the bitstream to the decoder a corresponding prediction error (e.g., residual) of the intra prediction mode.
[0110] A decoder may determine / predict samples of a current block being decoded (e.g., current block 904) for an intra prediction mode. For example, a decoder may receive an indication of a reference line (e.g., a reference line index or an MRL index associated with the reference line index) and an intra prediction mode (e.g., an angular intra prediction mode) from an encoder for a current block. The decoder may retrieve a set of reference samples and perform intra prediction based on the MRL index and the intra prediction mode indicated by the encoder for the current block in a similar manner (e.g., as described above for the encoder). For example, the decoder may obtain the reference samples from a reference line indicated / identified by the decoded MRL index. In some examples, when MRL is not enabled / activated / selected, the reference line has reference line index 0 and is immediately adjacent to the current block. In these examples, no indication of MRL index is signaled.
[0111] The decoder may add predicted values of the samples (e.g., determined based on the intra prediction mode) of the current block to a residual of the current block to reconstruct the current block. In some examples, a decoder need not receive an indication of an angular intra prediction mode from an encoder for a current block. Instead, the decoder may determine an intra prediction mode through other decoder-side means (e.g., by applying templatebased intra mode derivation (TIMD) tool / technique).Docket No.: 24-2066PCT
[0112] While various examples herein correspond to intra prediction modes in HEVC and WC, the methods, devices, and systems as described herein may be applied to / used for other intra prediction modes (e.g, as used in other video coding standards / formats, such as VP8, VP9, AV1, etc.).
[0113] Intra prediction may exploit correlations between spatially neighboring samples in the same picture of a video sequence to perform video compression. Inter prediction is another coding tool that may be used to perform video compression. Inter prediction may exploit correlations in the time domain between blocks of samples in different pictures of a video sequence. For example, an object may be seen across multiple pictures of a video sequence. The object may move (e.g., by some translation and / or affine motion) or remain stationary across the multiple pictures. A current block of samples in a current picture being encoded may have / be associated with a corresponding block of samples in a previously decoded picture. The corresponding block of samples may accurately predict the current block of samples The corresponding block of samples may be displaced from the current block of samples, for example, due to movement of the object, represented in both blocks, across the respective pictures of the blocks. The previously decoded picture may be a reference picture. The corresponding block of samples in the reference picture may be a reference block for motion compensated prediction. An encoder may use a block matching technique to estimate the displacement (or motion) of the object and / or to determine the reference block in the reference picture.
[0114] Similar to intra prediction, an encoder may determine a difference between a current block and a prediction for a current block. An encoder may determine a difference, for example, based on / after determining / generating a prediction for a current block (e.g., using inter prediction). The difference may be a prediction error (e.g., a residual). The encoder may store and / or send (e.g., signal), in / via a bitstream, the prediction error and / or other related prediction information The prediction error and / or other related prediction information may be used for decoding and / or other forms of consumption. A decoder may decode the current block by predicting the samples of the current block (e.g., by using the related prediction information) and combining the predicted samples with the prediction error.
[0115] FIG. 13A shows an example of inter prediction. The inter prediction may be performed for a current block 1300 in a current picture 1302 being encoded. An encoder (e.g., encoder 200 as shown in FIG. 2) may perform inter prediction to determine and / or generate a reference block 1304 in a reference picture 1306. Reference block 1304 may be used to predict the current block 1300. Reference pictures (e.g., reference picture 1306) may be prior decoded pictures available at the encoder and / or a decoder. Availability of a prior decoded picture may depend / be based on whether the prior decoded picture is available in a decoded picture buffer, at the time, current block 1300 is being encoded and / or decoded The encoder may search the one or more reference pictures 1306 for a block (e.g., a candidate reference block) that is similar (or substantially similar) to current block 1300. The encoder may determine the best matching block from the blocks (e.g., candidate reference blocks) tested during the searching process. The best matching block may be a reference block 1304. The encoder may determine that reference block 1304 is the best matching reference block based on one or more cost criteria. The one or more cost criteria may comprise a ratedistortion criterion (e.g., Lagrangian rate-distortion cost). The one or more cost criteria may be based on a differenceDocket No.: 24-2066PCT(e.g., SSD, SAD, and / or SATD) between prediction samples of reference block 1304 and original samples of current block 1300.
[0116] The encoder may search for reference block 1304 within a reference region (e.g., a search range 1308). The reference region (e.g., a search range 1308) may be positioned around a collocated block (or position) 1310, of current block 1300, in reference picture 1306. Collocated block 1310 may have a same position in the reference picture 1306 as the current block 1300 in the current picture 1302. The reference region (e.g., search range 1308) may at least partially extend outside of reference picture 1306. Constant boundary extension may be used, for example, if the reference region (e.g., search range 1308) extends outside of reference picture 1306. The constant boundary extension maybe used such that values of the samples in a row or a column of reference picture 1306, immediately adjacent to a portion of the reference region (e.g., search range 1308) extending outside of reference picture 1306, may be used for sample locations outside of reference picture 1306 A subset of potential positions, or all potential positions, within the reference region (e.g., search range 1308) may be searched for reference block 1304. The encoder may utilize one or more search implementations to determine and / or generate the reference block 1304. For example, the encoder may determine a set of candidate search positions based on motion information of neighboring blocks (e.g., a motion vector 1312) to the current block 1300.
[0117] One or more reference pictures maybe searched by the encoder during inter prediction to determine and / or generate the best matching reference block. The reference pictures searched by the encoder may be included in (e.g., added to) one or more reference picture lists. For example, in HEVC and WC (and / or in one or more other communication protocols), two reference picture lists may be used (e.g., a reference picture list 0 and a reference picture list 1). A reference picture list may include one or more pictures. The reference picture 1306 of reference block 1304 maybe indicated by a reference index pointing into a reference picture list comprising reference picture 1306.
[0118] FIG. 13B shows an example motion vector. A displacement between reference block 1304 and current block 1300 maybe interpreted as an estimate of the motion between reference block 1304 and current block 1300 across their respective pictures. The displacement maybe represented by a motion vector 1312. For example, motion vector 1312 maybe indicated by a horizontal component (MVx) and a vertical component (MVy) relative to the position of current block 1300. A motion vector (e.g., motion vector 1312) may have fractional or integer resolution. A motion vector with fractional resolution may point between two samples in a reference picture to provide a better estimation of the motion of current block 1300. For example, a motion vector may have 1 / 2, 1 / 4, 1 / 8, 1 / 16, 1 / 32, or any other fractional sample resolution. Interpolation between the two samples at integer positions may be used to generate a reference block and its corresponding samples at fractional positions, for example, if a motion vector points to a noninteger sample value in the reference picture. The interpolation may be performed by a filter with two or more taps.
[0119] The encoder may determine a difference (e.g., a corresponding sample-by-sample difference) between reference block 1304 and current block 1300. The encoder may determine the difference between reference block 1304 and current block 1300, for example, based on / after reference block 1304 is determined and / or generated, using inter prediction, for current block 1300. The difference may be a prediction error (e.g., a residual). The encoderDocket No.: 24-2066PCTmay store and / or send (e.g., signal), in / via a bitstream, the prediction error and / or related motion information. The prediction error and / or the related motion information may be used for decoding (e.g., decoding current block 1300) and / or other forms of consumption. The motion information may comprise the motion vector 1312 and a reference indicator / index. The reference indicator may indicate the reference picture 1306 in a reference picture list. In other examples, the motion information may comprise an indication of motion vector 1312 and / or an indication of the reference indicator / index. The reference indicator may indicate reference picture 1306 in the reference picture list comprising reference picture 1306. A decoder may decode current block 1300 by determining and / or generating the reference block 1304, which may correspond to / form (e.g., be considered as) a prediction of the current block 1300. The decoder may determine and / or generate the reference block 1304, for example, based on the related motion information. The decoder may decode current block 1300 based on combining the prediction (e.g., a reference block) with the prediction error (e.g., a residual block).
[0120] Inter prediction, as shown in FIG. 13A, maybe performed using one reference picture 1306 as a source of a prediction for current block 1300. Inter prediction based on a prediction of a current block using a single picture may be referred to as uni-prediction.
[0121] Inter prediction of a current block, using bi-prediction, may be based on two pictures (e.g., the source of prediction may be from the two pictures). Bi-prediction may be useful, for example, if a video sequence comprises fast motion, camera panning, zooming, and / or scene changes. Bi-prediction also maybe useful to capture fade outs of one scene or fade outs from one scene to another, where two pictures may effectively be displayed simultaneously with different levels of intensity.
[0122] One or both of uni-prediction and bi-prediction may be avai lable / used for performing inter prediction (e.g., at an encoder and / or at a decoder). Performing a specific type of inter prediction (e.g., uni-prediction and / or biprediction) may depend on a slice type of current block. For example, for P slices, only uni-prediction may be available / used for performing inter prediction. For B slices, either uni-prediction or bi-prediction may be available / used for performing inter prediction. An encoder may determine and / or generate a reference block, for predicting a current block, from a reference picture list 0, for example, if the encoder is using uni-prediction. An encoder may determine and / or generate a first reference block, for predicting a current block, from a reference picture list 0 and determine and / or generate a second reference block, for predicting the current block, from a reference picture list 1, for example, if the encoder is using bi-prediction.
[0123] FIG. 14 shows an example of bi-prediction. Two reference blocks 1402 and 1404 maybe used to predicta current block 1400. Reference block 1402 may be in a reference picture of one of reference picture listO or reference picture list 1. Reference block 1404 may be in a reference picture of another one of reference picture list 0 or reference picture list 1. As shown in FIG. 14, reference block 1402 may be in a first picture that precedes (e.g., in time) a current picture of current block 1400, and the reference block 1404 may be in a second picture that succeeds (e.g., in time) the current picture of current block 1400. The first picture may precede the current picture in terms of a picture order count (ROC). The second picture may succeed the current picture in terms of the POC. In otherDocket No.: 24-2066PCTexamples, the reference pictures may both precede or both succeed the current picture in terms of POC. A POC may be / indicate an order in which pictures are output (e.g., from a decoded picture buffer). A POC may be / indicate an order in which pictures are generally intended to be displayed. Pictures that are output may not necessarily be displayed but may undergo different processing and / or consumption (e.g., transcoding). The two reference blocks determined and / or generated using / for bi-prediction may correspond to (e.g., be comprised in) a same reference picture. The reference picture may be included in both the reference picture list 0 and the reference picture list 1, for example, if the two reference blocks correspond to the same reference picture.
[0124] A configurable weight and / or offset value may be applied to one or more inter prediction reference blocks. An encoder may enable the use of weighted prediction using a flag in a picture parameter set (PPS). The encoder may send / signal the weight and / or offset parameters in a slice segment header for current block 1400. Different weight and / or offset parameters may be sent / sig naled for luma and / or chroma components.
[0125] The encoder may determine and / or generate the reference blocks 1402 and 1404 for the current block 1400 using inter prediction. The encoder may determine a difference between current block 1400 and each of reference blocks 1402 and 1404. The differences may be prediction errors or residuals. The encoder may store and / or send / signal, in / via a bitstream, the prediction errors and / or their respective related motion information. The prediction errors and their respective related motion information may be used for decoding and / or other forms of consumption.
[0126] The motion information for reference block 1402 may comprise a motion vector 1406 and / or a reference indicator / index. The reference indicator may indicate a reference picture, of the reference block 1402, in a reference picture list. In some examples, the motion information for reference block 1402 may comprise an indication of motion vector 1406 and / or an indication of the reference index. The reference index may indicate the reference picture, of reference block 1402, in the reference picture list.
[0127] The motion information for reference block 1404 may comprise a motion vector 1408 and / or a reference index / indicator. The reference indicator may indicate a reference picture, of the reference block 1404, in a reference picture list. The motion information for reference block 1404 may comprise an indication of motion vector 1408 and / or an indication of the reference index. The reference index may indicate the reference picture, of the reference block 1404, in the reference picture list.
[0128] A decoder may decode current block 1400 by determining and / or generating the reference blocks 1402 and 1404. The decoder may determine and / or generate the reference blocks 1402 and 1404, for example, based on the respective related motion information for the reference blocks 1402 and 1404. The reference blocks 1402 and 1404 may correspond to / form (e.g., be considered as) the prediction (e.g., used to generate a prediction block) of the current block 1400. The decoder may decode the current block 1400 based on combining the prediction with the prediction errors.
[0129] Motion information may be predictively coded, for example, before being stored and / or sent / signaled in / via a bit stream (e.g, in HEVC, WC, and / or other video coding standards / formats / protocols). The motion information for a current block may be predictively coded based on motion information of one or more blocks neighboring the currentDocket No.: 24-2066PCTblock. The motion information of the neighboring block(s) may often correlate with the motion information of the current block because the motion of an object represented in the current block is often the same as (or similar to) the motion of objects in the neighboring block(s). Motion information prediction techniques (such as those in HEVC and WC) may comprise advanced motion vector prediction (AMVP) and / or inter prediction block merging (e.g., merge mode).
[0130] An encoder (e.g., encoder 200 as shown in FIG.2), may code a motion vector. The encoder may code the motion vector (e.g., using AMVP) as a difference between a motion vector of a current block being coded and a motion vector predictor (MVP). An encoder may determine / select the MVP from a list of candidate MVPs. The candidate MVPs may be / correspond to previously decoded motion vectors of neighboring blocks in the current picture of the current block, and / or blocks at or near the collocated position of the current block in other reference pictures. The encoder and / or a decoder may reciprocally generate and / or determine the list of candidate MVPs.
[0131] The encoder may determine / select an MVP from the list of candidate MVPs. Then, the encoder may send / signal, in / via a bitstream, an indication of the selected MVP and / or a motion vector difference (MVD). The encoder may indicate the selected MVP in the bitstream using an index / indicator. The index may indicate the selected MVP in the list of candidate MVPs. The MVD may be determined / calculated based on a difference between the motion vector of the current block and the selected MVP. For example, for a motion vector (e.g., comprising a horizontal component (MVx) and a vertical component (MVy)) that indicates a position relative to a position of the current block being coded, the MVD may be represented by two components MVDxand MVDy. MVDXand MVDymay be determined / calculated as:MVDx= MVx− MVPx, (15) MVDy = MVy - MVPy. (16) MVDx and MVDy may respectively represent horizontal and vertical components of the MVD. MVPx and MVPy may respectively represent horizontal and vertical components of the MVP.
[0132] A decoder (e.g., decoder 300 as shown in FIG. 3) may decode the motion vector by adding the MVD to the MVP indicated in / via the bitstream. The decoder may decode the current block by determining and / or generating the reference block. The decoder may determine and / or generate the reference block, for example, based on the decoded motion vector. The reference block may correspond to / form (e.g., be considered as) the prediction of the current block (e.g., a prediction block). The decoder may decode the current block by combining the prediction with the prediction error.
[0133] The list of candidate MVPs (e.g., in HEVC, WC, and / or one or more other communication protocols), for AMVP, may comprise two or more candidates (e.g., candidates A and B). Candidates A and B may comprise: up to two (or any other quantity of) spatial candidate MVPs determined / derived from five (or any other quantity of) spatial neighboring blocks of a current block being coded; one (or any other quantity of) temporal candidate MVP determined / derived from two (or any other quantity of) temporal, co-located blocks (e.g., if both of the two spatial candidate MVPs are not available or are identical); and / or zero motion vector candidate MVPs (e.g., if one or both ofDocket No.: 24-2066PCTthe spatial candidate MVPs or temporal candidate MVPs are not available). Other quantities of spatial candidate MVPs, spatial neighboring blocks, temporal candidate MVPs, and / or temporal, co-located blocks may be used for the list of candidate MVPs.
[0134] FIG. 15A shows example spatial candidate neighboring blocks for a current block. For example, five (or any other quantity of) spatial candidate neighboring blocks may be located relative to a current block 1500 being encoded. The five spatial candidate neighboring blocks maybe AO, A1, BO, B1, and B2. FIG. 15B shows temporal, co-located blocks for the current block. For example, two (or any other quantity of) temporal, co-located blocks may be located relative to current block 1500 being coded. The two temporal, co-located blocks may be CO and C1. The two temporal, co-located blocks may be in one or more reference pictures that may be different from the current picture of current block 1500.
[0135] An encoder (e.g., encoder 200 as shown in FIG.2) may code a motion vector using inter prediction block merging (e.g., a merge mode). For example, the encoder (e.g., using merge mode) may reuse the same motion information of a neighboring block (e.g., one of neighboring blocks A0, A1, B0, B1, and B2) for inter prediction of a current block. For example, the encoder (e.g., using merge mode) may reuse the same motion information of a temporal, co-located block (e.g., one of temporal, co-located blocks CO and C 1 ) for inter prediction of a current block. An MVD need not be sent (e.g., indicated, signaled) for the current block because the same motion information as that of a neighboring block or a temporal, co-located block may be used for the current block (e.g., at the encoder and / or a decoder). A signaling overhead for sending / signaling the motion information of the current block may be reduced because the MVD need not be indicated for the current block. The encoder and / or the decoder may reciprocally generate a candidate list of motion information from neighboring blocks or temporal, co-located blocks of the current block (e.g., in a manner similar to AMVP). The encoder may determine to use (e.g., inherit) motion information, of one neighboring block or one temporal, co-located block in the candidate list, for predicting motion information of the current block being coded. The encoder may signal / send, in / via a bitstream, an indication of the determined motion information from the candidate list. For example, the encoder may signal / send an indicator / index. The index may indicate the determined motion information in the list of candidate motion information. The encoder may signal / send the index to indicate the determined motion information.
[0136] A list of candidate motion information for merge mode (e.g., in HEVC, VVC, or any other coding formats / standards / protocols) may comprise: up to four (or any other quantity of) spatial merge candidates derived / determined from five (or any other quantity of) spatial neighboring blocks (e.g., as shown in FIG. 15A); one (or any other quantity of) temporal merge candidate derived from two (or any other quantity of) temporal, co-located blocks (e.g., as shown in FIG. 15B); and / or additional merge candidates comprising bi-predictive candidates and zero motion vector candidates. In some examples, the spatial neighboring blocks and the temporal, co-located blocks used for merge mode may be the same as the spatial neighboring blocks and the temporal, co-located blocks used for AMVP.Docket No.: 24-2066PCT
[0137] Inter prediction may be performed in other ways and variants than those described herein. For example, motion information prediction techniques other than AMVP and merge mode may be used. While various examples herein correspond to inter prediction modes, such as used in HEVC and WC, the methods, devices, and systems as described herein may be applied to / used for other inter prediction modes (e.g., as used for other video coding standards / formats such asVP8, VP9, AV1, etc.). History-based motion vector prediction (HMVP), combined intra / inter prediction mode (CIIP), and / or merge mode with motion vector difference (MMVD) (e.g., as described in WC) may be performed / used and are within the scope of the present disclosure.
[0138] A block matching operation (or technique) may be applied / used (e.g., in inter prediction) to determine a reference block in a different picture than that of a current block being coded (e.g., encoded and / or decoded). A block matching operation also may be applied / used to determine a reference block in a same picture as that of a current block being coded. The reference block, in a same picture as that of the current block, as determined using block matching may often not accurately predict the current block (e.g., for camera captured videos). Prediction accuracy for screen content videos may not be similarly impacted, for example, if a reference block in the same picture as that of the current block is used for encoding. Screen content videos may comprise, for example, computer generated text, graphics, animation, etc. Screen content videos may comprise (e.g., may often comprise) repeated patterns (e.g., repeated patterns of text and / or graphics) within the same picture. Using a reference block (e.g., as determined using block matching), in a same picture as that of a current block being encoded, may provide efficient compression for screen content videos.
[0139] A prediction technique may be used (e.g., in HEVC, WC, and / or any other coding standards / formats / protocols) to exploit correlation between blocks of samples within a same picture (e.g., of screen content videos). The prediction technique may be intra block copy (IBC) or current picture referencing (CPR). An encoder may apply / use a block matching technique (e.g., similar to inter prediction) to determine a displacement vector (e.g., a block vector (BV)). The BV may indicate a relative position of a reference block (e.g., in accordance with intra block compensated prediction), that best matches the current block, from a position of the current block. For example, the relative position of the reference block may be a relative position of a top-left corner (or any other point / sample) of the reference block. The BV may indicate a relative displacement from the current block to the reference block that best matches the current block. The encoder may determine the best matching reference block from blocks tested during a searching process (e.g., in a manner similar to that used for inter prediction). The encoder may determine that a reference block is the best matching reference block based on one or more cost criteria. The one or more cost criteria may comprise a rate-distortion criterion (e.g., Lagrangian rate-distortion cost). The one or more cost criteria may be based on, for example, one or more differences (e.g., an SSD, an SAD, an SATD, and / or a difference determined based on a hash function) between the prediction samples of the reference block and the original samples of the current block. A reference block may correspond to / comprise prior decoded blocks of samples (e.g., reconstructed samples) of the current picture. The reference block may comprise decoded blocks of samples of the current picture prior to being processed by in-loop filtering operations (e.g., deblocking and / or SAO filtering).Docket No.: 24-2066PCT
[0140] FIG. 16 shows an example of IBC (e.g., an IBC mode). The example shown in FIG. 16 may correspond to screen content. The rectangular portions / sections with arrows beginning at their boundaries may be the current blocks being encoded. The rectangular portions / sections that the arrows point to may be the reference blocks for predicting the respective current blocks.
[0141] A reference block may be determined and / or generated, for a current block, using IBC. The encoder may determine a difference (e.g., a corresponding sample-by-sample difference) between the reference block and the current block. The difference may be a prediction error or residual. The encoder may store and / or send / signal, in / via a bitstream the prediction error and / or related prediction information. The prediction error and / or the related prediction information may be used for decoding and / or other forms of consumption. The prediction information may comprise a BV. The prediction information may comprise an indication of the BV. A decoder (e.g., decoder 300 as shown in FIG.3), may decode the current block by determining and / or generating the reference block. The decoder may determine and / or generate the current block, for example, based on the prediction information (e.g., the BV). The reference block may correspond to / form (e.g., be considered as) the prediction (e.g., a prediction block) of the current block. The decoder may decode the current block by combining the prediction (e.g., prediction block) with the prediction error (e.g., residual or residual block).
[0142] A BV may be predictively coded (e.g., in HEVC, WC, and / or any other coding standards / formats / protocols) before being stored and / or sent / sig naled in / via a bitstream. For example, the BV for a current block may be predictively coded based on a BV of one or more blocks neighboring the current block. For example, an encoder may predictively code a BV using the merge mode (e.g., in a manner similar to as described herein for inter prediction), AMVP (e.g., as described herein for inter prediction), or a technique similar to AMVP. The technique similar to AMVP may be BV prediction and difference coding (or AMVP for IBC).
[0143] An encoder (e.g., encoder 200 as shown in FIG.2) performing BV prediction and coding may code a BV as a difference between the BV of a current block being coded and a block vector predictor (BVP). An encoder may select / determine the BVP from a list of candidate BVPs. The candidate BVPs may comprise / correspond to previously decoded BVs of neighboring blocks in the current picture of the current block. The encoder and / or a decoder may reciprocally generate or determine the list of candidate BVPs.
[0144] The encoder may send / signal, in / via a bitstream, an indication of the selected BVP and a block vector difference (BVD). The encoder may indicate the selected BVP in the bitstream using an index / indicator. The index may indicate (e.g., point to) the selected BVP in the list of candidate BVPs. The BVD may be determined / calculated based on a difference between a BV of the current block and the selected BVP. For example, for a BV (e.g., represented by a horizontal component (BVx) and a vertical component (BVy)) that indicates a position relative to a position of the current block being coded, the BVD may be represented by two components BVDxand BVDy. BVDxand BVDymay be determined / calculated as:BVDx= BVx− BVPx, (17)Docket No.: 24-2066PCTBVDy = BVy - BVPy. (18) BVDx and BVDy may respectively represent horizontal and vertical components of the BVD. BVPx and BVPy may respectively represent horizontal and vertical components of the BVP. A decoder (e.g decoder 300 as shown in FIG.3), may decode the BV by adding the BVD to the BVP indicated in / via the bitstream. The decoder may decode the current block by determining and / or generating the reference block. The decoder may determine and / or generate the reference block, for example, based on the decoded BV. The reference block may correspond to / form (e.g., be considered as) the prediction (e.g., a prediction block) of the current block. The decoder may decode the current block by combining the prediction (e.g., the prediction block) with the prediction error (e.g., residual or residual block).
[0145] A same BV as that of a neighboring block may be used for the current block and a BVD need not be separately signaled / sent for the current block, such as in the merge mode. A BVP (in the candidate BVPs), which may correspond to a decoded BV of the neighboring block, may itself be used as a BV for the current block. Not sending the BVD may reduce the signaling overhead.
[0146] A list of candidate BVPs (e.g., in HEVC, VVC, and / or any other coding standard / format / protocol) may comprise two (or more) candidates. The candidates may comprise candidates A and B. Candidates A and B may comprise: up to two (or any other quantity of) spatial candidate BVPs determined / derived from five (or any other quantity of) spatial neighboring blocks of a current block being encoded; and / or one or more of last two (or any other quantity of) coded BVs (e.g., if spatial neighboring candidates are not available). Spatial neighboring candidates may not be available, for example, if neighboring blocks are encoded using intra prediction or inter prediction. Locations of the spatial candidate neighboring blocks, relative to a current block, being encoded using IBC may be illustrated in a manner similar to spatial candidate neighboring blocks used for coding motion vectors in inter prediction (e.g., as shown in FIG. 15A). For example, five spatial candidate neighboring blocks of a current block being coded using IBC may be respectively denoted AO, A1, BO, B1, and B2 as shown in FIG. 15A.
[0147] The most probable mode (MPM) refers to the intra prediction mode (IPM) that is most likely to be the best mode for the current block being encoded or decoded. In current intra prediction techniques, the MPM is determined by analyzing the intra prediction modes of the neighboring CDs (e.g., also referred to as blocks) of a current block (or CU) to be coded (e.g., encoded or decoded). For example, WC uses a list of 6 MPMs (referred to as the “MPM list”) for luma intra prediction. The MPM list is derived from the intra prediction modes of the neighboring CDs, and is updated as the encoder progresses through the video frame. When encoding a block, the encoder may determine if the current block is a candidate for any of the MPMs in the MPM list. If it is, the encoder then compares the prediction errors of the respective MPMs to determine which MPM from the MPM list is the best mode for the current block. If the current block is not a candidate for any of the MPMs in the MPM list, the encoder may then evaluate all intra prediction modes (e.g., 67 in VVC) to determine the best mode for the current block.
[0148] The use of MPMs can significantly improve the coding efficiency because the encoder does not need to signal the intra prediction mode for the current block if it is one of the MPMs. Instead, the decoder can infer the intraDocket No.: 24-2066PCTprediction mode for the current block from the corresponding MPM list reciprocally and identically generated at the decoder. Thus, signaling overhead in the bitstream may be reduced.
[0149] In some examples, three types of intra modes are considered to construct the MPM list: default intra modes; neighboring intra modes; and derived intra modes. A unified 6 MPM list is used for intra blocks irrespective of whether Multiple Reference Lines (MRL) and Intra Sub-Partitions (ISP) coding tools are applied. The MPM list for the current block is constructed based on intra modes of the left neighbor block (e.g., block corresponding to A1 in FIG. 15A) and the above neighbor block (e.g., block corresponding to B1 in FIG. 15A) of the current block. Suppose the mode of the left neighbor block is denoted as Left and the mode of the above neighbor block is denoted as Above, the unified MPM list may be constructed as follows: when a neighboring block is not available, its intra mode is set to planar mode by default; if both modes Left and Above are non-angular modes, then the MPM list is set to incl ude{planar, DC, V, H, V - 4, V + 4}, where “V” and “H” refer to vertical mode and horizontal mode, respectively; if one of modes Left and Above is an angular mode, and the other is non-angular, set a mode Max as the larger mode in Left and Above, and set MPM list to include {planar, Max, Max - 1, Max + 1, Max — 2, Max + 2}; if Left and Above are both angular and they are different, set a mode Max and a mode Min as the larger mode in Left and Above and as the smaller mode in Left and Above, respectively, and thereafter, if Max - Min is equal to 1, then set MPM list to include {planar, Left, Above, Min - 1, Max + 1, Min - 2}, if Max - Min is greater than or equal to 62, then set MPM list to include {planar, Left, Above, Min + 1, Max - 1, Min + 2}, if Max - Min is equal to 2, set MPM list to include {planar, Left, Above, Min + 1, Min - 1, Max + 1 }, or otherwise, set MPM list to include {planar, Left, Above, Min - 1, -Min + 1, Max - 1}; and if Left and Above are both angular and they are the same, set MPM list to include {planar, Left, Left - 1, Left + 1, Left -2, Left + 2}.
[0150] The encoder may encode an MPM index in the bitstream to indicate the position of the selected intra prediction mode in the MPM list to the decoder. The encoder may represent the MPM index as a codeword and entropy encode the codeword into the bitstream. The decoder may derive the MPM list in a manner identical to the encoder, and use the MPM index obtained from the codeword decoded from bitstream to obtain the intra prediction mode from the MPM list derived at the decoder. In some instances, the first bin of codeword, representing the MPM index, is context coded using an arithmetic coder (e.g., CABAC) so as to achieve additional coding efficiencies. For example, three contexts may be used, corresponding to whether the current intra block is MRL enabled, ISP enabled, or a normal intra block.
[0151] During the 6 MPM list generation process, pruning may be used to remove duplicated intra modes so that the MPM list includes only unique intra modes. For entropy coding of the 61 non-MPM modes (that is, the 67 modes in WC minus the 6 MPM), a truncated binary code (TBC) may be used.
[0152] In some implementations, the MPM list is extended to include 16 additional candidates, and is divided into two parts, the primary MPM (PMPM) (e.g., including 6 entries) and the secondary (SMPM) (e.g., including 16 entries). In some implementations, the first entry in the general MPM list is the planar mode. The remaining entries include the intra modes of the adjacent neighboring blocks corresponding to positions left (L), above (A), below-left (BL), above-Docket No.: 24-2066PCTright (AR), and above-left (AL) (e.g., shown in FIG.15A as A1, B1, AO, BO, and B2), and decoder-side intra mode derivation (DIMD) modes which are sorted in ascending order of a cost such as, for example, SAD, SSD, SATD, etc. In some examples, up to a preconfigured / predetermined number of modes (e.g., 5) with the smallest costs are added to the MPM list. The cost for a respective MPM (e.g., an IPM corresponding to an entry in the MPM list) may be computed between the prediction of the reconstructed samples of the template of the current block and the reconstructed samples. For example, the prediction may be generated by applying the respective MPM for the template. Sorted directional modes are added into the general MPM list, and then the default modes, until the general MPM list with 22 entries is constructed. In some examples, if a CU block is vertically oriented, the order of neighboring blocks corresponds to A, L, BL, AR, AL; otherwise, it is L, A, AL, AR, BL.
[0153] Intra-template matching prediction (IntraTMP) is a special intra-prediction mode that selects a prediction block within a pre-determined reference region (RR) or search area from the reconstructed samples within the current frame. IntraTMP uses a pre-defined template of the current block to search for a candidate reference block of which the template best matches the template of the current block. FIG. 17 illustrates, for an example current block 1700, a reference region 1712 or search area from the reconstructed samples 1704 within which a search is performed for a candidate reference block 1706 of which the template 1708 ("candidate reference block template”) best matches the template 1702 ("current block template”) of the current block 1700. In this example the reference region is divided in four rectangular reference regions (R1, R2, R3, and R4).
[0154] By computing a cost function (e.g., SAD, SATD) between the template 1702 of the current block and the templates of several candidate reference blocks, N candidates with lower template costs, each indicated by a corresponding block vector predictor (BVP) candidate 1710 (BVP candidate may also be referred to herein as block vector (BV) candidate), are stored in an intraTMP list and ranked by lower cost value (ascending cost). This process is performed by both encoder and decoder.
[0155] The residual blocks obtained as the difference between the samples of the current block 1700 and the candidate's reference blocks 1706 in the list are computed, and the reference block with the better rate-distortion performance is selected as the best intraTMP reference block. An index indicating the position of the best BVP or BV candidate within the intraTMP list is signaled to the decoder in order to facilitate the block decoding using the IntraTMP prediction mode.
[0156] Due to the reference region having an irregular (non-rectangular) shape and to facilitate its hardware implementation, the template matching in the reference region 1712 is carried out in a set of rectangular sub-regions (R1 to R4 in the example of FIG. 17), whose dimensions are determined based on the current blocks' size and relative position inside the current CTU.
[0157] The global reference region 1712 dimensions comprising all sub-regions are determined by the SearchRange_w and SearchRange_h parameters, which are set proportional to the current block 1700 dimension (CbWidth, CbHeighf) using a multifactor parameter denoted as 'a', which controls the gain / complexity trade-off. InDocket No.: 24-2066PCTsome implementations of IntraTMP, the multifactor parameter 'a' may be uniform and equal to 5 or may be determined by of the current block dimension.Search Range_w = a * CbWidthSearchRange_h = a * CbHeight
[0158] In some implementations of IntraTMP, the global RR for block dimensions 4 and 8 were extended to 64 pixels, according to the following equations:Search Range_w = max (64, 5* CbWidth)SearchRange_h = max (64, 5* CbHeight)
[0159] In practice, this has the effect of using a variable multifactor parameter 'a' based on the block dimensions, as shown in the following table:Block size 64 32 16 8 4 Multifactor' a' 5 5 5 8 16Search Range_w 320 160 80 64 64Search Range_h 320 160 80 64 64
[0160] FIG. 18 illustrates a current block 1800 in a current CTU 1804, and the reference region 1812 with example corresponding TMP search regions R1-R6 identified. An example reference block 1816, and reference block template 1818 and current block template 1808 are also illustrated.
[0161] In order to reduce the high computational burn of the exhaustive template matching searching in the whole RR, the searching process is split into 2 steps: a sparse search step, and a refinement search step.
[0162] The sparse search is illustrated in FIG. 19A. The sparse search in some implementations is carried out in a regular grid using a subsampling interval of 3 in the horizontal and vertical directions. FIG. 19A shows an example candidate reference block 1916 in search region 4 (R4) and the corresponding reference template 1918. The template cost is computed for each reference block position within a search region (e.g., reference block position 1920 of reference block 1916), and the BVP candidates associated with the best (e.g., least cost) reference block are recorded in a sparse list and sorted in ascending order cost. The subsample positions are shown in the form of dark squares in within the RR region, separately determined for each search region. In some implementations of IntraTMP, the size of the sparse list is set to 30.
[0163] The refinement search, the second step, is a refinement of the reference block candidates (correspondingly, BVP candidates) in the sparse list. In some implementations, the refinement is made in a window of 3x3 pixels around the sparse BVP candidates using a sampling interval of 1. If the refinement window (refine window) crosses into another search region, the refinement window is clipped to the region boundary to which the candidate belongs. FIG.19B illustrates a current block 1906 and template 1908 located in a current CTU 1904 of a current frame 1900 (i.e., current picture), corresponding RR with respective search regions R1-R6, and an example BVP 1910 with its refinement search window 1912 being clipped where R4 (search region in which the BVP 1910 is located) borders R1Docket No.: 24-2066PCTand R3. The 19 refined candidates with the lowest template cost (e.g., SAD cost) are selected for a “Refined IntraTMP list”.
[0164] Once the encoder / decoder has constructed the Refined IntraTMP List, the encoder / decoder can select among different intraTMP sub-modes by checking the rate-distortion performance of each sub-mode. The IntraTMP sub-mode is signaled to the decoder in combination with an index to the best candidate in the Refined IntraTMP List or a cluster of candidates (e.g. the Fusion mode). The IntraTMP sub-modes may be the single predictor sub-mode, the fusion sub-mode, the sub-pel precision sub-mode, and the linear filter mode sub-mode. In the single predictor sub-mode, a single BVP candidate is selected from the Refined IntraTMP List and signaled to the decoder. In the fusion sub-mode, multiple BVP candidates are blended to derive the final BV prediction block. The blending weights may be either computed from the template matching cost of each predictor or with a Wiener-filter-based weight derivation method. In the sub-pel precision sub-mode, when a single predictor is used, sub-pel precision can be used with 1 / 2-pel precision, 1 / 4-pel precision, and 3 / 4-pel precision, each with 8 possible directions. In the linear filter model sub-mode, a linear filter can be learned between the reference and current templates and applied to the reference block. This mode can be used for a single predictor when sub-pel precision is not used and a single predictor is used.
[0165] In some implementations of IntraTMP several types of template shapes may be used. Five types of templates have been proposed according to the current block location in the frame: top template, left template, L- shape template, an only top template type, and an only left template type.
[0166] The template type of top template may be used when only the current block’s top samples are available, such as when the current block is located to the left boundary of the picture. Consequently, the TMP cost is computed using the top samples of the current and reference blocks. In some implementations, the top template is four samples in height.
[0167] The template type of left-template may be used when only the left samples of the current block are available, such as when the current block is located at the top boundary of the picture. Consequently, the TMP cost is computed using exclusively the left samples of both the current and the reference block. In some implementations, the left template is four samples in width.
[0168] The L-shape template type is used in the other cases (e.g., the current block is not located at the top or left boundary of the picture) where the samples surrounding the current block included in the L-Shape are available. Consequently, the TMP cost is computed using the L-shape template of both the current and the reference block. In some implementations, the L-shape template is four samples in width and height.
[0169] The L-shape template introduces two more template types: the Only-Top (Only-T) and Only-Left (Only-L) templates. Therefore, in addition to the L-Shape TMP cost, the TMP cost for the Only-T and Only-L templates are also computed, and the best N BV candidates are stored in different Only-T and Only-L lists.
[0170] FIG. 20 depicts an example of the top-template type. Only the top templates (e.g., template 2001 of current block and template 2000 of reference block 1816) are used for the TMP cost computation in 2003. The sparse searchDocket No.: 24-2066PCT2003 is computed using a sampling interval (SI) 2002 of three, and one sparse list (sparse candidates list) 2004 is built using the best (lower cost) 30 BV candidates in some implementations. Those candidates are refined in 2006 using a 3x3 window 2005 with an SI of 1, and a Refined Candidates List 2007 is built.
[0171] For the left-template type, the same top-template type logic is applied, but the left templates (2100 and 2101) of reference and current blocks are used instead of the top templates illustrated in FIG.21. The TMP cost calculation in search regions 2103, search intervals for sparse search 2102, sparse candidates list 2104, refinement of the sparse list 2106, refinement search windows 2105, and the refined candidates list 2107 of the Only-L process shown in FIG. 21 may be identical (except for the use of Only-L template instead of Only-T template) to 2003, 2002, 2004, 2006, 2005, and 2007, respectively, described in relation to FIG.20.
[0172] FIG. 22 depicts an example of a current block that has available (e.g., reconstructed samples are available for) the L-shape template 2201, and the L-shape template 2200 of the reference block is used to compute the L- shape TMP cost in all search regions, as it was described for the top-template and left-template type.
[0173] In addition to the L-shape cost, the Only-Top TMP and Only-Left TMP costs may also be computed. In some implementations, the Sparse search builds three sparse lists, one Sparse L-Shape List with a size of 30 BV candidates, and two additional lists, the Only-T sparse list and Only-L sparse list, both with a length of 6 BV candidates.
[0174] These three sparse lists are refined by computing the respective template type cost using a window of 3x3 in some implementations. The best N BV candidates, which have obtained the lower TMP costs, are stored in three new refined lists: the Refined L-Shape List with a size of 19 BV candidates, the Refined Only-T List with a length of 3 BV candidates and Refined Only-L List with a size of 3 BV candidates. In some implementations, the final IntraTMP List has 19 candidates as the L-Shape List, but it is a combination of the BVs candidates in the L-Shape List, and the Refined Only-T and Only-L Lists.
[0175] IntraTMP is described in F. Wang et al, “EE2-1.20i / j: Combination of IntraTMP tests”, JVET-AD0086, April 2023; L. Zhang etal, “EE2-1.11: Intra template matching prediction fusion”, JVET-AD0072, April 2023; J.-Y. Huo etal, “EE2-1.16: A Fusion method of Intra Template Matching Prediction (Intra TMP)”, JVET-AD0116, April 2023; and P. Lin etal, “EE2-1.19: IntraTMP with multiple modes”, JVET-AD0194, April 2023, the content of which are herein incorporated by reference.
[0176] In addition to the sparse BVP candidates obtained by TMP searching within the RR in IntraTMP, another set of Merge BVP candidates is proposed in another technique “IntraTMP with merge candidates”. In some implementations, Merge BVP candidates are a subset of the IBC merge candidates, comprising only the spatial candidates of the current block. In particular, the TMP Merge BVP candidates may use the 5 adjacent BV from the adjacent blocks (e.g., 2300 in FIG. 23A) and the 20 non-adjacent BV candidates (e.g., 2301 in FIG. 23B) from the non-adjacent neighboring blocks encoded using an IBC or TMP mode.
[0177] The IntraTMP with merge candidates process is illustrated in FIG. 24. A maximum of 50 BVP candidates 2405 from the adjacent blocks and BVP candidates 2406 from non-adjacent blocks are ranked in ascending TMP costDocket No.: 24-2066PCTat 2407, and the best 10 candidates (candidates with lower TMP costs) comprise the TMP merge list 2408, which is ordered in ascending TMP cost. A sparse list 2404 is generated at 2403 for the current block, as described above in relation to IntraTMP, by calculating TMP costs for reference templates (e.g., L-shape reference templates) 2400 and the current block’s template 2401 in search regions using a predetermined sampling interval 2402. In some implementations, up to 10 Merge BVP candidates are checked in the sparse list 2404 for duplicates, and the redundant BVPs in the TMP merge list are removed from the list at block 2409 to generate the updated TMP merge list 2410.
[0178] Thereafter, the Merge BVPs from the updated TMP merge list 2410 compete with the BVP candidates in the sparse list 2404, and the best 30 candidates from the sparse list 2404 and the updated merge list 2410, based on the TMP cost, are selected for the final or updated sparse list 2411.
[0179] BVPs in the updated sparse list 2411 are refined using a window whose size depends on the BVP type to generate the refined candidates list 2414. BVPs from the sparse searching in the regular TMP reference region use a 3x3 refinement window 2412 (sampling interval of 1 sample). Otherwise, BVPs included in the updated sparse list 2411 from the updated TMP merge list 2410, whether inside or outside the regular reference region, use a window size of 11x11 samples 2413 (sampling interval of 1 sample).
[0180] IntraTMP with merge candidates is described in K. Naser et al, " EE2-1.2: IntraTMP with merge candidates,” JVET-AG0151, January 2024, the content of which is herein incorporated by reference.
[0181] Another technique, known as auto-relocated block vector prediction (AR-BVP), may be used in the construction of the AMVP and / or merge IBC list. In some implementations, a guiding BV is selected from the BVP candidates in the AMVP and / or merge IBC list. A BV (referred to herein as a “coding BV”) pointing to a reference block of a block containing a position derived relative to a position pointed to by the guiding BV is identified. An AR- BVP candidate can be determined as the combination of the guiding BV and the identified coding BV. For example, the coding BV may be similar to a block vector displacement (BVD) in AMVP IBC.
[0182] FIG. 25A shows an example of AR-BVP applied to IBC where the guiding BV (BV-1 ) 2501 is a BVP candidate in the IBC list. The guiding BV maybe used to identify a first reference block (e.g., reference PU1), and five positions of the first reference block that are aligned with the five positions of the current block 2506 are also identified. Each of the five positions is checked to determine if a coding block containing the sample at the position is encoded / decoded using the IBC mode or IntraTMP mode. These positions are denoted correspond to the center of block (CT) 2520, left-top (LT) 2521, right-top (TR) 2522, left-bottom (LB) 2523, and right-bottom (RB) 2524. In the case that AR-BVP is used in IBC and a block containing a sample at a position of at least one of the five positions (CT, LT, RT, LB, and RB) defined relative to the first reference block was encoded / decoded using IntraTMP or IBC mode, the block vector used to indicate the reference block of the first reference block is used as the coding BV to combine with the guiding BV.
[0183] FIG. 25A further shows one example of multiple candidate AR BVPs iteratively derived from the initial guiding BV 2501. Guiding BV 2501 points to the first reference block (PU1). The five positions of the first referenceDocket No.: 24-2066PCTblock PU1, correspond to the guiding BV 2501 applied to the five respective positions (CT, LT, RT, LB, and RB) of current block 1906. Each of the five positions of the first reference block may be checked to determine if a block at any of those positions was coded in an IBC or IntraTMP mode. For example, a block at the position CT of the first reference block may be determined to be coded using BV-CT 12504. Then, BV-CT 12504 may be applied to the first reference block to determine a second reference block (reference PU2), from which five positions of the second reference block may be checked to derive one or more AR-BVPs. For example, the first AR-BVP candidate (BVP- AR1 ) 2505 maybe derived as the addition of the guiding BV 2501 and the coding BV(BV-CT1) 2504 used for the encoding / decoding of the block containing / at position CT of the first reference block. The second AR-BVP candidate, BV-AR22507, may be derived as the addition of previous AR-BVP (BVP-AR12505) and a block vector BV-LT2i 2506 derived from a block at LT position of the second reference block (reference PU2). For example, each of the five positions of the second reference block (reference PU2) may be checked to determine if a block containing the sample at those respective positions was encoded / decoded in an IntraTMP or IBC mode. For example, the block at the LT position may satisfy the condition and was coded using BV-LT212506, which points to a third reference block (reference PU3).
[0184] This cascading process may be iterated multiple times and referred to a number of hops. In some examples, the cascading process may be constrained to one hop. For example, BVP-AR12505 may be added as a candidate AR BVP derived from BVP-12501, but BVP-AR22507— which corresponds to a second hop— would not be determined and added as a second candidate AR BVP.
[0185] The AR-BVP candidates 2505 (and 2507) may be included in the AMVP and / or merge lists after the spatial adjacent candidates. In some examples, AR-BVP candidates 2505 (and 2507) may be included in the AMVP and / or merge lists after the spatial non-adjacent candidates. In some examples, AR-BVP candidates maybe included in the AMVP and / or merge lists after the HMVP candidates.
[0186] In some examples, AR-BVP technique may be implemented for IntraTMP and is referred to as IntraTMP AR- BVP merge. This technique uses the BVPs in the IntraTMP merge list as guiding BVs for the AR-BVP process.
[0187] In some examples, the TMP AR-BVP candidates, 2505 and 2507, may be included in the TMP merge lists after the adjacent and non-adjacent merge candidates. In some examples, TMP AR-BVP candidates 2505 and 2507 and the adjacent and the non-adjacent merge candidates may be sorted in an ascending order of the TMP costs within the TMP merge list. In some embodiments, the merge candidates (adjacent and non-adjacent candidates) and the AR-BVP candidates outside the TMP reference region may be included first within the TMP merge list.
[0188] In some examples, the refinement window for the AR-BVP candidates may have a different window size than that of the TMP merge candidates and the TMP sparse candidates. FIG. 25B shows examples of refinement windows for AR-BVP candidates derived from different positions associated with the initial guiding BV 2501, according to some embodiments. The first AR-BVP candidate (BVP-AR1) 2503 is derived as the addition of the guiding BV 2501 and the coding BV(BV-CT1) 2502 used for the encoding / decoding of a block containing the central position (CT) of the reference block (reference PU). Since the BVs of a block are typically referenced with respect toDocket No.: 24-2066PCTthe upper left Conner of the block, the guiding BV 2501 is illustrated as starting from the upper left Conner of the current block 1906. The BV associated with the block containing the central position CT, BV-CT12502, is shifted based on the guiding BV 2501 so that the addition of these two BVs can generate AR-BVP candidate BVP-AR12503. Similarly, another AR-BVP candidate, BV-RT12505, is derived based on the BV associated with a block containing the sample at the RT position of the reference block. An AR-BVP candidate may be refined with a refinement window 2514. Specific refinement window 2514 for a given AR-BVP candidate is set according to the AR-BVP candidate. For example, positions 2533 and 2535 pointed to by BVP-AR12503 and BVP-AR22505, respectively, are used to determine the respective refinement windows.
[0189] The following source code represent an example implementation for obtaining the five positions relative to a guiding BV and determining if a block at each of the five positions is coded based on IBC or IntraTMP mode: const PredictionUnit&pu = *cs.getPU(area.pos(), CHANNEL_TYPE_LUMA); / / pu is the top left corner of the CB II posCand indicate the 5 offset positions related to the CB. The pu position is equal to the topleft() not centerf) Position posCand[5] ={pu. Y().center(),pu. Y().topLeft(),pu. Y().topRight(), pu. Y().bottomLeft(),pu. Y().bottomRight() }; II EXAMPLEpu: {PU: x = 32, y = 24, width = 8, height = 8}posCand = {{x=36 y=28 }, {x=32 y=24 }, {x=39 y=24 }, {x=32 y=31 }, {x=39 y=31 }}II Loop FOR to test the five positions around a first IBC Merge / AMVP candidate or TMP merge candidateII bvBasedMergeCandidateslTMP_AR is the list of TMP merge candidatesfor (int mergeindex = 0; (mergeindex < 25) && (bvBasedMergeCandidateslTMP_AR.size() < totalNum); mergelndex++)cMv_Sparse = SparseCandidateslTMP[mergelndex];offsetX = cMv_Sparse.m_pX;offsetY = cMv_Sparse.m_pY;cMv = MvfoffsetX, offsetY); II This is the merge candidate to use as base for the AR-BVPfor (int n = 0; n < 5 && bvBasedMergeCandidateslTMP_AR.size() < totalNum; n++) / / Check the 5 positions around the Merge BVII puCascadedid the merge candidate + one of the offset (CT, TL, TR, BL, BR)const PredictionUnit* puCascaded = pu.cs->getPURestricted(posCand[n].offset(offsetX, offsetY), pu, pu.chType); if (IpuCascaded || ((puCascaded->cu->predMode!= MODE_IBC) && (!puCascaded->cu->tmpFlag)))continue; II If the block in that position was not encoded with IBC or TMP, skip it and go to the next among the 5Mv arbv = cMv + puCascaded->bv; / / Build the AR-BVP candidateDocket No.: 24-2066PCTII Check if reference block pointed out for the AR-BVP is already decodedif (PU::validltmpBv(pu, arbv.hor, arbv.ver))if (I PU:: CheckBvAvailable(bvBasedMergeCandidateslTMP, arbv) &&! PU:: CheckBvAvailable(bvBasedMergeCandidateslTMP_AR, arbv))II If the AR-BVP is valid and it is not already in the TMP merge list, it is added at the end of the TMP merge list bvBasedMergeCandidateslTMP_AR.push_back(arbv);if (bvBasedMergeCandidateslTMP_AR.size() >= totalNum)break; II If the list is complete, finish the TMP AR-BVP process
[0190] FIG. 25C shows one example of IntraTMP with BVP candidates derived using IntraTMP AR-BVP merge as discussed above with respect to FIGS. 25A and 25B. In FIG. 25C, TMP merge candidate BVP-M12531 pointing to a location outside the TMP reference region is used as a guiding BV to derive two AR-BVP candidates, BV-AR12533 and BV-AR22535. Likewise, TMP merge candidate BVP-M22536 is used as guiding BV to derive a new AR-BVP candidate BV-AR32538.
[0191] FIG. 25D shows the IntraTMP with merge candidates process illustrated in FIG.24 modified to include AR- BVP candidates in the merge list. In the illustrated process 2540, the blocks 2402, 2403, 2404, 2405, 2406, 2409, 2410, 2420, 2411, 2412, 2413, and 2414 function similarly to the correspondingly numbered blocks shown and described in relation to FIG. 24. Additionally, at blocks 2546 and 2544, sparse AR-BVP candidates and merge AR- BVP candidates, respectively, are determined. The merge AR-BVP candidates 2544 and sparse AR-BVP candidates 2546 are added to the initial merge list 2542 which comprises the adjacent merge BVP candidates 2405 and non- adjacent merge BVP candidates 2406.
[0192] At block 2548, reordering and pruning of the merge list that may include one or more each of adjacent merge BVP candidates, non-adjacent merge BVP candidates, merge AR-BVP candidates, and sparse AR-BVP candidates, is performed. The reordering and pruning may be performed in a manner similar to that described in relation to block 2407 in FIG.24, to generate the merge list 2550. As with the merge list 2408, merge list 2550 may be limited to a predetermined number (e.g., 10).
[0193] Subsequently, each BVP candidate in the updated sparse list 2411, which in process 2540 may include AR- BVP candidates in addition to other sparse BVP candidates and adjacent or non-adjacent merge BVP candidates, are subjected refinement search based on whether the entry is a merge BVP candidate or an AR-BVP candidate. BVP candidates that are not merge candidates, are set / assigned a refinement search window 2412 of size 3x3, BVPDocket No.: 24-2066PCTcandidates that are merge candidates but not AR-BVP candidates are set / assigned a refinement search window 2413 size of 11x11, and BVP candidates that are merge candidates and AR-BVP candidates are set a refinement search window 2552 size of 5x5. The refinement search may be performed similarly to the refinement search described in relation to FIG.24, to generate the refined list 2414 of BVP candidates.
[0194] The merge list, such as the merge lists produced by processes shown in FIG.24 and 25D, may accumulate a large number of BVP candidates, including those from TMP fusion, TMP non-fusion, IBC, DIMD, and their derived adaptive resolution BVs (AR-BVP). This accumulation significantly increases the computational complexity. The unfiltered accumulation increases the computational burden during the processes of reordering, pruning, clustering, and removing redundant candidates when merging the Sparse and merge lists.
[0195] BVP candidates in the merge lists of FIG.24 and FIG. 25D lack differentiation regarding their origin (e.g., TMP, IBC, or DIMD). This indistinct categorization complicates the ranking, pruning, clustering, and redundancy removal processes when merging the Sparse and merge lists.
[0196] To solve this problem of excessive time being consumed in the reordering and pruning of the merge list, embodiments of this disclosure provide for the filtering of merge candidates using a threshold from TMP cost. The filtering process, based on a threshold derived from the TMP cost, selectively eliminates less significant candidates from the merge list and improves overall efficiency.
[0197] The filtering process in example embodiments is incorporated into the TMP merge list generation workflow, as shown in FIG.26 and FIG. 27. This integration ensures that only the most relevant BVP candidates proceed to the next stage of processing.
[0198] FIG. 26 shows the process shown in FIG.25D, modified to include a filtering operation 2602, according to some embodiments of this disclosure. The filtering operation 2602 in process 2600 may be performed on the initial merge list 2542 before the AR-BVP candidates are added (2546). Thus the filtering operation 2602 enables removing less significant adjacent or non-adjacent BVP candidates from the merge list prior to adding AR-BVP and other BVP candidates. Alternatively, the filtering operation 2602 may be performed after the AR-BVP candidates 2544 and other BVP candidates are added to the merge list but before the reordering and pruning 2549 is performed on the merge list to generate merge list 2550. As noted above, performing the filtering operation 2602 before the reordering and pruning 2549 can significantly reduce the time consumed in the reordering and pruning by eliminating BVP candidates that are highly unlikely to be selected.
[0199] FIG. 27 shows example filtering operations being applied to an initial merge list according to some embodiments. The list 2710 on the right is the merge list under construction for the current block, and the list 2700 on the left is an example TMP list of a neighboring CU (e.g., previously-reconstructed neighboring block of the current block). The list 2700 indicates, for each candidate, an index 2701 of the location in the merge list, the type 2702 of the BVP candidate and coordinates 2703 of the candidate. The list 2710 indicates, for each candidate, an index 2711 of the location in the merge list, the type 2712 of the BVP candidate and coordinates 2713 of the candidate.Docket No.: 24-2066PCT
[0200] In some embodiments, candidates from neighboring blocks (e.g., one or more fusion candidates 11-15 and / or one or moreTMP candidates 1-3, such as shown in list 2700 from a neighboring CU) may be added to an initial merge list (e.g., initial merge list 2542), which is not shown in FIG.27. Some embodiments may add more than one candidate from the neighbor block to the initial merge list, and may “tag” each candidate with a type (e.g., BVP- TMP, BVP-TMP fusion, IBC, etc., as, for example, shown in 2712 in list 2710). In some embodiments, it may also “tag” each candidate with which neighboring block it was obtained from (e.g., with tags A1, A2, NA7 referring to adjacent (A) and non-adjacent (NA) neighboring blocks, as, for example, also shown in 2712 in list 2710).
[0201] For example, block 2705 collects the candidates’ information (e.g., tag) and tests the first three candidates of the list 2700 against a threshold, and block 2708 includes the candidates that have a cost less than the threshold to the merge list 2710. Note that the test at block 2705 is performed on candidates in the initial merge list. The TMP cost of the lowest cost candidate 11 with respect to the neighboring block may be used to filter candidates 12-15, associated with fusion group indicated by a TMP fusion index, according to TMP costs of candidates 12-15 calculated with respect to the current template of the current block. For example, a TMP cost of a candidate, such as candidate 12, is based on a difference between the current template and a reference template of a reference block indicated by the candidate. The list 2700 was previously reordered according to TMP costs of the neighboring block, so the stored group of BVP candidates 11-15 will also be in order. In the illustrated example, all three of the first three BVP candidates in list 2700 have costs less than the threshold, and therefore, all three of those entries are added to the merge list 2710. More particularly, these BVP candidates are added to the initial merge list and, after blocks 2705 and block 2708, are subsequently included at the end of the merge list 2710.
[0202] In block 2706, the information from the TMP fusion group, which was selected for the neighbor block, is tested to determine which entries (BV candidates) of that fusion group will be included in the merge list 2710 for the current block.
[0203] For another example, the fusion group 2 is filtered in 2708, and only one of the BVP candidates in the fusion group 2 (as shown in block 2706) is added to the merge list 2710.
[0204] Thus, candidates from TMP lists (e.g., list 2700) of neighboring blocks are added to the initial merge list, that is then filtered (and pruned) at block 2708 to obtain the merge list 2710. Note that it is possible that only one candidate is obtained from each neighboring block (e.g., one candidate 11 from TMP fusion group 2 used to code a neighboring block) and each of these candidates may be filtered according to a threshold obtained from a sparse list of the current block (e.g., highest TMP cost of costs of candidates in the sparse list).
[0205] The filtering block 2708 applies a filtering on potential candidates from the list 2700 so that only candidates with a TMP cost less than a threshold are added to the merge list 2710 of the current block. It is possible that the entire list or a portion of the list, e.g., from idx 1 to the last idx of the last candidate in a TMP fusion group used by the neighboring block, may be stored. However, in some embodiments, the entire list 2700 is not stored for the neighboring block. In some examples, if the neighboring block is coded with TMP fusion idx 2, then the set of BV candidates (e.g., 11-15) are stored, so only these BV candidates 11-15 are filtered.Docket No.: 24-2066PCT
[0206] The FIG. 28A-B and FIG. 29 illustrate three examples of generating a certain threshold and using the generated threshold for filtering the merge TMP list construction.
[0207] FIG. 28A shows an example process 2800 for filtering fusion BVP candidates. FIG. 28A shows how candidates from TMP fusion mode are evaluated and filtered based on TMP cost thresholds. The first index of the TMP Fusion group of neighboring block is used as a threshold.
[0208] Note BVP candidates may be obtained from neighboring blocks and added to an initial merge list (e.g., such as list 2707 in FIG.27) of the current block, with each candidate / entry tagged with information (e.g., candidate type, neighbor candidate obtained from, etc.). Then each candidate in the initial merge list may be filtered according to its type (e.g., indicated by “tag"). At 2801, the tag or type of the BV candidates from the neighbor block is obtained, and at 2802, an entry from the initial merge list is obtained.
[0209] At 2803 it is determined whether the candidate is a fusion candidate, as, for example, indicated by a corresponding fusion flag associated with the candidate. If not a fusion candidate, the candidate is added to the merge list (e.g., list 2710 in FIG. 27) at 2808.
[0210] If at 2803 it is determined that the candidate is a fusion candidate, then the candidate's TMP cost is computed at 2810. At 2804 it is determined whether the candidate is the candidate at the lowest index in its fusion group. If it is the candidate with the lowest index in its fusion group, then at 2805, the TMP cost of that candidate is set as the threshold. It should be noted that this is not the TMP cost computed for the neighboring block, because the current block does not have access to the previously computed TMP cost. This is the TMP cost computed between the reference template indicated by candidate and the current template of the current block.
[0211] At 2806, the candidate's TMP cost is compared to the threshold, and, if lower, the process 2800 at 2808 includes the candidate to the merge list. Alternatively, if at 2806, it is determined that the candidate's TMP cost is not less than the threshold, then at 2807, the candidate is removed and is not included in the merge list.
[0212] FIG. 28B illustrates a similar process 2820 for non-fusion TMP candidates using the first index of the TMP list of the neighboring block as a threshold.
[0213] As noted above, BVP candidates may be obtained from neighboring blocks and added to an initial merge list (e.g., such as list 2707 in FIG. 27) of the current block with tag information for each added candidate. At 2811, the tag or type of the BV candidates from the neighboring block is obtained, and at 2812, a candidate from the initial merge list is obtained.
[0214] At 2813, whether the candidate is a TMP candidate or not a fusion candidate is determined. If the candidate is either not a TMP candidate or is a fusion candidate, the candidate is included in the merge list at 2818.
[0215] If at 2813 it is determined that the candidate is a TMP candidate and is not a fusion candidate, then at 2814 it is determined whether the candidate is from the first index in a TMP list. If the candidate is the first index in a TMP list, then at 2815 the candidate’s TMP cost is set as a threshold.Docket No.: 24-2066PCT
[0216] At 2816, the candidate’s TMP cost is compared to the threshold. If the TMP cost is less than the threshold, then the candidate is included in the merge list at 2818. Otherwise, the candidate is removed at 2817 and is not included in the merge list.
[0217] FIG. 28C illustrates another process 2820 for filtering Merge BV candidates using the TMP cost of the last index of the Sparse TMP list as the threshold, according to some embodiments of this disclosure.
[0218] As described in relation to FIG. 24, at 2404, the sparse list is built. The cost of the last candidates (the highest cost in the list) in the sparse list is determined as a threshold at 2815. The cost may be stored or recorded at 2815. Also as described above, an initial merge list is generated at 2542.
[0219] At 2816, the merge candidate’s TMP cost that was computed at 2890 is compared to the threshold determined in 2815. In one embodiment, the threshold may be scaled by a predetermined value K. If the merge candidate’s TMP cost is less than the threshold or the scaled threshold, then the candidate is added to the merge list at 2818, and this candidate may be used as a guide candidate for the ARBVP list construction. Otherwise, the candidate is removed from the merge list at 2817.
[0220] In one embodiment, a refinement window of the merge candidates and the Sparse candidates are analyzed, and if they overlap, both refinement windows are clustered and only the candidate with lower cost, merge or sparse, is recorded in the sparse list according to the TMP cost. The merge BV filtering removes the merge candidates with cost higher than the last sparse candidates in the list, decreases the encoder and decoder complexity by reducing the number of candidates in the ARBVP list construction, and consequently, relieves clustering processes.
[0221] In the embodiment shown, the BV filtering may be applied after the ARBVP list construction 2544, which does not impact the ARBVP list construction but simplifies the clustering process. In some embodiments, the BV filtering is applied to the initial merge list before ARBVP list construction.
[0222] FIG. 29 illustrates another example of generating a certain threshold and using the generated threshold to filter the merge TMP list. Process 2900, at 2901, obtains the tag or type for a BV candidates from a neighbor block, and at 2902, obtains an entry from the initial merge list.
[0223] At 2903, it is determined whether the candidate's TMP flag is set. If the flag is set, then at 2904, the TMP list is reordered according to TMP cost, and at 2905, the highest TMP cost is selected as the threshold for candidates.
[0224] At 2906, the candidate's TMP cost is compared to the threshold. If the TMP cost is not less than the threshold, the candidate is removed at 2907, and at 2908, the merge list is updated. Otherwise, the candidate TMP cost exceeds the threshold, the candidate is included in the final merge list.
[0225] An example TMP merge and / or ARBVP list of the current block with the TMP flag tag 2910 and an example TMP merge and / or ARBVP list of the current block with the other tag 2920 are shown. Each list includes an index, BVP type, and BVP coordinate for each candidate. Process 2900 enables, for example, to select the highest TMP cost in the TMP flag tagged list as the threshold and filter candidates in both the lists 2910 and 2920 based on the selected threshold. In some implementations, process 2900 can be used in relation to other two lists of BVP candidates, such as, for example, the sparse list and the merge list.Docket No.: 24-2066PCT
[0226] FIG. 30 illustrates a flowchart 3000 of a process for encoding a current block using IntraTMP by filtering the merge list based on TMP cost, according to some embodiments. In some examples, the process of flowchart 3000 may be performed by an encoder (e.g., encoder 200 of FIG. 2).
[0227] The process of flowchart 3000 may begin at block 3002. At block 3002, a first set of candidates and a second set of candidates for predicting the current block are obtained using vector information (e.g., block vector) of reconstructed neighboring blocks (e.g., adjacent, non-adjacent, history-based candidates, etc.).
[0228] At least the first set of candidates is located in a plurality of search regions that subdivide a reference region determined in accordance with at least the location of the current block. The candidates of the second set of candidates may belong to the reference region or may be outside the reference region.
[0229] The first set of candidates may comprise candidates determined according to IntraTMP (e.g., based on performing template matching), and wherein the second set of candidates may comprise merge candidates determined according to IntraTMP with merge candidates. The second set of candidates may further include candidates determined according to IBC, DIMD, auto-relocated block vector prediction (AR-BVP), TIMD, IntraTMP with fusion, or other such candidate determining techniques. In an example, the first set of candidates is obtained from a sparse list (e.g., such as 2404 in FIG. 27) and the second set of candidates is from the initial merge list (e.g., such as 2542 in FIG. 27).
[0230] At block 3004, filtering of the second set of candidates is performed based on template matching costs of at least one of the first set of candidates or the second set of candidates.
[0231] In some implementations, the filtering is determined separately for template matching fusion candidates in the second set of candidates and non-fusion candidates in the second set of candidates. The filtering may be determined using separately determined TMP cost thresholds for template matching fusion candidates in the second set of candidates and non-fusion candidates in the second set of candidates.
[0232] In some implementations, the filtering comprises separately filtering each of two or more subsets of the second set of candidates, wherein each of the subsets comprise a different set of BVP candidate types.
[0233] In some implementations, the filtering comprises filtering the second set of candidates based on a template matching cost of a candidate in the first set of candidates. Examples of filtering the initial merge list according to some embodiments are described in relation to FIGs. 27-29.
[0234] In some embodiments, the second set of candidates includes BVP candidates in a TMP fusion group indicated by an index obtained from a (previously-reconstructed) neighboring block coded in a TMP fusion mode, where the second set of candidates are filtered based on a template matching cost of a BVP candidate with the lowest index in the TMP fusion group of the BVP candidates. For example, when a neighboring block is coded in TMP fusion and has a TMP fusion index 1, there is a list of BVPs such as, for example, BVP1 (first in the group), BVP2, BVP3. These BVPs are in order of TMP cost for the neighboring block. The TMP cost of BVP1 for the current block is computed and used to filter the TMP cost of the other BVPs, BVP2 and BVP3, based on TMP costs of BVP2 and BVP3 computed with respect to the current block (e.g., difference between current template and reference template ofDocket No.: 24-2066PCTa candidate reference block indicated by BVP2). Note that the reference block is at location of BVP2 + position of current block (top left corner / sample).
[0235] In some embodiments, the second set of candidates may include one or more third BVP candidates obtained from a neighboring block coded using TMP. The third BVP candidates may be associated with the lowest TMP cost based on the neighboring block. The filtering may include filtering the third candidates based on a template matching cost of a candidate in the first set of candidates. For example, the 3 TMP candidates with the lowest indexes in a candidates list for the neighboring block may be obtained. Then these 3 candidates may be filtered based on a threshold determined according to TMP cost of the first candidates— with the TMP cost being between reference template associated with a first candidate and the current template of the current block. For example, the threshold may be set to correspond to the highest TMP cost of TMP costs of the first set of candidates. These TMP costs are calculated with respect to the current template of the current block.
[0236] At block 3006, based on a template matching search of a plurality of refine windows corresponding to at least a subset of candidates of the first set of candidates and the filtered second set of candidates, a candidate for predicting the current block is selected. Optionally, reordering and pruning of the filtered second set of candidates may be performed before selecting the candidate. For example, in process 2600 shown in FIG. 26, a reordering and pruning 2549 is performed to obtain the merge list that is subsequently used to select the candidate. Because of the filtering based on TMP costs performed at block 3004 before the reordering and pruning, substantial improvements in the time consumed for the reordering and pruning operation can be observed.
[0237] At block 3008, the current block is encoded based on the selected candidate.
[0238] FIG. 31 illustrates a flowchart 3100 of a process for encoding a current block using IntraTMP by filtering the merge list based on TMP cost, according to some embodiments. In some examples, the process of flowchart 3100 may be performed by an decoder (e.g., decoder 300 of FIG. 3).
[0239] The process of flowchart 3100 may begin at block 3102. At block 3102, a first set of candidates and a second set of candidates for predicting the current block are obtained using vector information (e.g., block vector) of reconstructed neighboring blocks (e.g., adjacent, non-adjacent, history-based candidates, etc.).
[0240] At least the first set of candidates is located in a plurality of search regions that subdivide a reference region determined in accordance with at least the location of the current block. The candidates of the second set of candidates may belong to the reference region or may be outside the reference region.
[0241] The first set of candidates may comprise candidates determined according to IntraTMP (e.g., based on performing template matching), and wherein the second set of candidates may comprise merge candidates determined according to IntraTMP with merge candidates. The second set of candidates may further include candidates determined according to IBC, DIMD, auto-relocated block vector prediction (AR-BVP), TIMD, IntraTMP with fusion, or other such candidate determining techniques. In an example, the first set of candidates is obtained from a sparse list (e.g., such as 2404 in FIG. 27) and the second set of candidates is from the initial merge list (e.g., such as 2542 in FIG. 27).Docket No.: 24-2066PCT
[0242] At block 3104, filtering of the second set of candidates is performed based on template matching costs of at least one of the first set of candidates or the second set of candidates.
[0243] In some implementations, the filtering is determined separately for template matching fusion candidates in the second set of candidates and non-fusion candidates in the second set of candidates. The filtering may be determined using separately determined TMP cost thresholds for template matching fusion candidates in the second set of candidates and non-fusion candidates in the second set of candidates.
[0244] In some implementations, the filtering comprises separately filtering each of two or more subsets of the second set of candidates, wherein each of the subsets comprise a different set of BVP candidate types.
[0245] In some implementations, the filtering comprises filtering the second set of candidates based on a template matching cost of a candidate in the first set of candidates. Examples of filtering the initial merge list according to some embodiments are described in relation to FIGs. 27-29.
[0246] In some embodiments, the second set of candidates includes BVP candidates in a TMP fusion group indicated by an index obtained from a (previously-reconstructed) neighboring block coded in a TMP fusion mode, where the second set of candidates are filtered based on a template matching cost of a BVP candidate with the lowest index in the TMP fusion group of the BVP candidates. For example, when a neighboring block is coded in TMP fusion and has a TMP fusion index 1, there is a list of BVPs such as, for example, BVP1 (first in the group), BVP2, BVP3. These BVPs are in order of TMP cost for the neighboring block. The TMP cost of BVP1 for the current block is computed and used to filter the TMP cost of the other BVPs, BVP2 and BVP3, based on TMP costs of BVP2 and BVP3 computed with respect to the current block (e.g., difference between current template and reference template of a candidate reference block indicated by BVP2). Note that the reference block is at location of BVP2 + position of current block (top left corner / sample).
[0247] In some embodiments, the second set of candidates may include one or more third BVP candidates obtained from a neighboring block coded using TMP. The third BVP candidates may be associated with the lowest TMP cost based on the neighboring block. The filtering may include filtering the third candidates based on a template matching cost of a candidate in the first set of candidates. For example, the 3 TMP candidates with the lowest indexes in a candidates list for the neighboring block may be obtained. Then these 3 candidates may be filtered based on a threshold determined according to TMP cost of the first candidates— with the TMP cost being between reference template associated with a first candidate and the current template of the current block. For example, the threshold may be set to correspond to the highest TMP cost of TMP costs of the first set of candidates. These TMP costs are calculated with respect to the current template of the current block.
[0248] At block 3106, based on a template matching search of a plurality of refine windows corresponding to at least a subset of candidates of the first set of candidates and the filtered second set of candidates, a candidate for predicting the current block is selected. Optionally, reordering and pruning of the filtered second set of candidates may be performed before selecting the candidate. For example, in process 2600 shown in FIG. 26, a reordering and pruning 2549 is performed to obtain the merge list that is subsequently used to select the candidate. Because of theDocket No.: 24-2066PCTfiltering based on TMP costs performed at block 3104 before the reordering and pruning, substantial improvements in the time consumed for the reordering and pruning operation can be observed.
[0249] At block 3108, the current block is encoded based on the selected candidate.
[0250] FIG. 32 illustrates a flowchart 3200 of a process for reconstructing or encoding a current block using IntraTMP by filtering the merge list based on TMP cost, according to some embodiments. In some examples, the process of flowchart 3200 may be performed by a decoder (e.g., decoder 300 of FIG. 3) or an encoder (e.g., encoder 200 of FIG. 2).
[0251] The process of flowchart 3200 may begin at 3202. At block 3202, a first set of candidates and a second set of candidates for predicting the current block are obtained using vector information (e.g., block vector) of reconstructed neighboring blocks (e.g., adjacent, non-adjacent, history-based candidates, etc.).
[0252] At least the first set of candidates is located in a plurality of search regions that subdivide a reference region determined in accordance with at least the location of the current block. The candidates of the second set of candidates may belong to the reference region or may be outside the reference region.
[0253] The first set of candidates may comprise candidates determined according to IntraTMP, and wherein the second set of candidates may comprise merge candidates determined according to IntraTMP with merge candidates. The second set of candidates may further include candidates determined according to IBC, DIMD, auto-relocated block vector prediction (AR-BVP), TIMD, IntraTMP with fusion, or other such candidate determining techniques. In an example, the first set of candidates is obtained from a sparse list (e.g., such as 2404 in FIG. 27) and the second set of candidates is from the initial merge list (e.g., such as 2542 in FIG. 27).
[0254] At block 3204, filtering of the second set of candidates is performed based on template matching costs of at least one of the first set of candidates or the second set of candidates.
[0255] In some implementations, the filtering is determined separately for template matching fusion candidates in the second set of candidates and non-fusion candidates in the second set of candidates. The filtering may be determined using separately determined TMP cost thresholds for template matching fusion candidates in the second set of candidates and non-fusion candidates in the second set of candidates.
[0256] In some implementations, the filtering comprises separately filtering each of two or more subsets of the second set of candidates, wherein each of the subsets comprise a different set of BVP candidate types.
[0257] In some implementations, the filtering comprises filtering the second set of candidates based on a template matching cost of a candidate in the first set of candidates. Examples of filtering the initial merge list according to some embodiments are described in relation to FIGs. 27-29.
[0258] In some embodiments, the second set of candidates includes BVP candidates in a TMP fusion group indicated by an index obtained from a (previously-reconstructed) neighboring block coded in a TMP fusion mode, where the second set of candidates are filtered based on a template matching cost of a BVP candidate with the lowest index in the TMP fusion group of the BVP candidates. For example, when a neighboring block is coded in TMP fusion and has a TMP fusion index 1, there is a list of BVPs such as, for example, BVP1 (first in the group), BVP2,Docket No.: 24-2066PCTBVP3. These BVPs are in order of TMP cost for the neighboring block. The TMP cost of BVP1 for the current block is computed and used to filter the TMP cost of the other BVPs, BVP2 and BVP3, based on TMP costs of BVP2 and BVP3 computed with respect to the current block (e.g., difference between current template and reference template of a candidate reference block indicated by BVP2).
[0259] In some embodiments, the second set of candidates may include one or more third BVP candidates obtained from a neighboring block coded using TMP. The third BVP candidates may be associated with the lowest TMP cost based on the neighboring block. The filtering may include filtering the third candidates based on a template matching cost of a candidate in the first set of candidates. For example, the 3 TMP candidates with the lowest indexes in a candidates list for the neighboring block may be obtained. Then these 3 candidates may be filtered based on a threshold determined according to TMP cost of the first candidates— with the TMP cost being between reference template associated with a first candidate and the current template of the current block. For example, the threshold may be set to correspond to the highest TMP cost of TMP costs of the first set of candidates. These TMP costs are calculated with respect to the current template of the current block.
[0260] At block 3206, based on the first set of candidates and the filtered second set of candidates, a candidate for predicting the current block is selected. Optionally, reordering and pruning of the filtered second set of candidates may be performed before selecting the candidate. For example, in process 2600 shown in FIG. 26, a reordering and pruning 2549 is performed to obtain the merge list that is subsequently used to select the candidate. Because of the filtering based on TMP costs performed at block 3204 before the reordering and pruning, substantial improvements in the time consumed for the reordering and pruning operation can be observed.
[0261] At block 3208, the current block is reconstructed (at the decoder) or encoded (at the encoder) based on the selected candidate.
[0262] Embodiments of the present disclosure may be implemented in hardware using analog and / or digital circuits, in software, through the execution of instructions by one or more general purpose or special-purpose processors, or as a combination of hardware and software. Consequently, embodiments of the disclosure may be implemented in the environment of a computer system or other processing system. An example of such a computer system 3300 is shown in FIG. 33. Blocks depicted in the figures above, such as the blocks in FIGS. 1, 2, and 3, may execute on one or more computer systems 3300. Furthermore, each of the steps of the flowcharts depicted in this disclosure may be implemented on one or more computer systems 3300.
[0263] Computer system 3300 includes one or more processors, such as processor 3304. Processor 3304 may be, for example, a special purpose processor, general purpose processor, microprocessor, or digital signal processor. Processor 3304 may be connected to a communication infrastructure 3302 (for example, a bus or network). Computer system 3300 may also include a main memory 3306, such as random access memory (RAM), and may also include a secondary memory 3308.
[0264] Secondary memory 3308 may include, for example, a hard disk drive 3310 and / or a removable storage drive 3312, representing a magnetic tape drive, an optical disk drive, or the like. Removable storage drive 3312 may readDocket No.: 24-2066PCTfrom and / or write to a removable storage unit 3316 in a well-known manner. Removable storage unit 3316 represents a magnetic tape, optical disk, or the like, which is read by and written to by removable storage drive 3312 As will be appreciated by persons skilled in the relevant art(s), removable storage unit 3316 includes a computer usable storage medium having stored therein computer software and / or data.
[0265] In alternative implementations, secondary memory 3308 may include other similar means for allowing computer programs or other instructions to be loaded into computer system 3300. Such means may include, for example, a removable storage unit 3318 and an interface 3314. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a thumb drive and USB port, and other removable storage units 3318 and interfaces 3314 which allow software and data to be transferred from removable storage unit 3318 to computer system 3300.
[0266] Computer system 3300 may also include a communications interface 3320. Communications interface 3320 allows software and data to be transferred between computer system 3300 and external devices. Examples of communications interface 3320 may include a modem, a network interface (such as an Ethernet card), a communications port, etc. Software and data transferred via communications interface 3320 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 3320. These signals are provided to communications interface 3320 via a communications path 3322. Communications path 3322 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and other communications channels.
[0267] As used herein, the terms "computer program medium” and “computer readable medium” are used to refer to tangible storage media, such as removable storage units 3316 and 3318 or a hard disk installed in hard disk drive 3310. These computer program products are means for providing software to computer system 3300. Computer programs (also called computer control logic) may be stored in main memory 3306 and / or secondary memory 3308. Computer programs may also be received via communications interface 3320. Such computer programs, when executed, enable the computer system 3300 to implement the present disclosure as discussed herein. In particular, the computer programs, when executed, enable processor 3304 to implement the processes of the present disclosure, such as any of the methods described herein. Accordingly, such computer programs represent controllers of the computer system 3300.
[0268] In another embodiment, features of the disclosure may be implemented in hardware using, for example, hardware components such as application-specific integrated circuits (ASICs) and gate arrays. Implementation of a hardware state machine to perform the functions described herein will also be apparent to persons skilled in the art.
Claims
Docket No.: 24-2066PCTCLAIMSWhat is claimed is:
1. A method comprising:performing, for predicting a current block, a template matching search in a search region to obtain a first set of vector candidates;using vector information of reconstructed neighboring blocks of the current block to obtain a second set of vector candidates for predicting the current block;filtering the second set of vector candidates based on the highest template matching cost of template matching costs of the first set of vector candidates;selecting, based on a template matching search in a plurality of refine windows corresponding to at least a subset of vector candidates of the first set of vector candidates and the filtered second set of vector candidates, a vector candidate for predicting the current block; andreconstructing the current block based on the selected vector candidate and a residual block decoded from a bitstream.
2. A method comprising:obtaining, for predicting a current block:a first set of vector candidates; anda second set of vector candidates using vector information of reconstructed neighboring blocks of the current block;filtering the second set of vector candidates based on template matching costs of at least one of the first set of vector candidates or the second set of vector candidates;selecting, based on the first set of vector candidates and the filtered second set of vector candidates, a vector candidate for predicting the current block; andreconstructing the current block based on the selected vector candidate and a residual block decoded from a bitstream.
3. A method comprising:performing, for predicting a current block, a template matching search in a search region to obtain a first set of vector candidates;using vector information of reconstructed neighboring blocks of the current block to obtain a second set of vector candidates for predicting the current block;filtering the second set of vector candidates based on the highest template matching cost of template matching costs of the first set of vector candidates;selecting, based on a template matching search in a plurality of refine windows corresponding to at least a subset of vector candidates of the first set of vector candidates and the filtered second set of vector candidates, a vector candidate for predicting the current block; andDocket No.: 24-2066PCTencoding, in a bitstream, a residual block based on the current block and the selected vector candidate.
4. A method comprising:obtaining, for predicting a current block:a first set of vector candidates; anda second set of vector candidates using vector information of reconstructed neighboring blocks of the current block;filtering the second set of vector candidates based on template matching costs of at least one of the first set of vector candidates or the second set of vector candidates;selecting, based on the first set of vector candidates and the filtered second set of vector candidates, a vector candidate for predicting the current block; andencoding, in a bitstream, a residual block based on the current block and the selected vector candidate.
5. The method of any one of claims 2 or 4, wherein the obtaining the first set of vector candidates comprises performing, for predicting a current block, a template matching search in a search region to obtain the first set of vector candidates.
6. The method of any one of claims 2 or 4-5, wherein the second set of vector candidates is filtered based on template matching costs of the first set of vector candidates.
7. The method of any one of claims 2 or 4-5, wherein the filtering comprises filtering the second set of vector candidates based on a template matching cost of a candidate in the first set of vector candidates.
8. The method of any one of claims 6-7, wherein the second set of vector candidates is filtered based on the highest template matching cost of the template matching costs of the first set of vector candidates.
9. The method of claim 8, wherein the filtered second set of vector candidates each have lower template matching costs than the highest template matching cost.
10. The method of any one of claims 1, 3, or 5-9, wherein the first set of vector candidates is located in and searched in a plurality of search regions that subdivide the search region determined in accordance with at least a location of the current block.
11. The method of claim 10, wherein the each of the plurality of search regions is determined according to the location of the current block and a size of the current block.
12. The method of any one of claims 1, 3, or 5-11, wherein the performing the template matching search comprises determining template matching costs between a current template of the current block and each reference template of respective reference templates searched in the search region, wherein the first set of vector candidates correspond to a set of reference templates of the searched reference templates.
13. The method of claim 12, wherein the reference templates are searched in the search region based on a sampling interval of samples in the search region.
14. The method of any one of claims 12-13, wherein the set of reference templates have the smallest template matching costs of the template matching costs of the reference templates searched in the search region.Docket No.: 24-2066PCT15. The method of any one of claims 1-14, wherein the filtering the second set of vector candidates comprises removing vector candidates from the second set of vector candidates.
16. The method of any one of claims 2 or 4-15, wherein the selecting the vector candidate further comprises selecting the vector candidate based on a template matching search in a plurality of refine windows corresponding to at least a subset of vector candidates of the first set of vector candidates and the filtered second set of vector candidates.
17. The method of claim 16, wherein the plurality of refine windows comprises a respective refine window for each candidate in the subset of vector candidates.
18. The method of any one of claims 16-17, wherein the subset of vector candidates has the smallest template matching costs of template matching costs of the first set of vector candidates and the filtered second set of vector candidates19. The method of any one of claims 1-18, wherein the filtering is determined separately for template matching fusion candidates in the second set of vector candidates and non-fusion candidates in the second set of vector candidates.
20. The method of claim 19, wherein the filtering is determined using separately determined template matching cost thresholds for template matching fusion candidates in the second set of vector candidates and non-fusion candidates in the second set of vector candidates.
21. The method of any one of claims 1-18, wherein the filtering comprises separately filtering each of two or more subsets of the second set of vector candidates, wherein each of the subsets comprises candidates that are of a different set of BVP candidate types.
22. The method of any one of claims 1-21, wherein the second set of vector candidates comprises block vector prediction (BVP) candidates in a template matching prediction (TMP) fusion group indicated by an index obtained from a neighboring block coded in a TMP fusion mode, and wherein the second set of vector candidates is filtered based on a template matching cost of a BVP candidate with the lowest index in the TMP fusion group of the BVP candidates.
23. The method any one of claims 1-21, wherein the second set of vector candidates comprises a number of third BVP candidates associated with the lowest TMP cost of TMP costs of BVP candidates of a neighboring block coded using TMP, wherein the filtering comprises filtering the third candidates based on a template matching cost of a candidate in the first set of vector candidates.
24. The method of claim 23, wherein the template matching cost of the candidate is the highest template matching cost of template matching costs of the first set of vector candidates.
25. The method of any one of claims 1-21, wherein the second set of vector candidates comprises merge candidates obtained from previously-reconstructed neighboring blocks of the current block.Docket No.: 24-2066PCT26. The method of any one of claims 1, 3, 5-21, or 25, wherein candidates of the first set of vector candidates are located in the search region, and at least one candidate of the second set of vector candidates is located outside of the search region.
27. The method of any one of claims 1-2 or 5-26, wherein the current block is reconstructed based on combining the residual block and a reference block indicated by the selected vector candidate.
28. The method of any one of claims 1-2 or 5-27, further comprising: decoding, from the bitstream, an indication of IntraTMP mode for coding the current block.
29. The method of any one of claims 3-26, wherein the residual block is determined based on a difference between the current block and a reference block indicated by the selected vector candidate.
30. The method of any one of claims 3-26 or 29, further comprising: encoding, in the bitstream, an indication of IntraTMP mode for coding the current block.
31. A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors of an apparatus, cause the apparatus to perform the method of any one of claims 1-30.
32. An encoder comprising:one or more processors; andmemory storing instructions that, when executed by the one or more processors, cause the encoder to perform the method of any one of claim 3-26 or 29-30.
33. A non-transitory computer-readable recording medium storing a bitstream generated by the method for encoding a video according to any one of claims 3-26 or 29-30.
34. A decoder comprising:one or more processors; andmemory storing instructions that, when executed by the one or more processors, cause the decoder to perform the method of any one of claims 1-2 or 5-28.
35. A non-transitory computer readable medium storing a bitstream, which, when decoded by a decoder, causes the decoder to perform the method according to any one of claims 1-2 or 5-28.
36. A bitstream generated according to any one of claims 3-26 or 29-30.