Method, apparatus, and computer program for video coding using boundary filtering
Boundary filtering with PDPC mode optimizes IBC and IntraTMP modes in video coding, addressing memory bandwidth issues and enhancing coding efficiency in advanced standards like VVC.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Patents
- Current Assignee / Owner
- TENCENT AMERICA LLC
- Filing Date
- 2023-10-16
- Publication Date
- 2026-06-25
AI Technical Summary
Existing video coding techniques face challenges in efficiently utilizing intra-block copy (IBC) and intra-template matching (IntraTMP) modes due to increased memory bandwidth requirements and suboptimal implementation costs, particularly in advanced standards like Versatile Video Coding (VVC).
The implementation of boundary filtering using Position-Dependent Predictor Combination (PDPC) mode, which applies position-dependent weights to predicted samples at block boundaries, optimizing intra-prediction processes in IBC and IntraTMP modes to reduce memory bandwidth and improve coding efficiency.
Enhances video coding efficiency by reducing memory bandwidth requirements and lowering hardware complexity while maintaining coding performance in advanced video coding standards like VVC.
Smart Images

Figure 0007880492000005 
Figure 0007880492000006 
Figure 0007880492000007
Abstract
Description
Technical Field
[0001] [Incorporation by Reference] This application claims the benefit of priority to U.S. Patent Application No. 18 / 380,011, filed on October 13, 2023, entitled "BOUNDARY FILTERING ON INTRABC AND INTRATMP CODED BLOCKS", which claims the benefit of priority to U.S. Provisional Patent Application No. 63 / 416,905, filed on October 17, 2022, entitled "PDPC on IntraBC and IntraTMP coded blocks". The disclosure of the prior application is hereby incorporated by reference in its entirety.
[0002] [Technical Field] This disclosure generally describes aspects related to video coding.
Background Art
[0003] The background description provided herein is for the purpose of generally presenting the context of the present disclosure. The research of the presently named inventors, to the extent that it is not described in this background art as being commonly regarded as prior art at the time of filing, with aspects of the description that may not be explicitly or implicitly recognized as prior art to the present disclosure.
[0004] Image / video compression helps transmit image / video data across different devices, storage, and networks with minimal quality degradation. In some cases, video codec techniques can compress video based on spatial and temporal redundancy. In one example, a video codec can use a technique called intra-prediction, which can compress images based on spatial redundancy. For example, intra-prediction can use reference data from the current picture being reconstructed for sample prediction. In another example, a video codec can use a technique called inter-prediction, which can compress images based on temporal redundancy. For example, inter-prediction can predict samples in the current picture from previously reconstructed pictures using motion compensation. Motion compensation can be indicated by motion vectors (MV). [Overview of the Initiative]
[0005] Aspects of this disclosure include methods and apparatus for video coding / decoding. In some examples, the apparatus for video decoding includes a processing circuit. The processing circuit includes intra-block copy (IBC) mode and intra-template matching. prediction (IntraTMP, intra template matching predictionThe processor receives a coded video bitstream containing the current picture, which has coded blocks in one of the following modes: IBC mode or IntraTMP mode. The processor determines the predicted blocks of the block using one of the above modes: IBC mode or IntraTMP mode. In response to the application of boundary filtering to the block, the processor applies boundary filtering to the predicted sample pred(x',y') located at position (x',y') in the predicted block corresponding to the sample in the block by determining the boundary filtering parameter W based on the coding information of the block, determining the weights used in the boundary filtering by right-shifting the parameter W according to the position (x',y'), and generating the filtered predicted sample based on a linear combination of the reference sample and the predicted sample according to the determined weights. For example, the coding information of the block may include the coding block size, the coding block aspect ratio, whether the block is a luma or chroma component, adjacent reconstructed samples of the block, boundary predicted samples in the predicted block, the difference between adjacent reconstructed samples of the block and boundary predicted samples in the predicted block, or the color format of the block.
[0006] In one embodiment, the weight is the left reference sample R -1,y’ The weight associated with it lol L See sample R above. x’,-1 The weight associated with it lol T And the weights associated with the predicted sample pred(x',y') are (64-w L -w T ) and weight w L This is equal to W >> ((x' << 1) >> 0), and weight w T This is equivalent to W>>((y'<<1)>>0).
[0007] In one aspect, the boundary filtering is performed by a position-dependent predictor combination (PDPC) filter, and the processing circuit generates the filtered prediction sample as Clip(0, (1<<BitDepth)-1, (w L ×R -1,y’ +w T ×R x’,-1 +(64 - w L -w T )×pred(x’,y’)+32)>>6), where BitDepth indicates the bit depth.
[0008] In one example, the parameter W is determined as 8, 4, 16, or 2.
[0009] In one example, the processing circuit determines whether boundary filtering is applied to a block from adjacent reconstructed samples.
[0010] In one example, the processing circuit detects the content type of adjacent reconstructed samples. In response to the content type of adjacent reconstructed samples being screen content, boundary filtering is not applied to the block. In response to the content type of adjacent reconstructed samples not being screen content, boundary filtering is applied to the block.
[0011] In one example, the processing circuit checks the number of color values of adjacent reconstructed samples. In response to the number of color values being less than the color value threshold, the processing circuit detects the content type as screen content.
[0012] In one example, the adjacent reconstructed samples are of a specific color component, and the color values include the values of the specific color component.
[0013] In one example, the adjacent reconstructed samples are associated with multiple color components, and the color values include combinations of the respective values of the multiple color components.
[0014] In one example, boundary filtering is applied only when the color component of a block is a luma component.
[0015] In one example, boundary filtering is applied to each color component associated with a block.
[0016] In one example, boundary filtering is applied only when the current slice containing a block is an intra-slice.
[0017] Aspects of this disclosure also provide a non-temporary computer-readable medium that, when executed by a computer, stores instructions causing the computer to perform a method for video decoding / encoding. [Brief explanation of the drawing]
[0018] Further features, properties, and various advantages of the disclosed subject matter will become clearer from the following detailed description and accompanying drawings.
[0019] [Figure 1] This is a schematic diagram of an exemplary block diagram of a video processing system (100).
[0020] [Figure 2] This is a schematic diagram of an exemplary block diagram of a decoder.
[0021] [Figure 3] This is a schematic diagram of an exemplary block diagram of an encoder.
[0022] [Figure 4] This figure shows an example of the IntraTemplate Matching Prediction (IntraTMP) mode according to the aspects of this disclosure.
[0023] [Figure 5A] This figure shows an example of a reference sample for the Position-Dependent Predictor Combination (PDPC) mode, which can be applied to various prediction modes. [Figure 5B]This figure shows an example of a reference sample for the Position-Dependent Predictor Combination (PDPC) mode, which can be applied to various prediction modes. [Figure 5C] This figure shows an example of a reference sample for the Position-Dependent Predictor Combination (PDPC) mode, which can be applied to various prediction modes. [Figure 5D] This figure shows an example of a reference sample for the Position-Dependent Predictor Combination (PDPC) mode, which can be applied to various prediction modes.
[0024] [Figure 6] This figure shows an example of boundary filtering using two tap filters according to an aspect of the present invention.
[0025] [Figure 7] This figure shows an example of boundary filtering according to the aspects of this disclosure.
[0026] [Figure 8] This figure shows a flowchart outlining a process according to one aspect of this disclosure.
[0027] [Figure 9] This figure shows a flowchart outlining another process according to one aspect of this disclosure.
[0028] [Figure 10] This is a schematic diagram of a computer system in one configuration. [Modes for carrying out the invention]
[0029] Figure 1 shows a block diagram of a video processing system (100) in several examples. The video processing system (100) is an example of a video encoder and decoder in a streaming environment, as an application of the disclosed subject matter. The disclosed subject matter can be equally applied to other video-enabled applications, including, for example, video conferencing, digital TV, streaming services, and the storage of compressed video on digital media such as CDs, DVDs, and memory sticks.
[0030] The video processing system (100) includes a capture subsystem (113), which may include a video source (101), such as a digital camera, which creates, for example, a stream (102) of uncompressed video pictures. In one example, the stream (102) includes a sample captured by the digital camera. The stream (102) of video pictures is drawn as a thick line to highlight the high data volume compared to encoded video data (104) (or encoded video bitstream) and can be processed by an electronic device (120) which includes a video encoder (103) coupled to the video source (101). The video encoder (103) may include hardware, software, or a combination thereof to enable or implement aspects of the disclosed subject, as will be described in more detail below. The encoded video data (104) (or encoded video bitstream) is drawn as a thin line to highlight the low data volume compared to the stream (102) of video pictures and can be stored in a streaming server (105) for future use. One or more streaming client subsystems, such as client subsystems (106) and (108) in Figure 1, can access a streaming server (105) to retrieve copies (107) and (109) of encoded video data (104). Client subsystem (106) may include a video decoder (110) within, for example, an electronic device (130). The video decoder (110) decodes the input copy (107) of the encoded video data and creates an output stream (111) of a video picture that can be rendered on a display (112) (e.g., a display screen) or other rendering device (not shown). In some streaming systems, the encoded video data (104), (107), and (109) (e.g., video bitstreams) can be encoded according to specific video coding / compression standards. Examples of these standards include ITU-T Recommendation H.265.For example, a video coding standard under development is informally known as Versatile Video Coding (VVC). The disclosed subject matter may be used in the context of VVC.
[0031] It should be noted that electronic devices (120) and (130) may include other components (not shown). For example, electronic device (120) may include a video decoder (not shown), and similarly, electronic device (130) may also include a video encoder (not shown).
[0032] Figure 2 shows an exemplary block diagram of a video decoder (210). The video decoder (210) can be included in an electronic device (230). The electronic device (230) can include a receiver (231) (e.g., a receiving circuit). The video decoder (210) can be used in place of the video decoder (110) in the example of Figure 1.
[0033] The receiver (231) may receive one or more coded video sequences, for example, contained in a bitstream, to be decoded by a video decoder (210). In one embodiment, one coded video sequence may be received at a time, in which case the decoding of each coded video sequence is independent of the decoding of other coded video sequences. The coded video sequences may be received from a channel (201), which may be a hardware / software link to a storage device that stores coded video data. The receiver (231) may receive coded video data together with other data, such as coded audio data and / or auxiliary data streams, which may be transferred to their respective user entities (not shown). The receiver (231) may isolate the coded video sequences from other data. To counteract network jitter, a buffer memory (215) may be coupled between the receiver (231) and an entropy decoder / parser (220) (hereinafter, "Parser (220)"). In certain applications, the buffer memory (215) is part of the video decoder (210). In other cases, it may be located outside the video decoder (210) (not shown). Further in other cases, a buffer memory (not shown) may exist outside the video decoder (210), for example to counter network jitter, and further, another buffer memory (215) may exist inside the video decoder (210), for example to handle playback timing. When the receiver (231) is receiving data from a store / transfer device with sufficient bandwidth and controllability or from an isosynchronous network, the buffer memory (215) may not be required or may be small.For use in best-effort packet networks such as the Internet, a buffer memory (215) may be required, which can be relatively large, and advantageously can be an adaptive size, and may be implemented at least in part in an external operating system or similar element (not shown) outside the video decoder (210).
[0034] The video decoder (210) may include a parser (220) to reconstruct symbols (221) from the coded video sequence. These categories of symbols include information used to manage the operation of the video decoder (210) and information controlling a rendering device, such as a renderer device (212) (e.g., a display screen), which is not an integral part of the electronic device (230) as shown in Figure 2, but can be coupled to the electronic device (230). The rendering device control information may be in the form of Supplemental Enhancement Information (SEI) messages or Video Usability Information (VUI) parameter set fragments (not shown). The parser (20) can parse / entropically decode the received coded video sequence. The coding of the coded video sequence may follow video coding techniques or standards and may follow various principles, including variable-length coding, Huffman coding, and arithmetic coding with or without context sensitivity. The parser(220) may extract from the coded video sequence a set of subgroup parameters for at least one subgroup of pixels in the video decoder, based on at least one parameter corresponding to the group. Subgroups may include groups of pictures (GOPs), pictures, tiles, slices, macroblocks, coding units (CUs), blocks, transform units (TUs), prediction units (PUs), etc. The parser(220) may also extract from coded video sequence information such as transform coefficients, quantization parameter values, motion vectors, etc.
[0035] The parser (220) may perform an entropy decoding / parsing operation on the video sequence received from buffer memory (215) to create symbols (221).
[0036] The reconstruction of symbol (221) can involve multiple different units depending on the type of coded video picture or part thereof (e.g., inter and intra picture, inter and intra block) and other factors. Which units are involved and how can be controlled by subgroup control information parsed from the coded video sequence by parser (220). The flow of such subgroup control information between parser (220) and the multiple units below is not illustrated for clarity.
[0037] In addition to the functional blocks already described, the video decoder (210) can be conceptually subdivided into several functional units, as described below. In practical implementations operating under commercial constraints, many of these units can interact closely with each other and be integrated at least partially. However, for the purpose of illustrating the subject matter to be disclosed, the conceptual subdivision into functional units is appropriate below.
[0038] The first unit may be a scaler / inverse unit (251). The scaler / inverse unit (251) receives quantized transformation coefficients as symbols (221) from the parser (220), along with control information including which transformation to use, block size, quantization coefficients, quantization scaling matrix, etc. The scaler / inverse unit (251) can output a block containing sample values that can be input to the aggregator (255).
[0039] In some cases, the output samples of the scaler / inverse unit (251) may relate to intracoded blocks. Intracoded blocks are blocks that do not use predictive information from previously reconstructed pictures but can use predictive information from previously reconstructed portions of the current picture. Such predictive information can be provided by the intrapicture predictive unit (252). In some cases, the intrapicture predictive unit (252) generates blocks of the same size and shape as the block being reconstructed, using surrounding already reconstructed information fetched from the current picture buffer (258). The current picture buffer (258) is, for example, a partially reconstructed current picture and / or a fully reconstructed current picture. The aggregator (255) may, on a sample-by-sample basis, add the predictive information generated by the intrapredictive unit (252) to the output sample information provided by the scaler / inverse unit (251).
[0040] In other cases, the output samples of the scaler / inverse unit (251) may be related to blocks that are intercoded and potentially motion-compensated. In such cases, the motion-compensated prediction unit (253) can access the reference picture memory (257) to fetch samples to be used for prediction. After motion-compensating the fetched samples according to the symbols (221) related to the blocks, these samples can be added by the aggregator (255) to the output of the scaler / inverse unit (251) (in this case, called residual samples or residual signals) to generate output sample information. The address in the reference picture memory (257) from which the motion-compensated prediction unit (253) fetches the predicted samples can be controlled by a motion vector, which is available to the motion-compensated prediction unit (253) in the form of a symbol (221) which may have, for example, X, Y and reference picture components. Motion compensation may also include interpolation of sample values fetched from reference picture memory (257) when the precise motion vector of a subsample is used, motion vector prediction mechanisms, and so on.
[0041] The output samples of the aggregator (255) may be subjected to various loop filtering techniques within the loop filter unit (256). The video compression technique may include in-loop filtering techniques controlled by parameters contained in the coded video sequence (also called the coded video bitstream) and made available to the loop filter unit (256) as symbols (221) from the parser (220). Video compression may also respond to metadata obtained during decoding of earlier portions (in decoding order) of the coded picture or coded video sequence, as well as to previously reconstructed and loop-filtered sample values.
[0042] The output of the loop filter unit (256) can be a sample stream that can be output to a rendering device such as a display (212) and can be stored in a reference picture memory (257) for use in future interpicture prediction.
[0043] A particular coded picture, once fully reconstructed, can be used as a reference picture for future predictions. For example, when a coded picture corresponding to the current picture is fully reconstructed and the coded picture is identified as a reference picture (e.g., by the parser (220)), the current picture buffer (258) can become part of the reference picture memory (257), and a fresh current picture buffer can be reallocated before starting the reconstruction of subsequent coded pictures.
[0044] The video decoder (210) may perform decoding operations according to a given video compression technique or standard, such as ITU-T Rec. H.265. The coded video sequence may conform to the syntax specified by the video compression technique or standard being used, in the sense that the coded video sequence conforms to both the syntax of the video compression technique or standard and the profile documented in the video compression technique or standard. Specifically, a profile may select a particular tool from all the tools available in the video compression technique or standard as the only tool available for use under that profile. Also required for conformance is that the complexity of the coded video sequence is within the range defined by the level of the video compression technique or standard. In some cases, the level limits the maximum picture size, maximum frame rate, maximum reconstruction sample rate (e.g., measured in megasamples per second), maximum reference pixel size, etc. The limits set by the level may, in some cases, be further restricted through the HRD specification and metadata for buffer management of the Hypothetical Reference Decoder (HRD) signaled in the coded video sequence.
[0045] In one embodiment, the receiver (231) may receive additional (redundant) data along with the encoded video. The additional data may be included as part of the coded video sequence. The additional data may be used by the video decoder (210) to properly decode the data and / or more accurately reconstruct the original video data. The additional data may take the form of, for example, temporal, spatial, or signal-to-noise ratio (SNR) enhancement layers, redundant slices, redundant pictures, forward error correction codes, etc.
[0046] Figure 3 shows an exemplary block diagram of a video encoder (303). The video encoder (303) is included in an electronic device (320). The electronic device (320) includes a transmitter (340) (e.g., a transmitting circuit). The video encoder (303) can be used in place of the video encoder (103) in the example of Figure 1.
[0047] The video encoder (303) may receive video samples from a video source (301) (not part of the electronic device (320) in the example in Figure 3) that can capture video images to be coded by the video encoder (303). In another example, the video source (301) is part of the electronic device (320).
[0048] The video source (301) may provide a source video sequence to be coded by the encoder (303) in the form of a digital video sample stream that can have any suitable bit depth (e.g., 8-bit, 10-bit, 12-bit, ...), any color space (e.g., BT.601 Y CrCB, RGB, ...), and any suitable sampling structure (e.g., Y CrCb 4:2:0, Y CrCb 4:4:4). In a media supply system, the video source (301) may be a storage device that stores pre-prepared video. In a video conferencing system, the video source (301) may be a camera that captures local image information as a video sequence. The video data may be provided as a plurality of individual pictures that convey motion when viewed in sequence. The picture itself may be organized as a spatial array of pixels, in which case each pixel may have one or more samples depending on the sampling structure, color space, etc., in use.
[0049] In one embodiment, the video encoder (303) can code and compress the pictures of a source video sequence in real time or under any other required time constraints to obtain a coded video sequence (343). Implementing an appropriate coding speed is one function of the controller (350). In some embodiments, the controller (350) may also control and be functionally coupled to other functional units, as described below. This coupling is not illustrated for clarity. Parameters set by the controller (350) may include rate control-related parameters (picture skip, quantizer, lambda value of rate distortion optimization technique, ...), picture size, group of pictures (GOP) layout, maximum motion vector search range, etc. The controller (350) may be configured to have other appropriate functions that may be associated with the video encoder (303) optimized for a particular system design.
[0050] In some embodiments, a video encoder is configured to operate within a coding loop. In an overly simplified explanation, one example of a coding loop may include a source coder (330) (responsible for creating symbols, such as a symbol stream, based, for example, on an input picture and a reference picture to be coded) and a (local) decoder (333) embedded in the video encoder (303). The decoder (333) reconstructs the symbols to create sample data in a similar manner to how a (remote) decoder would create it. The reconstructed sample stream (sample data) can be input into a reference picture memory (334). Since decoding the symbol stream yields bit-exact results independent of the decoder location (local or remote), the contents within the reference picture memory (334) are also bit-exact between the local and remote encoders. In other words, the predictive portion of the encoder "sees" the exact same sample values as the reference picture samples that the decoder "sees" when using predictions during decoding. This fundamental principle of reference picture synchronicity (and the resulting drift when synchronicity cannot be maintained, for example, due to channel errors) is also used in several related fields.
[0051] The operation of the “local” decoder (333) can be the same as that of a “remote” decoder, such as the video decoder (210), as has already been described above in relation to Figure 2. However, also briefly referring to Figure 2, since symbols are available and the encoding / decoding of symbols to the coded video sequence by the entropy coder (345) and parser (220) can be reversible, the entropy decoding portion of the video decoder (210), including the buffer memory (215) and parser (220), may not be fully implemented in the local decoder (333).
[0052] In one embodiment, decoder techniques other than analysis / entropy decoding present in the decoder exist in the corresponding encoder in the same or substantially the same functional form. Therefore, the subject matter disclosed focuses on the operation of the decoder. A description of the encoder technique can be omitted, as it is the opposite of a comprehensive description of the decoder technique. More detailed descriptions are provided below only in specific areas.
[0053] During operation, in some examples, the source coder (330) may perform motion-compensated predictive coding, which predictively codes the input picture in relation to one or more previously coded pictures from a video sequence designated as “reference pictures”. In this way, the coding engine (332) codes the difference between the pixel blocks of the input picture and the pixel blocks of the reference picture which may be selected as a predictive reference for the input picture.
[0054] A local video decoder (333) can decode coded video data of a picture that may be designated as a reference picture based on symbols created by the source coder (330). The operation of the coding engine (332) may, advantageously, be a lossy process. When coded video data can be decoded by a video decoder (not shown in Figure 3), the reconstructed video sequence may typically be a replica of the source video sequence with some errors. The local video decoder (333) can replicate the decoding process that may be performed by the video decoder on the reference picture and store the reconstructed reference picture in the reference picture memory (334). In this way, the video encoder (303) can locally store a copy of the reconstructed reference picture having common content as a reconstructed reference picture acquired by the far-end video decoder (without transmission errors).
[0055] The predictor (335) may perform a predictive search on the coding engine (332). That is, for a new picture to be coded, the predictor (435) may search the reference picture memory (434) for sample data (as candidate reference pixel blocks) or specific metadata such as reference picture motion vectors, block shapes, etc., which may function as appropriate predictive references for the new picture. The predictor (335) may operate on a sample block-by-pixel-block basis to find appropriate predictive references. In some cases, the input picture may have predictive references drawn from multiple reference pictures stored in the reference picture memory (334), as determined by the search results obtained by the predictor (335).
[0056] The controller (350) may manage the coding operations of the source coder (330), including, for example, setting parameters and subgroup parameters used to encode video data.
[0057] All outputs of the aforementioned functional units can be subjected to entropy coding in the entropy coder (345). The entropy coder (345) converts the symbols generated by the various functional units into coded video sequences by applying lossless compression to the symbols according to techniques such as Huffman coding, variable-length coding, and arithmetic coding.
[0058] The transmitter (340) may buffer the coded video sequence created by the entropy coder (345) and prepare it for transmission over the communication channel (360), which may be a hardware / software link to a storage device that stores the coded video data. The transmitter (340) may merge the coded video data from the video encoder (303) with other data to be transmitted, such as coded audio data and / or auxiliary data streams (sources not shown).
[0059] The controller (350) may manage the operation of the video encoder (303). During coding, the controller (350) may assign a specific coded picture type to each coded picture, and this coded picture type may affect the coding that can be applied to each picture. For example, a picture may often be assigned as one of the following picture types:
[0060] An intra-picture (I-picture) can be coded and decoded without using any other pictures in the sequence as a source for prediction. Some video codecs allow different types of intra-pictures, including, for example, Independent Decoder Refresh (IDR) pictures.
[0061] Predictive pictures (P-pictures) can be coded and decoded using intra-prediction or inter-prediction with motion vectors and reference indices to predict the sample values of each block.
[0062] Bidirectional predictive pictures (B-pictures) can be coded and decoded using intra-prediction or inter-prediction, employing two motion vectors and a reference index to predict the sample values for each block. Similarly, multiple-predictive pictures can use two or more reference pictures and associated metadata for the reconstruction of a single block.
[0063] A source picture is typically subdivided spatially into multiple sample blocks (e.g., blocks of 4x4, 8x8, 4x8, or 16x16 samples each), and each block may be coded. Blocks may be coded predictively in relation to other (already coded) blocks, as determined by the coding assignment applied to each picture in the block. For example, blocks of picture I may be coded non-predictively, or they may be coded predictively in relation to already coded blocks of the same picture (spatial prediction or intra-prediction). Pixel blocks of picture P may be coded predictively via spatial prediction or temporal prediction in relation to one previously coded reference picture. Blocks of picture B may be coded predictively via spatial prediction or temporal prediction in relation to one or two previously coded reference pictures.
[0064] The video encoder (303) may perform coding operations in accordance with a given video coding technique or standard, such as ITU-T Rec. H.265. In this operation, the video encoder (303) may perform various compression operations, including predictive coding operations that utilize temporal and spatial redundancy in the input video sequence. The coded video data may therefore conform to the syntax specified by the video coding technique or standard being used.
[0065] In one embodiment, the transmitter (340) may transmit additional data along with the encoded video. The source coder (330) may include such data as part of the coded video sequence. The additional data may include other forms of redundant data, such as time / space / SNR enhancement layers, redundant pictures and slices, SEI messages, VUI parameter set fragments, etc.
[0066] Video can be captured as multiple source pictures (video pictures) in a time sequence. Intra-picture prediction (often abbreviated as intra-prediction) utilizes spatial correlations within a given picture, while inter-picture prediction utilizes (temporal or other) correlations between pictures. In one example, a particular picture being encoded / decoded is called the current picture and is divided into blocks. When a block in the current picture is analogous to a reference block in a previously coded and still-buffered reference picture in the video, the block in the current picture can be coded by a vector called a motion vector. The motion vector points to a reference block in the reference picture and may have a third dimension to identify the reference picture if multiple reference pictures are used.
[0067] In some embodiments, the dual prediction technique can be used for interpicture prediction. According to the dual prediction technique, two reference pictures are used, such as a first reference picture and a second reference picture, which are earlier in decoding order (but earlier and later in display order, respectively) than the current picture in the video. Blocks in the current picture can be coded by a first motion vector pointing to a first reference block in the first reference picture and a second motion vector pointing to a second reference block in the second reference picture. Blocks can be predicted by combinations of the first and second reference blocks.
[0068] Furthermore, coding efficiency can be improved by using merge mode techniques in interpicture prediction.
[0069] According to some aspects of this disclosure, predictions such as interpicture prediction and intrapicture prediction are performed in block units. For example, according to the HEVC standard, pictures in a sequence of video pictures are divided into coding tree units (CTUs) for compression, and the CTUs in a picture have the same size, such as 64x64 pixels, 32x32 pixels, or 16x16 pixels. Generally, a CTU contains three coding tree blocks (CTBs), the three CTBs being one lumen CTB and two chroma CTBs. Each CTU can be recursively quadtree-partitioned into one or more coding units (CUs). For example, a 64x64 pixel CTU can be divided into one 64x64 pixel CU, four 32x32 pixel CUs, or sixteen 16x16 pixel CUs. In one example, each CU is analyzed to determine the prediction type of the CU, such as inter-prediction type or intra-prediction type. A CU is divided into one or more prediction units (PUs) depending on its temporal and / or spatial predictability. Generally, each PU contains a Luma prediction block (PB) and two Chroma PBs. In one embodiment, prediction operations in coding (encoding / decoding) are performed in units of prediction blocks. Using a Luma prediction block as an example of a prediction block, the prediction block contains a matrix of pixel values (e.g., Luma values) such as 8x8 pixels, 16x16 pixels, 8x16 pixels, 16x8 pixels, etc.
[0070] It should be noted that the video encoders (103) and (303) and the video decoders (110) and (210) can be implemented using any suitable technique. In one embodiment, the video encoders (103) and (303) and the video decoders (110) and (210) can be implemented using one or more integrated circuits. In another embodiment, the video encoders (103) and (303) and the video decoders (110) and (210) can be implemented using one or more processors that execute software instructions.
[0071] Intrablock copy (IBC) mode can be used in image and / or video coding such as Versatile Video Coding (VVC). The concept of IBC was previously incorporated into the High Efficiency Video Coding (HEVC) standard. However, some related techniques used in HEVC required reducing implementation costs due to the size of the entire already reconstructed area of the current picture. A drawback of the IBC concept in some implementations like HEVC is the requirement for additional memory within the decoded picture buffer (DPB), which may require hardware implementations to use external memory. Additional external memory access entails increased memory bandwidth. Some implementations, like VVC, significantly reduce memory bandwidth requirements and hardware complexity by using fixed memory that can implement IBC mode by using on-chip memory. Reference sample memory (RSM) can be used to store samples of a single CTU. A special feature of the RSM is that it includes a continuous update mechanism that replaces the reconstructed samples of the left adjacent CTU with the reconstructed samples of the current CTU.
[0072] Furthermore, IBC mode block vector (BV) coding can utilize the concept of merge lists used in inter prediction. The IBC list construction process can consider two spatial neighbor BVs and five history-based BVs (HBVPs). In one example, only the first HBVP is compared to the spatial candidate when added to the candidate list. While normal inter prediction uses two different candidate lists, namely a candidate list for merge mode and a candidate list for normal mode, the IBC mode candidate list is used for both cases (e.g., IBC merge mode and IBC regular mode). Merge mode (e.g., IBC merge mode) may use up to six candidates from the list, while regular mode (e.g., IBC regular mode) uses only the first two candidates. Block vector difference (BVD) coding can utilize the motion vector difference (MVD) process, resulting in a final BV of arbitrary size. The reconstructed BV can point to an area outside the reference sample area, and in some examples, correction is required by removing the absolute offset in each direction using modulo operations with the width and height of the RSM.
[0073] Figure 4 shows an example of an IntraTemplate Matching Prediction (IntraTMP) mode according to an embodiment of the present invention. In embodiments such as Enhanced Compression Model (ECM) software, IntraTMP is a special intra-prediction mode in which the best prediction block (e.g., matching block (421)) can be copied from a reconfigured portion of the current frame (or current picture), where the template (e.g., L-shaped template) (420) of the best prediction block can be matched with the current template (410) of the current block (411). Within a predetermined search range, the encoder can search for the template most similar to the current template in the reconfigured portion of the current frame, and the corresponding block can be used as the prediction block. The encoder can signal the use of IntraTMP mode, and the same prediction operation can be performed on the decoder side.
[0074] The prediction signal can be generated by matching the current template (410), such as the L-shaped causal neighbor of the current block (411), with the template of another block within a given search area. The exemplary search area shown in Figure 4 can include multiple CTUs (or superblocks). Referring to Figure 4, the search area can include the current CTU R1 (e.g., a part of the current CTU R1), the upper left CTU R2, the upper CTU R3, and the left CTU R4. The cost function can include any suitable cost function, such as the sum of absolute differences (SAD).
[0075] Within each region, the decoder can search for the template with the lowest cost (e.g., lowest SAD) relative to the current template, and can use the block associated with the template with the lowest cost as the prediction block.
[0076] The dimensions of the region indicated by (SearchRange_w, SearchRange_h) can be set to be proportional to the block dimensions (BlkW, BlkH), and each pixel can have a fixed number of SAD comparisons. Therefore, SearchRange_w = a × BlkW Equation (a) SearchRange_h = a × BlkH Equation (b)
[0077] The parameter "a" can be a constant that controls the trade-off between gain and complexity. In one example, "a" is 5.
[0078] The intra-template matching tool can be enabled for CUs with specific sizes, such as width and height of 64 or less. The maximum CU size for IntraTMP mode can be configurable. IntraTMP mode can be signaled at the CU level through a dedicated flag, for example, when decoder-side intra-mode derivation (DIMD) is not used for the current CU.
[0079] Boundary filtering may include applying adjustments (or filtering processes) to predicted samples in the prediction block of the current block, such as predicted samples at block boundaries, using nearby reconstructed samples from previously coded areas. In one example, boundary filtering includes applying adjustments (or filtering processes) to predicted samples at block boundaries using nearby reconstructed samples from previously coded areas, where the predicted samples at block boundaries are located within the prediction block. Boundary filtering using Position-Dependent Predictor Combinations (PDPC) mode can be applied to image and / or video coding. In VVC, the results of intra-prediction in DC mode, planar mode, and several angular modes can be further modified by the PDPC method. The PDPC mode is an intra-prediction method that calls a combination of boundary reference samples and HEVC-style intra-prediction using filtered boundary reference samples. The PDPC mode can be applied without signaling to the following intra-modes: planar mode, DC mode, angular mode with an intra-angle less than or equal to the horizontal (e.g., the angle corresponding to the horizontal mode), and angular mode with an intra-angle greater than or equal to the vertical (e.g., the angle corresponding to the vertical mode) and less than or equal to 80°. The PDPC mode is not applied if the current block is coded in block-based delta pulse code modulation (BDPCM) mode or if the multiple reference line (MRL) index is greater than 0.
[0080] The predicted sample pred(x',y') is predicted using a linear combination of the intra-prediction mode (DC, plane, angle) and the reference sample according to equation (1) below:
number
[0081] R x,-1 , R -1,y These can represent reference samples located above and to the left of the current sample (x',y'), respectively.
number
[0082] Figures 5A to 5D show reference samples for PDPC modes used (e.g., applied) in various prediction modes (R x,-1 and R -1,y Figure 5A shows an example of the reference sample (R) used in the PDPC mode in the diagonal upper right mode. x,-1 and R -1,y Figure 5B shows the reference sample (R) used in the PDPC mode in the diagonal lower left mode. x,-1 and R -1,y Figure 5C shows the reference sample (R) used in the PDPC mode of the adjacent diagonal upper right mode. x,-1 and R -1,y Figure 5D shows the reference sample (R) used in the PDPC mode in the adjacent diagonal lower left mode. x,-1 and R -1,y This shows that the predicted sample pred(x',y') can be located at (x',y') within the prediction block. For example, for diagonal mode, see reference sample R x,-1 The coordinate x is given by x = x' + y' + 1, and the reference sample R -1,y Similarly, the coordinate y is given by y = x' + y' + 1. For other angle modes, see sample R. x,-1 and R -1,y It can be located at a fractional sample position. In this case, the sample value of the nearest integer sample position is used.
[0083] In some examples (for instance, when PDPC mode is applied to DC, planar, horizontal intra-mode and vertical intra-mode), equation (1) is:
number
[0084] When PDPC is applied to DC, planar, horizontal, and vertical intra-modes, no additional boundary filters are required, as are necessary for HEVC DC mode boundary filters or horizontal / vertical mode edge filters. The PDPC processes for DC mode and planar mode can be identical. For angular modes, if the current angular mode is HOR_IDX or VER_IDX, the left or top reference sample is not used, respectively. The PDPC weights and scaling factors may depend on the prediction mode and block size. PDPC modes can be applied to blocks where both width and height are greater than or equal to a threshold, such as 4.
[0085] In some cases, a PDPC mode can be applied to a specific prediction block, such as an IntraBC and / or IntraTMP prediction block. In the original PDPC design, a PDPC mode designed for an intra-prediction mode, such as DC mode, planar mode, or angular mode, may not be optimized for a specific prediction mode, such as an IntraBC mode and / or IntraTMP prediction mode. In some cases, there is a coding loss if the original PDPC design is applied directly to an IntraBC and / or IntraTMP prediction mode. Each of the IBC mode and IntraTMP prediction mode can predict the current block using a reference block, where the reference block and the current block are in the same picture, and the reference block is indicated by a block vector pointing from the current block to the reference block. In some embodiments, a PDPC mode can be applied to one or more other prediction modes that use a reference block in the same picture. In some examples, an IBC mode or an IntraTMP prediction mode may be considered an intra-prediction (e.g., an intra-prediction mode) in the sense that a reference block in the same picture as the current block is used to predict the current block. In some examples, the IBC mode or IntraTMP prediction mode may be considered a separate mode distinct from intra-prediction and inter-prediction. This disclosure includes a method for applying boundary filtering (such as the PDPC mode) to prediction blocks coded by a particular prediction mode based on block / template matching, such as the IntraBC mode or IntraTMP mode. In one example, boundary filtering applies adjustments to prediction samples at block boundaries using nearby reconstructed samples from adjacent coded blocks.
[0086] The IBC mode is sometimes also called the IntraBC mode. The IBC mode can include different modes such as the IBC merge mode and the IBC regular mode. Boundary filtering can be applied to prediction blocks coded using the IBC mode or the IntraTMP mode. Boundary filtering can apply adjustments to prediction samples within a prediction block, such as prediction samples at block boundaries, using nearby reconstructed samples from previously coded areas. In one embodiment, boundary filtering is the same as the PDPC mode (also called the PDPC filter) applied to other intra-prediction modes, such as the DC mode and the planar mode, as described by equations (2) to (4).
[0087] In one embodiment, boundary filtering is based on other intra-prediction modes, such as DC mode and planar mode PDPC modes, with some adjustments.
[0088] One example is when comparing different values of parameter s with PDPC modes used in other intra-prediction modes.
[0089] Figure 6 shows an example of boundary filtering using two tap filters according to an aspect of the present invention. In one aspect, for left (upper) boundary prediction samples, boundary filtering is a weighted average of the left (upper) adjacent reconstructed samples and the left (upper) boundary prediction samples. An example of boundary filtering using two tap filters on the row above the boundary prediction samples and the column to the left of the boundary prediction samples is shown in Figure 6. The left adjacent reconstructed sample (613) and the upper adjacent reconstructed sample (611) may be neighbors of the current block (601). The number of rows above and columns to the left of the prediction samples in the current block (601) that are filtered using the boundary filters may depend on the block size.
[0090] In one embodiment, block-level and / or high-level syntax (HLS) level flags are signaled to indicate whether PDPC mode is applied to IntraBC and / or IntraTMP prediction blocks. The HLS may be flags in the video parameter set (VPS), picture parameter set (PPS), sequence parameter set (SPS), adaptive parameter set (APS), slice header, frame header, tile header, or CTU header.
[0091] In one embodiment, the template matching (TM) cost of the current IBC or IntraTMP block (similar to, for example, the one used in the IntraTMP mode shown in Figure 4) may be used to determine whether and how to apply a boundary filter.
[0092] In one example, for a block coded in an IBC model, the TM cost is calculated based on the template area indicated by the BV of the current block. Boundary filtering can be disabled when the TM cost is less than or equal to a threshold T1, for example, if T1 is equal to 0.
[0093] In another example, for a block coded in IntraTMP mode, the IntraTMP mode TM cost may be used to check against a threshold T2. If the TM cost is less than or equal to the threshold T2, boundary filtering may be disabled.
[0094] In one embodiment, the values of T1 and T2 can be different.
[0095] In one embodiment, PDPC parameters such as parameter s depend on the template matching cost, as described above.
[0096] In one embodiment, boundary filters are not applied when the currently coded block is coded in IBC merge mode.
[0097] In another embodiment, when the current block is coded in IBC merge mode and the BVP is derived from adjacent upper or upper-right spatial candidates, only the left adjacent reconstructed sample is used for the boundary filter. In another embodiment, when the current block is coded in IBC merge mode and the BVP is derived from adjacent left or lower-left spatial candidates, only the upper adjacent reconstructed sample is used for the boundary filter.
[0098] In one embodiment, the residuals of blocks coded in IntraTMP mode and IntraBC mode are used to determine whether and how to apply a boundary filter.
[0099] In one example, a boundary filter is applied when the residual energy is greater than the threshold T1'.
[0100] In one example, the boundary filter is not applied when the residual energy is less than the threshold T2'.
[0101] In one example, the residual energy is measured by the SAD, SSE, SATD, and MSE of the residual block.
[0102] In one example, the threshold values T1' and T2' may differ for blocks coded by IntraTMP mode and IntraBC mode.
[0103] In one example, a PDPC parameter like parameter s depends on the energy of the residual.
[0104] In one embodiment, boundary filtering for IBC prediction blocks (also called IntraBC prediction blocks) and / or IntraTMP prediction blocks is a PDPC filter, but the parameters used in the PDPC mode applied to the IBC prediction block or IntraTMP prediction block may differ from the parameters used in the PDPC mode applied to other intra-prediction modes (such as planar mode, DC mode, etc.). In one example, the PDPC mode applied to the IBC prediction block and / or IntraTMP prediction block is described using equation (4).
[0105] Figure 7 shows an example of boundary filtering (also called boundary filtering) according to an aspect of the present invention. The current block can be coded in either IBC mode (or IntraBC mode) or IntraTMP mode. For example, the current block can be predicted in either IBC mode or IntraTMP mode, and the predicted block (701) of the current block can be determined (e.g., generated) using either IBC mode or IntraTMP mode. In one example, the predicted block (701) is an IBC predicted block (also called an IBC predicted block) predicted using IBC mode. In another example, the predicted block (701) is an IntraTMP predicted block (also called an IntraTMP predicted block) predicted using IntraTMP mode. Boundary filtering can be applied to the predicted block (701) obtained using either IBC mode or IntraTMP mode. Therefore, boundary filtering can be applied to the predicted sample pred(x',y') (marked with X in Figure 7) (710) located at position (x',y') in the predicted block (701) corresponding to the sample in the current block.
[0106] In one embodiment, the boundary filtering parameters may depend on the coding information of the current block. The boundary filtering parameters (including, for example, parameter W) can be determined based on the coding information of the current block. The coding information may include (i) size information of the current block, such as the current block size and current block aspect ratio; (ii) color component information, such as whether the current block is a luma component or a chroma component; (iii) adjacent reconstructed samples (also called reference samples) of the current block or prediction block (701); (iv) prediction samples within the prediction block (701), such as boundary prediction samples; (v) the difference between adjacent reconstructed samples of the current block and boundary prediction samples within the prediction block (701); and / or (vi) the color format of the current block, such as YUV 4:2:0, YUV 4:2:2, YUV 4:4:4, RGB, etc. Referring to Figure 7, the prediction block (701) includes prediction samples predicted using IBC mode or IntraTMP mode. In one example, boundary filtering is applied to prediction samples within prediction block (701). In another example, boundary filtering is applied to boundary prediction samples within prediction block (701). Boundary prediction samples may include one or more lines of prediction samples near the boundary of prediction block (701).
[0107] The weights used for boundary filtering can be determined by right-shifting the parameter W according to the position (x',y'). The filtered predicted samples can be generated based on a linear combination of the reference sample and the predicted sample pred(x',y')(710) according to their respective determined weights. The linear combination can be a weighted average of the reference sample and the predicted sample pred(x',y')(710) according to their respective determined weights. Boundary filtering can be performed on the predicted sample pred(x',y')(710) by calculating the weighted average of the reference sample and the predicted sample pred(x',y')(710) according to their respective weights. The reference sample is the left reference sample R -1,y’ (For example, the adjacent reconstructed sample on the left) and the above reference sample R x’,-1 This can include adjacent reconstructed samples of a prediction block (701), such as the adjacent reconstructed sample above. The weights are sometimes also called filter coefficients. If boundary filtering is performed using PDPC mode (for example, boundary filtering is in PDPC mode), the weights are sometimes called PDPC weights or PDPC filter coefficients.
[0108] In one example, the weighted average is (w L ×R -1,y’ + w T ×R x’,-1 + w pred The weights for boundary filtering are calculated using the left reference sample R. -1,y’ The weight associated with it lol L See sample R above. x’,-1 The weight associated with it lol T And the weights associated with the predicted sample pred(x',y') w pred This can include: In one example, the sum of the weights is predefined or a constant (e.g., 64), and therefore, w predOne weight can be determined from two other weights, as in =(64-wL-wT). In one example, the weighted average is clipped using a clipping function. In one example, boundary filtering is performed by a PDPC filter (or PDPC mode), and filtered predicted samples (or boundary filtered predicted samples) (710) can be generated using equation (4), where the reference sample R -1,y’ and R x’,-1 The predicted sample (710) is shown in Figure 7. The parameter BitDepth in equation (4) can represent the bit depth.
[0109] Boundary filtering performed by PDPC (e.g., using equation (4)) on IBC prediction blocks or IntraTMP prediction blocks may differ from PDPC performed on intra prediction blocks acquired using, for example, intra prediction modes (e.g., DC mode, planar mode) (as described, e.g., using equations (1) to (3) or equations (2) to (4)).
[0110] In one embodiment, the parameters used in the PDPC mode for an IBC prediction block or an IntraTMP prediction block may differ from the parameters in the PDPC mode for an intra-prediction mode (e.g., DC mode, planar mode) or for an intra-prediction block (e.g., an intra-block predicted using an intra-prediction mode such as DC mode, planar mode, etc.).
[0111] In one example, when the PDPC mode is applied to an IntraBC and / or IntraTMP prediction block (e.g., prediction block (701)), the PDPC filter coefficients (weights) are derived using the following equations (equations (5) to (6)). x' and y' can represent the positions of the processed samples (e.g., pred(x',y')(602)) within the prediction block (701). Exemplary values of W are not limited to these, but can include 8, 4, 16, and 2. In one example, the parameter W is determined to be 8, 4, 16, or 2. In one example, W is not 32. The weights (PDPC filter coefficients) in the PDPC mode applied to the IntraBC and / or IntraTMP prediction block can be derived using the following equations (equations (5) to (6)).
number
[0112] On the other hand, as mentioned above, the weights in the PDPC mode for intra prediction modes (such as planar mode, DC mode, etc.) (e.g., w L and w T ) can be obtained using equations (2)~(3) based on the parameter s and a constant value which is 32.
[0113] Comparing equations (2)-(3) with equations (5)-(6), the weights w obtained using equations (5)-(6) in PDPC mode for IntraBC and / or IntraTMP prediction blocks L and w T The weights w can be obtained using equations (2) and (3) in PDPC mode for the intra prediction block without using the parameter s (instead, the constant value "0" is used in equations (5) and (6). L and w T This can be obtained based on parameter s without using parameter W (for example, a constant value "32" is used). Parameter W in equations (5) and (6) can change, while the value "32" in equations (2) and (3) is fixed. W may also be different from 32. In equations (2) and (3), parameter s can change, but in equations (5) and (6), s is replaced with the value "0". In one example, equations (5) and (6) are w L =W>>((x'<<1)) and w T =W>>((y'<<1)), and the weight w is obtained using equations (5)~(6) for the IntraBC and / or IntraTMP prediction blocks. L and w T This can be obtained based on parameter W and is independent of parameter s.
[0114] Referring to equations (2) to (3), the parameters in the PDPC mode for the intra-prediction block may include parameter s. Referring to equations (5) to (6), the parameters in the PDPC mode for the IntraBC and / or IntraTMP prediction block may include parameter W. Comparing equations (2) to (3) with equations (5) to (6), the parameters in the PDPC mode for the IntraBC and / or IntraTMP prediction block (e.g., including parameter W) may be different from the parameters in the PDPC mode for the intra-prediction block (e.g., including parameter s).
[0115] In one embodiment, the parameters of a PDPC filter (e.g., a PDPC filter for IntraBC and / or IntraTMP prediction blocks) depend on coding information including, but not limited to, the coding block size, coding block aspect ratio, whether the current block is a luma or chroma component, adjacent reconstructed samples, boundary predicted samples within the prediction block (e.g., boundary IBC predicted samples), the difference between adjacent reconstructed samples and boundary predicted samples, and the color format (e.g., YUV 4:2:0, YUV 4:2:2, YUV 4:4:4, or RGB). In one example, a relationship may exist between the parameters of a PDPC filter (e.g., parameter W) (e.g., for IntraBC and / or IntraTMP prediction blocks) and the coding information. In one example, parameter W increases with coding block size. In another example, W decreases with coding block size. Thus, the parameters of a PDPC filter (e.g., a PDPC filter for IntraBC and / or IntraTMP prediction blocks) can be derived based on the coding information described above. Weights of PDPC filters (e.g., w) (for example, for IntraBC and / or IntraTMP prediction blocks) L and w T ) can depend on coding information due to the dependence of the weights on the parameter W, for example, as shown in equations (5) to (6).
[0116] For example, the parameters of a PDPC filter (e.g., a PDPC filter for IntraBC and / or IntraTMP prediction blocks) may be derived separately for each color component or subset of color components.
[0117] In one embodiment, it is possible to determine from adjacent reconstructed samples whether boundary filtering (e.g., PDPC mode) is applied to the current block or to a predicted block (701) predicted by IBC or IntraTMP. Whether a PDPC mode is applied can be adaptively determined based on adjacent reconstructed samples (also called adjacent reconstructed samples), for example, whether a PDPC mode, such as those described in equations (4) to (6), is applied to a predicted block (701) can be adaptively determined based on adjacent reconstructed samples.
[0118] In one example, the content type detection process is applied to adjacent reconstructed samples. If the adjacent reconstruction is determined to be screen content (e.g., non-camera-captured content), the PDPC mode is not applied to the IntraBC and / or IntraTMP prediction blocks (e.g., IBC prediction blocks and / or IntraTMP prediction blocks). Otherwise, the PDPC mode is applied to the IntraBC and / or IntraTMP prediction blocks (e.g., IBC prediction blocks and / or IntraTMP prediction blocks). For example, the content type of an adjacent reconstructed sample is detected. If the content type of the adjacent reconstructed sample is screen content, boundary filtering (e.g., PDPC mode) is not applied to the block (e.g., IBC prediction blocks and / or IntraTMP prediction blocks such as prediction block 701). If the content type of the adjacent reconstructed sample is not screen content (e.g., the content type is natural camera-captured content), boundary filtering (e.g., PDPC mode) is applied to the block (e.g., IBC prediction blocks and / or IntraTMP prediction blocks such as prediction block 701).
[0119] In one example, the content type detection process includes checking whether there are distinct color values within adjacent blocks. If there are color values below a given threshold, the adjacent reconstructed sample can be determined to have screen content. For example, the number of color values (e.g., the number of distinct color values) of the adjacent reconstructed sample is checked. If the number of color values is less than the color value threshold, the content type of the adjacent reconstructed sample is detected as screen content.
[0120] In one example, adjacent reconstructed samples are of a specific color component, and the color value includes the value of that specific color component. In another example, the color value represents the value of one specific color component, such as lumens or the lumens component.
[0121] In one example, adjacent reconstructed samples are associated with multiple color components, and the color value includes combinations of the values of each of those color components. In another example, the color value represents a combination of values for multiple color components, such as a combination of Y, Cb, and Cr, or a combination of R, G, and B. In yet another example, Y represents the luma component, and Cb and Cr represent the chroma components.
[0122] In one example, boundary filtering (e.g., PDPC mode) is applied only to predefined color components when, for example, the color component of the current block (or predicted block (701)) (e.g., an IBC-coded block or an IntraTMP-coded block) is a luma component. Boundary filtering can be applied to each color component associated with a block. In one embodiment, PDPC mode is applied only to IntraBC and / or IntraTMP-coded blocks when the current color component is luma (is a luma component). Alternatively, PDPC mode is applied only to IntraBC and / or IntraTMP-coded blocks for all color components (e.g., all of Y, Cb, and Cr).
[0123] In one example, boundary filtering (e.g., PDPC mode) is applied to the prediction block (701) only when the current slice containing the current block is of a predefined slice type, such as an intra-slice.
[0124] In one embodiment, the PDPC mode applies only to IntraBC and / or IntraTMP coded blocks (e.g., IBC coded blocks or IntraTMP coded blocks) when the current slice is an intra-slice. The current slice contains IntraBC and / or IntraTMP coded blocks. In one example, whether the PDPC mode applies to an intra-slice and / or inter-slice is coded in high-level syntax (HLS). The HLS may be a flag in the VPS, PPS, SPS, APS, slice header, frame header, tile header, or CTU header.
[0125] Figure 8 is a flowchart outlining process (800) according to an embodiment of the present disclosure. Process (800) can be used in a video decoder. In various embodiments, process (800) is executed by processing circuits, such as a processing circuit that performs the functions of a video decoder (110) and a processing circuit that performs the functions of a video decoder (210). In some embodiments, process (800) is implemented with software instructions, and therefore, when a processing circuit executes a software instruction, the processing circuit executes process (800). Processing starts from (S801) and proceeds to (S810).
[0126] (S810) Intrablock copy (IBC) mode and intratemplate matching prediction In one of the (IntraTMP) modes, it is possible to receive a coded video bitstream that includes the current picture having coded blocks (e.g., the current block shown in Figure 7).
[0127] In (S820), the predicted block of the above block (for example, the predicted block (701) shown in Figure 7) can be determined using either the IBC mode or the IntraTMP mode.
[0128] In (S830), in response to boundary filtering being applied to a block, boundary filtering can be applied to the predicted sample pred(x',y') located at position (x',y') in the predicted block corresponding to the sample in the block, as explained with reference to Figure 7. The predicted sample pred(x',y') can be filtered by boundary filtering as follows: The weights used for boundary filtering can be determined, for example, by right-shifting the parameter W according to the position (x',y') within the prediction block. The filtered prediction samples can then be generated based on a linear combination of the reference samples and prediction samples, according to the determined weights.
[0129] In one example, the boundary filtering parameter W is determined (e.g., derived) based on the coding information of the block. The coding information may include the coding block size, coding block aspect ratio, whether the block is a luma or chroma component, adjacent reconstructed samples of the block, boundary prediction samples within the prediction block, the difference between adjacent reconstructed samples of the block and boundary prediction samples within the prediction block, or the color format of the block. In one example, the parameter W is determined to be 8, 4, 16, or 2.
[0130] The reference sample is shown in Figure 7, left reference sample R -1,y’ and see sample R above. x’,-1 It can include the weights of the left reference sample R. -1,y’ The weight associated with it lol L See sample R above. x’,-1 It can be associated with lol TAnd the weights associated with the predicted sample pred(x',y') are (64-w L -w T ) can include weight w L w L We can determine this by setting =W>>((x'<<1)>>0), and the weight w T w T This can be determined by setting =W>>((y'<<1)>>0).
[0131] In one example, boundary filtering is performed by a location-dependent predictor combination (PDPC) filter. The filtered prediction samples are then processed using Clip(0,(1< <BitDepth)-1,(w L ×R -1,y’ + w T ×R x’,-1 +(64-w L -w T It can be generated as ) × pred(x',y') + 3²) >> 6). The parameter BitDepth indicates the bit depth.
[0132] Next, the process proceeds to (S899) and terminates.
[0133] Process (800) can be appropriately adapted. Steps in process (800) can be modified and / or omitted. Additional steps can be added. Any appropriate implementation order can be used. In one example, whether boundary filtering is applied to a block is determined from adjacent reconstructed samples of the block or predicted block.
[0134] Figure 9 is a flowchart outlining process (900) according to an embodiment of the present disclosure. Process (900) can be used in a video encoder. In various embodiments, process (900) is executed by processing circuits, such as a processing circuit that performs the functions of a video encoder (103) and a processing circuit that performs the functions of a video encoder (303). In some embodiments, process (900) is implemented with software instructions, and therefore, when a processing circuit executes a software instruction, the processing circuit executes process (900). Processing starts at (S901) and proceeds to (S910).
[0135] In (S910), the predicted block of the block is determined using intrablock copy (IBC) mode and intratemplate matching. prediction This can be determined using either (IntraTMP) mode or another mode.
[0136] In (S920), in response to boundary filtering being applied to a block, boundary filtering can be applied to the predicted sample pred(x',y') located at position (x',y') in the predicted block corresponding to the sample in the block, as explained with reference to Figure 7. The predicted sample pred(x',y') can be filtered by boundary filtering as follows: The weights used for boundary filtering can be determined, for example, by shifting the parameter W to the right according to the position (x',y') within the prediction block. The filtered prediction samples can then be generated based on a linear combination of the reference samples and prediction samples, according to the determined weights.
[0137] In one example, the boundary filtering parameter W is determined (derived) based on the block size of the block, the block aspect ratio of the block, whether the block is a luma component or a chroma component, the adjacent reconstructed samples of the block, the boundary prediction samples in the prediction block, the difference between the adjacent reconstructed samples of the block and the boundary prediction samples in the prediction block, or the color format of the block. In one example, the parameter W is determined to be 8, 4, 16, or 2.
[0138] The reference samples can include, as shown in FIG. 7, a left reference sample R -1,y’ and an upper reference sample R x’,-1 . The weights can include a weight w -1,y’ associated with the left reference sample R L , a weight w x’,-1 associated with the upper reference sample R T , and a weight (64 - w L - w T ) associated with the prediction sample pred(x’, y’). The weight w L can be determined as w L = W >> ((x’ << 1) >> 0), and the weight w T can be determined as w T = W >> ((y’ << 1) >> 0).
[0139] In one example, boundary filtering is performed by a position-dependent predictor combination (PDPC) filter. The filtered prediction sample can be generated as Clip(0, (1 << BitDepth) - 1, (w L × R -1,y’ + w T × R x’,-1 +(64 - w L - w T ) × pred(x’, y’) + 32) >> 6). The parameter BitDepth indicates the bit depth.
[0140] Then, the process proceeds to (S999) and ends.
[0141] Process (900) can be appropriately adapted. Steps in process (900) can be modified and / or omitted. Additional steps can be added. Any appropriate implementation order can be used.
[0142] The aspects, embodiments, and / or examples in this disclosure may be used separately or in any order. Each of the methods (or aspects), encoders, and decoders may be implemented by a processing circuit (e.g., one or more processors or one or more integrated circuits). In one example, one or more processors execute a program stored in a non-temporary computer-readable object.
[0143] The above-described technology can be implemented as computer software using computer-readable instructions and physically stored on one or more computer-readable media. For example, Figure 10 shows a computer system (1000) suitable for implementing an aspect of the disclosed subject matter.
[0144] Computer software can be coded using any suitable machine code or computer language that may be subject to assembly, compilation, linking, or similar mechanisms, and can contain code that includes instructions that can be executed directly or through interpretation, microcode execution, etc., by one or more computer central processing units (CPUs), graphics processing units (GPUs), etc.
[0145] The instructions can be executed on various types of computers or their components, including, for example, personal computers, tablet computers, servers, smartphones, game devices, and Internet of Things devices.
[0146] The components shown in Figure 10 for the computer system (1000) are essentially illustrative and are not intended to imply any limitations on the scope or functionality of computer software implementing aspects of this disclosure. Furthermore, the configuration of the components should not be construed as having any dependence or requirement on any one or combination of components illustrated in the exemplary aspects of the computer system (1000).
[0147] The computer system (1000) may include a specific human interface input device. Such a human interface input device may respond to input from one or more human users, for example, through haptic input (keystrokes, swipes, data glove movements, etc.), audio input (voice, applause, etc.), visual input (gestures, etc.), or olfactory input (not shown). The human interface input device may also be used to capture specific media that are not necessarily directly related to conscious human input, such as audio (voices, music, ambient sounds, etc.), images (scanned images, photographic images obtained from still image cameras, etc.), or video (2D video, 3D video including stereoscopic images, etc.).
[0148] The human interface input device may include one or more of the following: keyboard (1001), mouse (1002), trackpad (1003), touch screen (1010), data glove (not shown), joystick (1005), microphone (1006), scanner (1007), and camera (1008) (only one of each is shown).
[0149] The computer system (1000) may also include certain human interface output devices. Such human interface output devices may stimulate the senses of one or more human users, for example, through tactile output, sound, light, and smell / taste. Such human interface output devices may include tactile output devices (e.g., tactile feedback via a touch screen (1010), data glove (not shown), or joystick (1005), although there may also be tactile feedback devices that do not function as input devices), audio output devices (e.g., speakers (10010), headphones (not shown)), visual output devices (screens (1010), including CRT screens, LCD screens, plasma screens, OLED screens, etc., with or without touch screen input functionality and with or without tactile feedback functionality, some of which may output two-dimensional visual output or output beyond three dimensions through means such as stereoscopic image output or virtual reality glasses (not shown), holographic displays, and smoke tanks (not shown)), and printers (not shown).
[0150] The computer system (1000) may also include optical media or similar media (1021) including CD / DVD ROM / RW (1020) having CD / DVD, thumb drives (1022), removable hard drives or solid-state drives (1023), legacy magnetic media such as tapes and floppy disks (not shown), and special ROM / ASIC / PLD-based devices such as security dongles (not shown), as well as their associated media.
[0151] Those skilled in the art should also understand that, when used in connection with the subject matter now disclosed, the term “computer-readable medium” does not encompass a transmission medium, carrier wave, or other transient signal.
[0152] The computer system (1000) may also include an interface (1054) to one or more communication networks (1055). The networks may be, for example, wireless, wired, or optical. The networks may further be local, wide-area, metropolitan, vehicle and industrial, real-time, latency-tolerant, etc. Examples of networks include local area networks such as Ethernet® and wireless LAN; cellular networks including GSM®, 3G, 4G, 5G, LTE, etc.; wired or wireless wide-area digital networks including cable TV, satellite TV, and terrestrial TV; and vehicle and industrial networks including CANBus, etc. Certain networks generally require external network interface adapters (e.g., USB ports on the computer system (1000)) attached to specific general-purpose data ports or peripheral buses (1049), while others are generally integrated into the core of the computer system (1000) by attachment to system buses described later (e.g., an Ethernet® interface to a PC computer system or a cellular network interface to a smartphone computer system). Using any of these networks, the computer system (1000) can communicate with other entities. Such communication may be unidirectional and receivable (e.g., broadcast television), unidirectional and receivable (e.g., from a specific CANbus to a specific CANbus device), or bidirectional, to other computer systems, for example, using a local or wide-area digital network. As described above, specific protocols and protocol stacks may be used in each of these networks and network interfaces.
[0153] The aforementioned human interface device, human-accessible storage device, and network interface can be attached to the core (1040) of the computer system (1000).
[0154] The core (1040) may include one or more central processing units (CPUs) (1041), graphics processing units (GPUs) (1042), dedicated programmable processing units in the form of field-programmable gate arrays (FPGAs) (1043), hardware accelerators for specific tasks (1044), graphics adapters (1050), etc. These devices may be connected via a system bus (1048) along with read-only memory (ROM) (1045), random access memory (RAM) (1046), internal non-user-accessible hard drives, SSDs, and other internal mass storage (1047). In some computer systems, the system bus (1048) is accessible in the form of one or more physical plugs to allow expansion with additional CPUs, GPUs, etc. Peripheral devices can be attached directly to the core's system bus (1048) or via a peripheral bus (1049). For example, a screen (1010) can be connected to the graphics adapter (1050). The peripheral bus architecture includes PCI, USB, etc.
[0155] The CPU (1041), GPU (1042), FPGA (1043), and accelerator (1044) can execute specific instructions that, when combined, constitute the aforementioned computer code. This computer code can be stored in ROM (1045) or RAM (1046). Temporary data can also be stored in RAM (1046), while permanent data can be stored, for example, in internal mass storage (1047). High-speed storage and retrieval of any of the memory devices can be enabled through the use of cache memory that can be closely associated with one or more CPUs (1041), GPUs (1042), mass storage (1047), ROMs (1045), RAM (1046), etc.
[0156] A computer-readable medium may have computer code thereon for performing various computer-implemented operations. The medium and computer code may be specifically designed and constructed for the purposes of this disclosure, or they may be of a type that is well known and available to those skilled in the computer software technology.
[0157] For example, but not limited to, a computer system (1000) having an architecture, in particular a core (1040), can provide functionality as a result of a processor (including a CPU, GPU, FPGA, accelerator, etc.) executing software embodied in one or more tangible computer-readable media. Such computer-readable media can be user-accessible mass storage as described above, as well as media associated with specific storage of the core (1040) of a non-transient nature, such as core internal mass storage (1047) or ROM (1045). Software implementing various aspects of this disclosure can be stored in such devices and executed by the core (1040). The computer-readable media can include one or more memory devices or chips according to specific needs. The software can cause the core (1040) and in particular the processor (including a CPU, GPU, FPGA, etc.) therein to execute specific processes or specific parts of specific processes as described herein, including defining data structures to be stored in RAM (1046) and modifying such data structures according to processes defined by the software. As an addition or alternative, a computer system may provide functionality as a result of being embodied in a circuit (e.g., an accelerator (1044)) in logic hardwired or otherwise, which can operate in place of or with software to perform a particular process or a particular part of a particular process described herein. References to software include logic, and vice versa, as appropriate. References to a computer-readable medium may include a circuit (such as an integrated circuit (IC)) that stores software for execution, a circuit that embodies logic for execution, or both, where appropriate. This disclosure encompasses any appropriate combination of hardware and software.
[0158] In this disclosure, the use of “at least one of” or “one of” is intended to include any one or a combination of the enumerated elements. For example, “at least one of A, B, or C”; “at least one of A, B, and C”; “at least one of A, B, and / or C”; and references to at least one of A-C are intended to include A only, B only, C only, or any combination thereof. References to one of A or B, and one of A and B, are intended to include A or B or (A and B). The use of “one of” does not exclude any combination of the enumerated elements when applicable, such as when the elements are not mutually exclusive.
[0159] While this disclosure describes several exemplary embodiments, there are various modifications, substitutions, and equivalent alternatives that fall within the scope of this disclosure. Therefore, those skilled in the art will understand that various systems and methods, not expressly shown or described herein, embody the principles of this disclosure and thus fall within the spirit and scope of this disclosure.
Claims
1. A method of video decoding performed by a processor, The steps include receiving a coded video bitstream, which includes the current picture having coded blocks in either intrablock copy (IBC) mode or intratemplate matching prediction (IntraTMP) mode, The steps include determining the predicted block of the block using one of the IBC mode and the IntraTMP mode, A step of determining whether to apply boundary filtering to the predicted block of the block based on adjacent reconstructed samples, To detect the content type of the adjacent reconstructed sample, Based on the detected content type, it is determined whether to apply the boundary filtering to the predicted block of the block, The steps include determining whether to apply the boundary filtering, The step of applying the boundary filtering to the prediction block of the block is to apply the boundary filtering to the prediction sample pred(x', y') located at position (x', y') in the prediction block corresponding to the sample in the block, Based on the coding information of the aforementioned block, the boundary filtering parameter W is determined, The weights used in the boundary filtering are determined by right-shifting the parameter W according to the position (x', y'), Based on the linear combination of the reference sample and the predicted sample pred(x', y') according to the weights determined above, a filtered predicted sample is generated. The steps include applying the boundary filtering and Methods that include...
2. The aforementioned weight is the left reference sample R -1,y’ The weight associated with it lol L See sample R above. x’,-1 The weight associated with it lol T And the weight (64-w) associated with the aforementioned predicted sample pred(x', y') L -w T ) and The aforementioned weight w L This is equal to W >> ((x' << 1) >> 0), and the aforementioned weight w T This is equal to W >> ((y' << 1) >> 0), The method according to claim 1.
3. The boundary filtering is performed by a location-dependent predictor combination (PDPC) filter. Generating the filtered prediction sample includes generating the filtered prediction sample as (Clip(0, (1 << BitDepth) - 1, (w L × R -1,y’ + w T × R x’,-1 + (64 - w L - w T × pred(x', y') + 32) >> 6), where the BitDepth indicates the bit depth. The method according to claim 2.
4. The coding information of the block includes the coding block size, the coding block aspect ratio, whether the block is a luma component or a chroma component, the adjacent reconstructed samples of the block, the boundary prediction samples within the prediction block, the difference between the adjacent reconstructed samples of the block and the boundary prediction samples within the prediction block, or the color format of the block. The method according to claim 1.
5. The parameter W is determined to be 8, 4, 16, or 2. The method according to claim 1.
6. The step of determining whether to apply the boundary filtering is: In response to the content type of the adjacent reconstructed sample being screen content, the boundary filtering is not applied to the prediction block of the block. Applying the boundary filtering to the prediction block of the block in response to the content type of the adjacent reconstructed sample not being screen content, The method according to claim 1, including the method described in claim 1.
7. Detecting the aforementioned content type means The number of color values of the adjacent reconstructed samples is checked, In response to the number of color values being less than the color value threshold, the content type is detected as the screen content, The method according to claim 6, including the method described in claim 6.
8. The adjacent reconstructed samples are of a specific color component, and the color values include the values of the specific color component. The method according to claim 7.
9. The adjacent reconstructed samples are associated with multiple color components, and the color values include combinations of the values of each of the multiple color components. The method according to claim 7.
10. A device for video decoding, comprising a processing circuit configured to perform the method described in any one of claims 1 to 9.
11. A computer program, when executed by at least one processor, that causes the at least one processor to perform the method according to any one of claims 1 to 9.