Methods and apparatus of adaptive predictor blending and adaptive boundary process order in overlapped block refinement for intra prediction in video coding

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By integrating OBMC with normal motion compensation and refining intra block boundaries using intra prediction, the method addresses computational complexity and visual artefacts in video coding, enhancing coding efficiency and visual quality.

WO2026130282A1PCT designated stage Publication Date: 2026-06-25MEDIATEK INC

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: MEDIATEK INC
Filing Date: 2025-12-15
Publication Date: 2026-06-25

AI Technical Summary

Technical Problem

Existing video coding systems face challenges in efficiently handling visual artefacts and reducing computational complexity due to the separate application of Overlapped Block Motion Compensation (OBMC) and Bi-Directional Optical Flow (BIO) processes, which increase bandwidth and memory requirements, and the lack of refinement for intra block boundaries in intra prediction.

Method used

The proposed method integrates OBMC with normal motion compensation processes, applies OBMC to intra block boundaries using intra prediction modes, and refines boundary pixels with intra prediction, while optimizing the order of geometric partitioning and sub-block processing to reduce computational complexity and improve coding efficiency.

Benefits of technology

This approach enhances coding efficiency by reducing computational complexity and memory bandwidth, improves visual quality by smoothing intra block boundaries, and provides better handling of intra prediction discontinuities, leading to improved video coding performance.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025142422_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A method and apparatus of video coding using overlapped boundary processing for blended intra prediction are disclosed. According to this method, an intra prediction blending process and an overlapped boundary refinement process is applied to the current block. Whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is pre-defined or is determined adaptively for the current block according to one or more conditions. A refined final predictor is derived and the derivation process is dependent on whether the intra prediction blending process is applied to the current block or the current subblock before or after the overlapped boundary refinement process. The current block or the current subblock is encoded or decoded using the refined final predictor.

Need to check novelty before this filing date? Find Prior Art

Description

METHODS AND APPARATUS OF ADAPTIVE PREDICTOR BLENDING AND ADAPTIVE BOUNDARY PROCESS ORDER IN OVERLAPPED BLOCK REFINEMENT FOR INTRA PREDICTION IN VIDEO CODINGCROSS REFERENCE TO RELATED APPLICATIONS

[0001] The present invention is a non-Provisional Application of and claims priority to U.S. Provisional Patent Application No. 63 / 735,455, filed on December 18, 2024. The U.S. Provisional Patent Application is hereby incorporated by reference in its entirety.FIELD OF THE INVENTION

[0002] The present invention relates to video coding system using overlapped block boundary processing for intra prediction. In particular, the present invention relates to processing order for applying overlapped block boundary processing and blended intra prediction. BACKGROUND AND RELATED ART

[0003] Versatile video coding (VVC) is the latest international video coding standard developed by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and the ISO / IEC Moving Picture Experts Group (MPEG) . The standard has been published as an ISO standard: ISO / IEC 23090-3: 2021, Information technology -Coded representation of immersive media -Part 3: Versatile video coding, published Feb. 2021. VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals.

[0004] Fig. 1A illustrates an exemplary adaptive Inter / Intra video encoding system incorporating loop processing. For Intra Prediction 110, the prediction data is derived based on previously coded video data in the current picture. For Inter Prediction 112, Motion Estimation (ME) is performed at the encoder side and Motion Compensation (MC) is performed based on the result of ME to provide prediction data derived from other picture (s) and motion data. Switch 114 selects Intra Prediction 110 or Inter Prediction 112 and the selected prediction data is supplied to Adder 116 to form prediction errors, also called residues. The prediction error is then processed by Transform (T) 118 followed by Quantization (Q) 120. The transformed and quantized residues are then coded by Entropy Encoder 122 to be included in a video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with Intra prediction and Inter prediction, and other information such as parameters associated with loop filters applied to underlying image area. The side information associated with Intra Prediction 110, Inter prediction 112 and in-loop filter 130, is provided to Entropy Encoder 122 as shown in Fig. 1A. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well. Consequently, the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues. The residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data. The reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames.

[0005] As shown in Fig. 1A, incoming video data undergoes a series of processing in the encoding system. The reconstructed video data from REC 128 may be subject to various impairments due to a series of processing. Accordingly, in-loop filter 130 is often applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality. For example, deblocking filter (DF) , Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) may be used. The loop filter information may need to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, loop filter information is also provided to Entropy Encoder 122 for incorporation into the bitstream. In Fig. 1A, Loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in the reference picture buffer 134. The system in Fig. 1A is intended to illustrate an exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Coding (HEVC) system, VP8, VP9, H. 264 or VVC.

[0006] The decoder, as shown in Fig. 1B, can use similar or portion of the same functional blocks as the encoder except for Transform 118 and Quantization 120 since the decoder only needs Inverse Quantization 124 and Inverse Transform 126. Instead of Entropy Encoder 122, the decoder uses an Entropy Decoder 140 to decode the video bitstream into quantized transform coefficients and needed coding information (e.g. ILPF information, Intra prediction information and Inter prediction information) . The Intra prediction 150 at the decoder side does not need to perform the mode search. Instead, the decoder only needs to generate Intra prediction according to Intra prediction information received from the Entropy Decoder 140. Furthermore, for Inter prediction, the decoder only needs to perform motion compensation (MC 152) according to Inter prediction information received from the Entropy Decoder 140 without the need for motion estimation.

[0007] Overlapped Block Motion Compensation (OBMC)

[0008] Overlapped Block Motion Compensation (OBMC) is to find a Linear Minimum Mean Squared Error (LMMSE) estimate of a pixel intensity value based on motion-compensated signals derived from its nearby block motion vectors (MVs) . From estimation-theoretic perspective, these MVs are regarded as different plausible hypotheses for its true motion, and to maximize coding efficiency, their weights should minimize the mean squared prediction error subject to the unit-gain constraint.

[0009] When High Efficient Video Coding (HEVC) was developed, several proposals were made using OBMC to provide coding gain. Some of them are described as follows.

[0010] In JCTVC-C251 (Peisong Chen, et. al., “Overlapped block motion compensation in TMuC” , Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11, 3rd Meeting: Guangzhou, CN, 7-15 October, 2010, Document: JCTVC-C251) , OBMC was applied to geometry partition. In geometry partition, it is very likely that a transform block contains pixels belonging to different partitions. In geometry partition, since two different motion vectors are used for motion compensation, the pixels at the partition boundary may have large discontinuities that can produce visual artefacts similar to blockiness. This in turn decreases the transform efficiency. Let the two regions created by a geometry partition be denoted by region 1 and region 2. A pixel from region 1 (2) is defined to be a boundary pixel if any of its four connected neighbours (left, top, right, and bottom) belongs to region 2 (1) . Fig. 2 shows an example where grey-dotted pixels belong to the boundary of region 1 (grey region) and white-dotted pixels belong to the boundary of region 2 (white region) . If a pixel is a boundary pixel, the motion compensation is performed using a weighted sum of the motion predictions from the two motion vectors. The weights are 3 / 4 for the prediction using the motion vector of the region containing the boundary pixel and 1 / 4 for the prediction using the motion vector of the other region. The overlapping boundaries improve the visual quality of the reconstructed video while also providing BD-rate gain.

[0011] In JCTVC-F299 (Liwei Guo, et. al., “CE2: Overlapped Block Motion Compensation for 2NxN and Nx2N Motion Partitions” , Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11, 6th Meeting: Torino, 14-22 July, 2011, Document: JCTVC-F299) , OBMC was applied to symmetrical motion partitions. If a coding unit (CU) is partitioned into 2 2NxN or Nx2N prediction units (PUs) , OBMC is applied to the horizontal boundary of the two 2NxN prediction blocks, and the vertical boundary of the two Nx2N prediction blocks. Since those partitions may have different motion vectors, the pixels at partition boundaries may have large discontinuities, which may generate visual artefacts and also reduce the transform / coding efficiency. In JCTVC-F299, OBMC is introduced to smooth the boundaries of motion partition.

[0012] Figs. 3A-B illustrate an example of OBMC for 2NxN (Fig. 3A) and Nx2N blocks (Fig. 3B) . The grey pixels are pixels belonging to Partition 0 and white pixels are pixels belonging to Partition 1. The overlapped region in the luma component is defined as 2 rows (columns) of pixels on each side of the horizontal (vertical) boundary. For pixels which are 1 row (column) apart from the partition boundary, i.e., pixels labelled as A in Figs. 3A-B, OBMC weighting factors are (3 / 4, 1 / 4) . For pixels which are 2 rows (columns) apart from the partition boundary, i.e., pixels labelled as B in Figs. 3A-B, OBMC weighting factors are (7 / 8, 1 / 8) . For chroma components, the overlapped region is defined as 1 row (column) of pixels on each side of the horizontal (vertical) boundary, and the weighting factors are (3 / 4, 1 / 4) .

[0013] Currently, the OBMC is performed after normal MC, and BIO is also applied in these two MC processes, separately. That is, the MC results for the overlapped region between two CUs or PUs is generated by another process not in the normal MC process. BIO (Bi-Directional Optical Flow) is then applied to refine these two MC results. This can help to skip the redundant OBMC and BIO processes, when two neighbouring MVs are the same. However, the required bandwidth and MC operations for the overlapped region is increased compared to integrating the OBMC process into the normal MC process. For example, the current PU size is 16x8, the overlapped region is 16x2, and the interpolation filter in MC is 8-tap. If the OBMC is performed after normal MC, then we need (16+7) x (8+7) + (16+7) x (2+7) = 552 reference pixels per reference list for the current PU and the related OBMC. If the OBMC operations are combined with normal MC into one stage, then only (16+7) x (8+2+7) = 391 reference pixels per reference list for the current PU and the related OBMC. Therefore, in the following, in order to reduce the computation complexity or memory bandwidth of BIO, several methods are proposed, when BIO and OBMC are enabled simultaneously.

[0014] In the JEM (Joint Exploration Model) , the OBMC is also applied. In the JEM, unlike in H. 263, OBMC can be switched on and off using syntax at the CU level. When OBMC is used in the JEM, the OBMC is performed for all motion compensation (MC) block boundaries except for the right and bottom boundaries of a CU. Moreover, it is applied to both the luma and chroma components. In the JEM, a MC block corresponds to a coding block. When a CU is coded with sub-CU mode (includes sub-CU merge, affine and FRUC mode) , each sub-block of the CU is a MC block. To process CU boundaries in a uniform fashion, OBMC is performed at sub-block level for all MC block boundaries, where sub-block size is set equal to 4×4, as illustrated in Figs. 4A-B.

[0015] When OBMC is applied to the current sub-block, besides current motion vectors, motion vectors of four connected neighbouring sub-blocks, if available and are not identical to the current motion vector, are also used to derive the prediction block for the current sub-block. These multiple prediction blocks based on multiple motion vectors are combined to generate the final prediction signal of the current sub-block. Prediction block based on motion vectors of a neighbouring sub-block is denoted as PNn, with n indicating an index for the neighbouring above, below, left and right sub-blocks and prediction block based on motion vectors of the current sub-block is denoted as PC. Fig. 4A illustrates an example of OBMC for sub-blocks of the current CU 410 using a neighbouring above sub-block (i.e., PN1) , left neighbouring sub-block (i.e., PN2) , left and above sub-blocks i.e., PN3) . Fig. 4B illustrates an example of OBMC for the ATMVP mode, where block PN of the current CU 420 uses MVs from four neighbouring sub-blocks for OBMC. When PN is based on the motion information of a neighbouring sub-block that contains the same motion information as the current sub-block, the OBMC is not performed from PN. Otherwise, every sample of PN is added to the same sample in PC, i.e., four rows / columns of PN are added to PC. The weighting factors {1 / 4, 1 / 8, 1 / 16, 1 / 32} are used for PN and the weighting factors {3 / 4, 7 / 8, 15 / 16, 31 / 32} are used for PC. The exception are small MC blocks (i.e., when height or width of the coding block is equal to 4 or a CU is coded with sub-CU mode) , for which only two rows / columns of PN are added to PC. In this case, weighting factors {1 / 4, 1 / 8} are used for PN and weighting factors {3 / 4, 7 / 8} are used for PC. For PN generated based on motion vectors of vertically (horizontally) neighbouring sub-block, samples in the same row (column) of PN are added to PC with a same weighting factor.

[0016] In the JEM, for a CU with size less than or equal to 256 luma samples, a CU level flag is signalled to indicate whether OBMC is applied or not for the current CU. For the CUs with size larger than 256 luma samples or not coded with the AMVP mode, OBMC is applied by default. At the encoder, when OBMC is applied for a CU, its impact is taken into account during the motion estimation stage. The prediction signal formed by OBMC using motion information of the top neighbouring block and the left neighbouring block is used to compensate the top and left boundaries of the original signal of the current CU, and then the normal motion estimation process is applied.

[0017] In JEM (Joint Exploration Model for VVC development) , the OBMC is applied. For example, as shown in Fig. 5, for a current block 510, if the above block and the left block are coded in an inter mode, it takes the MV of the above block to generate an OBMC block A and takes the MV of the left block to generate an OBMC block L. The predictors of OBMC block A and OBMC block L are blended with the current predictors. To reduce the memory bandwidth of OBMC, it is proposed to do the above 4-row MC and left 4-column MC with the neighbouring blocks. For example, when doing the above block MC, 4 additional rows are fetched to generate a block of (above block + OBMC block A) . The predictors of OBMC block A are stored in a buffer for coding the current block. When doing the left block MC, 4 additional columns are fetched to generate a block of (left block + OBMC block L) . The predictors of OBMC block L are stored in a buffer for coding the current block. Therefore, when doing the MC of the current block, four additional rows and four additional columns of reference pixels are fetched to generate the predictors of the current block, the OBMC block B, and the OBMC block R as shown in Fig. 6A (may also generate the OBMC block BR as shown in Fig. 6B) . The OBMC block B and the OBMC block R are stored in buffers for the OBMC process of the bottom neighbouring blocks and the right neighbouring blocks.

[0018] For an M x N block, if the MV is not integer and an 8-tap interpolation filter is applied, a reference block with size of (M+7) x (N+7) is used for motion compensation. However, if the BIO and OBMC is applied, additional reference pixels are required, which increases the worst case memory bandwidth.

[0019] There are two different schemes to implement OBMC.

[0020] In the first scheme, OBMC blocks are pre-generated when performing motion compensation for each block. These OBMC blocks will be stored in a local buffer for neighbouring blocks. In the second scheme, the OBMC blocks are generated before the blending process of each block when performing OBMC.

[0021] In both scheme, several methods are proposed to reduce the computation complexity, especially for the interpolation filtering, and additional bandwidth requirement of OBMC.

[0022] Template Matching Based OBMC

[0023] A template matching-based OBMC scheme has been proposed (JVET-Y0076) recently. As shown in Fig. 7, for each top block with a size of 4×4 at the top CU boundary, the above template size equals to 4×1. In Fig. 7, box 710 corresponds to a CU. If N adjacent blocks have the same motion information, then the above template size is enlarged to 4N×1 since the MC operation can be processed at one time, which is in the same manner in ECM-OBMC. For each left block with a size of 4×4 at the left CU boundary, the left template size equals to 1×4 or 1×4N.

[0024] For each 4×4 top block (or N 4×4 blocks group) , the prediction value of boundary samples is derived according to the following steps: – Take block A as the current block and its above neighbouring block AboveNeighbour_A for example. The operation for left blocks is conducted in the same manner. – First, three template matching costs (Cost1, Cost2, Cost3) are measured by SAD between the reconstructed samples of a template and its corresponding reference samples derived by MC process according to the following three types of motion information: Cost1 is calculated according to A’s motion information. Cost2 is calculated according to AboveNeighbour_A’s motion information. Cost3 is calculated according to weighted prediction of A’s and AboveNeighbour_A’s motion information with weighting factors as 3 / 4 and 1 / 4 respectively. – Second, choose one out of three approaches to calculate the final prediction results of boundary samples by comparing Cost1, Cost2 and Cost 3.

[0025] The original MC result using current block’s motion information is denoted as Pixel1, and the MC result using neighbouring block’s motion information is denoted as Pixel2. The final prediction result is denoted as NewPixel. - If Cost1 is minimum, then NewPixel (i, j) = Pixel1 (i, j) . - If (Cost2 + (Cost2 >> 2) + (Cost2 >> 3) ) <= Cost1, then blending mode 1 is used. For luma blocks, the number of blending pixel rows is 4. - NewPixel (i, 0) = (26×Pixel1 (i, 0) +6×Pixel2 (i, 0) +16) ＞＞5 - NewPixel (i, 1) = (7×Pixel1 (i, 1) +Pixel2 (i, 1) +4) ＞＞3 - NewPixel (i, 2) = (15×Pixel1 (i, 2) +Pixel2 (i, 2) +8) ＞＞4 - NewPixel (i, 3) = (31×Pixel1 (i, 3) +Pixel2 (i, 3) +16) ＞＞5 For chroma blocks, the number of blending pixel rows is 1. - NewPixel (i, 0) = (26×Pixel1 (i, 0) +6×Pixel2 (i, 0) +16) ＞＞5 - If Cost1 <= Cost2, then blending mode 2 is used. For luma blocks, the number of blending pixel rows is 2. - NewPixel (i, 0) = (15×Pixel1 (i, 0) +Pixel2 (i, 0) +8) ＞＞4 - NewPixel (i, 1) = (31×Pixel1 (i, 1) +Pixel2 (i, 1) +16) ＞＞5 For chroma blocks, the number of blending pixel rows / columns is 1. - NewPixel (i, 0) = (15×Pixel1 (i, 0) +Pixel2 (i, 0) +8) ＞＞4 - Otherwise, blending mode 3 is used. For luma blocks, the number of blending pixel rows is 4. - NewPixel (i, 1) = (7×Pixel1 (i, 1) +Pixel2 (i, 1) +4) ＞＞3 - NewPixel (i, 2) = (15×Pixel1 (i, 2) +Pixel2 (i, 2) +8) >>4 - NewPixel (i, 3) = (31×Pixel1 (i, 3) +Pixel2 (i, 3) +16) >>5 For chroma blocks, the number of blending pixel rows is 1. - NewPixel (i, 0) = (7×Pixel1 (i, 0) +Pixel2 (i, 0) +4) ＞＞3.

[0026] JVET-AC0164 Non-EE2: Improvements on Local Illumination Compensation in ECM7.0

[0027] In ECM-7.0, local illumination compensation (LIC) is an inter coding technique that aims at addressing the illumination variations between one block and its prediction block. The LIC is based on a linear model where a scale α and an offset β are derived from the template samples neighbouring to the current block and their corresponding prediction samples. The derived LIC parameters are then applied to adjust the prediction samples of the block as P′ [x, y] =α·P [x, y] +β

[0028] Currently, the LIC is only applicable to uni-predictive inter CUs which contains no less than 32 luma samples.

[0029] Additionally, overlapped block motion compensation (OBMC) is another inter tool in ECM7.0, which alleviates the discontinuities among the prediction samples of inter blocks by adjusting the boundary prediction samples of one inter block / sub-block using its neighbouring block’s MV. According to the existing ECM design, when the LIC is applied to one inter block, the OBMC is always disabled. Additionally, when a neighbouring block of the current CU applies the LIC, only its MVs are used to produce the corresponding prediction samples used for the OBMC process of the current CU.

[0030] The following modifications are proposed to further improve the coding efficiency of the LIC tool.

[0031] Bi-Predictive LIC

[0032] It is proposed to extend the existing LIC design to bi-predicted CUs. Specifically, when applying the proposed method to one bi-prediction block, two different linear models are derived to compensate the illumination changes that exist between the current block and its two prediction blocks. Then, the final bi-prediction of the current block is calculated as the combination of two uni-prediction blocks after the LIC adjustment, i.e., P′ [x, y] = (1-ω) ·p′0 [x, y] +ω·p′1 [x, y] , and p′0 [x, y] =α0·P0 [x, y] +β0, p′1 [x, y] =α1·P1 [x, y] +β1, where α0 and β0, and α1 and β1 indicate the scales and the offsets in L0 and L1, respectively; ω indicates the weight (as indicated by the CU-level BCW index) that is applied when combining the two uni-prediction blocks.

[0033] Same to the current LIC design, one control flag is signalled for AMVP bi-predicted CUs to indicate the enabling / disabling of the LIC while the flag is inherited from one neighbouring block for merge inter CUs (including AMVP-Merge mode) . Additionally, the LIC is disabled when decoder-side motion vector refinement (DMVR) (including multi-pass DMVR, adaptive DMVR and affine DMVR) and bi-directional optical flow (BDOF) is applied.

[0034] To reuse the linear model derivation of the existing LIC, one iterative approach is applied to alternately derive the L0 and L1 linear models. Specifically, given the two MVs of the current block, it assumes T0 and T1 are the two predictions of the current block’s template T. The method firstly derives the L0 linear model (α0 and β0) that result in the minimum difference between T0 and T; then, the L1 linear model (α1 and β1) can be calculated that minimizes the difference between T1 and the updated template. Finally, the L0 linear model is refined again in the same way.

[0035] OBMC with LIC

[0036] The following two changes are applied to better handle the interaction between the LIC and the OBMC: 1) It is proposed to enable the OBMC to the inter blocks where the LIC is applied. Additionally, to achieve a better complexity / performance trade-off, the OBMC is only applied for refining the prediction samples on the top and left boundaries of one LIC CU while the OBMC on the internal sub-block boundaries are always disabled. 2) Besides the MVs, it is proposed to also take the LIC parameters of one neighbouring block (when it is coded by the LIC) into consideration when generating its corresponding prediction samples for the OBMC of the current CU.

[0037] JVET-AJ0161 EE2-3.3: OBMC Extension with Intra Prediction

[0038] In OBMC of ECM, top and left boundary pixels of the current block are only blended with inter prediction block generated using motion information of neighbouring block. However, top and left boundary pixels adjacent to intra block remain un-refined due to the absence of motion information of neighbouring intra block. Consequently, discontinuities at these boundary pixels may still be present, which can result in large residual signals at the boundary. To address this issue, another OBMC extension was proposed in JVET-AI0154. In the proposed OBMC extension, in addition to the existing OBMC process, the top and left boundary pixels adjacent to intra block are blended with intra prediction subblock generated using the intra prediction mode derived by applying DIMD (Decoder-side Intra Mode Derivation) on the neighbouring reconstructed samples. An example of the OBMC extension with intra prediction is illustrated in Fig. 8 to show how boundary pixels adjacent to intra blocks are processed.

[0039] Furthermore, the blending of top and left boundary pixels adjacent to intra blocks is performed only when the intra prediction mode derived from DIMD falls within the range defined according to the availability of neighbouring reconstructed samples, as shown in Fig. 9. In Fig. 9, the range of the intra prediction modes for a top block, A is from 34 to 66 and for a left block, B is from 2 to 34. This condition limits the usage of padded reference samples for intra prediction.

[0040] Test 3.3a: OBMC extension with intra prediction

[0041] In test 3.3a, OBMC is extended to perform the refinement of the top and left boundary pixels adjacent to the intra block. These pixels are blended using an intra prediction block generated with the intra prediction mode derived by applying DIMD on the neighbouring reconstructed samples.

[0042] Test 3.3b: Test 3.3a + DIMD with 2x2 edge operator

[0043] In test 3.3b, the 3x3 edge operator used in test 3.3a is modified to the 2x2 edge operator proposed in JVET-AI0140 [2] to derive the intra prediction mode in DIMD. The horizontal and vertical filters of the 2x2 edge operator are defined as follows:

[0044] Test 3.3c: Test 3.3a + using intra prediction mode of neighbouring block

[0045] In test 3.3c, instead of deriving intra prediction mode with DIMD as in test 3.3a, intra prediction mode of the neighbouring block is used to generate the intra prediction block.

[0046] Test 3.3d: Test 3.3a + more neighbouring blocks are checked for available motion information to generate inter subblock

[0047] On top of test 3.3a, the search positions for available motion information are expanded as follows: - For subblocks on the top boundary: In addition to the above block, the blocks to its left and right are also checked. - For subblocks on the left boundary: In addition to the left block, the blocks above and below are also checked.

[0048] JVET-AJ0078 EE2-related: Extended Overlapped Block Blending for MV / BV Based Prediction

[0049] For additional coding gain, the JVET-AJ0078 contribution proposes an extended OBB for MV / BV based prediction. Specifically, this contribution extensively introduces OBB (Overlapped Block Blending) to IBC or IntraTMP applied current blocks regardless of prediction mode in adjacent blocks. Moreover, the JVET-AJ0078 contribution additionally allows OBB for inter predicted current blocks having adjacent blocks with BV.

[0050] To summarize, OBB applied cases are extended as shown in Table 1. Table 1 Extended OBB applied cases by the proposed method.

[0051] The subblock-based IPM derivation of adjacent blocks and OBB blending processes are followed by those in EE2-3.3a and ECM-14.0, respectively. To reduce the complexity and enhance the robustness against noises, the subblock-based IPM derivation process is executed jointly when adjacent subblocks have same intra prediction mode.

[0052] JVET-AJ0113 Non-EE2: Intra OBMC

[0053] In order to addresses the discontinuity between two non-inter coded blocks, the intra OBMC scheme is proposed in JVET-AJ0113.

[0054] The intra OBMC is applied to top and left boundaries of the current block as follows: - For luma component: ● Both the current block and the neighbouring block are coded by BV modes; ● The current block is coded by a BV mode and the neighbouring block is coded by a non-BV intra mode; - For chroma components: ● The current block is coded by a CCP mode and the neighbouring block is coded by a non-CCP and non-BV mode.

[0055] The BV mode includes IBC mode and intra TMP mode for luma, and DBV mode for chroma.

[0056] For 4: 2: 0 colour format, the intra OBMC is performed on 4x4 sub-block level for luma and 2x2 sub-block level for chroma. If intra OBMC is applied, each sub-block on the top and left boundaries within the current block is blended with a predictor generated by the information of the neighbouring block. If the neighbouring block is coded by a BV mode, the BV information (including BV, fusion parameters, LIC parameters etc. ) is used to generate the predictor. If the neighbouring block is coded by a non-BV intra mode, an intra prediction mode derived by applying DIMD method on the adjacent reconstructed samples is used to generate the predictor.

[0057] The blending weights for the proposed intra OBMC are identical to those used in inter OBMC when template matching-based OBMC is not applied.

[0058] The proposed intra OBMC is only applied to camera-captured sequences.

[0059] JVET-AJ-0238 AHG12: OBMC Modifications

[0060] Two modifications related to OBMC operation orders are proposed.

[0061] The first modification is about GPM (Geometric Partitioning Mode) and OBMC. In the current ECM, when the two GPM partitions are coded with non-affine inter, the GPM blending is carried out first, and then OBMC is applied on top of blended GPM samples. It is proposed to apply OBMC separately on top of each GPM partition, and then apply blending on top of OBMC modified samples of the two partitions. Furthermore, if the corresponding partition contains subblock motion, OBMC is applied on those subblock boundaries as well.

[0062] The second modification is about the order of applying OBMC on block and sub-block boundaries. In the current ECM, OBMC is first carried out on block (CU) boundaries, and is then carried out on subblock boundaries. It is proposed to swap the order to first apply OBMC on subblock boundaries (if they are present within the CU) , and then apply OBMC on CU boundaries.

[0063] Decoder Side Intra Mode Derivation (DIMD)

[0064] When DIMD is applied, up to five intra modes are derived from the reconstructed neighbour samples, and those five predictors are combined with the planar mode predictor with the weights derived from the histogram of gradients as described in JVET-O0449 (Mohsen Abdoli, et al., “Non-CE3: Decoder-side Intra Mode Derivation with Prediction Fusion Using Planar” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO / IEC JTC 1 / SC 29 / WG 11, 15th Meeting: Gothenburg, SE, 3–12 July 2019, Document JVET-O0449) . The division operations in weight derivation are performed utilizing the same lookup table (LUT) based integerization scheme used by the CCLM. For example, the division operation in the orientation calculation, Orient=Gy / Gx is computed by the following LUT-based scheme: x = Floor (Log2 (Gx) ) normDiff = ( (Gx<< 4) >> x) &15 x += (3 + (normDiff ! = 0) ? 1: 0) Orient = (Gy* (DivSigTable [normDiff] | 8) + (1<< (x-1) ) ) >> x, where DivSigTable

[0016] = {0, 7, 6, 5 , 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0} .

[0065] For a block of size W×H, the weight for each of the five derived modes is modified if the above or left histogram magnitudes is twice larger than the other one. In this case, the weights are location dependent and computed as follows.

[0066] If the above histogram is twice the left, then:

[0067] If the left histogram is twice the above, then: where wDimdi is the unmodified uniform weight of the DIMD selected as in JVET-O0449, Δiis pre-defined and set to 10.

[0068] Derived intra modes are included into the primary list of intra most probable modes (MPM) , so the DIMD process is performed before the MPM list is constructed. The primary derived intra mode of a DIMD block is stored with a block and is used for MPM list construction of the neighbouring blocks.

[0069] Finally, note the region of neighbouring reconstructed samples used for computing the histogram of gradients is modified compared to JVET-O0449 method, depending on reconstructed samples availability. The region of decoded reference samples of current WxH luma CB is extended towards the above-right side if available, up to W additional columns. It is extended towards the bottom-left side if available, up to H additional rows.

[0070] DIMD Chroma Mode

[0071] The DIMD chroma mode uses the DIMD derivation method to derive the chroma intra prediction mode of the current block based on the neighbouring reconstructed Y, Cb and Cr samples in the second neighbouring row and column as shown in Figs. 10A-C for Y, Cb and Cr components (Fig. 10A, Fig. 10B and Fig. 10C) respectively. Specifically, a horizontal gradient and a vertical gradient are calculated for each collocated reconstructed luma sample of the current chroma block 1010, as well as the reconstructed Cb and Cr samples, to build a HoG. Then the intra prediction mode with the largest histogram amplitude values is used for performing chroma intra prediction of the current chroma blocks 1020 and 1030.

[0072] When the intra prediction mode derived from the DIMD chroma mode is the same as the intra prediction mode derived from the DM mode, the intra prediction mode with the second largest histogram amplitude value is used as the DIMD chroma mode. A CU level flag is signalled to indicate whether the proposed DIMD chroma mode is applied.

[0073] Finally, the luma region of reconstructed samples used for computing the histogram of gradients for chroma DIMD mode is modified compared to JVET-O0449. For a WxH pair of chroma CBs to predict, to build the histogram of gradients associated to the collocated luma CB, the pairs of a vertical gradient and a horizontal gradient are extracted from the second and third lines in this luma CB instead of being extracted from the regular set of DIMD decoded reference samples around this luma CB.

[0074] Fusion of Chroma Intra Prediction Modes

[0075] In ECM, two chroma intra prediction signals can be fused together. One of the two chroma intra prediction signals is predicted using one of the DM mode, DIMD chroma mode and the four default modes (non-LM mode) . The other chroma intra prediction signal is predicted using cross-component linear prediction modes (LM mode) . Two different methods are supported.

[0076] In the first method, the LM mode is fixed to MMLM_LT mode, and the final predictor is derived as follows: predC (i, j) = (w0×pred0 (i, j) +w1×pred1 (i, j) + (1<< (shift-1) ) ) >>shift where pred0 (i, j) is the predictor obtained by applying the non-LM mode, pred1 (i, j) is the predictor obtained by applying the MMLM_LT mode and predC (i, j) is the final predictor of the current chroma block. The two weights, w0 and w1 are determined by the intra prediction mode of adjacent chroma blocks and shift is set equal to 2. Specifically, when the above and left adjacent blocks are both coded with LM modes, {w0, w1} = {1, 3} ; when the above and left adjacent blocks are both coded with non-LM modes, {w0, w1} = {3, 1} ; otherwise, {w0, w1} = {2, 2} .

[0077] In the second method, the LM mode can be either MMLM or CCLM mode, and the final predictor is derived as follows: predC (i, j) = α0×pred0 (i, j) + α1×rec′L (i, j) +α2×β where pred0 (i, j) is the predictor obtained by applying the non-LM mode, rec′L (i, j) is the set of downsampled reconstructed luma samples at co-located positions and predC (i, j) is the final predictor of the current chroma block. β is a fixed value and is set equal to 512 for 10-bit contents. The three weights, α0, α1 and α2 are derived from the adjacent luma and chroma samples using the same LDL derivation method as in CCCM.

[0078] For the syntax design, one index is signalled to indicate whether fusion is applied and which method is used as shown in Table 2. It is noted that for I slices, the non-LM mode can be DM mode, DIMD chroma mode and the four default modes. For non-I slices, only DIMD chroma mode is allowed to be fused with LM modes. Table 2. Index signalled for indicating whether fusion is applied and which method being used

[0079] Fusion for Template-Based Intra Mode Derivation (TIMD)

[0080] For each intra prediction mode in MPMs, as well as the wide-angle modes if the above-right and / or bottom-left reference samples are available, SATD between the prediction and reconstruction samples of the template is calculated. First two intra prediction modes with the minimum SATD and one non-angular intra prediction mode (i.e. DC or Planar) with the lowest SATD cost are selected as the TIMD modes. These three TIMD modes are fused with the weights after applying PDPC (Position Dependent intra Prediction Combination) process, and such weighted intra prediction is used to code the current CU. PDPC is included in the derivation of the TIMD modes.

[0081] The conditions below are checked to determine whether the non-angular intra prediction mode is used in fusion: – the non-angular intra prediction mode is different from the two selected intra prediction modes. – costMode3 < 1.5*costMode1, where the costMode3 is the SATD cost of the non-angular intra prediction mode and costMode1 is the SATD cost of the first intra prediction mode.

[0082] If both of the conditions are true, three intra prediction modes are used to generate the prediction, and the weights of each intra prediction mode are computed from SATD cost:

[0083] Otherwise, the non-angular intra prediction mode is not used in prediction, and the costs of the two selected modes are compared with a threshold. In the test, the cost factor of 2 is applied as follows: costMode2 < 2*costMode1.

[0084] If this condition is true, the fusion is applied, otherwise the only mode1 is used.

[0085] Weights of the modes are computed from their SATD costs as follows: weight1 = costMode2 / (costMode1+ costMode2) weight2 = 1 -weight1.

[0086] The division operations are conducted using the same lookup table (LUT) based the integerization scheme used by the CCLM.

[0087] Besides, location-dependent sample-based fusion used in DIMD fusion process is used for the TIMD fusion, but the location-dependent criterion applying to amplitudes of the selected predictors is replaced by a SATD cost-based criterion. The location-dependent criterion is determined from a ratio of the normalized SATD of the selected TIMD predictors computed in the above and the left template area.

[0088] Intra Prediction Fusion

[0089] This intra prediction method derives predicted samples as a weighted combination of multiple predictors generated from different reference lines. In this process, multiple intra predictors are generated and then fused by weighted averaging. The process of deriving the predictors to be used in the fusion process is described as follows: 1) For angular intra prediction modes including the single mode case of TIMD and DIMD, the proposed method derives intra prediction by weighting intra predictions obtained from multiple reference lines represented as pfusion=w0pline+w1pline+1, where pline is the intra prediction from the default reference line and pline+1 is the prediction from the line above the default reference line. The weights are set as w0=3 / 4 and w1=1 / 4. 2) For TIMD mode with blending, pline is used for the first mode (w0=1, w1=0) and pline+ is used for the second mode (w0=0, w1=1) . 3) For DIMD mode with blending, the number of predictors selected for a weighted average is increased from 3 to 6.

[0090] The angular intra prediction fusion method is applied to luma blocks when angular intra mode has non-integer slope (required reference samples interpolation) and the block size is greater than 16. It is used with MRL and not applied for ISP coded blocks. In the method studied, PDPC is applied for the intra prediction mode using the closest to the current block reference line.

[0091] The TIMD mode with blending method is applied when all the following conditions are satisfied: - both the first and second modes are angular prediction mode - the current block is not ISP coded block - all of the following conditions are false: ○ abs (predModeIntra1 –predModeIntra2) is greater than Threshold. The value of Threshold is set to 8 or 4 depending on block size. ○ (predModeIntra1 -EXT_HOR_IDX) * (predModeIntra2 -EXT_HOR_IDX) is less than 0. ○ (predModeIntra1 -EXT_VER_IDX) * (predModeIntra2 -EXT_VER_IDX) is less than 0.

[0092] Spatial Geometric Partitioning Mode (SGPM)

[0093] SGPM is an intra mode that resembles the inter coding tool of GPM, where the two prediction parts are generated from intra predicted process. In this mode, a candidate list is built with each entry containing one partition split and two intra prediction modes as shown in Fig. 11, where Fig. 11A shown a block is partitioned into two partitions with intra prediction modes 0 and 1, in Fig. 11B shows the case of direct signalling of the three modes in the bit-stream, and Fig. 11C shows the case of efficient signalling by using the candidate index from a candidate list. 26 partition modes and 9 of intra prediction modes are used to form the combinations. The length of the candidate list is set equal to 16.

[0094] The list is reordered using template as shown in Fig. 12, where SAD between the prediction and reconstruction of the template is used for reordering. The template size is fixed to 1.

[0095] For each partition mode, an IPM list is derived for each part using the same intra-inter GPM list derivation. The IPM list size is set to 3. In the list, TIMD derived mode is replaced by 2 derived modes with horizontal and vertical orientations. The list is further augmented with block-vector based prediction candidates obtained from the adjacent and non-adjacent merge candidates coded in IntraTMP or IBC mode. The template cost is employed to select the up to 6 block vectors. The final list contains up to 9 predictors: 3 regular intra modes and up to 6 block vectors based predictors.

[0096] The SGPM mode is applied with a restricted blocks size: 4<=width<=64, 4<=height<=64, width<height*8, height<width*8, width*height>=32.

[0097] A PPS flag is coded to indicate whether no blending of two intra predictions is allowed. When this PPS flag is set to false, the following adaptive blending is also used for spatial GPM, where blending depth τ shown in Fig. 13 is derived as follows: ● If min (width, height) ==4, 1 / 2 τ is selected ● else if min (width, height) ==8, τ is selected ● else if min (width, height) ==16, 2 τ is selected ● else if min (width, height) ==32, 4 τ is selected ● else, 8 τ is selected.

[0098] Otherwise (i.e., the PPS flag is set to true) , 1 / 4 τ is always used for spatial GPM coded blocks to make sure no blending is used when SGPM block has partition angle completely horizontal or vertical, and much narrower blending width is used when SGPM block has other partition angles. It is noted that the flag is set to true in current Common Test Conditions (CTC) for the screen content videos.

[0099] In Fig. 13, line 1340 corresponds to the GPM partition boundary and two thresholds (i.e., -τ and τ) correspond to lines 1342 and 1344 in Fig 13. Furthermore, the angle 1310 and offset ρi 1320 are indicated for GPM index i and point 1530 corresponds to the centre of the block.

[0100] JVET-AH0209 EE2-2.13: Matrix Based Intra Prediction (a. k. a., Position Dependent Intra Prediction (PDP) ) Replacing Conventional Intra Modes

[0101] In this test, a matrix of weights, which are defined for a block shape and intra mode, is introduced, those weights are multiplied by the neighbour reference template to derive the prediction samples replacing conventional intra prediction. The weights are applied to the reference samples of the L shaped causal neighbourhood template as shown in Fig. 14.

[0102] The reference samples in the causal neighbourhood are denoted as r, and F (x, y) is the matrix of weights. Then, the prediction P (x, y) can be derived as P (x, y) = ∑k F (x, y, k) *r (k) , where k denotes the index of the reference sample in the template

[0103] In the test, this prediction is used for block size with both width and height up to 32 (except for 4x32, 32x4, 8x32 and 32x8) . The template size is 2 for blocks with both width and height up to 16 and it is only used for mode 0, 1, and (2+2*k) . For other blocks, template size is set to 1; it is used for mode 0, 1, and (2+4*k) ; prediction is only performed for 16x16 positions, and the rest of the samples are generated by bilinear interpolation. For all block sizes, block shape and mode-based symmetry are used. Reference length is set to W and H for modes greater than 18 and less than 50 and set to 2*W and 2*H otherwise.

[0104] The filters are trained with BVI sequences composing of 800 sequences with diverse resolutions.

[0105] MAC for per block shapes for reference length 2*W and 2*H are provided below. Note, MAC numbers for reference length W and H are roughly half of it. The total number of coefficients are approximately 1.8M. Table 3. MAC numbers for reference length W and H

[0106] In the present invention, a OBMC-like boundary processing method is extended to intra coded blocks so that the neighbouring predictor from a neighbouring block can be blended with the current intra predictor to form a blended predictor for overlapped boundary area of the current block. BRIEF SUMMARY OF THE INVENTION

[0107] A method and apparatus of video coding using overlapped boundary processing for blended intra prediction are disclosed. According to this method, input data comprising a current block, a current subblock, a neighbouring block, a neighbouring subblock, or a combination thereof is received. An intra prediction blending process and an overlapped boundary refinement process is applied to the current block. Whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is pre-defined or is determined adaptively for the current block according to one or more conditions. When the intra prediction blending process is applied to the current block or the current subblock before the overlapped boundary refinement process, a final intra predictor is generated for the current block or the current subblock by blending two or more intra predictions, and the overlapped boundary refinement process is then applied to overlapped boundaries associated with the final intra predictor to generate a refined final predictor. When the intra prediction blending process is applied to the current block or the current subblock after the overlapped boundary refinement process, the overlapped boundary refinement process is applied to the overlapped boundaries associated with at least one of said two or more intra predictions to form refined intra predictions, and the intra prediction blending process is applied to the refined intra predictions to generate the refined final predictor. The current block or the current subblock is encoded or decoded using the refined final predictor.

[0108] In one embodiment, said one or more conditions comprise prediction mode. In one embodiment, said one or more conditions comprise a predictor difference, and the predictor difference is compared with a threshold to determine whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process. In one embodiment, said one or more conditions comprise intra prediction angle difference or motion information.

[0109] In one embodiment, whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is fixed according to one or more pre-defined conditions or determined according to one or more flags or syntax elements signalled. In one embodiment, said one or more flags or syntax elements are signalled or parsed at SPS (Sequence Parameter Set) -level, PPS (Picture Parameter Set) -level, picture header level, slice header level, CTU (Coding Tree Unit) -level, block level or a combination thereof.

[0110] In one embodiment, when the overlapped boundary refinement process is applied to multiple predictors, weightings, blending lines, blending rules, or adaptive blending decision in the overlapped boundary refinement process is allowed to be different for multiple predictors.

[0111] In one embodiment, the current block is coded in Matrix-Based Intra Prediction (MIP) , replacement of conventional intra modes with MIP (PDP) , Decoder-Side Intra Mode Derivation (DIMD) , Occurrence-Based Intra Coding (OBIC) , Template-Based Intra Mode Derivation (TIMD) , decoder-side derived intra prediction–related modes, spatial-GPM and GPM-related modes, intra prediction fusion, or chroma intra prediction mode fusion.

[0112] In one embodiment, when the current block is coded in Spatial Geometric Partitioning Mode (S-GPM) , said two or more intra predictions are derived from S-GPM partitions of the current block respectively. In one embodiment, whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is determined adaptively. In one embodiment, the intra prediction blending process is always applied to the current block after the overlapped boundary refinement process. In one embodiment, when the overlapped boundary refinement process is performed at the current subblock with a GPM partition line inside the current subblock, the overlapped boundary refinement process uses less blending lines or weaker blending weightings.

[0113] In one embodiment, when the current block is coded in Decoder-Side Intra Mode Derivation (DIMD) , said two or more intra predictions are derived according to DIMD process.BRIEF DESCRIPTION OF THE DRAWINGS

[0114] Fig. 1A illustrates an exemplary adaptive Inter / Intra video encoding system incorporating loop processing.

[0115] Fig. 1B illustrates a corresponding decoder for the encoder in Fig. 1A.

[0116] Fig. 2 illustrates an example of overlapped motion compensation for geometry partitions.

[0117] Figs. 3A-B illustrate an example of OBMC for 2NxN (Fig. 3A) and Nx2N blocks (Fig. 3B) .

[0118] Fig. 4A illustrates an example of the sub-blocks that OBMC is applied, where the example includes subblocks at a CU / PU boundary.

[0119] Fig. 4B illustrates an example of the sub-blocks that OBMC is applied, where the example includes subblocks coded in the AMVP mode.

[0120] Fig. 5 illustrates an example of the OBMC processing using neighbouring blocks from above and left for the current block.

[0121] Fig. 6A illustrates an example of the OBMC processing for the right and bottom part of the current block using neighbouring blocks from right and bottom.

[0122] Fig. 6B illustrates an example of the OBMC processing for the right and bottom part of the current block using neighbouring blocks from right, bottom and bottom-right.

[0123] Fig. 7 illustrates an example of Template Matching based OBMC where, for each top block with a size of 4×4 at the top CU boundary, the above template size equals to 4×1.

[0124] Fig. 8 illustrates an example of proposed OBMC extension with intra prediction according to JVET-AI0154.

[0125] Fig. 9 illustrates an example of constraint on the range of intra prediction modes for using OBMC extension with intra prediction according to JVET-AI0154.

[0126] Fig. 10 illustrates an example of neighbouring collocated reconstructed Y samples (Fig. 10A) , neighbouring reconstructed Cb samples (Fig. 10B) , and neighbouring reconstructed Cr samples (Fig. 10C) used for DIMD chroma mode.

[0127] Fig. 11 illustrates an example of Spatial Geometric Partitioning Mode (SGPM) , where a block is partitioned into two parts and two intra prediction modes are used (Fig. 11A) , the syntax coding for Spatial GPM (SGPM) before using a simplified method is shown in Fig. 11B and the syntax coding for Spatial GPM (SGPM) before using a simplified method is shown in Fig. 11C.

[0128] Fig. 12 illustrates an example of template for Spatial GPM (SGPM) . of template for Spatial GPM (SGPM) template.

[0129] Fig. 13 illustrates an example of bending weight ω0 using the geometric partitioning mode.

[0130] Fig. 14 illustrates an example of L shaped neighbourhood of a given predicted block for matrix based intra prediction.

[0131] Fig. 15 illustrates an example of overlapped block refinement in spatial-GPM coded block.

[0132] Fig. 16 illustrates a flowchart of an exemplary video coding system, where an overlapped boundary process is used for blended intra predictor according to an embodiment of the present invention.DETAILED DESCRIPTION OF THE INVENTION

[0133] It will be readily understood that the components of the present invention, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the systems and methods of the present invention, as represented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. References throughout this specification to “one embodiment, ” “an embodiment, ” or similar language mean that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment.

[0134] Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, etc. In other examples, well-known structures, or operations are not shown or described in detail to avoid obscuring aspects of the invention. The illustrated embodiments of the invention will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of apparatus and methods that are consistent with the invention as claimed herein.

[0135] Proposed Methods

[0136] Process Order in Overlapped Block Refinement in Intra Prediction

[0137] In the existing ECM, OBMC is performed after a final predictor is formed in some prediction modes. For example, two GPM partitions form a final predictor and OBMC is performed on the final predictor afterwards. For another example, for a bi-predicted block, two uni-predictors are generated and blended first, and OBMC is applied afterwards. In the existing method, the OBMC predictor blending process order may not be suitable for various video contents. For example, OBMC can refine GPM predicted block CU boundary predictor, but may adversely affect the partitioning angle near the CU boundary predictor. In another example, if BCW is enabled for a bi-predicted block, the selected BCW index may be suitable for the current block, but can be sub-optimal if OBMC is applied afterwards. Besides, the existing ECM design does not allow the current intra coded block to perform overlapped block refinement, which may result in discontinuity between the current intra coded block and the neighbouring block. Thus, it is proposed to apply OBMC-like process to refine the current intra coded block (called overlapped block refinement in this disclosure) , and to adaptively change the overlapped block refinement predictor blending process order at predictor fusion according to some conditions, such as prediction mode information, intra prediction angle, neighbouring prediction information, etc.

[0138] Furthermore, CU boundary OBMC is applied firstly to refine the current predictor boundary, and then subblock-boundary is subsequently applied to refine the current predictor internal boundary if conditions are met. However, existing OBMC boundary processing order may favour to strongly smooth block boundary and to weakly smooth internal subblock boundary since reduced (refined) block boundary information is later used to refine the internal subblock. It can be sub-optimal to various video contents or subblock modes since the subblock boundary edge may still exist. Besides, the existing ECM design does not allow the current intra coded block to perform overlapped block refinement, which may result in discontinuity between the current intra coded block and neighbouring block. Thus, it is proposed to apply an OBMC-like process, referred to as overlapped block refinement, to refine the current intra-coded block. The boundary processing order of the overlapped block refinement may be adaptively changed according to certain conditions, such as prediction mode information, intra prediction angle, or neighbouring prediction information.

[0139] Adaptive Predictor Blending in Overlapped Block Refinement

[0140] When the current block is coded by intra prediction mode, overlapped block refinement can be performed to refine the current intra predictor block boundary and the subblock block boundary. It is proposed to adaptively change the overlapped block refinement process order during predictor fusion according to some conditions.

[0141] For example, when the current block or subblock, or its neighbouring block or subblock, is coded using an intra prediction mode, the order of overlapped block refinement and predictor blending can be either pre-defined or adaptively adjusted according to specific prediction modes. These modes include, but are not limited to, matrix-based intra prediction (MIP) , replacement of conventional intra modes with MIP (PDP) , decoder-side intra mode derivation (DIMD) , occurrence-based intra coding (OBIC) , template-based intra mode derivation (TIMD) , decoder-side derived intra prediction–related modes, spatial-GPM and GPM-related modes, intra prediction fusion, and chroma intra prediction mode fusion. The decoder-side derived intra prediction modes related modes can be DIMD, OBIC, TIMD or other modes that the decoder side can use statistic method or cost-based method to derive the modes. The GPM related modes can be spatial-GPM, GPM-intra, or other modes, where one or more partitions are derived using intra prediction methods and then blended afterwards.

[0142] Some syntax elements, flags, or indexes may be signalled to indicate the processing order during predictor fusion. For example, such signalling may be provided at the Sequence Parameter Set (SPS) level, Picture Parameter Set (PPS) level, picture header level, slice header level, Coding Tree Unit (CTU) level, block level, or any combination thereof.

[0143] Example 1: Current block or current subblock coded by intra prediction mode or neighbouring block or neighbouring subblock coded by intra prediction mode

[0144] In one embodiment, the order of overlapped block refinement and predictor blending can be adaptively changed according to some conditions, such as prediction modes. The order can be fixed according to the pre-defined conditions or be determined according to signalled flags or syntax elements.

[0145] In another embodiment, overlapped block refinement can be always applied to each predictor accordingly before the final predictor is blended according to some conditions, such as prediction mode.

[0146] In another embodiment, final predictor can be always generated by fusing one or more intra predictor and then overlapped block refinement can be applied to the final predictor for refinement according to some conditions, such as prediction mode.

[0147] In another embodiment, when overlapped block refinement can be applied to one or more predictors, weightings, blending lines, blending rules, or adaptive blending decision in the overlapped block refinement process can be different in one or more predictors.

[0148] Example 1-1: decoder-side intra modes derivation (DIMD)

[0149] In one example, the order of overlapped block refinement process and DIMD blending process can be adaptively changed. In another example, when the current block is coded by DIMD prediction mode, overlapped block refinement can be firstly performed at one or more DIMD derived predictors, and one or more DIMD derived predictors are blended afterward to generate the final predictor of DIMD prediction mode. For another example, when the current block is coded by DIMD prediction mode, one or more DIMD derived predictors can be firstly blended to generate the final predictor of DIMD prediction mode and then overlapped block refinement can be applied to refine the final predictor.

[0150] For example, the order of overlapped block refinement process and DIMD blending process is determined according to a predictor difference. For those one or more DIMD derived predictors, predictor difference is calculated and if predictor difference is smaller than or equal to a threshold, DIMD blending process can be performed firstly followed by the overlapped block refinement process; otherwise, overlapped block refinement process can be carried out firstly followed by the DIMD blending process.

[0151] For example, when overlapped block refinement is applied to the generated final predictor of DIMD prediction mode, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0152] In the above equations, CurrTop [i] denotes the predictor pixel value at position i along the top block boundary. NeighTop [i] denotes the neighbouring overlapped predictor pixel value at position i along the top block boundary. While the example only illustrates the processing on the top boundary, similar processing can be applied to the left boundary as well.

[0153] For example, when overlapped block refinement is applied to the generated final predictor of DIMD prediction mode, the blending weights of overlapped block refinement can be weaker compared to original overlapped block refinement weightings, as shown in following equations: Row0 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row1 = (29*CurrTop [i] + 3*NeighTop [i] + 16) / 32 Row2 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row3 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0154] Example 1-2: Occurrence Based Intra Coding (OBIC)

[0155] In one example, the order of overlapped block refinement process and the OBIC blending process can be adaptively changed. In another example, when the current block is coded by the OBIC prediction mode, overlapped block refinement can be firstly performed at one or more OBIC derived predictors and one or more OBIC derived predictors can be blended afterwards to generate the final predictor of OBIC prediction mode. For another example, when the current block is coded by OBIC prediction mode, one or more OBIC derived predictors can be firstly blended to generate the final predictor of OBIC prediction mode and then overlapped block refinement can be applied to refine the final predictor.

[0156] For example, the order of overlapped block refinement process and the OBIC blending process is determined according to the predictor difference. For those one or more OBIC derived predictors, the predictor difference is calculated, and if the predictor difference is smaller than or equal to a threshold, the OBIC blending process can be performed firstly followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be carried out firstly followed by the OBIC blending process.

[0157] For example, when overlapped block refinement is applied to the generated final predictor of the OBIC prediction mode, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to the original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0158] For example, when overlapped block refinement is applied to generated the final predictor of the OBIC prediction mode, the blending weights of overlapped block refinement can be stronger compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0159] Example 1-3: Template based intra mode derivation (TIMD)

[0160] In one example, the order of overlapped block refinement process and TIMD fusion process can be adaptively changed. In another example, when the current block is coded by TIMD prediction mode, overlapped block refinement can be firstly performed at one or more TIMD derived predictors, and one or more TIMD derived predictors can be fused afterwards to generate the final predictor of TIMD prediction mode. For another example, when the current block is coded by TIMD prediction mode, one or more TIMD derived predictors can be firstly blended to generate the final predictor of TIMD prediction mode and then overlapped block refinement can be applied to refine the final predictor.

[0161] In another embodiment, overlapped block refinement can be applied to refine the template predictor in TIMD prediction mode.

[0162] In another embodiment, overlapped block refinement can be applied to refine the current intra predictor generated in TIMD prediction mode.

[0163] For example, the order of the overlapped block refinement process and TIMD blending process is determined according to the current predictor difference or template predictor difference. For those one or more template predictors, the template predictor difference is calculated and if template predictor difference is smaller than or equal to a threshold, TIMD blending process can perform firstly followed by the overlapped block refinement process; otherwise, overlapped block refinement process can be carried out firstly followed by TIMD blending process. For those one or more TIMD derived predictors, TIMD derived predictor difference is calculated and if TIMD derived predictor difference is smaller than or equal to a threshold, TIMD blending process can be performed firstly followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be carried out firstly followed by TIMD blending process.

[0164] For example, when overlapped block refinement is applied to the generated final predictor of TIMD prediction mode, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0165] For example, when overlapped block refinement is applied to the generated final predictor of TIMD prediction mode, the blending weights of overlapped block refinement can be stronger compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0166] For example, when overlapped block refinement is applied to TIMD template predictor, compared to original OBMC weightings and blending lines, weaker weightings or fewer lines can be used in overlapped block refinement process, as shown in following equations: Row0 = (29*CurrTop [i] + 3*NeighTop [i] + 16) / 32

[0167] Example 1-4: Decoder-side derived intra prediction modes related modes

[0168] In one example, the order of overlapped block refinement process and the decoder-side derived intra prediction mode process can be adaptively changed. In another example, when the current block is coded by decoder-side derived intra prediction mode prediction mode, overlapped block refinement can be firstly performed at one or more decoder-side derived predictors and one or more decoder-side derived predictors can be fused afterwards to generate the final predictor of decoder-side derived intra prediction mode. For another example, when the current block is coded by decoder-side derived intra prediction mode, one or more decoder-side derived predictors are firstly blended to generate the final predictor of decoder-side intra prediction mode and then overlapped block refinement can be applied to refine the final predictor.

[0169] For example, the order of overlapped block refinement process and decoder-side derived intra prediction mode related modes blending process is determined according to the predictor difference. For those one or more decoder-side derived predictors, the predictor difference is calculated and if predictor difference is smaller than or equal to a threshold, decoder-side derived intra prediction mode related modes blending process can be performed firstly followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be carried out firstly followed by decoder-side derived intra prediction modes related mode blending process.

[0170] For example, when overlapped block refinement is applied to generated the final predictor of decoder-side derived intra prediction modes related modes, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0171] For example, when overlapped block refinement is applied to the generated final predictor of decoder-side derived intra prediction mode related modes, the blending weights of overlapped block refinement can be stronger compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0172] Example 1-5: Spatial-GPM

[0173] In one example, the order of overlapped block refinement process and spatial-GPM partition blending can be adaptively changed. In another example, when the current block is coded by the spatial-GPM prediction mode, overlapped block refinement can be firstly performed at one or more spatial-GPM partitions and one or more spatial-GPM partitions blending can be performed to generate the final predictor of spatial-GPM prediction mode. For another example, when the current block is coded by the spatial-GPM prediction mode, one or more spatial-GPM partitions can be firstly blended to generate the final predictor of spatial GPM prediction mode, and then overlapped block refinement can be applied to refine the final predictor.

[0174] In another embodiment, when overlapped block refinement process is performed at spatial-GPM coded block, the blending weightings or the blending lines of overlapped block refinement process can be changed or different from original blending weightings or blending lines. For instance, as shown in Fig. 15, when overlapped block refinement is performed at the current subblock 1530, it is possible that there is a GPM partition line 1520 lying inside the current subblock. In this case, overlapped block refinement blending lines or blending weightings can be fewer or weaker since there is a GPM partition line. In Fig. 15, block 1510 corresponds to the current block, block 1530 corresponds to the current subblock with intra angular mode 1532, and block 1540 corresponds to the neighbouring subblock with intra angular mode 1542.

[0175] In another embodiment, when overlapped block refinement process is performed at a spatial-GPM coded block, one or more pieces of stored intra prediction information at the current block can be used. For example, when the current block is coded by the spatial-GPM prediction mode, either the intra prediction mode from partition 0 or the intra prediction mode from partition 1 will be stored at the current block., However, in overlapped block refinement process, either the intra prediction mode information from partition 0 or 1, or both of them can be used in the overlapped block refinement process.

[0176] For example, the order of the overlapped block refinement process and the spatial-GPM blending process is determined according to the predictor difference. For one or more spatial-GPM partitions, the partition difference is calculated. If the partition difference is smaller than or equal to a threshold, the spatial-GPM blending process is performed first, followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be performed first, followed by the spatial-GPM blending process.

[0177] For example, when overlapped block refinement is applied to the generated final predictor of spatial-GPM prediction mode, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0178] For example, when overlapped block refinement is applied to the generated final predictor of spatial-GPM prediction mode, the blending weights of overlapped block refinement can be stronger compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0179] Example 1-6: GPM related modes

[0180] In one example, the order of the overlapped block refinement process and GPM related mode blending can be adaptively changed. In another example, when the current block is coded by a GPM related prediction mode, overlapped block refinement can be performed first at one or more GPM partitions, and one or more GPM partitions blending can be performed to generate the final predictor of GPM related prediction mode. For another example, when the current block is coded by the GPM related prediction mode, one or more GPM partitions can be blended first to generate the final predictor of GPM related prediction mode and then overlapped block refinement can be applied to refine the final predictor.

[0181] In another embodiment, when the overlapped block refinement process performs at a GPM related mode coded block, the blending weightings or the blending lines of overlapped block refinement process can be changed or different from original blending weightings or blending lines. For instance, as shown in Fig. 15, when overlapped block refinement is performed at the current subblock, it is possible that there is a GPM partition line lying inside the current subblock. In this case, overlapped block refinement blending lines or blending weightings can be fewer or weaker since there is a GPM partition line.

[0182] In another embodiment, when the overlapped block refinement process is performed at a GPM related mode coded block, one or more pieces of stored intra prediction information at the current block can be used. For example, when the current block is coded by a GPM related prediction mode, either intra prediction mode from partition 0 or intra prediction mode from partition 1 will be stored at the current block. However, in the overlapped block refinement process, either one of intra prediction mode information from partition 0 or 1, or both of them can be used in the overlapped block refinement process.

[0183] For example, the order of overlapped block refinement process and GPM related mode blending process is determined according to a predictor difference. For one or more GPM partitions, a partition difference is calculated. If partition difference is smaller than or equal to a threshold, the GPM related mode blending process can be performed first, followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be performed first, followed by the GPM related mode blending process.

[0184] For example, when overlapped block refinement is applied to the generated final predictor of GPM related prediction mode, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0185] For example, when overlapped block refinement is applied to the generated final predictor of GPM related prediction mode, the blending weights of overlapped block refinement can be stronger compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0186] Example 1-7: Intra prediction fusion

[0187] In one example, the order of overlapped block refinement process and intra predictor fusion can be adaptively changed. In another example, when the current block is determined to use intra prediction fusion, the overlapped block refinement process can be each applied to the generated intra predictor first for refinement. Then, each generated intra predictor refined can be fused to generate the final intra predictor. In another example, the final intra predictor can be derived by fusing one or more intra predictors first, and then overlapped block refinement can be applied to the final predictor for refinement.

[0188] For example, the order of overlapped block refinement process and intra prediction fusion blending process is determined according to a predictor difference. For one or more intra prediction fusion derived predictors, the predictor difference is calculated and if the predictor difference is smaller than or equal to a threshold, the intra prediction fusion blending process can be performed first, followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be performed first, followed by the intra prediction fusion blending process.

[0189] For example, when overlapped block refinement is applied to the generated final predictor of intra prediction fusion prediction mode, the blending weights of overlapped block refinement can be stronger or the blending process can use more blending lines compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32 Row4 = (31*CurrTop [i] + 1*NeighTop [i] + 16) / 32.

[0190] For example, when overlapped block refinement is applied to the generated final predictor of intra prediction fusion prediction mode, the blending weights of overlapped block refinement can be stronger compared to original OBMC weightings, as shown in following equations: Row0 = (24*CurrTop [i] + 8*NeighTop [i] + 16) / 32 Row1 = (26*CurrTop [i] + 6*NeighTop [i] + 16) / 32 Row2 = (28*CurrTop [i] + 4*NeighTop [i] + 16) / 32 Row3 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0191] Example 1-8: Fusion of chroma intra prediction modes

[0192] In one example, the order of overlapped block refinement process and chroma intra prediction fusion can be adaptively changed. In another example, when chroma intra prediction fusion is applied to the current block, the overlapped block refinement process can be applied to each generated chroma intra predictor for refinement. Then, each generated chroma intra predictor refined can be fused to generate the final predictor. In another example, the final intra predictor can be firstly derived by fusing one or more generated chroma intra predictors, and then overlapped block refinement can be applied to the final predictor for refinement.

[0193] For example, the order of overlapped block refinement process and chroma intra prediction fusion process is determined according to a chroma predictor difference or a difference between down-sampled luma reconstruction and chroma predictor. For one or more generated chroma intra predictors, the chroma predictor difference or the difference between down-sampled luma reconstruction and chroma predictor is calculated. If the difference is smaller than or equal to a threshold, the chroma intra prediction fusion process can be performed, followed by the overlapped block refinement process; otherwise, the overlapped block refinement process can be performed first, followed by the chroma intra prediction fusion process.

[0194] For example, when overlapped block refinement is applied to the generated final predictor of chroma intra prediction fusion mode, the blending weights of overlapped block refinement can be weaker compared to original OBMC chroma weightings, as shown in following equations: Row0 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0195] For example, when overlapped block refinement is applied to the generated final predictor of chroma intra prediction fusion mode, the blending weights of overlapped block refinement can be weaker compared to original OBMC chroma weightings, as shown in following equations: Row0 = (30*CurrTop [i] + 2*NeighTop [i] + 16) / 32.

[0196] Adaptive Boundary Process Order in Overlapped Block Refinement

[0197] When the current block is coded by an intra prediction mode, overlapped block refinement can be performed to refine the current intra predictor block boundary and subblock block boundary. It is proposed to adaptively change the overlapped block refinement boundary process order during predictor fusion according to some conditions, such as intra prediction angle difference, prediction mode, signalled flag, predictor difference between the current predictor and the neighbouring predictor, or motion information. Some syntax elements, flags, or indexes can be signalled to indicate the boundary process order. The signalling can be at SPS-level, PPS-level, picture header level, slice header level, CTU level, or block level.

[0198] In one embodiment, the overlapped block refinement boundary process order can be adaptively changed according to some conditions, such as intra prediction angle difference, prediction mode, signalled flag, predictor difference between the current predictor and neighbouring predictor, or motion information.

[0199] In another embodiment, when the current block is coded in a subblock mode (e.g., ISP) , the overlapped block refinement boundary process order can be adaptively changed according to some conditions, such as intra prediction angle difference, prediction mode, signalled flag, predictor difference between the current predictor and neighbouring predictor, or motion information.

[0200] In another embodiment, when the current block is coded in a subblock mode (e.g., ISP) , the overlapped block refinement boundary process order can be always the subblock boundary first and then the CU boundary.

[0201] In another embodiment, CU boundary overlapped block refinement and subblock boundary overlapped block refinement can be enabled or disabled separately. That is, each boundary may decide whether to enable overlapped block refinement or not according to some conditions, such as intra prediction angle difference, prediction mode, signalled flag, predictor difference between the current predictor and a neighbouring predictor, or motion information.

[0202] Predictor Generation in Intra Block Overlapped Block Refinement and OBMC When Current Block or Neighbouring Block is Intra Coded Block

[0203] In one embodiment, when the current block or neighbouring block is coded by spatial-GPM or GPM-intra or GPM related intra prediction modes, overlapped block refinement can be applied to refine the current predictor.

[0204] In another embodiment, overlapped block refinement can be enabled depending on the neighbouring template shape or existence of neighbouring intra coded blocks. For example, top CU boundary overlapped block refinement can be performed only when the top neighbouring template exists.

[0205] In another embodiment, when neighbouring block or the current block satisfies some conditions to generate other kinds of predictor, one or more new predictors can be generated and used in overlapped block refinement or OBMC. For example, when neighbouring block or the current block meets PDP conditions, PDP is used to generate the intra predictor for the overlapped block refinement process or the OBMC process.

[0206] In another embodiment, when the neighbouring block or the current block does not meet some conditions for generating other kinds of predictor, original predictor generation mechanism can be employed to generate the predictor in the overlapped block refinement process or the OBMC process. For example, when the neighbouring block or the current block does not meet PDP conditions, regular intra prediction method can be used to generate the intra predictor for the overlapped block refinement process or the OBMC process.

[0207] Syntax Design of Overlapped Block Refinement

[0208] When the current block is coded by an intra prediction mode, overlapped block refinement can be performed to refine the current intra predictor block boundary and subblock block boundary. When a neighbouring block is coded by the intra prediction mode, overlapped block refinement or OBMC can be performed to refine the current intra predictor, the current inter predictor, the current IBC predictor, or the current IntraTMP predictor. It may not always be beneficial to apply overlapped block refinement or OBMC to refine the current predictor. Accordingly, it is proposed to introduce one or more high-level syntax elements to selectively enable or disable the overlapped block refinement process, the OBMC process, or the blending process.

[0209] In one embodiment, one or more high-level syntax elements may be explicitly signalled to enable or disable the overlapped block refinement process, the OBMC process, or the blending process when the current block, current subblock, neighbouring block, or neighbouring subblock is intra-predicted. For example, such high-level syntax elements may include an SPS-level flag, a PPS-level flag, a picture header-level flag, a slice header-level flag, a frame-level flag, a CTU-level flag, or a block-level flag.

[0210] In another embodiment, some high-level syntax can be signalled or inferred implicitly to enable or to disable overlapped block refinement process or OBMC process or blending process when the current block or the current subblock or neighbouring block or neighbouring subblock is intra-predicted. For example, some high-level syntax can be SPS-level flag or PPS-level flag or picture header level flag or slice header level flag or frame level flag or CTU level flag or block level flag.

[0211] Any of the foregoing proposed methods can be implemented in encoders and / or decoders. For example, any of the proposed methods of adaptive boundary refinement for blended intra prediction can be implemented in a predictor derivation module of an encoder, and / or a predictor derivation module of a decoder. Alternatively, any of the proposed methods can be implemented as a circuit coupled to the predictor derivation module of the encoder and / or the predictor derivation module of the decoder, so as to provide the information needed by the predictor derivation module. With reference to the exemplary encoder in Fig. 1A and exemplary decoder in Fig. 1B, any of the proposed methods can be implemented in a predictor derivation module of an encoder, and / or a predictor derivation module of a decoder. Alternatively, any of the proposed methods can be implemented as a circuit coupled to the predictor derivation module of the encoder and / or the predictor derivation module of the decoder, so as to provide the information needed by the predictor derivation module. For example, the process for the proposed methods can be implemented in an encoder side or a decoder side, such as the Intra / Inter coding module (e.g. Intra Pred. 150 / MC 152 in Fig. 1B) in a decoder or an Intra / Inter coding module is an encoder (e.g. Intra Pred. 110 / Inter Pred. 112 in Fig. 1A) .

[0212] Fig. 16 illustrates a flowchart of an exemplary video coding system, where an overlapped boundary process is used for blended intra predictor according to an embodiment of the present invention. The steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side. The steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart. According to this method, input data comprising a current block, a current subblock, a neighbouring block, a neighbouring subblock, or a combination thereof is received in step 1610. An intra prediction blending process and an overlapped boundary refinement process is applied to the current block in step 1620. Whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is pre-defined or is determined adaptively for the current block according to one or more conditions. When the intra prediction blending process is applied to the current block or the current subblock before the overlapped boundary refinement process, a final intra predictor is generated for the current block or the current subblock by blending two or more intra predictions, and the overlapped boundary refinement process is then applied to overlapped boundaries associated with the final intra predictor to generate a refined final predictor. When the intra prediction blending process is applied to the current block or the current subblock after the overlapped boundary refinement process, the overlapped boundary refinement process is applied to the overlapped boundaries associated with at least one of said two or more intra predictions to form refined intra predictions, and the intra prediction blending process is applied to the refined intra predictions to generate the refined final predictor. The current block or the current subblock is encoded or decoded using the refined final predictor in step 1630.

[0213] The flowchart shown is intended to illustrate an example of video coding according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.

[0214] The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.

[0215] Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

[0216] The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims

1.A method of video coding, the method comprising:receiving input data comprising a current block, a current subblock, a neighbouring block, a neighbouring subblock, or a combination thereof;applying an intra prediction blending process and an overlapped boundary refinement process to the current block;wherein whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is pre-defined or is determined adaptively for the current block according to one or more conditions;wherein when the intra prediction blending process is applied to the current block or the current subblock before the overlapped boundary refinement process, a final intra predictor is generated for the current block or the current subblock by blending two or more intra predictions, and the overlapped boundary refinement process is then applied to overlapped boundaries associated with the final intra predictor to generate a refined final predictor; andwherein when the intra prediction blending process is applied to the current block or the current subblock after the overlapped boundary refinement process, the overlapped boundary refinement process is applied to the overlapped boundaries associated with at least one of said two or more intra predictions to form refined intra predictions, and the intra prediction blending process is applied to the refined intra predictions to generate the refined final predictor; andencoding or decoding the current block or the current subblock using the refined final predictor.2.The method of Claim 1, wherein said one or more conditions comprise prediction mode.3.The method of Claim 1, wherein said one or more conditions comprise a predictor difference, and the predictor difference is compared with a threshold to determine whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process.4.The method of Claim 1, wherein said one or more conditions comprise intra prediction angle difference or motion information.5.The method of Claim 1, wherein whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is fixed according to one or more pre-defined conditions or determined according to one or more flags or syntax elements signalled.6.The method of Claim 5, wherein said one or more flags or syntax elements are signalled or parsed at SPS (Sequence Parameter Set) -level, PPS (Picture Parameter Set) -level, picture header level, slice header level, CTU (Coding Tree Unit) -level, block level or a combination thereof.7.The method of Claim 1, wherein when the overlapped boundary refinement process is applied to multiple predictors, weightings, blending lines, blending rules, or adaptive blending decision in the overlapped boundary refinement process is allowed to be different for multiple predictors.8.The method of Claim 1, wherein the current block is coded in Matrix-Based Intra Prediction (MIP) , replacement of conventional intra modes with MIP (PDP) , Decoder-Side Intra Mode Derivation (DIMD) , Occurrence-Based Intra Coding (OBIC) , Template-Based Intra Mode Derivation (TIMD) , decoder-side derived intra prediction–related modes, spatial-GPM and GPM-related modes, intra prediction fusion, or chroma intra prediction mode fusion.9.The method of Claim 1, wherein when the current block is coded in Spatial Geometric Partitioning Mode (S-GPM) , said two or more intra predictions are derived from S-GPM partitions of the current block respectively.10.The method of Claim 9, wherein whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is determined adaptively.11.The method of Claim 9, wherein the intra prediction blending process is always applied to the current block after the overlapped boundary refinement process.12.The method of Claim 9, wherein when the overlapped boundary refinement process is performed at the current subblock with a GPM partition line inside the current subblock, the overlapped boundary refinement process uses less blending lines or weaker blending weightings.13.The method of Claim 1, wherein when the current block is coded in Decoder-Side Intra Mode Derivation (DIMD) , said two or more intra predictions are derived according to DIMD process.14.An apparatus for video coding, the apparatus comprising one or more electronics or processors arranged to:receive input data comprising a current block, a current subblock, a neighbouring block, a neighbouring subblock, or a combination thereof;apply an intra prediction blending process and an overlapped boundary refinement process to the current block;wherein whether the intra prediction blending process is applied to the current block before or after the overlapped boundary refinement process is pre-defined or is determined adaptively for the current block according to one or more conditions;wherein when the intra prediction blending process is applied to the current block or the current subblock before the overlapped boundary refinement process, a final intra predictor is generated for the current block or the current subblock by blending two or more intra predictions, and the overlapped boundary refinement process is then applied to overlapped boundaries associated with the final intra predictor to generate a refined final predictor; andwherein when the intra prediction blending process is applied to the current block or the current subblock after the overlapped boundary refinement process, the overlapped boundary refinement process is applied to the overlapped boundaries associated with at least one of said two or more intra predictions to form refined intra predictions, and the intra prediction blending process is applied to the refined intra predictions to generate the refined final predictor; andencode or decode the current block or the current subblock using the refined final predictor.