Implicit determination of transform skip mode

By determining the transform mode based on the decoding coefficients of representative blocks, and combining zeroing operations and transform skipping modes, the video encoding and decoding process is optimized, solving the problem of low encoding and decoding efficiency for large-block and screen-content videos, and improving encoding efficiency and compression performance.

CN115699737BActive Publication Date: 2026-06-26DOUYIN VISION CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
DOUYIN VISION CO LTD
Filing Date
2021-03-25
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing video codec standards need to improve the efficiency of transformation mode selection and usage when processing video blocks, especially when processing large-sized videos and videos with large screen content, where traditional methods are inefficient.

Method used

The encoding and decoding process of video blocks is optimized by determining whether to apply a horizontal or vertical specific transform mode based on the decoding coefficients of one or more representative blocks, combined with zeroing operations and transform skipping modes.

Benefits of technology

It improves the efficiency of the video encoding and decoding process, especially when processing large-sized videos and videos with large screen content, reducing unnecessary transformation operations and improving encoding efficiency and compression performance.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115699737B_ABST
    Figure CN115699737B_ABST
Patent Text Reader

Abstract

Methods, systems, and apparatus, including computer programs, for video processing are described. An example video processing method includes performing a conversion between a video and a bitstream of the video according to a rule. The rule specifies use of a particular transform mode for the conversion at least at a first video unit level and a second video unit level.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] Cross-reference of related applications

[0002] Pursuant to applicable patent law and / or the rules of the Paris Convention, this application claims priority and interest in International Patent Application No. PCT / CN2020 / 081198, filed on March 25, 2020. For all legal purposes, the entire disclosure of the aforementioned application is incorporated herein by reference as a part of the disclosure. Technical Field

[0003] This patent document relates to image encoding and decoding as well as video encoding and decoding. Background Technology

[0004] Digital video accounts for the largest share of bandwidth usage on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth demand for digital video is expected to continue to grow. Summary of the Invention

[0005] This document discloses techniques that can be used by video encoders and decoders to process the encoded and decoded representation of video using control information useful for decoding the encoded and decoded representation.

[0006] In one example aspect, a video processing method is disclosed. This method includes performing conversions between video and video bitstreams according to rules. These rules specify the use of a particular transformation mode for the conversion, at least at a first video unit level and a second video unit level.

[0007] In another example, a video processing method is disclosed. This method includes performing conversions between video blocks and video bitstreams according to rules. These rules specify syntax elements at the video unit level used to indicate the allowed set of transforms for the conversion.

[0008] In another example, a video processing method is disclosed. This method includes performing conversions between video blocks and a video bitstream according to rules. The rules specify the use of a particular transform mode for the conversion of the video blocks, determined based on a function associated with the energy of representative coefficients of one or more representative blocks of the video.

[0009] In another example, a video processing method is disclosed. The method includes: converting video blocks of a video to a codec representation of the video; determining, based on rules, whether a horizontally specific transform or a vertically specific transform is applied to the video block; and performing the transform based on the determination. The rules specify a relationship between the determination and representative coefficients of decoding coefficients from one or more representative blocks of the video.

[0010] In another example, another video processing method is disclosed. This method includes: converting video blocks to a codec representation of the video; determining, based on rules, whether a horizontally specific transform or a vertically specific transform is applied to the video block; and performing the transformation based on that determination. The rules specify the relationship between the determination and the decoded luminance coefficients of the video block.

[0011] In another example, another video processing method is disclosed. This method includes: converting video blocks of a video to a codec representation of the video; determining, based on rules, whether a horizontally specific transform or a vertically specific transform is applied to the video block; and performing the transform based on that determination. The rules define the relationship between the determination and a value V, which is associated with decoding coefficients or representative coefficients of a representative block.

[0012] In another example, another video processing method is disclosed. This method includes determining that one or more syntax fields exist in the codec representation of a video, wherein the video contains one or more video blocks; and based on the one or more syntax fields, determining whether a horizontally specific transform or a vertically specific transform is enabled for the video blocks in the video.

[0013] In another example, another video processing method is disclosed. This method includes making a first determination regarding whether to enable the use of a specific transform for the conversion between video blocks and the codec representation of the video; making a second determination regarding whether to enable a zeroing operation during the conversion; and performing the conversion based on the first and second determinations.

[0014] In another example, another video processing method is disclosed. This method includes performing a conversion between video blocks and a codec representation of the video; wherein the video blocks are represented as codec blocks in the codec representation, wherein the non-zero coefficients of the codec blocks are restricted to one or more sub-regions; and wherein a specific transformation is applied to generate the codec blocks.

[0015] In yet another example, a video encoder apparatus is disclosed. The video encoder includes a processor configured to implement the methods described above.

[0016] In yet another example, a video decoder apparatus is disclosed. The video decoder includes a processor configured to implement the methods described above.

[0017] In yet another example, a computer-readable medium on which code is stored is disclosed. This code embodies one of the methods described herein in the form of processor-executable code.

[0018] These and other features are described in this document. Attached Figure Description

[0019] Figure 1 A block diagram of an example video encoder is shown.

[0020] Figure 2 Examples of 67 intra-frame prediction modes are shown.

[0021] Figure 3A An example of a reference sample for wide-angle intra-frame prediction is shown.

[0022] Figure 3B Another example of a reference sample for wide-angle intra-frame prediction is shown.

[0023] Figure 4 The discontinuity problem is shown when the orientation exceeds 45 degrees.

[0024] Figure 5A An example definition of the sample points used by the PDPC applied to diagonal intra-frame mode and adjacent angle intra-frame mode is shown.

[0025] Figure 5B Another example definition of the sample points used by the PDPC applied to diagonal intra-frame mode and adjacent angle intra-frame mode is shown.

[0026] Figure 5C Another example definition of the sample points used by the PDPC applied to diagonal intra-frame mode and adjacent angle intra-frame mode is shown.

[0027] Figure 5D This shows yet another example definition of the samples used by the PDPC applied to diagonal intra-frame mode and adjacent angle intra-frame mode.

[0028] Figure 6 Examples of 4×8 and 8×4 block partitioning are shown.

[0029] Figure 7 Examples of block partitioning are shown for all blocks except 4×8, 8×4, and 4×4.

[0030] Figure 8 An example of a quadratic transformation in JEM is shown.

[0031] Figure 9 An example of the simplified quadratic transformation LFNST is shown.

[0032] Figure 10A An example of positive simplification transformation is shown.

[0033] Figure 10B An example of the inverse reduction transformation is shown.

[0034] Figure 11 An example of a positive LFNST8×8 process with a 16×48 matrix is ​​shown.

[0035] Figure 12 Examples of scan positions 17 to 64 for non-zero elements are shown.

[0036] Figure 13 Examples of subblock transformation modes SBT-V and SBT-H are shown.

[0037] Figure 14A An example of Scan Region Based Coefficient Coding (SRCC) is shown.

[0038] Figure 14B Another example of scan region-based coefficient encoding and decoding (SRCC) is shown.

[0039] Figure 15A Example constraints of IST based on the position of non-zero coefficients are shown.

[0040] Figure 15B Another example constraint of IST based on the position of non-zero coefficients is shown.

[0041] Figure 16A An example of a zeroed-type TS codec block is shown.

[0042] Figure 16B Another example of a zeroed-type TS codec block is shown.

[0043] Figure 16C Another example of a zeroed-type TS codec block is shown.

[0044] Figure 16D Another zero-type TS codec block is shown.

[0045] Figure 17 This is a block diagram of an example video processing system.

[0046] Figure 18 This is a block diagram illustrating a video encoding / decoding system according to some embodiments of the present disclosure.

[0047] Figure 19 This is a block diagram illustrating an encoder according to some embodiments of the present disclosure.

[0048] Figure 20 This is a block diagram illustrating a decoder according to some embodiments of the present disclosure.

[0049] Figure 21 This is a block diagram of a video processing device.

[0050] Figure 22 This is a flowchart of an example method for video processing.

[0051] Figure 23This is a flowchart representation of the video processing method based on this technology.

[0052] Figure 24 This is a flowchart representation of another video processing method based on this technology.

[0053] Figure 25 This is a flowchart representation of another video processing method based on this technology. Detailed Implementation

[0054] The use of section headings in this document is for ease of understanding and does not limit the application of the technologies and embodiments disclosed in each section to that section only. Furthermore, the use of H.266 terminology in some specifications is merely for ease of understanding and not to limit the scope of the disclosed technologies. Thus, the technologies described herein are also applicable to other video codec protocols and designs.

[0055] 1. Overview

[0056] This document relates to video codec technology. Specifically, it covers transform skipping modes and transform types (e.g., identity transform, which can be considered an identity transform) in video codecs. It can be applied to existing video codec standards (e.g., HEVC) or upcoming standards (General Video Codec). It can also be applied to future video codec standards or video codecs.

[0057] 2. Preliminary Discussion

[0058] Video codec standards have primarily evolved through the development of well-known ITU-T and ISO / IEC standards. ITU-T developed H.261 and H.263, while ISO / IEC developed MPEG-1 and MPEG-4. These two organizations jointly developed the H.262 / MPEG-2 video and H.264 / MPEG-4 Advanced Video Coding (AVC) standards, as well as the H.265 / HEVC standard. Since H.262, video codec standards have been based on a hybrid video codec architecture, utilizing temporal prediction plus transform coding. To explore future video codec technologies beyond HEVC, VCEG and MPEG jointly established the Joint Video Exploration Team (JVET) in 2015. Since then, JVET has adopted many new methods and incorporated them into reference software called the Joint Exploration Model (JEM). In April 2018, the Joint Video Experts Group (JVET) between VCEG (Q6 / 16) and ISO / IEC JTC1 SC29 / WG11 (MPEG) was established to work on the VVC (Versatile Video Coding) standard, with the goal of reducing the bit rate by 50% compared to HEVC.

[0059] 2.1. Encoding and decoding process of a typical video codec

[0060] Figure 1 An example of a VVC encoder block diagram is shown, comprising three in-loop filtering blocks: Deblocking Filter (DF), Sample Adaptive Offset (SAO), and ALF. Unlike DF, which uses predefined filters, SAO and ALF utilize the raw samples of the current image, signaling the offset and filter coefficients with encoding / decoding side information. They reduce the mean square error between the raw and reconstructed samples by adding an offset and by applying a Finite Impulse Response (FIR) filter, respectively. ALF is the last processing stage for each image and can be viewed as a tool attempting to capture and repair artifacts created in previous stages.

[0061] 2.2. Intra-mode encoding and decoding with 67 intra-prediction modes

[0062] To capture arbitrary edge directions presented in natural video, the number of intra-frame directional modes has been expanded from the 33 used in HEVC to 65. Additional directional modes include... Figure 2 The diagram is rendered in the image, and the planar and DC modes remain unchanged. These dense directional intra-prediction modes are applicable to all block sizes as well as luma and chroma intra-prediction.

[0063] Traditional intra-frame prediction direction is defined as ranging from 45 degrees to -135 degrees in a clockwise direction, such as... Figure 2 As shown. In VTM2, for non-square blocks, several traditional angular intra-prediction modes are adaptively replaced with wide-angle intra-prediction modes. The replaced modes are signaled using the original method and remapped to the wide-angle mode index after parsing. The total number of intra-prediction modes remains unchanged, for example, 67, and the intra-mode encoding and decoding remain unchanged.

[0064] In HEVC, each intra-codec block has a square shape, with each side's length being a power of 2. Therefore, division is unnecessary for generating intra-prediction values ​​using DC mode. In VVV2, blocks can have rectangular shapes, which typically requires division for each block. To avoid division for DC prediction, only the longer sides are used to calculate the average of non-square blocks.

[0065] 2.3. Wide-angle intra-frame prediction for non-rectangular blocks

[0066] Traditional angular intra-prediction directions are defined clockwise from 45 degrees to -135 degrees. In VTM2, for non-square blocks, several traditional angular intra-prediction modes are adaptively replaced with wide-angle intra-prediction modes. The replaced modes are communicated using the original method signaling and remapped to the wide-angle mode index after resolution. The total number of intra-prediction modes for a given block remains unchanged, for example, 67, and the intra-mode encoding and decoding remain unchanged.

[0067] To support these predicted directions, a top reference of length 2W+1 and a left reference of length 2H+1 are defined as follows: Figures 3A to 3B As shown.

[0068] The number of replacement modes in wide-angle directional mode depends on the aspect ratio of the block. Table 1 shows the intra-prediction modes for replacement.

[0069] Table 1: Intra-prediction modes replaced by wide-angle mode

[0070]

[0071] like Figure 4 As shown, in the case of wide-angle intra-frame prediction, two vertically adjacent prediction samples can use two non-adjacent reference samples. Therefore, low-pass reference sample filtering and side smoothing are applied to wide-angle prediction to reduce the increased gap Δp. α The negative impact.

[0072] 2.4. Location-dependent intra-frame prediction combination

[0073] In VTM2, the intra-prediction results for planar modes are further modified using the position-dependent intraprediction combination (PDPC) method. PDPC is an intra-prediction method that combines unfiltered boundary reference samples with HEVC-style intra-prediction using filtered boundary reference samples. PDPC is applied to the following intra-modal modes without signaling notification: planar, DC, horizontal, vertical, lower left angle mode and its eight adjacent angle modes, and upper right angle mode and its eight adjacent angle modes.

[0074] Using a linear combination of intra-frame prediction modes (DC, plane, angle) and reference samples, the prediction sample pred(x,y) is predicted according to the following equation:

[0075] pred(x,y)=(wL×R -1,y +wT×R x,-1 –wTL×R -1,-1 +(64–wL–wT+wTL)×pred(x,y)+32)>>6

[0076] Where R x,-1 R -1,y R represents the reference sample points located at the top and left of the current sample point (x,y), respectively. -1,-1 This represents the reference sample point located at the top left corner of the current block.

[0077] If PDPC is applied to DC intra-frame mode, planar intra-frame mode, horizontal intra-frame mode, and vertical intra-frame mode, no additional boundary filtering is required, but it is required in the case of HEVC DC mode boundary filtering or horizontal / vertical mode edge filtering.

[0078] Figures 5A to 5D Reference samples (R) of PDPC applied to various prediction modes are shown. x,-1 ,R -1,y and R -1,-1 The definition of ). The predicted sample point pred(x',y') is located at (x',y') within the prediction block. The reference sample point R. x,-1 The coordinates x are given by the following formula: x = x' + y' + 1, with reference sample point R. -1,y The coordinates y are similarly given by the following formula: y = x' + y' + 1. Figure 5A The top-right diagonal pattern is shown. Figure 5B The bottom left diagonal pattern is shown. Figure 5C The adjacent diagonal top right pattern is shown. Figure 5D This shows an adjacent diagonal bottom left pattern.

[0079] The PDPC weights depend on the prediction pattern, as shown in Table 2.

[0080] Table 2: Examples of PDPC weights based on prediction patterns

[0081] Predictive patterns wT wL wTL diagonal top right 16>>((y’<<1)>>shift) 16>>((x’<<1)>>shift) 0 diagonal bottom left 16>>((y’<<1)>>shift) 16>>((x’<<1)>>shift) 0 adjacent diagonal top right 32>>((y’<<1)>>shift) 0 0 The adjacent diagonal bottom left 0 32>>((x’<<1)>>shift) 0

[0082] 2.5. Intra-frame sub-block segmentation (ISP)

[0083] In some embodiments, an ISP is proposed, which divides the luminance intra-frame prediction block vertically or horizontally into 2 sub-segments or 4 sub-segments based on the block size dimension, as shown in Table 3. Figure 6 and Figure 7 Examples of two possibilities are shown. All sub-segments satisfy the condition of having at least 16 samples.

[0084] Table 3: The number of sub-segments depends on the block size.

[0085] Block size Number of sub-segments 4×4 Undivided 4×8 and 8×4 2 All other cases 4

[0086] For each of these sub-segments, a residual signal is generated by entropy decoding of the coefficients transmitted by the encoder, followed by inverse quantization and inverse transform. The sub-segment is then intra-predicted, and the corresponding reconstructed samples are finally obtained by adding the residual signal to the predicted signal. Thus, the reconstructed values ​​of each sub-segment can be used to generate the next prediction, and this process is repeated. All sub-segments share the same intra-frame mode.

[0087] Based on the intra-frame mode and the partitions used, two different types of processing orders are employed, referred to as the normal order and the reverse order. In the normal order, the first sub-segment to be processed is the one containing the top-left sample of the CU, then it continues downwards (horizontal partitioning) or to the right (vertical partitioning). As a result, the reference samples used to generate the sub-segment prediction signal are located only to the left and above these lines. On the other hand, the reverse processing order starts with the sub-segment containing the bottom-left sample of the CU and continues upwards, or starts with the sub-segment containing the top-right sample of the CU and continues to the left.

[0088] 2.6. Multiple Transform Set (MTS)

[0089] In addition to DCT-II, which is already used in HEVC, the Multiple Transform Selection (MTS) scheme is used for residual coding and decoding of both inter-frame and intra-frame codec blocks. It uses multiple transforms selected from DCT8 / DST7. The newly introduced transform matrices are DST-VII and DCT-VIII. Table 4 shows the basis functions of the selected DST / DCTs.

[0090] Table 4: Transformation Types and Basis Functions

[0091]

[0092] There are two ways to enable MTS: explicit MTS and implicit MTS.

[0093] 2.6.1. Implicit MTS

[0094] Implicit MTS is a new tool in VVC. The derivation of the variable implicitMtsEnabled is as follows:

[0095] Whether implicit MTS is enabled depends on the value of the variable implicitMtsEnabled. The derivation of the variable implicitMtsEnabled is as follows:

[0096] – If sps_mts_enabled_flag equals 1, and one or more of the following conditions are true, then implicitMtsEnabled is set to equal to 1:

[0097] –IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT (i.e., ISP is enabled).

[0098] –cu_sbt_flag equals 1 (i.e., ISP is enabled), and Max(nTbW, nTbH) is less than or equal to 32.

[0099] –sps_explicit_mts_intra_enabled_flag equals 0 (i.e., explicit MTS is disabled), CuPredMode[0][xTbY][yTbY] equals MODE_INTRA, and lfnst_idx[x0][y0] equals 0, and intra_mip_flag[x0][y0] equals 0.

[0100] Otherwise, implicitMtsEnabled is set to 0.

[0101] The derivation of the variable trTypeHor, which defines the horizontal transform kernel, and the variable trTypeVer, which defines the vertical transform kernel, is as follows:

[0102] – Set trTypeHor and trTypeVer to 0 if one or more of the following conditions are true (e.g., DCT2).

[0103] –cIdx is greater than 0 (i.e., for the chromaticity component).

[0104] –IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, lfnst_idx is not equal to 0

[0105] Otherwise, if implicitMtsEnabled equals 1, the following applies:

[0106] – If cu_sbt_flag equals 1, then trTypeHor and trTypeVer are specified in Table 40 according to cu_sbt_horizontal_flag and cu_sbt_pos_flag.

[0107] Otherwise (cu_sbt_flag equals 0), the derivation of trTypeHor and trTypeVer is as follows:

[0108] trTypeHor=(nTbW>=4&&nTbW<=16)? 1:0 (1188)

[0109] trTypeVer=(nTbH>=4&&nTbH<=16)? 1:0 (1189)

[0110] Otherwise, trTypeHor and trTypeVer are specified in Table 39 according to mts_idx.

[0111] The derivation of variables nonZeroW and nonZeroH is as follows:

[0112] – If ApplyLfnstFlag equals 1, nTbW is greater than or equal to 4, and nTbH is greater than or equal to 4, then the following conditions apply:

[0113] nonZeroW=(nTbW==4||nTbH==4)? 4:8 (1190)

[0114] nonZeroH=(nTbW==4||nTbH==4)? 4:8 (1191)

[0115] Otherwise, the following applies:

[0116] nonZeroW=Min(nTbW,(trTypeHor>0)?16:32) (1192)

[0117] nonZeroH=Min(nTbH,(trTypeVer>0)?16:32) (1193)

[0118] 2.6.2. Explicit MTS

[0119] To control the MTS scheme, a flag is used to specify whether explicit MTS exists in the bitstream for intra / inter-frame use. Additionally, two separate enable flags are specified at the SPS level for intra and inter-frame use to indicate whether explicit MTS is enabled. When MTS is enabled at SPS, the CU-level transform index can be signaled to indicate whether MTS should be applied. Here, MTS applies only to luma. The MTS CU-level index (represented by mts_idx) is signaled when the following conditions are met.

[0120] - Width and height are both less than or equal to 32

[0121] -CBF brightness mark equals one

[0122] -Non-TS

[0123] -Non-ISP

[0124] -Non-SBT

[0125] -LFNST is disabled

[0126] - There exists a non-zero coefficient that is not at the DC position (top left of the block).

[0127] - There are no non-zero coefficients outside the 16×16 region at the top left.

[0128] If the first bit of `mts_idx` is zero, DCT2 applies in both directions. However, if the first bit of `mts_idx` is one, additional signaling informs the other two bits to indicate the transform type in the horizontal and vertical directions, respectively. The transform and signaling mapping table is shown in Table 5. For transform matrix precision, an 8-bit master transform kernel is used. Therefore, all transform kernels used in HEVC remain unchanged, including 4-point DCT-2 and DST-7, 8-point DCT-2, 16-point DCT-2, and 32-point DCT-2. Furthermore, other transform kernels, including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7, and DCT-8, use an 8-bit master transform kernel.

[0129] Table 5: Signaling Notifications of MTS

[0130]

[0131] To reduce the complexity of large-sized DST-7 and DCT-8 blocks, the high-frequency transform coefficients are set to zero for DST-7 and DCT-8 blocks with a size (width or height, or both) equal to 32. Only the coefficients in the 16×16 low-frequency region are retained.

[0132] In HEVC, for example, block residuals can be encoded and decoded using transform skip mode. To avoid redundancy in syntax encoding and decoding, the transform skip flag is not signaled when the CU level MTS_CU_flag is not equal to zero. The block size limit for transform skip is the same as the block size limit for MTS in JEM4, which indicates that transform skip applies to the CU when both the block width and block height are equal to or less than 32.

[0133] 2.6.3. Zeroing in MTS

[0134] In VTM8, large block size transforms up to 64×64 are enabled, primarily for higher resolution video, such as 1080p and 4K sequences. For transform blocks with a size (width or height, or both) of 64 or more, the high-frequency transform coefficients of the block to which DCT2 transform is applied are set to zero, thus retaining only the low-frequency coefficients, and all other coefficients are forced to zero without signaling notification. For example, for an M×N transform block, where M is the block width and N is the block height, when M is not less than 64, only the left 32 columns of transform coefficients are retained. Similarly, when N is not less than 64, only the first 32 rows of transform coefficients are retained.

[0135] For transform blocks with dimensions (width or height, or both) not less than 32, the high-frequency transform coefficients of blocks to which DCT8 or DST7 transform is applied are set to zero, thus retaining only the low-frequency coefficients, while all other coefficients are forced to zero without being notified. For example, for an M×N transform block, where M is the block width and N is the block height, when M is not less than 32, only the left 16 columns of transform coefficients are retained. Similarly, when N is not less than 32, only the first 16 rows of transform coefficients are retained.

[0136] 2.7. Low-frequency non-separable secondary transform (LFNST)

[0137] 2.7.1. JEM Non-Separable Secondary Transform (NSST)

[0138] In JEM, a quadratic transform is applied between the forward master transform and quantization (at the encoder) and between the dequantization and inverse master transform (at the decoder). For example... Figure 8 As shown, a 4×4 (or 8×8) quadratic transformation is performed based on the block size. For example, for each 8×8 block, the 4×4 quadratic transformation is applied to the smaller block (e.g., min(width, height) < 8), and the 8×8 quadratic transformation is applied to the larger block (e.g., min(width, height) > 4).

[0139] The application of the non-separable transform is described below using the input as an example. To apply the non-separable transform, the 4x4 input block X

[0140]

[0141] is first represented as a vector

[0142]

[0143] The non-separable transform is calculated as where denotes the transform coefficient vector, and T is a 16x16 transform matrix. Subsequently, the 16x1 coefficient vector is reorganized into a 4x4 block using the scan order (horizontal, vertical, or diagonal) of the block. Coefficients with smaller indices are placed in the 4x4 coefficient block together with smaller scan indices. There are a total of 35 transform sets, and each transform set uses 3 non-separable transform matrices (kernels). The mapping from the intra prediction mode to the transform set is predefined. For each transform set, the selected non-separable quadratic transform candidate is further specified by a quadratic transform index signaled explicitly. After the transform coefficients, this index is signaled once per frame for each CU in the bitstream.

[0144] 2.7.2. Reduced Secondary Transform (LFNST)

[0145] In some embodiments, LFNST is introduced and a 4-transform-set (instead of 35 transform sets) mapping is used. In some implementations, 16×64 (which can be further reduced to 16×48) matrices and 16×16 matrices are used for 8×8 blocks and 4×4 blocks, respectively. For ease of annotation, the 16×64 (which can be further reduced to 16×48) transform is denoted as LFNST8×8, and the 16×16 transform is denoted as LFNST4×4. Figure 9 An example of LFNST is shown.

[0146] LFNST calculation

[0147] The main idea of the reduced transform (RT) is to map an N-dimensional vector to an R-dimensional vector in a different space, where R / N (R < N) is the reduction factor.

[0148] The RT matrix is an R×N matrix as follows:

[0149]

[0150] Here, the R rows of the transformation form R bases in N-dimensional space. The inverse transformation matrix of RT is the transpose of its forward transformation. The forward and inverse RT are as follows: Figure 10A and Figure 10B The description.

[0151] In this proposal, a reduction factor of 4 (1 / 4 size) is applied to LFNST 8×8. Therefore, instead of 64×64, a 16×64 direct matrix is ​​used, which is the traditional size of an 8×8 inseparable transform matrix. In other words, a 64×16 inverse LFNST matrix is ​​used on the decoder side to generate the core (first) transform coefficients in the top-left region of the 8×8. Positive LFNST 8×8 uses a 16×64 (or 8×64 for 8×8 blocks) matrix such that it produces non-zero coefficients only in the top-left 4×4 region of a given 8×8 region. In other words, if LFNST is applied, the 8×8 region outside the top-left 4×4 region will only have zero coefficients. For LFNST 4×4, 16×16 (or 8×16 for 4×4 blocks) direct matrix multiplication is applied.

[0152] The inverse LFNST is conditionally applied when the following two conditions are met:

[0153] a. Block size is greater than or equal to a given threshold (W>=4 && H>=4)

[0154] b. The transition skip mode flag is equal to zero.

[0155] If both the width (W) and height (H) of the transform coefficient block are greater than 4, then LFNST 8x8 is applied to the top-left 8×8 region of the transform coefficient block. Otherwise, LFNST 4x4 is applied to the top-left min(8,W)×min(8,H) region of the transform coefficient block.

[0156] If the LFNST index is equal to 0, then LFNST is not applied. Otherwise, LFNST is applied, and its core is selected along with the LFNST index. The LFNST selection method and the encoding / decoding of the LFNST index will be explained later.

[0157] In addition, LFNST is applied to intra-frame CUs in intra-frame and inter-frame stripes, as well as luma and chroma. If dual-tree is enabled, the LFNST indexes for luma and chroma are signaled separately. For inter-frame stripes (where dual-tree is disabled), a single LFNST index is signaled and used for luma and chroma.

[0158] At the 13th JVET conference, Intra-Frame Sub-Segmentation (ISP) was adopted as a new intra-frame prediction mode. When ISP mode is selected, LFNST is disabled, and the LFNST index is not signaled because the performance improvement is limited even if LFNST is applied to every feasible segmentation block. Furthermore, disabling LFNST on the residuals of ISP predictions reduces coding complexity.

[0159] LFNST selection

[0160] The LFNST matrix is ​​selected from four transform sets, each consisting of two transforms. Which transform set is applied is determined by the intra-prediction mode, as follows:

[0161] 1) If one of the three CCLM modes is indicated, then select transform set 0.

[0162] 2) Otherwise, perform the transformation set selection according to Table 6.

[0163] Table 6: Transform Set Selection Table

[0164]

[0165]

[0166] The index of the access table, denoted as IntraPredMode, ranges from [-14, 83], and is the transform mode index used for wide-angle intra-frame prediction.

[0167] reduce LFNST matrix of dimension

[0168] For further simplification, a 16×48 matrix is ​​used instead of a 16×64 matrix with the same transformation set configuration. Each matrix obtains 48 input data points from three 4×4 blocks, excluding the bottom right 4×4 block from the top left 8×8 block. Figure 11 ).

[0169] LFNST signaling

[0170] A positive LFNST 8×8 with R=16 uses a 16×64 matrix, therefore it produces non-zero coefficients only in the top-left 4×4 region of a given 8×8 region. In other words, if LFNST is applied, the 8×8 region produces only zero coefficients except for the top-left 4×4 region. Therefore, when the top-left 4×4 region is excluded (e.g., ... Figure 12 When any non-zero element is detected in an 8×8 block region outside of the one shown in the diagram, the LFNST index is not encoded or decoded because this means that no LFNST has been applied. In this case, the LFNST index is inferred to be zero.

[0171] Zeroing range

[0172] Normally, any coefficient in a 4×4 subblock can be nonzero before applying the inverse LFNST to it. However, in some cases, there are constraints that some coefficients in the 4×4 subblock must be zero before applying the inverse LFNST to it.

[0173] Let nonZeroSize be a variable. Any coefficient with an index not less than nonZeroSize must be zero when rearranging it into a 1-D array before inverting LFNST.

[0174] When nonZeroSize equals 16, the coefficients in the top left 4×4 sub-block have no zeroing constraint.

[0175] In some examples, nonZeroSize is set to 8 when the current block size is 4×4 or 8×8. For other block sizes, nonZeroSize is set to 16.

[0176] 2.8. Affine linear weighted intra prediction (ALWIP, also known as matrix-based intra prediction)

[0177] In some embodiments, alpha-based weighted intra prediction (ALWIP, also known as matrix-based intra prediction (MIP)) is used.

[0178] In some embodiments, two tests are performed. In Test 1, ALWIP is designed with an 8KB memory limit and a maximum of 4 multiplications per sample. Test 2 is similar to Test 1, but with a further simplified design in terms of memory requirements and model architecture.

[0179] • A single set of matrices and offset vectors for all block shapes.

[0180] • The number of patterns for all block shapes has been reduced to 19.

[0181] • Reduce memory requirements to 5760 10-bit values, or 7.20 kilobytes.

[0182] • Linear interpolation of the predicted samples is performed in a single step in each direction, instead of iterative interpolation in the first test.

[0183] 2.9. Sub-block Transformation

[0184] For an inter-frame prediction CU with a cu_cbf equal to 1, the cu_sbt_flag can be signaled to indicate whether to decode the entire residual block or a sub-part of the residual block. In the former case, the inter-frame MTS information is further parsed to determine the transform type of the CU. In the latter case, a portion of the residual block is encoded and decoded using the inferred adaptive transform, and the other portion of the residual block is set to zero. SBT is not applied to combined inter-frame and intra-frame modes.

[0185] In the sub-block transformation, a position-dependent transformation is applied to the luma transform blocks in SBT-V and SBT-H (chroma TB always uses DCT-2). The two positions in SBT-H and SBT-V are associated with different kernel transforms. More specifically, the horizontal and vertical transforms at each SBT position are... Figure 13 The rules specify that, for example, the horizontal and vertical transformations for SBT-V position 0 are DCT-8 and DST-7, respectively. When one side of the residual TU is greater than 32, the corresponding transformation is set to DCT-2. Therefore, the sub-block transformation joint specifies TU tiling, cbf, and the horizontal and vertical transformations of the residual block, which can be considered a syntax shortcut for cases where the main residual of the block is on one side of the block.

[0186] 2.10. Scan Region-Based Coefficient Encoding / Decoding (SRCC)

[0187] SRCC has been adopted by AVS-3. Regarding SRCC, such as... Figures 14A to 14B The lower right position (SRx, SRy) shown in the diagram is signaled, and only the coefficients within the rectangle with its four corners (0, 0), (SRx, 0), (0, SRy), and (SRx, SRy) are scanned and signaled. All coefficients outside the rectangle are zero.

[0188] 2.11. Implicit Selection of Transform (IST)

[0189] As disclosed in PCT / CN2019 / 090261 (included herein by reference), an implicit choice of the transformation solution is given, wherein the choice of the transformation matrix (DCT2 for horizontal and vertical transformations, or DST7 for both) is determined by the parity of the non-zero coefficients in the transformation block.

[0190] The proposed method is applied to the luminance component of intra-frame encoded blocks, excluding those encoded using DT, and allows block sizes from 4×4 to 32×32. The transform type is hidden in the transform coefficients. Specifically, the parity of the number of valid coefficients (e.g., non-zero coefficients) in a block is used to indicate the transform type. Odd numbers indicate the application of DST-VII, and even numbers indicate the application of DCT-II.

[0191] To eliminate the 32-point DST-7 introduced by IST, it is proposed that the use of IST be limited based on the remaining scan area when using SRCC. For example... Figures 15A to 15B As shown, IST is not allowed when the x-coordinate or y-coordinate of the lower right position in the remaining scan area is not less than 16. That is to say, in this case, DCT-II is applied directly.

[0192] In another scenario, when using run-length coefficient encoding / decoding, each non-zero coefficient needs to be checked. IST is not allowed when the x-coordinate or y-coordinate of a non-zero coefficient position is not less than 16.

[0193] The corresponding grammatical changes are indicated by bold, italic, and underlined text, as shown below:

[0194]

[0195]

[0196]

[0197] 3. Examples of technical problems solved by publicly available technical solutions

[0198] The current designs of IST and MTS have the following problems:

[0199] 1. In VVC, the TS mode is signaled at the block level. However, while DCT2 and DST7 work well for residual blocks in camera-captured sequences, the Transition Skip (TS) mode is used more frequently for video with screen content compared to DST7. Further research is needed on how to more effectively determine the use of the TS mode.

[0200] 2. In VVC, the maximum allowed TS block size is set to 32×32. How to support large TS blocks requires further investigation.

[0201] 4. Example technologies and implementation examples

[0202] The items listed below should be considered as examples for explaining general concepts. These items should not be interpreted in a narrow way. Furthermore, these items can be combined in any way.

[0203] min(x, y) yields the smaller of x and y.

[0204] Transform skip mode / implicit determination of specific transforms

[0205] A method was proposed to determine whether to apply a horizontal and / or vertical identity transform (IT) (e.g., transform skip mode) to the current first block based on the decoding coefficients of one or more representative blocks. This method is called "implicit determination of IT". When both the horizontal and vertical transforms are IT, the transform skip (TS) mode is applied to the current first block.

[0206] A “block” can be a transform unit (TU) / prediction unit (PU) / encoder / decoder unit (CU) / transform block (TB) / prediction block (PB) / encoder / decoder block (CB). A TU / PU / CU can include one or more color components, such as a luma-only component for a two-tree segment, where the currently encoded color component is luma; and two chroma components for a two-tree segment, where the currently encoded color component is chroma; or three color components for a single-tree case.

[0207] 1. Decoding coefficients can be associated with one or more representative blocks of the same or different color components of the current first block.

[0208] a. In one example, the representative block is the first block, and the decoding coefficients associated with the first block are used to determine the use of IT in the first block.

[0209] b. In one example, the determination of which IT is used for the first block may depend on the decoding coefficients of multiple blocks, including at least one block different from the first block.

[0210] i. In one example, multiple blocks may include the first block.

[0211] ii. In one example, multiple blocks may include one or more blocks that are adjacent to the first block.

[0212] iii. In one example, multiple blocks may include one or more blocks having the same block dimension as the first block.

[0213] iv. In one example, multiple blocks may include the last N decoded blocks that precede the first block in decoding order and satisfy certain conditions (such as having the same prediction mode as the current block, e.g., all intra-frame codecs or IBC codecs, or having the same dimensions as the current block). N is an integer greater than 1.

[0214] v. In one example, multiple blocks may include one or more blocks that have a different color component than the first block.

[0215] 1) In one example, the first block can be in the luminance component. Multiple blocks can include blocks in the chrominance components (e.g., a second block in the Cb / B component and a third block in the Cr / R component).

[0216] a) In one example, the three blocks are in the same codec unit.

[0217] b) In addition, optionally, implicit MTS is applied only to the luma block and not to the chroma block.

[0218] 2) In one example, the first block in the first color component and the multiple blocks included in the multiple blocks that are not in the first color component can be in corresponding or juxtaposed positions in the image.

[0219] 2. The decoding coefficients used to determine the use of IT are called representative coefficients.

[0220] a. In one example, the representative coefficients only include coefficients that are not equal to zero (referred to as effective coefficients).

[0221] b. In one example, the representativeness coefficient can be modified before it is used to determine the use of IT.

[0222] i. For example, representativeness coefficients can be calibrated before being used to derive the transform.

[0223] ii. For example, representativeness coefficients can be scaled before being used to derive the transformation.

[0224] iii. For example, the representativeness coefficient can be offset before it is used to derive the transformation.

[0225] iv. For example, representativeness coefficients can be filtered before being used to derive the transform.

[0226] v. For example, coefficients or representative coefficients can be mapped to other values ​​(e.g., by lookup tables or dequantization) before being used to derive a transformation.

[0227] c. In one example, the representative coefficients are all the significant coefficients in the representative block.

[0228] d. Alternatively, the representativeness coefficient is a portion of the effective coefficients in the representative block.

[0229] i. In one example, the representative coefficients are those odd-numbered valid decoding coefficients.

[0230] 1) Optionally, the representative coefficients are those even-numbered valid decoding coefficients.

[0231] ii. In one example, the representative coefficients are those valid decoding coefficients that are greater than or not less than the threshold.

[0232] 1) Optionally, representative coefficients are those effective decoding coefficients whose amplitude is greater than or not less than a threshold.

[0233] iii. In one example, the representative coefficients are those decoded valid coefficients that are less than or no greater than the threshold.

[0234] 1) Optionally, representative coefficients are those effective decoding coefficients whose amplitude is less than or not greater than the threshold.

[0235] iv. In one example, the representative coefficients are the first K (K>=1) valid decoded coefficients in the decoding order.

[0236] v. In one example, the representative coefficients are the last K (K>=1) valid decoded coefficients in the decoding order.

[0237] vi. In one example, the representativeness coefficient can be the coefficient at a predefined location within the block.

[0238] 1) In one example, the representativeness coefficient may include only one coefficient relative to the representative block at the coordinates (xPos, yPos). For example, xPos = yPos = 0.

[0239] 2) In one example, the representativeness coefficient may include only one coefficient relative to the representative block at coordinates (xPos, yPos). And xpo and / or ypo satisfy the following condition:

[0240] a) In one example, xPos is not greater than the threshold Tx (e.g., 31) and / or yPos is not greater than the threshold Ty (e.g., 31).

[0241] b) In one example, xPos is not less than the threshold Tx (e.g., 32) and / or yPos is not less than the threshold Ty (e.g., 32).

[0242] 3) For example, the location can depend on the dimensions of the block.

[0243] vii. In one example, representative coefficients can be those coefficients at predefined positions in the coefficient scan order.

[0244] e. Alternatively, the representativeness coefficient may also include those with zero coefficients.

[0245] f. Alternatively, the representative coefficients may be coefficients derived from the decoded coefficients, such as by limiting to a range, or by quantization.

[0246] g. In one example, the representative coefficient can be the coefficient preceding the last effective coefficient (which may include the last effective coefficient).

[0247] 3. The determination of whether to use IT for the first block may depend on the decoding luminance coefficient of the first block.

[0248] a. In addition, optionally, the use of a specific IT is applied only to the luminance component of the first block, while DCT2 is always used for the chrominance component of the first block.

[0249] b. Alternatively, the determined IT is applied to all color components of the first block. That is, the same transformation matrix is ​​applied to all color components of the first block.

[0250] 4. The determination of the use of IT can depend on a function of representativeness coefficients, such as a function that uses representativeness coefficients as input and value V as output.

[0251] a. In one example, V is derived as the number of representative coefficients.

[0252] i. Optionally, V is derived as the sum of representative coefficients.

[0253] 1) Optionally, V is derived as the sum of the levels (or absolute values) of the representative coefficients.

[0254] 2) Optionally, V can be derived as a level (or absolute value) of a representative coefficient (such as the last one).

[0255] 3) Optionally, V can be derived as the number of representative coefficients of even levels.

[0256] 4) Optionally, V can be derived as the number of representative coefficients of odd level.

[0257] 5) In addition, optionally, the sum can be limited to derive V.

[0258] ii. Alternatively, V is derived as the output of a function, where the function defines the residual energy distribution.

[0259] 1) In one example, the function returns the ratio of the sum of the absolute values ​​of the partial representative coefficients to the absolute values ​​of all representative coefficients.

[0260] 2) In one example, the function returns the ratio of the sum of squares of the absolute values ​​of the partial representative coefficients to the sum of squares of the absolute values ​​of all representative coefficients.

[0261] 3) In one example, the function returns whether the energy of the first K representative coefficients multiplied by the scaling factor is greater than the energy of the first M (M>K) representative coefficients or all representative coefficients.

[0262] 4) In one example, the function returns whether the energy of the representative coefficient in the first subregion representing the block, multiplied by the scaling factor, is greater than the energy of the representative coefficient in the second subregion that contains the first subregion and is larger than the first subregion.

[0263] a) Optionally, the function returns whether the energy of the representative coefficient in the first subregion representing the block multiplied by the scaling factor is greater than the energy of the representative coefficient in the second subregion that does not overlap with the first subregion.

[0264] b) In one example, the first sub-region is the top-left M×N sub-region (i.e., M=N=1, DC only).

[0265] i. Optionally, the first sub-region is the second sub-region that does not include the top left M×K sub-region (i.e., does not include DC).

[0266] c) In one example, the first sub-region is the top left 4×4 sub-region.

[0267] 5) In the example above, energy is defined as the sum of absolute values ​​or the sum of squares of values.

[0268] iii. Alternatively, V is derived as whether at least one representative coefficient is located outside a subregion of the representative block.

[0269] 1) In one example, a subregion is defined as the top left subregion of the representative block, for example, the top left quarter of the representative block.

[0270] b. In one example, the determination of the use of IT may depend on the parity of V.

[0271] i. For example, if V is even, then IT is used; but if V is odd, then IT is not used.

[0272] 1) Optionally, if V is even, use IT; if V is odd, do not use IT.

[0273] ii. In one example, if V is less than threshold T1, then IT is used; but if V is greater than threshold T2, then IT is not used.

[0274] 1) Optionally, if V is greater than threshold T1, then IT is used; if V is less than threshold T2, then IT is not used.

[0275] iii. For example, the threshold can depend on encoding / decoding information, such as block dimension and prediction mode.

[0276] iv. For example, the threshold can depend on QP.

[0277] c. In one example, the determination of the use of IT may depend on a combination of V and other codec information (e.g., prediction mode, strip type / picture type, block dimension).

[0278] 5. The determination of the use of IT can further depend on the encoding and decoding information of the current block.

[0279] a. In one example, the determination may also depend on mode information (e.g., inter-frame, intra-frame, or IBC).

[0280] b. In one example, the transformation determination may depend on the scan area, which is the smallest rectangle covering all valid coefficients (e.g., as depicted in Figure 14).

[0281] i. In one example, if the size of the scan region associated with the current block (e.g., width multiplied by height) is greater than a given threshold, a default transformation (such as DCT-2) can be used, including both horizontal and vertical transformations. Otherwise, rules such as those defined in bullet point 3 can be used (e.g., IT when V is even and DCT-2 when V is odd).

[0282] ii. In one example, if the width of the scan region associated with the current block is greater than (or less than) a given maximum width (e.g., 16), then a default horizontal transformation (such as DCT-2) can be used. Otherwise, rules such as those defined in bullet point 3 can be used.

[0283] iii. In one example, if the height of the scan region associated with the current block is greater than (or less than) a given maximum height (e.g., 16), a default vertical transformation (such as DCT-2) can be utilized. Otherwise, rules such as those defined in bullet point 3 can be used.

[0284] iv. In one example, the given dimensions are L×K, where L and K are integers, such as 16.

[0285] v. In one example, the default transformation matrix can be either DCT-2 or DST-7.

[0286] 6. One or more of the methods disclosed in bullet points 1 through 5 can only be applied to a specific block.

[0287] a. For example, one or more of the methods disclosed in bullets 1 to 5 may only be applied to blocks of IBC encoding and / or intra-frame encoding and decoding other than DT.

[0288] b. For example, one or more of the methods disclosed in bullet points 1 through 5 can only be applied to blocks with specific constraints on the coefficients.

[0289] i. A rectangle with four corners (0, 0), (CRx, 0), (0, CRy), and (CRx, CRy) is defined as a constrained rectangle, as in the SRCC method, for example. In one example, one or more of the methods disclosed in bullets 1 through 5 may be applied only if all coefficients outside the constrained rectangle are zero. For example, CRx = CRy = 16.

[0290] 1) For example, CRx = SRx and CRy = SRy, where (SRx, SRy) is defined in SRCC as described in Section 2.14.

[0291] 2) Alternatively, the above method may be applied only when the block width or block height is greater than K.

[0292] a) In one example, K equals 16.

[0293] b) In one example, the above method is applied only when the block width is greater than K1 and K1 equals CRx; or when the block height is greater than K2 and K2 equals CRy.

[0294] ii. Only when the last non-zero coefficients (in the forward scan order) satisfy certain conditions, for example, when the horizontal / vertical coordinate is not greater than a threshold (e.g.,

[0295] When 16 / 32), one or more of the methods can be applied.

[0296] 7. When it is determined that IT is not to be used, default transformations such as DCT-2 or DST-7 can be used instead.

[0297] a. Optionally, when it is determined that IT is not to be used, one can choose from several default transformations such as DCT-2 or DST-7.

[0298] 8. In order to determine the transformation to be applied to a block encoded and decoded with prediction mode A from the transformation set, representative coefficients (e.g., bullet 2) from one or more representative blocks (e.g., bullet 1) are used, and the transformation set may depend on prediction mode A and / or one or more syntax elements and / or other encoding and decoding information (e.g., the use of DT, block dimension).

[0299] a. In one example, the transform set is {DCT2}.

[0300] b. In one example, the transformation set is {IT}.

[0301] c. In one example, the transform set includes {DCT2, IT}.

[0302] d. In one example, the transform set includes {DCT2, DST7}.

[0303] e. In one example, DCT2 is used when the number of representative coefficients is even. Otherwise (when the number of representative coefficients is odd), DST7 or IT is used, which can be determined by prediction mode A and / or one or more syntax elements and / or other codec information.

[0304] i. In one example, if the prediction pattern A is IBC, it means that IT is always used when the number of coefficients is odd.

[0305] ii. In one example, if prediction mode A is intra-frame and DT is applied, DCT2 is always used when the number of representative coefficients is odd.

[0306] iii. In one example, if prediction mode A is intra-frame and DT is not applied, and one or more syntax elements indicate the implicit determination of IT or the implicit determination of transform skip mode, then IT is always used when the number of representative coefficients is odd.

[0307] 9. Whether and / or how the methods disclosed above can be applied to signaling notification at the video region level (such as sequence level / picture level / strip level / group level / piece level / subpicture level).

[0308] a. In one example, signaling notifications (e.g., flags) can be found in the sequence header / picture header / SPS / VPS / DCI / DPS / PPS / APS / strip header / piece group header.

[0309] i. Additionally, alternatively, one or more syntax elements (e.g., one or more flags) may be signaled to specify whether the implicit determination method of IT is enabled or the use of implicit determination of the transformation skip mode is enabled.

[0310] 1) In one example, a signaling notification first flag can be used to control the use of a method for implicitly determining the IT of an IBC codec block at the video region level.

[0311] a) Additionally, a signaling notification flag may be added if IBC is checked to see if it is enabled.

[0312] b) Optionally, whether to enable the use of an implicit determination method for the IT of IBC-coded blocks can be controlled by the same flag used to control the use of the implicit selection (IST) mode for the transform of intra-coded blocks that do not include the derivation tree (DT) mode.

[0313] 2) In one example, a signaling notification to a second flag may be used to control the implicit determination of the IT of intra-frame codec blocks at the video region level (e.g., blocks with derivation tree (DT) mode may be excluded).

[0314] 3) In one example, a signaling notification third flag can be used to control the use of an implicit method for determining the IT of inter-frame codec blocks at the video region level.

[0315] 4) In one example, signaling notification flags can be used to control the implicit determination of the IT of intra-frame codec blocks (e.g., blocks with DT mode can be excluded) and inter-frame codec blocks at the video region level.

[0316] 5) In one example, signaling notification flags can be used to control the implicit determination of the IT of IBC codec blocks and intra-frame codec blocks at the video region level (e.g., blocks with DT mode can be excluded).

[0317] ii. Additionally, optionally, when an implicit determination method for the video region is enabled (e.g., a flag is set to true), the following can be further applied:

[0318] 1) In one example, for IBC codec blocks, if IT is used for the block, TS mode is applied; otherwise, DCT2 is used.

[0319] 2) In one example, for intra-frame encoded blocks (e.g., blocks with DT mode can be excluded), if IT is used for the block, then TS mode is applied; otherwise, DCT2 is used.

[0320] iii. Additionally, alternatively, when implicit determination methods for IT are disabled for video regions (e.g., flagged as false), the following may be further applied;

[0321] 1) In one example, DCT-2 is used for IBC and / or encoded / decoded blocks.

[0322] 2) In one example, for intra-frame encoded blocks (e.g., excluding blocks with DT mode), DCT-2 or DST-7 can be determined on the fly, such as by IST.

[0323] b. Multi-level control that allows signaling notifications to enable / apply IT methods at multiple video unit levels (e.g., sequence level, picture level, strip level).

[0324] i. In one example, the first video unit level is defined as the sequence level.

[0325] 1) Optionally, in addition, the first syntax element (e.g., a flag) in the sequence header / SPS / is signaled to indicate the use of IT.

[0326] a) In one example, a first syntax element equal to 0 indicates that the implicitly selected transform skip (ISTS) method cannot be used in the sequence. Otherwise (a first syntax element equal to 1) indicates that the implicitly selected transform skip (ISTS) method can be used in the sequence.

[0327] b) Alternatively, the first syntax element may be conditionally signaled, for example, based on “implicit selection of enable transformation (IST) / ist_enable_flag equals 1”.

[0328] c) Alternatively, in addition, a default value is inferred when the first syntax element is not present. For example, IT is inferred to be disabled for the first video unit level.

[0329] ii. In one example, the second video unit level is defined as the picture level / strip level.

[0330] 1) Optionally, in addition, a signaling second syntax element (e.g., a flag) is provided in the picture header (e.g., intra-frame picture header and / or inter-frame picture header) / strip header to indicate the use of IT.

[0331] a) Optionally, in addition, the second syntax element may be conditionally signaled according to such conditions as “Enable Implicit Selection of Transformation (IST)” or “ist_enable_flag equals 1” and / or “Enable IT method at the first video unit level (e.g., sequence)”.

[0332] b) Alternatively, in addition, a default value is inferred when the second syntax element is not present. For example, for the second video unit level, IT is inferred to be disabled.

[0333] c. At the video unit level (e.g., image), signaling notifications indicate syntax elements of the allowed transform set, and the syntax element can be conditionally signaled based on whether implicit selection of transforms (IST) is enabled.

[0334] i. In one example, a signaling notification syntax element (e.g., a flag or index) is included in the picture header (e.g., an intra-frame picture header and / or an inter-frame picture header) / strip header.

[0335] 1) Optionally, in addition, N different allowed transformation sets are supported, the selection of which depends on the syntax element.

[0336] a) In one example, N is set to 2.

[0337] b) In one example, the choice of transform set may depend on block information, such as the encoding / decoding mode of the CU, or whether the CU is encoded / decoded in (derivative tree) DT mode.

[0338] i. For example, DCT2 is always used for CUs with DT mode.

[0339] ii. For example, DCT2 is always used for chroma blocks.

[0340] c) In one example, the two sets are {DCT2, DST7} and {DCT2, IT}.

[0341] i. Optionally, these two sets are used for blocks in inter-frame encoding and decoding.

[0342] ii. Alternatively, these two sets can be used for blocks of non-DT codec inter-frame codecs.

[0343] d) In one example, the two sets are {DCT2} and {DCT2,IT}.

[0344] i. Optionally, these two sets are used for blocks in inter-block copy (IBC) encoding and decoding.

[0345] ii. Alternatively, these two sets are used for blocks in non-DT codec Inter-Block Copy (IBC) codecs.

[0346] 2) Optionally, and furthermore, if not present, the default value is inferred. For example, the allowed set of transformations is inferred to allow only one transformation type (e.g., DCT2).

[0347] ii. In one example, the syntax element (e.g., a flag or index) is used to control the selection of transformations from the allowed set of transformations for use by blocks with a specific encoding / decoding mode.

[0348] 1) In one example, this syntax element controls the selection of transforms from the allowed transform set for use by intra-frame encoded blocks / excluding those blocks that have applied DT and excluding those blocks that have Pulse Code-Coded Modulation (PCM) modes.

[0349] 2) Alternatively, for blocks with other encoding / decoding modes, the allowed set of transformations may be independent of the syntax elements.

[0350] a) In one example, for a block of inter-frame codec, the allowed transform set is {DCT2}.

[0351] b) In one example, for a block encoded and decoded by IBC, the allowed transform set is {DCT2} or {DCT2, IT}, which may depend on, for example, whether IST is enabled for the current image or whether IBC is enabled.

[0352] 10. At the video region level, such as sequence level / picture level / strip level / group level / piece level / subpicture level, signaling indicates whether to apply a zeroing instruction to transform blocks (including specific transforms).

[0353] a. In one example, the indication (e.g., a flag) can be signaled in the sequence header / picture header / SPS / VPS / DCI / DPS / PPS / APS / strip header / piece group header.

[0354] b. In one example, when the instruction specifies that zeroing is enabled, only IT transformations are allowed.

[0355] c. In one example, when the instruction specifies that zeroing is disabled, only non-IT transformations are allowed.

[0356] d. Additionally, optionally, the allowed range of binary / context modeling / last valid coefficients / bottom-right position (e.g., the maximum X / Y coordinates relative to the top-left position of the block) in the SRCC can depend on this indication.

[0357] 11. The first rule (e.g., in bullet points 1 to 7 above) can be used to determine the use of IT in the first block, and the second rule can be used to determine the transformation type that does not include IT.

[0358] a. In one example, the first rule can be defined as the residual energy distribution.

[0359] b. In one example, the second rule can be defined as the parity of the representative coefficients.

[0360] Transformation skip

[0361] 12. Apply zeroing to IT (e.g., TS) codec blocks, where non-zero coefficients are restricted to specific sub-regions of the block.

[0362] a. In one example, the zeroing range of an IT (e.g., TS) codec block is set to the upper right K*L sub-region of the block, where K is set to min(T1, W) and L is set to min(T2, H), where W and H are the block width / block height respectively, and T1 / T2 are two thresholds.

[0363] i. In one example, T1 and / or T2 can be set to 32 or 16.

[0364] ii. Alternatively, the last non-zero coefficient should be located within the K*L subregion.

[0365] iii. Additionally, optionally, the lower right position (SRx, SRy) in the SRCC method.

[0366] It should be located within the K*L subregion.

[0367] 13. Multiple zeroing types are defined for IT (e.g., TS) codec blocks, where each type corresponds to a sub-region of the block, where non-zero coefficients exist only in that sub-region.

[0368] a. In one example, non-zero coefficients exist only in the top-left K0*L0 subregion of the block.

[0369] b. In one example, non-zero coefficients exist only in the upper right K1*L1 subregion of the block.

[0370] i. Alternatively, signaling may be used to indicate the lower left position of a sub-region with a non-zero coefficient.

[0371] c. In one example, non-zero coefficients exist only in the lower left K2*L2 subregion of the block.

[0372] i. Alternatively, signaling may be used to indicate the upper right position of a sub-region with a non-zero coefficient.

[0373] d. In one example, non-zero coefficients exist only in the lower right K3*L3 subregion of the block.

[0374] i. Alternatively, signaling may be used to indicate the upper left position of a sub-region with a non-zero coefficient.

[0375] e. In addition, alternatively, explicit signaling notifications or immediate export of IT zeroing type instructions may be provided.

[0376] 14. When at least one valid coefficient is outside the zeroing region defined by IT (e.g., TS), such as outside the top-left K0*L0 sub-region of the block, IT (e.g., TS) is not used in the block.

[0377] a. Alternatively, in this case, the default transformation can be used.

[0378] 15. Use IT (e.g., TS) in a block when at least one valid coefficient is outside the zeroing region defined by another transformation matrix (e.g., DST7 / DCT2 / DCT8), such as outside the top-left K0*L0 sub-region of the block.

[0379] a. Alternatively, in this case, the TS mode may be used for inference.

[0380] Figures 16A to 16D The various zeroing types of TS codec blocks are shown. Figure 16A The top-left K0*L0 sub-region is shown. Figure 16B The upper right K1*L1 sub-region is shown. Figure 16C The lower left K2*L2 sub-region is shown. Figure 16D The lower right K3*L3 sub-region is shown.

[0381] General

[0382] 16. The transformation matrix can be determined at the CU / CB level or the TU level.

[0383] a. In one example, the decision is made at the CU level, where all TUs share the same transformation matrix.

[0384] i. Alternatively, when a CU is divided into multiple TUs, the coefficients in one TU (e.g., the first TU or the last TU) or some or all of the TUs can be used to determine the transformation matrix.

[0385] b. Whether to use a CU-level solution or a TU-level solution may depend on the block size and / or VPDU size and / or maximum CTU size and / or encoding / decoding information of a block.

[0386] i. In one example, when the block size is larger than the VPDU size, the CU level determination method can be applied.

[0387] 17. Whether and / or how the disclosed methods are applied may depend on encoding / decoding information, which may include:

[0388] a. Block dimension.

[0389] i. In one example, the above method can be applied to blocks whose width and / or height are no greater than a threshold (e.g., 32).

[0390] ii. In one example, the above method can be applied to blocks whose width and / or height are not less than a threshold (e.g., 4).

[0391] iii. In one example, the above method can be applied to blocks whose width and / or height are less than a threshold (e.g., 64).

[0392] b.QP

[0393] c. Image or strip type (such as I-frame or P / B frame, I-strip or P / B strip)

[0394] i. In one example, the proposed method can be enabled for I-frames but disabled for P / B frames.

[0395] d. Structural segmentation methods (single-tree or dual-tree)

[0396] i. In one example, the above method can be applied to strips / images / tiles / pieces for which single-tree segmentation is applied.

[0397] e. Encoding / decoding modes (such as inter-frame mode / intra-frame mode / IBC mode, etc.).

[0398] i. In one example, the above method can be applied to blocks that are intra-frame encoded or decoded.

[0399] ii. In one example, the above method can be applied to intra-frame codec blocks that do not include those blocks that have applied DT and those blocks that have PCM mode.

[0400] iii. In one example, the above method can be applied to intra-frame codec blocks that do not include those with DT applied and those with PCM mode, as well as IBC codec blocks.

[0401] f. Encoding and decoding methods (such as intra-frame sub-block segmentation, Derived Tree (DT) methods, etc.).

[0402] i. In one example, the above method can be disabled for intra-frame codec blocks that have applied DT.

[0403] ii. In one example, the above method can be disabled for intra-frame codec blocks that have applied ISP.

[0404] g. Color components

[0405] i. In one example, the above method can be applied to the luma block, but not to the chroma block.

[0406] h. Intra-frame prediction modes (such as DC, vertical, horizontal, etc.).

[0407] i. Motion information (such as MV and reference index).

[0408] j. Standard grade / level / hierarchy

[0409] 5. Example Implementation

[0410] The following are some example embodiments of some aspects of the invention summarized in Section 4 above, which can be applied to the VVC specification.

[0411] 5.1. Example #1

[0412] This section presents an example of a scheme for Implicit Selection of Transform Skip Mode (ISTS). Essentially, the scheme follows the design principles of Implicit Selection of Transform (ISTS) already adopted by AVS3. A high-level flag is signaled in the picture header to indicate that ITS is enabled. If ITS is enabled, the allowed transform set is set to {DCT-II TS}, and the determination of the TS mode is based on the parity of the number of non-zero coefficients in the block. Simulation results reportedly show that, compared to HPM6.0, the proposed ITS achieves bitrate reductions of 15.86% and 12.79% for screen content encoding and decoding in AI and RA configurations, respectively. It is asserted that the increase in encoder and decoder complexity is negligible.

[0413] 5.1.1. Introduction

[0414] In the current AVS3 design, only DCT-II is allowed to encode and decode residual blocks used in IBC mode. For blocks that do not include DT (Transform Skip) encoding / decoding, IST (Internal Signaling System) is applied, allowing the block to choose between DCT-II and DST-VII based on the parity of the number of non-zero coefficients. However, DST-VII is much less efficient for screen content encoding / decoding. Transform Skip (TS) mode is an efficient encoding / decoding method for screen content encoding / decoding. Research is needed on how to allow the codec to support TS without explicit signaling notification to the block.

[0415] 5.1.2. The proposed method

[0416] In some embodiments, implicit selection of transformation skip mode (ISTS) can be used. A high-level signaling flag in the image header indicates whether ISTS is enabled.

[0417] When ISTS is enabled, the allowed transform set is set to {DCT-II TS}, and the TS mode is determined based on the parity of the number of non-zero coefficients in the block, following the same design principles as IST. Odd numbers indicate the application of TS, while even numbers indicate the application of DCT-II. For CUs using intra-frame or IBC encoding / decoding, ISTS applies to CUs ranging in size from 4×4 to 32×32, excluding CUs that apply DT or use PCM.

[0418] When ITS is disabled and IST is enabled, the allowed transform set is set to be the same as the current AVS3 design {DCT-II DST-VII}.

[0419] 5.1.3. Proposed Changes to the Syntax Table, Semantics, and Decoding Process

[0420] Most of the relevant parts that have been added or modified are underlined in bold italics, and some deleted parts are indicated by [[]].

[0421] 7.1.2.2 Sequence Header

[0422] Table 14: Sequence Header Definitions

[0423]

[0424]

[0425] 7.1.3.1 Image header of image

[0426] Table 27: Intra-frame Prediction Header Definition

[0427]

[0428] 7.1.3.2 Image header of PB images

[0429] Table 28: Inter-Frame Prediction Header Definition

[0430]

[0431] Document revisions 7.1.7

[0432]

[0433] 7.2.2.2 Sequence Header

[0434]

[0435] 7.2.3.1 Intra-frame prediction header

[0436]

[0437] 9.6.3 Inverse Transformation

[0438] This paper defines the process of converting the M1×M2 transformation coefficient matrix CoeffMatrix into the residual sample matrix ResidueMatrix.

[0439] If the intra-prediction mode is neither 'Intra_Luma_PCM' nor 'Intra_Chroma_PCM'

[0440] if The current transform block is a lumen intra-frame prediction residual block. The values ​​of M1 and M2 are both less than 64 and the value of IstTuFlag is equal to 1. Then, the residual sample matrix ResidueMatrix is ​​derived according to the method defined by 0.

[0441]

[0442] Otherwise, derive the residual sample matrix ResidueMatrix according to the method defined in 9.6.3.1.

[0443] Otherwise (intra-prediction mode is 'Intra_Luma_PCM' or 'Intra_Chroma_PCM'), the residual sample matrix ResidueMatrix is ​​derived according to the method defined in 9.6.3.3.

[0444] 9.6.3.4 Implicit Inverse Transformation Skip Method

[0445]

[0446] Figure 17 This is a block diagram illustrating an example video processing system 1700, in which various techniques disclosed herein can be implemented. Various implementations may include some or all of the components of system 1700. System 1700 may include an input 1702 for receiving video content. The video content may be received in a raw or uncompressed format (e.g., 8-bit or 10-bit multi-component pixel values), or in a compressed or encoded format. Input 1702 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces (e.g., Ethernet, Passive Optical Networking (PON), etc.) and wireless interfaces (e.g., Wi-Fi or cellular interfaces).

[0447] System 1700 may include codec component 1704, which can implement the various codec or encoding methods described in this document. Codec component 1704 can reduce the average bit rate of video from input 1702 to the output of codec component 1704 to produce a codec representation of the video. Therefore, codec techniques are sometimes referred to as video compression or video transcoding techniques. The output of codec component 1704 can be stored or transmitted via communication through the connection represented by component 1706. The stored or transmitted bitstream (or codec) representation of the video received at input 1702 can be used by component 1708 to generate pixel values ​​or displayable video to be sent to display interface 1710. The process of generating a user-viewable video from the bitstream is sometimes referred to as video decompression. Furthermore, although some video processing operations are referred to as “codec” operations or tools, it will be understood that codec tools or operations are used at the encoder, and the corresponding decoding tools or operations that inversely represent the codec results will be performed by the decoder.

[0448] Examples of peripheral bus interfaces or display interfaces may include Universal Serial Bus (USB), High Definition Multimedia Interface (HDMI), or DisplayPort. Examples of storage interfaces include SATA (Serial Advanced Technology Accessory), PCI, IDE, etc. The technologies described in this document can be found in a variety of electronic devices, such as mobile phones, laptops, smartphones, or other devices capable of performing digital data processing and / or video display.

[0449] Figure 21 This is a block diagram of a video processing apparatus 2100. Apparatus 2100 can be used to implement one or more methods described herein. Apparatus 2100 can be embodied in smartphones, tablets, computers, Internet of Things (IoT) receivers, etc. Apparatus 2100 may include one or more processors 2102, one or more memories 2104, and video processing hardware 2106. The processors(multiple) 2102 can be configured to implement one or more methods described in this document. The one or more memories 2104 can be used to store data and code used to implement the methods and techniques described herein. The video processing hardware 2106 can be used to implement some of the techniques described in this document in hardware circuitry.

[0450] Figure 18 This is a block diagram illustrating an example video codec system 100 that can utilize the techniques disclosed herein.

[0451] like Figure 18 As shown, the video encoding / decoding system 100 may include a source device 110 and a destination device 120. The source device 110 generates encoded video data, which may be referred to as a video encoding device. The destination device 120 can decode the encoded video data generated by the source device 110, which may be referred to as a video decoding device.

[0452] The source device 110 may include a video source 112, a video encoder 114, and an input / output (I / O) interface 116.

[0453] Video source 112 may include sources such as video capture devices, interfaces for receiving video data from video content providers, and / or computer graphics systems for generating video data, or combinations of these sources. Video data may include one or more images. Video encoder 114 encodes the video data from video source 112 to generate a bitstream. The bitstream may include a sequence of bits forming a codec representation of the video data. The bitstream may include codec images and associated data. A codec image is a codec representation of an image. Associated data may include sequence parameter sets, image parameter sets, and other syntax structures. I / O interface 116 may include a modulator / demodulator (modem) and / or a transmitter. Encoded video data can be transmitted directly to destination device 120 via network 130a through I / O interface 116. Encoded video data may also be stored on storage medium / server 130b for access by destination device 120.

[0454] Destination device 120 may include I / O interface 126, video decoder 124 and display device 122.

[0455] I / O interface 126 may include a receiver and / or a modem. I / O interface 126 may acquire encoded video data from source device 110 or storage medium / server 130b. Video decoder 124 may decode the encoded video data. Display device 122 may display the decoded video data to a user. Display device 122 may be integrated with destination device 120, or may be located external to destination device 120, which is configured to interact with an external display device.

[0456] The video encoder 114 and the video decoder 124 can operate according to video compression standards such as the High Efficiency Video Codec (HEVC) standard, the Universal Video Codec (VVM) standard, and other current and / or further standards.

[0457] Figure 19 This is a block diagram illustrating an example of a video encoder 200, which can be... Figure 18 The video encoder 114 in the system 100 shown.

[0458] The video encoder 200 can be configured to perform any or all of the techniques disclosed herein. Figure 19 In the example, the video encoder 200 includes multiple functional components. The techniques described in this disclosure can be shared among the various components of the video encoder 200. In some examples, the processor can be configured to perform any or all of the techniques described in this disclosure.

[0459] The functional components of the video encoder 200 may include a segmentation unit 201, a prediction unit 202 (which may include a mode selection unit 203), a motion estimation unit 204, a motion compensation unit 205, an intra-frame prediction unit 206, a residual generation unit 207, a transform unit 208, a quantization unit 209, an inverse quantization unit 210, an inverse transform unit 211, a reconstruction unit 212, a buffer 213, and an entropy coding unit 214.

[0460] In other examples, the video encoder 200 may include more, fewer, or different functional components. In one example, the prediction unit 202 may include an intra-block copy (IBC) unit. The IBC unit can perform prediction in IBC mode, where at least one reference picture is the picture containing the current video block.

[0461] Furthermore, some components (such as motion estimation unit 204 and motion compensation unit 205) may be highly aggregated, but for illustrative purposes, in Figure 11 The examples represent the examples respectively.

[0462] The segmentation unit 201 can segment an image into one or more video blocks. The video encoder 200 and the video decoder 300 can support various video block sizes.

[0463] The mode selection unit 203 can, for example, select one of the encoding / decoding modes (intra-frame or inter-frame) based on the error result, and provide the resulting intra-frame or inter-frame encoded / decoded block to the residual generation unit 207 to generate residual block data, and to the reconstruction unit 212 to reconstruct the coded block for use as a reference picture. In some examples, the mode selection unit 203 can select a combination of intra-frame prediction and inter-frame prediction (CIIP) modes, where the prediction is based on the inter-frame prediction signal and the intra-frame prediction signal. The mode selection unit 203 can also select the resolution of the motion vector for the block (e.g., sub-pixel precision or integer pixel precision) in the case of inter-frame prediction.

[0464] To perform inter-frame prediction on the current video block, motion estimation unit 204 can generate motion information for the current video block by comparing one or more reference frames from buffer 213 with the current video block. Motion compensation unit 205 can determine the predicted video block for the current video block based on motion information and decoded samples from images other than those associated with the current video block from buffer 213.

[0465] For example, motion estimation unit 204 and motion compensation unit 205 can perform different operations on the current video block, depending on whether the current video block is in an I-band, P-band, or B-band.

[0466] In some examples, motion estimation unit 204 can perform unidirectional prediction on the current video block, and can search for a reference video block for the current video block in the reference images of list 0 or list 1. Then, motion estimation unit 204 can generate a reference index indicating the reference image in list 0 or list 1 containing the reference video block, and a motion vector indicating the spatial displacement between the current video block and the reference video block. Motion estimation unit 204 can output the reference index, prediction direction indicator, and motion vector as motion information for the current video block. Motion compensation unit 205 can generate a predicted video block for the current block based on the reference video block indicated by the motion information of the current video block.

[0467] In other examples, motion estimation unit 204 can perform bidirectional prediction on the current video block. Motion estimation unit 204 can search for a reference video block for the current video block in the reference images in list 0, and also search for another reference video block for the current video block in the reference images in list 1. Then, motion estimation unit 204 can generate reference indices indicating the reference images in lists 0 and 1 containing the reference video blocks, and motion vectors indicating the spatial displacement between the reference video blocks and the current video block. Motion estimation unit 204 can output the reference index and motion vector of the current video block as the motion information of the current video block. Motion compensation unit 205 can generate a predicted video block for the current video block based on the reference video blocks indicated by the motion information of the current video block.

[0468] In some examples, the motion estimation unit 204 can output a complete set of motion information for the decoder's decoding processing.

[0469] In some examples, motion estimation unit 204 may not output the complete set of motion information for the current video. Instead, motion estimation unit 204 can signal the motion information of the current video block by referencing the motion information of another video block. For example, motion estimation unit 204 may determine that the motion information of the current video block is sufficiently similar to the motion information of neighboring video blocks.

[0470] In one example, the motion estimation unit 204 may instruct the video decoder 300, within the syntax structure associated with the current video block, to indicate that the current video block has the same motion information as another video block.

[0471] In another example, motion estimation unit 204 can identify another video block and motion vector difference (MVD) within the syntactic structure associated with the current video block. The motion vector difference indicates the difference between the motion vector of the current video block and the motion vector of the indicated video block. Video decoder 300 can use the motion vector of the indicated video block and the motion vector difference to determine the motion vector of the current video block.

[0472] As discussed above, the video encoder 200 can predictively signal motion vectors. Two examples of predictive signaling techniques that can be implemented by the video encoder 200 include Advanced Motion Vector Prediction (AMVP) and merge pattern signaling.

[0473] Intra-prediction unit 206 can perform intra-prediction on the current video block. When intra-prediction unit 206 performs intra-prediction on the current video block, it can generate prediction data for the current video block based on decoded samples from other video blocks in the same frame. The prediction data for the current video block may include the predicted video block and various syntax elements.

[0474] The residual generation unit 207 can generate residual data for the current video block by subtracting (e.g., indicated by a minus sign) the predicted video block from the current video block. The residual data for the current video block can include residual video blocks corresponding to different sample components in the current video block.

[0475] In other examples, there may be no residual data for the current video block. For example, in skip mode, the residual generation unit 207 may not perform the subtraction operation.

[0476] The transform processing unit 208 can generate one or more transform coefficient video blocks for the current video block by applying one or more transforms to the residual video blocks associated with the current video block.

[0477] After the transform processing unit 208 generates a transform coefficient video block associated with the current video block, the quantization unit 209 can quantize the transform coefficient video block associated with the current video block based on one or more quantization parameter (QP) values ​​associated with the current video block.

[0478] The inverse quantization unit 210 and the inverse transform unit 211 can apply inverse quantization and inverse transform to the transform coefficient video block, respectively, to reconstruct the residual video block based on the transform coefficient video block. The reconstruction unit 212 can add the reconstructed residual video block to the corresponding samples of one or more predicted video blocks generated by the prediction unit 202 to produce a reconstructed video block associated with the current block, which is then stored in the buffer 213.

[0479] After the video block is reconstructed by reconstruction unit 212, a loop filtering operation can be performed to reduce video block artifacts in the video block.

[0480] Entropy encoding unit 214 can receive data from other functional components of video encoder 200. When entropy encoding unit 214 receives data, it can perform one or more entropy encoding operations to generate entropy encoded data and output a bit stream including the entropy encoded data.

[0481] Figure 20 This is a block diagram illustrating an example of a video decoder 300, which can be... Figure 18 The video decoder 114 in the system 100 shown.

[0482] The video decoder 300 can be configured to perform any or all of the technologies disclosed herein. Figure 20 In the example, the video decoder 300 includes multiple functional components. The techniques described in this disclosure can be shared among the various components of the video decoder 300. In some examples, the processor can be configured to perform any or all of the techniques described in this disclosure.

[0483] exist Figure 20 In the example, the video decoder 300 includes an entropy decoding unit 301, a motion compensation unit 302, an intra-frame prediction unit 303, an inverse quantization unit 304, an inverse transform unit 305, a reconstruction unit 306, and a buffer 307. In some examples, the video decoder 300 can perform encoding passes typically described with respect to the video encoder 200. Figure 19 The opposite decoding iteration.

[0484] The entropy decoding unit 301 can retrieve the encoded bitstream. The encoded bitstream may include entropy-coded video data (e.g., encoded blocks of video data). The entropy decoding unit 301 can decode the entropy-coded video data, and the motion compensation unit 302 can determine motion information based on the entropy-decoded video data. This motion information includes motion vectors, motion vector precision, reference image list index, and other motion information. For example, the motion compensation unit 302 can determine this information by executing AMVP and merge modes.

[0485] The motion compensation unit 302 can generate motion compensation blocks, possibly performing interpolation based on an interpolation filter. The syntax elements may include identifiers for the interpolation filter used at sub-pixel precision.

[0486] The motion compensation unit 302 can use the interpolation filter used by the video encoder 200 during video block encoding to calculate the interpolation of sub-integer pixels of the reference block. The motion compensation unit 302 can determine the interpolation filter used by the video encoder 200 based on the received syntax information, and use the interpolation filter to generate the prediction block.

[0487] The motion compensation unit 302 may use some syntax information to determine the size of the blocks used to encode (multiple) frames and / or (multiple) stripes of the encoded video sequence, segmentation information describing how each macroblock of the image of the encoded video sequence is segmented, a mode indicating how each partition is encoded, one or more reference frames (and a list of reference frames) for each inter-frame coded block, and other information for decoding the encoded video sequence.

[0488] Intra-prediction unit 303 can use, for example, an intra-prediction mode received in the bitstream to form prediction blocks based on spatially adjacent blocks. Dequantization unit 303 dequantizes (e.g., dequantizes) the quantized video block coefficients provided in the bitstream and decoded by entropy decoding unit 301. Inverse transform unit 303 applies an inverse transform.

[0489] The reconstruction unit 306 can add the residual block to the corresponding prediction block generated by the motion compensation unit 202 or the intra-frame prediction unit 303 to form a decoded block. If necessary, a deblocking filter can also be applied to filter the decoded block to remove block artifacts. The decoded video block is then stored in a buffer 307, which provides a reference block for subsequent motion compensation / intra-frame prediction and also generates decoded video for presentation on a display device.

[0490] Below is a list of preferred solutions for some embodiments.

[0491] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., Project 1).

[0492] 1. A video processing method (e.g., Figure 22 The method described in the document (2200) includes: converting between video blocks of a video and a codec representation of the video; determining, based on a rule, whether to apply a horizontally specific transformation or a vertically specific transformation to the video blocks (2202); and performing the transformation based on the determination (2204), wherein the rule specifies the relationship between the determination and the representative coefficients of the decoding coefficients from one or more representative blocks of the video.

[0493] 2. The method of Solution 1, wherein one or more representative blocks belong to the color component to which the video block belongs.

[0494] 3. The method of Solution 1, wherein one or more representative blocks belong to a color component that is different from the color component of the video block.

[0495] 4. The method of any one of solutions 1-3, wherein the one or more representative blocks correspond to video blocks.

[0496] 5. The method of any one of solutions 1-3, wherein the one or more representative blocks do not include video blocks.

[0497] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., bullet points 1 and 2).

[0498] 6. The method of any one of solutions 1-5, wherein the representative coefficients include decoding coefficients with non-zero values.

[0499] 7. The method of any one of solutions 1-6, wherein the relationship specifies the use of the representativeness coefficient based on a modified coefficient determined by modifying the representativeness coefficient.

[0500] 8. The method of any one of solutions 1-7, wherein the representative coefficient corresponds to the effective coefficient of the decoding coefficient.

[0501] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., Project 3).

[0502] 9. A video processing method, comprising: converting between video blocks of a video and a encoded / decoded representation of the video; determining, based on a rule, whether to apply a horizontally specific transform or a vertically specific transform to the video blocks; and performing the conversion based on the determination, wherein the rule specifies a relationship between the determination and the decoded luminance coefficients of the video blocks.

[0503] 10. The method of Solution 1, wherein performing the transformation includes applying a horizontally specific transform luminance component or a vertically specific transform luminance component of the video block and DCT2 to the chroma component of the video block.

[0504] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., Items 1 and 4).

[0505] 11. A video processing method comprising: converting between video blocks of a video and a codec representation of the video; determining, based on a rule, whether to apply a horizontally specific transform or a vertically specific transform to the video blocks; and performing the conversion based on the determination, wherein the rule defines a relationship between the determination and a value V associated with a decoding coefficient or a representative coefficient of a representative block.

[0506] 12. The method of Solution 11, where V equals the number of representative coefficients.

[0507] 13. The method of Solution 11, where V equals the sum of the values ​​of the representative coefficients.

[0508] 14. The method of Solution 11, where V is a function of the residual energy distribution of the representative coefficient.

[0509] 15. The method of any one of solutions 11-14, wherein the relation is defined with respect to the parity of the value V.

[0510] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., Project 5).

[0511] 16. The method of any of the above solutions, wherein the rule specifies that the relationship further depends on the encoding and decoding information of the video block.

[0512] 17. The method of Solution 16, wherein the encoding / decoding information is the encoding / decoding mode of the video block.

[0513] 18. The method of Solution 16, wherein the encoding / decoding information comprises a minimum rectangular region covering all valid coefficients of the video block.

[0514] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., Item 6).

[0515] 19. The method of any of the above solutions, wherein the determination is performed because the video block has a pattern or a constraint on the coefficients.

[0516] 20. The method of Solution 19, wherein the type corresponds to the Intra-Block Copy (IBC) mode.

[0517] 21. The method of Solution 19, wherein the constraint on the coefficients makes the coefficients outside the rectangle of the current block zero.

[0518] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., item 7).

[0519] 22. The method of any one of solutions 1-21, wherein, in the case that horizontal-specific transformations and vertical-specific transformations are not used, the transformation is performed using a DCT-2 transformation or a DST-7 transformation.

[0520] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., Item 9).

[0521] 23. The method of any one of solutions 1-22, wherein one or more syntax fields in the codec representation indicate whether the method is enabled for video blocks.

[0522] 24. The method of Solution 23, wherein the one or more syntax fields are included at the sequence level, picture level, strip level, slice group level, slice level, or sub-picture level.

[0523] 25. The method of any one of solutions 23-24, wherein the one or more syntax fields are included in a strip header or an image header.

[0524] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., items 1 and 8).

[0525] 26. A video processing method, comprising: determining that one or more syntax fields exist in a codec representation of a video, wherein the video contains one or more video blocks; and determining, based on the one or more syntax fields, whether to enable a horizontally specific transformation or a vertically specific transformation on the video blocks in the video.

[0526] 27. The method of Solution 1, wherein, in response to the implicit determination of the transform skip mode indicated by the one or more syntax fields being enabled, a transformation between a first video block of a video and the codec representation of the video is performed, and a rule is used to determine whether to apply a horizontally specific transform or a vertically specific transform to the video block; and a transformation is performed based on the determination, wherein the rule specifies the relationship between the determination and the representative coefficients of the decoding coefficients from one or more representative blocks of the video.

[0527] 28. The method of Solution 27, the first video block is encoded and decoded in intra-block copy mode.

[0528] 29. The method of Solution 27, the first video block is encoded and decoded in intra-frame mode.

[0529] 30. Solution 27's approach involves encoding and decoding the first video block using intra-frame mode instead of derivative tree (DT) mode.

[0530] 31. The method of Solution 27, wherein the parity is determined based on the number of non-zero coefficients in the first video block.

[0531] 32. The method of Solution 27 applies a horizontally specific transformation and a vertically specific transformation to the first video block when the parity of the number of non-zero coefficients in the first video block is even.

[0532] 33. The method of Solution 27, where the parity of the number of non-zero coefficients in the first video block is even, means that horizontally specific transformations and vertically specific transformations are not applied to the first video block.

[0533] 34. Solution 33 applies DCT-2 to the first video block.

[0534] 35. The method of Solution 32 further includes: in response to one or more syntax fields indicating that the implicit determination of the transform skip mode is disabled, horizontally specific transforms and vertically specific transforms are not applied to the first video block.

[0535] 36. The method of solution 32, wherein DCT-2 is applied to the first video block.

[0536] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., items 9, 10).

[0537] 37. A video processing method, comprising: making a first determination regarding whether to enable the use of a specific transformation for a conversion between video blocks of a video and a codec representation of a video; making a second determination regarding whether to enable a zeroing operation during the conversion; and performing the conversion based on the first determination and the second determination.

[0538] 38. The method of solution 37, wherein one or more syntax fields of the first level in the codec representation indicate a first determination.

[0539] 39. The method of any one of solutions 37-38, wherein one or more syntax fields of the second level in the codec representation indicate a second determination.

[0540] 40. The method of any one of solutions 38-39, wherein the first level and the second level correspond to the header field at the sequence or image level or the parameter set at the sequence or image level or the adaptive parameter set.

[0541] 41. The method of any one of solutions 37-40, wherein the transformation uses a specific transformation or zeroing operation, but not both.

[0542] The following solutions illustrate example embodiments of the techniques discussed in the previous section (e.g., items 12 and 13).

[0543] 42. A video processing method, comprising: performing a conversion between video blocks of a video and a codec representation of the video; wherein the video blocks are represented as codec blocks in the codec representation, wherein the non-zero coefficients of the codec blocks are restricted to one or more sub-regions; and wherein a specific transformation is applied to generate the codec blocks.

[0544] 43. The method of Solution 1, wherein the one or more sub-regions include an upper right sub-region of a video block with dimensions K×L, where K and L are integers, K is min(T1, W), L is min(T2, H), where W and H are the width and height of the video block, respectively, and T1 and T2 are thresholds.

[0545] 44. The method of any one of solutions 42-43, wherein the encoding / decoding representation indicates the one or more sub-regions.

[0546] The following solutions illustrate example embodiments of the techniques discussed in the previous section (items 16 and 17).

[0547] 45. The method of any one of solutions 1-44, wherein the video region includes a video encoding / decoding unit.

[0548] 46. ​​The methods of solutions 1-45, wherein the video region is a prediction unit or a transform unit.

[0549] 47. The method of any one of solutions 1-46, wherein the video blocks satisfy a specific dimension condition.

[0550] 48. The method of any one of solutions 1-47, wherein the video block is encoded and decoded using a predefined range of quantization parameters.

[0551] 49. The method of any one of solutions 1-48, wherein the video region includes video images.

[0552] 50. The method of any one of solutions 1 to 49, wherein the conversion includes encoding the video into a codec representation.

[0553] 51. The method of any one of solutions 1 to 49, wherein the conversion includes decoding the encoding / decoding representation to generate pixel values ​​of a video.

[0554] 52. A video decoding apparatus, comprising a processor configured to implement the method described in one or more of solutions 1 to 51.

[0555] 53. A video encoding / decoding apparatus, comprising a processor configured to implement the method described in one or more of solutions 1 to 51.

[0556] 54. A computer program product having computer code stored thereon, which, when executed by a processor, causes the processor to implement the method described in any one of solutions 1 to 51.

[0557] 55. The methods, apparatus or systems described in this document.

[0558] Figure 23 This is a flowchart representation of a method for video processing according to the present technology. Method 2300 includes, at operation 2310, performing a conversion between video and the bitstream of the video according to a rule. The rule specifies the use of a particular transformation mode for the conversion at least at a first video unit level and a second video unit level.

[0559] In some embodiments, a specific transform mode includes a transform skip mode. In transform skip mode, the residual of the prediction error between the current video block and the reference video block is represented in the bitstream without applying a transform. In some embodiments, the first video unit level includes the sequence level. In some embodiments, a first syntax element is included in the sequence header or sequence parameter set to indicate the use of an implicitly selected transform skip mode at the sequence level. In some embodiments, a first syntax element equal to 0 indicates that the implicitly selected transform skip mode is disabled at the sequence level. In some embodiments, a first syntax element equal to 1 indicates that the implicitly selected transform skip mode is enabled at the sequence level.

[0560] In some embodiments, a first syntax element is conditionally included in the sequence header or sequence parameter set based on a first syntax flag indicating whether an implicit selection of transforms is enabled at the sequence level. In some embodiments, the first syntax element is included in the sequence header or sequence parameter set in response to the first syntax flag indicating an implicit selection of transforms enabled at the sequence level. In some embodiments, a default value for the first syntax element is inferred in response to the omission of the first syntax element in the bitstream. In some embodiments, the default value indicates a transform skipping mode that disables implicit selection at the sequence level.

[0561] In some embodiments, the second video unit level includes a picture level or a stripe level. In some embodiments, the second syntax element includes a picture header or stripe header to indicate the use of an implicitly selected transform skip mode at the picture level or stripe level, the picture header including at least an intra-frame picture header or an inter-frame picture header. In some embodiments, the second syntax element is conditionally indicated based on a first syntax element or syntax flag in the first video unit level. In some embodiments, a default value for the second syntax element is inferred in response to the omission of the second syntax element in the bitstream. In some embodiments, the default value indicates that the use of a specific transform is disabled at the second video unit level.

[0562] Figure 24 This is a flowchart representation of a method for video processing according to the present technology. Method 2400 includes, at operation 2410, performing a conversion between video blocks and a video bitstream according to rules. These rules specify video unit-level syntax elements used to indicate the allowed set of transforms used for the conversion.

[0563] In some embodiments, the video unit level includes the picture level. In some embodiments, syntax elements are conditionally included based on an implicit selection of whether transformation is enabled at the video unit level. In some embodiments, syntax elements include flags or indexes. Syntax elements are included in a picture header or stripe header, which includes at least an intra-frame picture header or an inter-frame picture header.

[0564] In some embodiments, N transform sets are supported for transformation, and the selection of allowed transform sets from the N transform sets is based on syntax elements. In some embodiments, N equals 2. In some embodiments, the selection of allowed transform sets is also based on the codec information of the video block, which includes at least the codec mode or segmentation mode of the video block. In some embodiments, Discrete Cosine Transform Type-II (DCT2) is always used when the video block is encoded using a derivation tree mode. In some embodiments, Discrete Cosine Transform Type-II (DCT2) is always used when the video block is a chroma block. In some embodiments, the N transform sets include {DCT2, DST7} and {DCT2, IT}. DCT2 represents Discrete Cosine Transform Type-II, DST7 represents Discrete Sine Transform Type-7, and IT represents implicit transform. In some embodiments, the N transform sets are applicable to intra-frame encoded blocks. In some embodiments, the N transform sets are applicable to intra-frame encoded blocks encoded using a non-derivation tree codec mode. In some embodiments, the N transform sets include {DCT2} and {DCT2, IT}. DCT2 represents Discrete Cosine Transform Type II, and IT represents Implicit Transform. In some embodiments, the N transform sets are applicable to blocks encoded and decoded using Intra-Block Copy (IBC). In some embodiments, the N transform sets are applicable to blocks encoded and decoded using an intra-block copy (IBC) encoding and decoding mode.

[0565] In some embodiments, a default value is inferred in response to the omission of a syntax element in the bitstream indicating the default allowed transform set. In some embodiments, the syntax element includes a flag or index. In response to encoding / decoding a video block using a specific codec mode, the syntax element indicates the allowed transform set for conversion. In some embodiments, the syntax element is used for conversion when the video block is encoded / decoded using an intra-frame codec mode or an intra-frame codec mode that does not include a derivation tree codec mode or a pulse codec modulation codec mode. In some embodiments, when the video block is not encoded / decoded using an intra-frame codec mode, the allowed transform set for conversion is independent of the syntax element. In some embodiments, when the video block is an inter-frame codec block, the allowed transform set includes {DCT2}. In some embodiments, when the video block is an inter-block copy (IBC) codec block, the allowed transform set includes {DCT2, IT}.

[0566] Figure 25This is a flowchart representation of a method for video processing according to the present technology. Method 2500 includes, at operation 2510, performing a conversion between video blocks and a video bitstream according to a rule. The rule specifies the use of a particular transformation mode for the conversion of the video blocks, determined based on a function associated with the energy of representative coefficients of one or more representative blocks of the video.

[0567] In some embodiments, a specific transform mode includes a transform skip mode. In transform skip mode, the residual of the prediction error between the video block and the reference video block is represented in the bitstream without applying a transform. In some embodiments, the function returns whether the energy of the first K representative coefficients multiplied by a scaling factor is greater than the energy of the first M representative coefficients or all representative coefficients, where M is greater than K. In some embodiments, the function returns whether the energy of the representative coefficients in a first sub-region of the representative block multiplied by a scaling factor is greater than the energy of the representative coefficients in a second sub-region that includes the first sub-region and is larger than the first sub-region. In some embodiments, the function returns whether the energy of the representative coefficients in a first sub-region of the representative block multiplied by a scaling factor is greater than the energy of the representative coefficients in a second sub-region that does not overlap with the first sub-region.

[0568] In some embodiments, the first sub-region includes the top-left M×N region within the video block. In some embodiments, M = N = 1. In some embodiments, the first sub-region is a second sub-region that does not include the top-left M×N region. In some embodiments, the first sub-region is the top-left 4×4 region. In some embodiments, the energy representing the coefficients is defined as the sum of the absolute values ​​of the representative coefficients or the sum of the squares of the representative coefficients.

[0569] In some embodiments, the applicability of one or more of the above methods is based on the encoding and decoding information of the video block. In some embodiments, the method is applicable to the video block when its width or height is less than a threshold. In some embodiments, the threshold is 64. In some embodiments, the method is applicable to the video block when intra-frame encoding and decoding is performed without using a derivation tree mode or a pulse codec modulation mode. In some embodiments, the method is applicable to the video block when intra-block copy (IBC) encoding and decoding is performed on the video block.

[0570] In some embodiments, the conversion includes encoding the video into a bitstream. In some embodiments, the conversion includes decoding the bitstream to generate the video.

[0571] In this document, the term "video processing" can refer to video encoding, video decoding, video compression, or video decompression. For example, a video compression algorithm can be applied during the conversion from the pixel representation of a video to the corresponding bitstream, and vice versa. As defined in the syntax, the bitstream of the current video block can, for example, correspond to bits juxtaposed or scattered at different positions within the bitstream. For example, a macroblock can be encoded based on the residual error values ​​after transformation and encoding / decoding, and also using bits from the header and other fields in the bitstream. Furthermore, during the conversion, the decoder can parse the bitstream based on determinations as described in the solutions above, knowing that some fields may or may not be present. Similarly, the encoder can determine whether certain syntax fields are included or excluded, and generate the codec representation accordingly by including or excluding syntax fields from the codec representation.

[0572] The solutions and other solutions, examples, embodiments, modules, and functional operations disclosed in this document can be implemented in digital electronic circuits, or computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or combinations of one or more of the foregoing. The disclosed embodiments and other embodiments can be implemented as one or more computer program products, such as one or more modules of computer program instructions encoded on a computer-readable medium for execution by or control of the operation of a data processing apparatus. The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of substances affecting machine-readable propagation signals, or one or more combinations thereof. The term "data processing apparatus" includes all means, devices, and machines for processing data, such as programmable processors, computers, or multiple processors or computers. In addition to hardware, the apparatus may also include code that creates an execution environment for the computer program in question, such as code constituting processor firmware, a protocol stack, a database management system, an operating system, or a combination thereof. Propagation signals are artificially generated signals, such as machine-generated electrical, optical, or electromagnetic signals, which are generated to encode information for transmission to a suitable receiver device.

[0573] Computer programs (also known as programs, software, software applications, scripts, or code) can be written in any programming language, including compiled or interpreted languages, and can be deployed in any form, including as standalone programs or modules, components, subroutines, or other units suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a file portion that holds other programs or data (e.g., one or more scripts stored in a markup language document), a single file dedicated to a related program, or multiple coordination files (e.g., a file storing one or more modules, subroutines, or code portions). A computer program can be deployed to execute on a single computer, or on multiple computers located at one site or distributed across multiple sites and interconnected through a communication network.

[0574] The processes and logic flows described in this document can be executed by one or more programmable processors, which execute one or more computer programs to perform functions by manipulating input data and generating output. The processes and logic flows can also be executed by dedicated logic circuitry, and the devices can be implemented as dedicated logic circuitry, such as FPGAs (Field-Programmable Gate Arrays) or ASICs (Application-Specific Integrated Circuits).

[0575] For example, processors suitable for executing computer programs include general-purpose and special-purpose microprocessors, as well as any one or more processors in any type of digital computer. Typically, a processor receives instructions and data from read-only memory or random access memory, or both. The basic components of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operatively coupled to, receiving or transferring data to one or more mass storage devices (e.g., magnetic disks, magneto-optical disks, or optical disks) for storing data. However, a computer does not require such devices. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices such as EPROM, EEPROM, and flash memory devices; magnetic disks, such as internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM optical disks. The processor and memory may be supplemented by or incorporated into special-purpose logic circuitry.

[0576] Although this patent document contains numerous details, these details should not be construed as limiting the scope of any subject matter or claimed content, but rather as descriptions of features characteristic of specific embodiments of a particular technology. Certain features described in the context of individual embodiments in this patent document may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments. Furthermore, although the foregoing features may be described as functioning in a particular combination, or even initially claimed to be so, in some cases one or more features may be removed from the claimed combination, and the claimed combination may refer to a sub-combination or a variation of a sub-combination.

[0577] Similarly, although these operations are described in a specific order in the accompanying drawings, this should not be construed as requiring that such operations be performed in the specific order or sequence shown, or requiring that all shown operations be performed to obtain the desired result. Furthermore, the separation of various system components in the embodiments described in this patent document should not be construed as requiring such separation in all embodiments.

[0578] Only some implementations and examples are described, and other implementations, enhancements and variations may be made based on the content described and illustrated in this patent document.

Claims

1. A video processing method, comprising: Perform the conversion between video blocks and the video bitstream according to the rules. The rule specifies that the use of a particular transformation mode for the transformation of the video block is determined based on a function associated with the energy of the representative coefficients of one or more representative blocks of the video. The specific transformation mode includes a transformation skip mode, wherein the residual of the prediction error between the video block and the reference video block is represented in the bitstream without the need for transformation.

2. The method according to claim 1, wherein, The function returns whether the energy of the first K representative coefficients multiplied by the scaling factor is greater than the energy of the first M representative coefficients or all representative coefficients, where M is greater than K.

3. The method according to claim 1, wherein, The function returns whether the energy of the representative coefficient in the first subregion representing the block, multiplied by the scaling factor, is greater than the energy of the representative coefficient in the second subregion, which includes the first subregion and is larger than the first subregion.

4. The method according to claim 1, wherein, The function returns whether the energy of the representative coefficient in the first subregion of the representative block multiplied by the scaling factor is greater than the energy of the representative coefficient in the second subregion that does not overlap with the first subregion.

5. The method according to claim 3, wherein, The first sub-region includes the upper left M×N region in the video block.

6. The method according to claim 5, wherein, M = N = 1.

7. The method according to claim 4, wherein, The first sub-region is the second sub-region excluding the top-left M×N region.

8. The method according to claim 4, wherein, The first sub-region is the top left 4×4 region.

9. The method according to claim 1, wherein, The energy of the representative coefficient is defined as the sum of the absolute values ​​of the representative coefficients or the sum of the squares of the representative coefficients.

10. The method according to claim 1, wherein, The applicability of the method is based on the encoding and decoding information of the video block.

11. The method according to claim 10, wherein, The method is applicable to video blocks whose width or height is less than a threshold.

12. The method according to claim 11, wherein, The threshold is 64.

13. The method according to claim 10, wherein, The method is applicable to the video block without using derivation tree mode or pulse codec modulation mode to perform intra-frame encoding and decoding on the video block.

14. The method of claim 10, wherein, The method is applicable to the video block when intra-block copy (IBC) encoding / decoding is performed on the video block.

15. The method according to claim 1, wherein, The rules also specify that the use of a particular transformation mode for the transformation should be indicated at least at the first video unit level and the second video unit level.

16. The method according to claim 15, wherein, The first video unit level includes the sequence level.

17. The method according to claim 16, wherein, The first syntax element is included in the sequence header or sequence parameter set to indicate the use of the implicitly selected transform skip mode at the sequence level.

18. The method according to claim 17, wherein, The first syntax element being equal to 0 indicates that the implicitly selected transform skip mode is disabled at the sequence level.

19. The method of claim 17, wherein, The first syntax element being equal to 1 indicates that the implicitly selected transform skip mode is enabled at the sequence level.

20. The method of claim 17, wherein, The first syntax element is conditionally included in the sequence header or the sequence parameter set based on a first syntax flag indicating whether an implicit selection of transformations is enabled at the sequence level.

21. The method according to claim 20, wherein, In response to the first syntax flag indicating an implicit selection to enable the transformation at the sequence level, the first syntax element is included in the sequence header or the sequence parameter set.

22. The method according to claim 20, wherein, In response to the omission of the first syntax element in the bitstream, the default value of the first syntax element is inferred.

23. The method according to claim 22, wherein, The default value indicates that the implicitly selected transform skip mode is disabled at the sequence level.

24. The method according to claim 15, wherein, The second video unit level includes either the image level or the strip level.

25. The method according to claim 24, wherein, The second syntax element is included in the picture header or strip header to indicate the use of an implicitly selected transform skip mode at the picture level or the strip level, the picture header including at least an intra-frame picture header or an inter-frame picture header.

26. The method of claim 25, wherein, The second syntax element is conditionally indicated based on the first syntax element or syntax flag at the first video unit level.

27. The method according to claim 25, wherein, In response to the omission of the second syntax element in the bitstream, the default value of the second syntax element is inferred.

28. The method according to claim 27, wherein, The default value indicates that the use of the specific transform mode is disabled at the second video unit level.

29. The method according to claim 1, wherein, The rules also specify that syntax elements at the video unit level are used to indicate the allowed set of transforms for the transformation.

30. The method according to claim 29, wherein, The video unit level includes the image level.

31. The method according to claim 29, wherein, The syntax elements are conditionally included based on an implicit selection of whether transformations are enabled at the video unit level.

32. The method according to claim 29, wherein, The syntax element includes a flag or index, and wherein the syntax element is included in an image header or stripe header, the image header including at least an intra-frame image header or an inter-frame image header.

33. The method according to claim 29, wherein, N transformation sets are supported for the transformation, and the selection of the allowed transformation sets from the N transformation sets is based on the syntax elements.

34. The method according to claim 33, wherein, N equals 2.

35. The method according to claim 33, wherein, The selection of the allowed transform set is also based on the encoding and decoding information of the video block, which includes at least the encoding and decoding mode or segmentation mode of the video block.

36. The method according to claim 35, wherein, When encoding and decoding the video block using the derivation tree mode, the Discrete Cosine Transform Type-II (DCT2) is always used.

37. The method of claim 35, wherein, When the video block is a chroma block, Discrete Cosine Transform Type-II DCT2 is always used.

38. The method according to claim 33, wherein, The N transform sets include {DCT2, DST7} and {DCT2, IT}, where DCT2 represents Discrete Cosine Transform Type II, DST7 represents Discrete Sine Transform Type 7, and IT represents Implicit Transform.

39. The method according to claim 38, wherein, The N transform sets are applicable to blocks of intra-frame encoding and decoding.

40. The method according to claim 38, wherein, The N transform sets are applicable to intra-frame encoded / decoded blocks using non-derivative tree encoding / decoding modes.

41. The method according to claim 33, wherein, The N transform sets include {DCT2} and {DCT2, IT}, where DCT2 represents Discrete Cosine Transform Type-II and IT represents Implicit Transform.

42. The method according to claim 41, wherein, The N transform sets are applicable to blocks of intra-block copy IBC encoding and decoding.

43. The method according to claim 41, wherein, The N transform sets are applicable to intra-block copy IBC blocks encoded and decoded using non-derivative tree encoding / decoding modes.

44. The method according to claim 29, wherein, A default value is inferred in response to the omission of the syntax element indicating the default set of allowed transformations in the bitstream.

45. The method according to claim 29, wherein, The syntax element includes a flag or index, and wherein, in response to encoding or decoding the video block using a specific encoding / decoding mode, the syntax element is used to indicate the allowed set of transforms for the transformation.

46. ​​The method according to claim 45, wherein, When encoding or decoding the video block using an intra-frame encoding / decoding mode or an intra-frame encoding / decoding mode that does not include a derivation tree encoding / decoding mode or a pulse encoding / decoding modulation encoding / decoding mode, the syntax element is used for the conversion.

47. The method according to claim 45, wherein, When the video block is not encoded or decoded using an intra-frame encoding / decoding mode, the allowed set of transforms used for the transformation is independent of the syntax elements.

48. The method according to claim 47, wherein, In the case that the video block is an inter-frame encoded block, the allowed transform set includes {DCT2}.

49. The method according to claim 47, wherein, In the case that the video block is an inter-frame block copy (IBC) codec block, the allowed transform set includes {DCT2, IT}.

50. The method according to any one of claims 1 to 49, wherein, The conversion includes encoding the video into the bitstream.

51. The method according to any one of claims 1 to 49, wherein, The conversion includes decoding the video from the bitstream.

52. A method for storing a video bitstream, comprising: The bitstream is generated by performing the method according to any one of claims 1 to 50; The bit stream is stored in a non-transitory computer-readable recording medium.

53. A video decoding apparatus, comprising a processor configured to implement the method according to any one of claims 1 to 49, 51.

54. A video encoding apparatus comprising a processor configured to implement the method according to any one of claims 1 to 50, 52.

55. A computer program product having computer code stored thereon, said code causing the processor to implement the method according to any one of claims 1 to 52 when executed by a processor.

56. A non-transitory computer-readable recording medium having a computer program and a bit stream stored thereon, wherein, When the computer program is executed by the video processing device, it generates the bitstream using the method described in any one of claims 1 to 50.